unbiasing biomedical discovery
In the past decade, there has been an effort to sequence and compare a large number of individual genomes of a given species, resulting in a large number of (reference) genomes of various species being made publicly available. For example, there is now public data for the 1,000 Genome Project, the 100K Genome Project, the 1001 Arabidopsis Genomes project, among others. Short read aligners have been fundamental to the analysis of these datasets, and have enabled the discovery of genetic markers that have causal relationships with countless diseases and phenotypes. These methods take as input a set of sequence reads and a reference genome, build an index from the reference genome, and use this index to find alignments with the limitation that few insertions and deletions are allowed. The goal of this project is to develop the theoretical and practical methods needed to align to a population of genomes, rather than a single genome.
identifying triggers of resistant bacteria
enabling mobile bioinformatics
testing in the field and beyond
Third-generation sequencing technologies –including Oxford Nanopore’s MinION and SmidgION– are revolutionizing again biomedical sciences by combining large throughput with miniaturization and portability. A sequencer now fits the palm of a hand and plugs directly into a smartphone, ready for on-site, real-time genomic applications. Our aim is to create mobile bioinformatics methods for on-site, real-time detection of pathogens and AMR using nanopore technology. Funded by NSF: SCH (PI: Boucher).
optical mapping analysis
assembly and analysis of optical mapping data
Even with significantly high coverage and various insert sizes, genome assembly and structural variation detection are tenuous computational processes using short read data alone due to repetitive regions in the genome. One type of data that can be used to overcome these challenges is optical mapping data. Optical maps, which are ordered genome-wide high-resolution restriction maps that specify the positions of occurrence of one or more short nucleotide sequences, are one such type of data. This research is funded by the National Science Foundation (NSF III 1618814; PI: Boucher)