Pangenomics Alignment

unbiasing biomedical discovery

In the past decade, there has been an effort to sequence and compare a large number of individual genomes of a given species, resulting in a large number of (reference) genomes of various species being made publicly available. For example, there is now public data for the 1,000 Genome Project, the 100K Genome Project, the 1001 Arabidopsis Genomes project, among others. Short read aligners have been fundamental to the analysis of these datasets, and have enabled the discovery of genetic markers that have causal relationships with countless diseases and phenotypes. These methods take as input a set of sequence reads and a reference genome, build an index from the reference genome, and use this index to find alignments with the limitation that few insertions and deletions are allowed.  The goal of this project is to develop the theoretical and practical methods needed to align to a population of genomes, rather than a single genome.

Antimicrobial resistance

identifying triggers of resistant bacteria

The World Health Organization describes antimicrobial resistance as “an increasingly serious threat to global public health.” This threat has prompted an executive order initiating the National Action Plan for Combating Antibiotic-Resistant Bacteria, whose fifth goal of this Action Plan is to: “Improve International Collaboration and Capacities for Antibiotic-resistance Prevention, Surveillance, Control, and Antibiotic Research and Development.”  Yet, despite the growing concern with antibiotic use in agriculture, there are very few computational methods to measure, quantify, and track the AMR genes within a food production system.  Hence, one of my research interests is the development of methods that allows for systematic surveillance of AMR in agriculture and clinical environments.  Funded by NIH/NIAID R01AI141810-01 (PI: Boucher)





enabling mobile bioinformatics

testing in the field and beyond

Third-generation sequencing technologies –including Oxford Nanopore’s MinION and SmidgION– are revolutionizing again biomedical sciences by combining large throughput with miniaturization and portability. A sequencer now fits the palm of a hand and plugs directly into a smartphone, ready for on-site, real-time genomic applications. Our aim  is to create mobile bioinformatics methods for on-site, real-time detection of pathogens and AMR using nanopore technology. Funded by NSF: SCH (PI: Boucher).  

See UFL Highlight article about this project.

An illustration of Oxford Nanopore’s MinION []

optical mapping analysis

assembly and analysis of optical mapping data

Even with significantly high coverage and various insert sizes, genome assembly and structural variation detection are tenuous computational processes using short read data alone due to repetitive regions in the genome. One type of data that can be used to overcome these challenges is optical mapping data. Optical maps, which are ordered genome-wide high-resolution restriction maps that specify the positions of occurrence of one or more short nucleotide sequences, are one such type of data.  This research is funded by the National Science Foundation (NSF III 1618814; PI: Boucher)

Learn More




Above is an illustrate of the BioNano Saphyr System. []