Bioinformatics
Genomic Signatures of Speciation in Sympatric Flying Squirrels
Hybridization contributes to the genetic diversity and can impact speciation. This study investigates the genetic evidence of recent hybridization under climate change in sympatric populations of northern and southern flying squirrels in Ontario. Using low-coverage whole-genome sequences, my research examines the existing population structure and measures the genomic variation of the Glaucomys species. The global estimates of FST (0.308) and DXY (0.141) are indicative of substantial differentiation between the species. Measures of genetic diversity (π), differentiation (FST), and divergence (DXY) across the genome reveal insights into the divergent selection driving speciation. Results indicate an absence of contemporary hybridization or introgression at a site with longstanding sympatry. Across both species' genomes, signatures of selection align with four different scenarios for the formation of genomic landscapes of differentiation, shedding light on the complex speciation history of these flying squirrels. These findings enhance understanding of evolutionary dynamics, adaptation, speciation, and genetic differentiation.
Author Keywords: Genomic differentiation, Glaucomys, northern flying squirrel, southern flying squirrel, speciation
Gene flow directionality and functional genetic variation among Ontario, Canada Ursus americanus populations.
Rapidly changing landscapes introduce challenges for wildlife management, particularly for large mammal populations with long generation times and extensive spatial requirements. Understanding how these populations interact with heterogeneous landscapes aids in predicting responses to further environmental change. In this thesis, I profile American black bears using microsatellite loci and pooled whole-genome sequencing. These data characterize gene flow directionality and functional genetic variation to understand patterns of dispersal and local adaptation; processes key to understanding vulnerability to environmental change. I show dispersal is positively density-dependent, male biased, and influenced by food productivity gradients suggestive of source-sink dynamics. Genomic comparison of bears inhabiting different climate and forest zones identified variation in genes related to the cellular response to starvation and cold. My thesis demonstrates source-sink dynamics and local adaption in black bears. Population management must balance dispersal to sustain declining populations against the risk of maladaptation under future scenarios of environmental change.
Author Keywords: American black bear, Dispersal, Functional Genetic Variation, Gene Flow Directionality, Genomics, Local Adaptation
Demographic history and conservation genomics of caribou (Rangifer tarandus) in Québec
Genetic variation is the raw material and basis for evolutionary changes in nature. The loss of genetic diversity is a challenge many species are facing, with genomics being a potential tool to inform and prioritize decision making. Whole genome analysis can be an asset to conservation biology and the management of species through the generation of more precise and novel metrics. This thesis uses whole genome re-sequencing to characterize the demographic history and quantify genomic metrics relevant to conservation of caribou (Rangifer tarandus) in Québec, Canada. We calculated the ancestral and contemporary patterns of genomic diversity of five representative caribou populations and applied a comparative population genomics framework to assess the interplay between demographic events and genomic diversity. When compared to the census size, NC, the endangered Gaspésie Mountain caribou population had the highest ancestral Ne:NC ratio which is consistent with recent work suggesting high ancestral Ne:NC is of conservation concern. These ratios were highly correlated with genomic signatures (i.e. Tajima's D) of recent population declines and explicit demographic model parameters. Values of contemporary Ne, estimated from linkage-disequilibrium showed Gaspêsie having among the highest contemporary Ne:NC ratio. Importantly, classic conservation genetics theory would predict this population to be of less concern based off this metric alone. Inbreeding measures suggested nuanced patterns of inbreeding and correlated to the demographic models. This study suggests that while the Québec populations are all under decline, they harbour enough ancestral genetic variation to replenish any lost diversity, if conservation decisions are made in favour of these populations, specifically supporting NC.
An Investigation of the Impact of Big Data on Bioinformatics Software
As the generation of genetic data accelerates, Big Data has an increasing impact on the way bioinformatics software is used. The experiments become larger and more complex than originally envisioned by software designers. One way to deal with this problem is to use parallel computing.
Using the program Structure as a case study, we investigate ways in which to counteract the challenges created by the growing datasets. We propose an OpenMP and an OpenMP-MPI hybrid parallelization of the MCMC steps, and analyse the performance in various scenarios.
The results indicate that the parallelizations produce significant speedups over the serial version in all scenarios tested. This allows for using the available hardware more efficiently, by adapting the program to the parallel architecture. This is important because not only does it reduce the time required to perform existing analyses, but it also opens the door to new analyses, which were previously impractical.
Author Keywords: Big Data, HPC, MCMC, parallelization, speedup, Structure
Interactome Study of Giardia Intestinalis Cytochromes B5
Giardia intestinalis is an anaerobic protozoan that lacks common eukaryotic heme-dependent respiratory complexes and does not encode any proteins involved in heme biosynthesis. Nevertheless, the parasite encodes several hemeproteins, including three members of the Type II cytochrome b5 sub-group of electron transport proteins found in anaerobic protist and amitochondriate organisms. Unlike the more well-characterized cytochrome b5s of animals, no function has been ascribed to any of the Type II proteins. To explore the functions of these Giardia cytochromes (gCYTB5s), I used bioinformatics, immunofluorescence microscopy (IFM) and co-immunoprecipitation assays. The protein-protein interaction in silico prediction tool, STRING, failed to identify relevant interacting partners for any of the Type II cytochromes b5 from Giardia or other organisms. Differential cellular localization of the gCYTB5s was detected by IFM: gCYTB5-I in the perinuclear space; gCYTB5-II in the cytoplasm with a staining pattern similar to peripheral vacuole-associated protein; and gCYTB5-III in the nucleus. Co-immunoprecipitation with the gCYTB5s as bait identified potential interacting proteins for each isotype. The most promising candidate is the uncharacterized protein GL50803_9861, which was identified in the immunoprecipitate of both gCYTB5-I and II, and which co-localizes with both. Structural analysis of GL50803_9861 using Swiss Model, Phyre2, I-TASSER and RaptorX predicts the presence of a nucleotide-binding domain, which is consistent with a potential redox role involving nicotinamide or flavin-containing cofactors. Finally, the protein GL50803_7204 which contains a RNA/DNA binding domain was identified a potential partner of gCYTB5-III. These findings represent the first steps in the discovery of the roles played by these proteins in Giardia.
Author Keywords: Cytochrome b5, Giardia intestinalis, Heme, Interactome, Protein structure prediction
Using environmental DNA (eDNA) metabarcoding to assess aquatic plant communities
Environmental DNA (eDNA) metabarcoding targets sequences with interspecific
variation that can be amplified using universal primers allowing simultaneous detection
of multiple species from environmental samples. I developed novel primers for three
barcodes commonly used to identify plant species, and compared amplification success
for aquatic plant DNA against pre-existing primers. Control eDNA samples of 45 plant
species showed that species-level identification was highest for novel matK and preexisting
ITS2 primers (42% each); remaining primers each identified between 24% and
33% of species. Novel matK, rbcL, and pre-existing ITS2 primers combined identified
88% of aquatic species. The novel matK primers identified the largest number of species
from eDNA collected from the Black River, Ontario; 21 aquatic plant species were
identified using all primers. This study showed that eDNA metabarcoding allows for
simultaneous detection of aquatic plants including invasive species and species-at-risk,
thereby providing a biodiversity assessment tool with a variety of applications.
Author Keywords: aquatic plants, biodiversity, bioinformatics, environmental DNA (eDNA), high-throughput sequencing, metabarcoding
De novo transcriptome assembly, functional annotation, and SNP discovery in North American flying squirrels (genus Glaucomys)
Introgressive hybridization between northern (Glaucomys sabrinus) and southern flying squirrels (G. volans) has been observed in some areas of Canada and the USA. However, existing molecular markers lack the resolution to discriminate late-generation introgressants and describe the extent to which hybridization influences the Glaucomys gene pool. I report the first North American flying squirrel (genus Glaucomys) functionally annotated de novo transcriptome assembly with a set of 146,621 high-quality, annotated putative species-diagnostic SNP markers. RNA-sequences were obtained from two northern flying squirrels and two southern flying squirrels sampled from Ontario, Canada. I reconstructed 702,228 Glaucomys transcripts using 193,323,120 sequence read-pairs, and captured sequence homologies, protein domains, and gene function classifications. These genomic resources can be used to increase the resolution of molecular techniques used to examine the dynamics of the Glaucomys hybrid zone.
Author Keywords: annotation, de novo transcriptome, flying squirrels, high-throughput sequencing, hybridization, single nucleotide polymorphisms
Investigating wheat rust virulence evolution through transcriptome analysis of a recently emerged race of Puccinia triticina
Puccinia triticina, wheat leaf rust (WLR), is the most economically damaging fungal rust of wheat on a global scale. This study identified transcriptome changes in a recently emerged race of WLR in Ontario with a new virulence type relative to a possible ancestor race. Also, this study focused on detecting variation in candidate virulence genes and uncovering novel insight into WLR virulence evolution. Various race-by-variety interactions were evaluated using RNA-seq experiments. A list of genes with statistically significant expression changes in each comparison was prepared and predicted effectors were retained for further analysis. Proteins with nonsynonymous substitutions were run through BLASTx to identify potential orthologs. Over 100 candidate effectors with a 2-fold or higher change in transcript level were identified. Seven of these candidate effector genes were recognized to contain single nucleotide polymorphisms (SNPs) which altered the amino acid sequence of the resulting protein. The information gained may aid in targeted breeding programs to combat new WLR races as well as provide the basis for functional analysis of WLR using potential orthologs in a model basidiomycete.
Author Keywords: effectors, RNA-seq, rust fungi, SNPs, transcriptome, wheat leaf rust
Adaptive Genetic Markers Reveal the Biological Significance and Evolutionary History of Woodland Caribou (Rangifer tarandus caribou) Ecotypes
Migratory and sedentary ecotypes are phenotypic distinctions of woodland caribou. I explored whether I could distinguish between these ecotypes in Manitoba and Ontario using genetic signatures of adaptive differentiation. I anticipated that signatures of selection would indicate genetic structure and permit ecotype assignment of individuals. Cytochrome-b, a functional portion of the mitochondrial genome, was tested for evidence of adaptation using Tajima's D and by comparing variations in protein physiology. Woodland caribou ecotypes were compared for evidence of contemporary adaptive differentiation in relation to mitochondrial lineages. Trinucleotide repeats were also tested for differential selection between ecotypes and used to assign individuals to genetic clusters. Evidence of adaptive variation in the mitochondrial genome suggests woodland caribou ecotypes of Manitoba and Ontario corresponded with an abundance of functional variation. Woodland caribou ecotypes coincide with genetic clusters, and there is evidence of adaptive differentiation between migratory caribou and certain sedentary populations. Previous studies have not described adaptive variation in caribou using the methods applied in this study. Adaptive differences between caribou ecotypes suggest selection may contribute to the persistence of ecotypes and provides new genetic tools for population assessment.
Author Keywords: Adaptation, Cytochrome-B, Ecotype, RANGIFER TARANDUS CARIBOU, Selection, TRINUCLEOTIDE REPEAT