Skip to main content
  • Home
  • Projects
  • Metagenomic Data Analytics for Human Identification

Projects

Metagenomic Data Analytics for Human Identification

The availability of vast quantities of human and microbial genomic data can be exploited to uncover unique ways that the human genome impacts both internal and external bacterial communities, and conversely, how these communities may impact our own genome.

Traits inherited from parental genes impact our outward appearance or identity, and the metagenomic joint network field of epigenetics is uncovering certain regions of the human genome that indicate uniqueness beyond a readily observable physical appearance, such as personal habits and disease.

Similarly, human bacterial colonies present in locations ranging from skin surfaces to the digestive tract are greatly impacted by health and environment, but may also be correlated to the host DNA as well based on epigenetic factors.

The goal of this proposed project is to bridge this gap by combining the various human genomic data sets with those from the microbiomes. By so doing, we can mine human and microbial genomic data for potential biomarkers that enable determination of an individual’s habits, health, and identity.

The objective of this project is to study the potential association between the composition of the skin microbiome and certain genetic traits that exist in the human genome in order to build a framework for exploiting these associations for the determination of identity. WVU has data from a previous collection effort comprised of full human genomes and bacterial genomes from the same 20 individuals, along with metadata on ethnicity, hygiene habits, and health history. In the project, we will develop a computational framework for improved network propagation, using novel data structures, making it possible to connect information from disparate datasets, such as DNA and bacteria. Guided by panels of single nucleotide polymorphisms (SNPs) that are associated with human ethnicities, diseases, etc., we will investigate how these SNPs are mutated in our 20 subjects, and whether these can correctly classify the individuals in our dataset according to their metadata.

We will characterize their respective microbiome to provide a detailed profile of each individual’s microbiome. We will perform a comparative genomic analysis to determine which markers tend to vary for groups of people and which tend to be unique. These analyses provide more information on the SNPs, genomic variants, and bacterial metagenomics codes that could be used in human identification, and human bio-geographical classification.

metagenomic flowchart
Figure 1. The proposed genomic-metagenomic joint network will relate the genome to an individual’s microbiome.