Personalized Genome

        Human genome is all the sequence information of human cell. It is used as a platform to describe primary structure of human genome. Completion of human genome project allows characterization of functional elements of human genome. Different versions of the same gene circulate within different group of people in different geographical groups. This phenomenon is called genetic polymorphism. Genetic polymorphism is reflected by sequence variation, such as single nucleotide polymorphisms (SNPs), copy number variations (CVs), insertions and deletions (InDels). Therefore a generalized human genome cannot be used to represent genome of different individual from different groups of people due to these genetic polymorphisms. As mentioned above, different versions of the same gene exist among human population. If “A” and “a” are two different versions of a given gene, we call them alleles. Human has diploid genome, meaning human cells contain two sets of genome, with one from mother and another one from father. If two alleles of a gene are the same (AA or aa), the genotype of this gene is called homozygote, otherwise heterozygote (Aa). If allele “a” is disease associated, heterozygosity of this gene or homozygosity for allele “a” poses a risk of the disease to certain individual.

        Personalized genome is all the sequence information of one human individual. The individual-specific sequence differences (i.e. SNPs and InDels) can serve functions of interpreting ancestors’ ethical group and geographical origin and determining of kinship among individuals. More importantly, personalized genome serves as a biological feature that best represents one individual’s genome. The polymorphisms, and mutations, are major reasons for susceptibility to certain disease, and also the different-to-none response to medication. Judging on this, the value of personalized genome can provide a full detailed report of what disease one might be associated with, which environment risk factors can potentially contribute to certain health conditions, and finally facilitate medication selection for personalized medicine. With the technical advances of DNA/RNA sequencing technology, development of bioinformatics and data science, sequencing of personalized genome becomes a mature and affordable technique.  

Personalized genome sequencing

        Personalized genome sequencing strategies can be classified as genotyping microarray, exome sequencing and whole-genome sequencing.

        Measuring genotypes of known disease related polymorphisms is capable of determining whether one is disease associated. Genotyping microarray is based on molecular hybridization between DNA/RNA sample of tester and pre-designed nucleotide probes. This method offers fast data turnover and low analysis cost. However the targets to be analyzed are limited by pre-designed probes based on known polymorphisms therefore this method cannot fully resolve variances within one’s genome. 

        Exome sequencing and whole-genome sequence provide the most comprehensive about one’s genome. Both sequencing strategies relies on Next Generation Sequencing (NGS) platforms. In addition to the information that can be acquired from genotyping analysis, exome sequencing includes sequencing information of all exons. Whole-genome sequence additionally includes all non-coding sequences, intergenic region sequences. Thus any sequence variation can be captured. 1000 genome project showed on average each individual carries 250-300 loss-of-function variants in known genes and 50 to 100 variants that have been implicated in inherited disorders. More comprehensive sequence information enhances the reorganization of these variants and therefore guarantees the accuracy of personalized medicine. DNA was usually sequenced at 30-50 fold of coverage to collect enough sequence information for statistically determining genetic variations. These variations can subsequently be compared to multiple databases to find associated disease.

Personalized medicine

        Personalized medicine heavily relies on personalized genomes. For example, an advanced chemotherapy selection method developed in Mayo Clinic implements xenograft of patient derived tumor cells into immune-suppressed mouse. Tumor cell clones isolated from patients were amplified in multiple animals. Drug selection was subsequently performed on the tumor xenograft. Treatment effectiveness of a drug is determined on the effectiveness of the drug on repressing the tumor related gene expression profile shift in post-treatment cancer cells. Patient data suggested that genetic variations, gene expression and drug response are able to classify patients into different cohorts. Therefore using such cohorts, effective medication can be reversely estimated.

        With the advance of genetic research technique and data science, availability of various types of data, sequencing cost is much lower compared to 4-5 years ago and is estimated less than 1,500 dollars. If you are interested to have your own genome sequenced and analyzed, feel free to contact with us for any question and we are happy to design most appropriate sequencing plan.

Figure 1 Disease related SNP

Figure 1 illustrates an example of how a recessive SNP cause disease. Assuming there are two different alleles in population for a given gene, one is not disease related (brown allele) and the other is disease related (yellow mixed allele). The disease related allele leads to disease in a recessive manner, meaning disease related allele produce its effect only when the gene is homozygous for disease related allele. Father carries one of each allele (heterozygosity) while mother carries only one allele (homozygosity). By Mendelian inheritance, each of their children has 50% of chance to carry the disease related allele. If the carrier child and another disease allele carrier have children, there are 25% of chance for their kids to be normal (homozygous for normal allele), 50% of chance to be carrier (heterozygous for the given gene) and 25% of chance to have the disease (homozygous for disease allele). If this disease has environmental risk factor(s), realizing one’s genome is prone to certain disease will be beneficial to avoid risk factor exposure and prevent disease from taking place, or lead to early preventative treatment.

Figure 2 Mutation

Figure 2 illustrates an example of how a mutation cause disease. Mutations are not inherited from parents. Instead, mutation takes place in one’s own genome by different mechanisms.The bright side of a mutation is if it doesn’t have deteriorative effect, it can be passed onto next generation like a normal gene and circulate in population as a new polymorphism. On the other hand, not only detrimental mutations have damaged function, but also mutant’s gene product can inhibit normal allele’s product (i.e. tumor suppressor p53). In cancer research, Loss-of-heterozygosity (LOH) serves as a marker that indicates disease progression. Mutation originally took place in one of the two alleles, with the other normal allele can still produce normal gene products. When LOH is observed, both alleles are mutated, therefore the cell cannot produce normal gene products anymore. 








        外显子测序和全基因组测序提供了最完善的个体基因组信息。这两种发发都依赖于高通量测序的方法。这些数据信息不仅含括了基因型分析的结果,还在此基础上提供了所有编码以及非编码序列,SNP/Indel,拷贝数变化的信息。1000 human genome project(千人基因组计划)显示平均每个人携带了250到300个导致基因产物功能缺失的基因组序列差异,同时还携带了50到100个通过遗传获得的序列差异。更多的数据保证了对这些基因组序列差异以及他们相关的疾病辨识度和药物筛选的准确性。在进行全基因组测序时,我们一般会对基因组进行30-50X深度的测序,以保证在统计学上有足够多的序列信息来发现不同的等位基因和突变。而后我们会将这些数据与人类基因组参考数据比对并把序列差异与多个数据库比对来发现潜在的疾病信息。


图1 疾病相关的单核苷酸多样性差异


图2 突变



How to have personalized genome sequenced

If you are interested in sequencing your own genome, we recommend you initiate the whole process by contacting us. Based on your need, we can help you to design the best project protocol, arrange your sample collection and sequencing service. For example, if you are interested in where did you ancestor come from, we will recommend performing genotyping array plus population genetics analysis to give you the desired answer with low budget. In contrast, if you have the desire of fully sequencing your personal genome, a more complicated sequencing would be more appropriate. We will arrange your sample to be sequenced at our collaborating sequencing service provider that is close to you. After sequencing is accomplished, the sequencing data will be piped to our server for downstream analysis. Using state-of-art bioinformatics tools, we will determine gene polymorphisms and gene mutations. Subsequently, we will compare your genome to known genomic information to determine whether your genetic variation is associated with certain health condition. In the final report that will be send back to you, we will report any polymorphism and mutation we found in your genome, as well as any disease and risk factors that are potentially associated with them. We will also report any drug shown to repress the function of certain genetic variations. After the whole process, we will contact you again to go through details of your analysis result, and whether you want to share your genetic information with your health provider when necessary. If you are interested in a genome project about yourself, please contact with us.