A study by researchers at Yale University offers a new view on what causes the greatest genetic variability among individuals, suggesting that it is due less to single point mutations than to the presence of structural changes that cause extended segments of the human genome to be missing, rearranged or present in extra copies.
"The focus for identifying genetic differences has traditionally been on point mutations or SNPs – changes in single bases in individual genes," says Michael Snyder, the Cullman Professor of Molecular, Cellular & Developmental Biology and senior author of the study, which was published in Science Express. "Our study shows that a considerably greater amount of variation between individuals is due to rearrangement of big chunks of DNA."
Although the original human genome sequencing effort was comprehensive, it left regions that were poorly analysed. Recently, investigators found that even in healthy individuals, many regions in the genome show structural variation. This study was designed to fill in the gaps in the genome sequence and to create a technology to rapidly identify structural variation between genomes at very high resolution over extended regions.
"We were surprised to find that structural variation is much more prevalent than we thought and that most of the variants have an ancient origin. Many of the alterations we found occurred before early human populations migrated out of Africa," says first author Jan Korbel, a postdoctoral fellow in the Department of Molecular Biophysics & Biochemistry at Yale.
To look at structural variants that were shared or different, DNA from two females – one of African descent and one of European descent – was analysed using a novel DNA-based methodology called Paired-End Mapping (PEM). Researchers broke up the genome DNA into manageable-sized pieces about 3000 bases long; tagged and rescued the paired ends of the fragments; and then analysed their sequence with a high-throughput, rapid-sequencing method developed by 454 Life Sciences.
"454 Sequencing can generate hundreds of thousands of long read pairs that are unique within the human genome to quickly and accurately determine genomic variations," explains Michael Egholm, a co-author of the study and vice-president of research and development at 454 Life Sciences.
"Previous work, based on point mutations estimated that there is a 0.1% difference between individuals, while this work points to a level of variation between two- and five-times higher," says Snyder.
"We also found 'hot spots' – particular regions where there is a lot of variation," says Korbel. "While these regions may be still actively undergoing evolution, they are often regions associated with genetic disorder and disease."
"These results will have an impact on how people study genetic effects in disease," says Alex Eckehart Urban, a graduate student in Snyder's group, and one of the principal authors on the study. "It was previously assumed that 'landmarks,' like the SNPs mentioned earlier, were fairly evenly spread out in the genomes of different people. Now, when we are hunting for a disease gene, we have to take into account that structural variations can distort the map and differ between individual patients."
Adds Snyder: "While it may sound like a contradiction, this study supports results we have previously reported about gene regulation as the primary cause of variation. Structural variation of large spans of the genome will likely alter the regulation of individual genes within those sequences."
According to the authors, even in healthy people, there are variants in which part of a gene is deleted or sequences from two genes are fused together without destroying the cellular activity with which they are associated.
They say these findings show that the "parts list" of the human genome may be more variable, and possibly more flexible, than previously thought.