In the first study to run a genome-wide analysis of Short Tandem Repeats (STRs) in gene expression, a large team of computational geneticists have shown that STRs, thought to be just neutral, or “junk,” actually play an important role in regulating gene expression. The work uncovers a new class of genetic variants that modulate gene expression.
Says the study’s leader Yaniv Erlich, who is an assistant professor of computer science at Columbia Engineering, a member of Columbia’s Data Science Institute, and a core member of the New York Genome Center:
“Our work expands the repertoire of functional genetic elements. We expect our findings will lead to a better understanding of disease mechanisms and perhaps eventually help to identify new drug targets.”
Spelling Errors in Different Flavors
Genomic variants are what makes our DNA different from each other, and come, Erlich explains, like spelling errors in different flavors. The most common variants are SNPs (single nucleotide polymorphisms). Computational geneticists have been focused mostly on SNPs that look like a single letter typo— mother vs. muther— and their effect on complex human traits.
Erlich’s study looked at Short Tandem Repeats (STRs), variants that create what look like typos: stutter vs. stututututututter. Most researchers, assuming that STRs were neutral, dismissed them as not important. In addition, these variants are extremely hard to study.
“They look so different to analysis algorithms,” Erlich notes, “that they just usually classify them as noise and skip these positions.”
Erlich used a multitude of statistical genetic and integrative genomics analyses to reveal that STRs have a function: they act like springs or knobs that can expand and contract, and fine-tune the nearby gene expression. Different lengths correspond to different tensions of the spring and can control gene expression and disease traits.
He is calling these variants eSTRs, or expression STRs, to note that they regulate gene expression. He and his team also discovered that these eSTRs can be associated with a range of conditions including Crohn’s disease, high blood pressure, and a range of metabolites.
These eSTRs explain on average 10 to 15% the genetic differences of gene expression between individuals.
“We’ve known that STRs are known to play a role in these diseases, but no one has ever conducted a genome-wide scan to find their effect on complex traits,” Erlich adds. “If we want to do personalized medicine, we really need to understand every part of the genome, including repeat elements—there’s a lot of exciting biology ahead.”
Erlich and his team, which included researchers from Harvard, MIT, Stanford University, and Mount Sinai, plan next to study the effect of these eSTRs on more human diseases and better understand their molecular mechanism.
Melissa Gymrek, Thomas Willems, Audrey Guilmatre, Haoyang Zeng, Barak Markus, Stoyan Georgiev, Mark J Daly, Alkes L Price, Jonathan K Pritchard, Andrew J Sharp & Yaniv Erlich
Abundant contribution of short tandem repeats to gene expression variation in humans
Nature Genetics (2015) doi:10.1038/ng.3461
Illustration: Anna Tanczos, Wellcome Images