About Our Research


Evolution-Directed Studies of Protein Functional Surfaces

The Lichtarge Computational Biology Lab combines novel evolutionary equations with machine learning to solve problems in genomic medicine and molecular engineering.

Evolutionary Trace Algorithm

This work began with the goal to identify protein functional surfaces. For this, we sought to compute evolutionarily important sequence positions and introduced the Evolutionary Trace (ET) method. This algorithm was the first to explicitly use phylogenetic divergences to weigh the importance of amino acid variations across species1. Systematic studies of protein structure and function established that the more important positions cluster structurally2 to reveal functional sites and their specificity determinants3. With improved scalability4, these characteristic ET determinants could then be matched across the structural proteome to predict functions5 — even substrates in favorable cases6

For more rotating images of functional sites, visit our sample traces page.

G Protein Signaling

A hallmark of ET development is that it was largely motivated by and applied to collaborative studies to elucidate G protein signaling. Discoveries with Bourne, Wensel, Bouvier, Caron and Lefkowitz included: G protein binding sites7; an allosteric switch in Regulators of G protein Signaling proteins8,9; and transmembrane micro-domains for ligand binding, allosteric triggering, and effector coupling in G Protein-Coupled Receptors10. ET also guided separation of function mutations in bioamine receptors, notably decoupling G protein from ß-arrestin effector pathways in vitro11, and, later, in mice12. In other work, mutations targeted in-between the ligand and effector sites rewired a dopamine receptor so that it responded to serotonin, demonstrating allosteric modulation of functional bias13,14.

Together, these ET studies showed how to measure the sensitivity of protein sequence positions to mutations and, as a result, spot most sites that mediate functions and efficiently guides experimental studies that expose and reprogram molecular mechanisms.

Evolutionary Action Equation

Shifting viewpoint, we next reinterpreted ET as the gradient of the fitness landscape. As such, it couples genotype variations with phenotype variations and enables to solve a differential equation for the Evolutionary Action (EA) of mutations on fitness. This EA is computable for most proteins and organisms, and applies across biological scales: In proteins, EA tends to outperform other methods to score the deleterious impact of missense mutations25; in patients, EA correlates with disease morbidity; and in populations, EA explains the distribution of polymorphisms, connecting population genetics to molecular biology15. Clinically, the EA equation also separates head and neck cancer mortality based on p53 mutational severity16, an effect perhaps associated with cisplatin resistance17 and alternative treatment18. EA may also be integrated over patients and pathways25.

Integrative Molecular Biology

Complementary studies focus on machine learning in networks of gene, drug and diseases. Although fine details matter, such as when negative auto-regulatory feedback affects mutational tolerance and evolvability19,20; they are often lacking. This was so in a study that uncovered a new malarial Glutathione-S-Transferase by diffusing information across a gene interaction network spanning nearly 400 species. This new GST may play a role in pathogenesis and therapy as it degrades a toxic byproduct of metabolism and is inhibited by Artesunate—the best current antimalarial agent21. At even poorer resolution, a network that adapted IBM’s WATSON to molecular biology and text-mined the entire PubMed corpus of abstracts, nevertheless identified new p53 kinases among other automated hypotheses 22,23.

Together, these studies suggest that machine learning, text-mining, network analyses and evolutionary equations may soon integrate biological information in light of the genome variations most relevant to a given trait, or disease. This will inform studies of the genotype-phenotype relationship across biology; help design individualized therapies based on a patient’s precise and unique genetic location in the human fitness landscape.

To get a comprehensive view of our lab's contributions, explore the core research publications

To read about our previous research, click here.

References
  • 1 Lichtarge, O., Bourne, H. R. & Cohen, F. E. An evolutionary trace method defines binding surfaces common to protein families. J Mol Biol 257, 342-358 (1996).
  • 2 Madabushi, S. et al. Structural clusters of evolutionary trace residues are statistically significant and common in proteins. J Mol Biol 316, 139-154 (2002).
  • 3 Yao, H. et al. An accurate, sensitive, and scalable method to identify functional sites in protein structures. J Mol Biol 326, 255-261 (2003).
  • 4 Mihalek, I., Res, I. & Lichtarge, O. A family of evolution-entropy hybrid methods for ranking protein residues by importance. J Mol Biol 336, 1265-1282 (2004).
  • 5 Kristensen, D. M. et al. Prediction of enzyme function based on 3D templates of evolutionarily important amino acids. BMC Bioinformatics 9, 17, PMC2219985 (2008).
  • 6 Amin, S. R., Erdin, S., Ward, R. M., Lua, R. C. & Lichtarge, O. Prediction and experimental validation of enzyme substrate specificity in protein structures. Proc Natl Acad Sci U S A 110, E4195-4202, PMC3831482 (2013).
  • 7 Lichtarge, O., Bourne, H. R. & Cohen, F. E. Evolutionarily conserved Galphabetagamma binding surfaces support a model of the G protein-receptor complex. Proc Natl Acad Sci U S A 93, 7507-7511, PMC38775 (1996).
  • 8 Sowa, M. E., He, W., Wensel, T. G. & Lichtarge, O. A regulator of G protein signaling interaction surface linked to effector specificity. Proc Natl Acad Sci U S A 97, 1483-1488, PMC26460 (2000).
  • 9 Sowa, M. E. et al. Prediction and confirmation of a site critical for effector regulation of RGS domain activity. Nat Struct Biol 8, 234-237 (2001).
  • 10 Madabushi, S. et al. Evolutionary trace of G protein-coupled receptors reveals clusters of residues that determine global and class-specific functions. J Biol Chem 279, 8126-8132 (2004).
  • 11 Shenoy, S. K. et al. beta-arrestin-dependent, G protein-independent ERK1/2 activation by the beta2 adrenergic receptor. J Biol Chem 281, 1261-1273 (2006).
  • 12 Peterson, S. M. et al. Elucidation of G-protein and beta-arrestin functional selectivity at the dopamine D2 receptor. Proc Natl Acad Sci U S A 112, 7097-7102, PMC4460444 (2015).
  • 13 Rodriguez, G. J., Yao, R., Lichtarge, O. & Wensel, T. G. Evolution-guided discovery and recoding of allosteric pathway specificity determinants in psychoactive bioamine receptors. Proc Natl Acad Sci U S A 107, 7787-7792, PMC2867884 (2010).
  • 14 Sung, Y. M., Wilkins, A. D., Rodriguez, G. J., Wensel, T. G. & amp;Lichtarge, O. Intramolecular allosteric communication in dopamine D2 receptor revealed by evolutionary amino acid covariation. Proc Natl Acad Sci U S A 113, 3539-3544, PMC4822589 (2016).
  • 15 Katsonis, P. & Lichtarge, O. A formal perturbation equation between genotype and phenotype determines the Evolutionary Action of protein-coding variations on fitness. Genome Res 24, 2050-2058, PMC4248321 (2014).
  • 16 Neskey, D. M. et al. Evolutionary Action Score of TP53 Identifies High-Risk Mutations Associated with Decreased Survival and Increased Distant Metastases in Head and Neck Cancer. Cancer Res 75, 1527-1536, PMC4383697 (2015).
  • 17 Osman, A. A. et al. Evolutionary Action Score of TP53 Coding Variants Is Predictive of Platinum Response in Head and Neck Cancer Patients. Cancer Res 75, 1205-1215, PMC4615655 (2015).
  • 18 Osman, A. A. et al. Wee-1 kinase inhibition overcomes cisplatin resistance associated with high-risk TP53 mutations in head and neck cancer through mitotic arrest followed by senescence. Mol Cancer Ther 14, 608-619, PMC4557970 (2015).
  • 19 Marciano, D. C., Lua, R. C., Herman, C. & Lichtarge, O. Cooperativity of Negative Autoregulation Confers Increased Mutational Robustness. Phys Rev Lett 116, 258104, PMC5152588 (2016).
  • 20 Marciano, D. C. et al. Negative feedback in genetic circuits confers evolutionary resilience and capacitance. Cell Rep 7, 1789-1795, PMC4103627 (2014).
  • 21 Lisewski, A. M. et al. Supergenomic network compression and the discovery of EXP1 as a glutathione transferase inhibited by artesunate. Cell 158, 916-928, PMC4167585 (2014).
  • 22 Spangler, S. et al. Automated hypothesis generation based on mining scientific literature. in 20th ACM SIGKDD international conference on Knowledge discovery and data mining 1877-1886 (ACM, New York, USA, 2014).
  • 23 Nagarajan, M. et al. Predicting Future Scientific Discoveries Based on a Networked Analysis of the Past Literature. in 21st ACM SIGKDD international conference on Knowledge discovery and data mining 2019-2028 (ACM, Sydney, Australia, 2015).
  • 24 Katsonis, P. & Lichtarge, O. Objective assessment of the evolutionary action equation for the fitness effect of missense mutations across CAGI-blinded contests. Hum Mutat doi: 10.1002/humu.23266 PMID28544059 (2017).
  • 25 Cancer Genome Atlas Network. Comprehensive and Integrative Genomic Characterization of Hepatocellular Carcinoma. Cell 169, 1327-1341 (2017).
Short reading list
  • Lichtarge, O., Bourne, H. R. & Cohen, F. E. An evolutionary trace method defines binding surfaces common to protein families. J Mol Biol 257, 342-358 (1996).
  • Sowa, M. E. et al. Prediction and confirmation of a site critical for effector regulation of RGS domain activity. Nat Struct Biol 8, 234-237 (2001).
  • Madabushi, S. et al. Structural clusters of evolutionary trace residues are statistically significant and common in proteins. J Mol Biol 316, 139-154 (2002).
  • Mihalek, I., Res, I. & Lichtarge, O. A family of evolution-entropy hybrid methods for ranking protein residues by importance. J Mol Biol 336, 1265-1282 (2004).
  • Madabushi, S. et al. Evolutionary trace of G protein-coupled receptors reveals clusters of residues that determine global and class-specific functions. J Biol Chem 279, 8126-8132 (2004).
  • Katsonis, P. & Lichtarge, O. A formal perturbation equation between genotype and phenotype determines the Evolutionary Action of protein-coding variations on fitness. Genome Res 24, 2050-2058, PMC4248321 (2014).
  • Lisewski, A. M. et al. Supergenomic network compression and the discovery of EXP1 as a glutathione transferase inhibited by artesunate. Cell 158, 916-928, PMC4167585 (2014).
  • Spangler, S. et al. Automated hypothesis generation based on mining scientific literature. in 20th ACM SIGKDD international conference on Knowledge discovery and data mining 1877-1886 (ACM, New York, USA, 2014).
  • Neskey, D. M. et al. Evolutionary Action Score of TP53 Identifies High-Risk Mutations Associated with Decreased Survival and Increased Distant Metastases in Head and Neck Cancer. Cancer Res 75, 1527-1536, PMC4383697 (2015).
  • Marciano, D. C., Lua, R. C., Herman, C. & Lichtarge, O. Cooperativity of Negative Autoregulation Confers Increased Mutational Robustness. Phys Rev Lett 116, 258104, PMC5152588 (2016).