While I ended up in a different field, the lessons I learned from structural biology helped me better appreciate how the conformation of a protein contributes to our understanding of its function in the cell and organism.
All biology students have heard of the central dogma, which states that the genetic code (stored in DNA) is transcribed into messenger RNA and then translated into proteins by ribosomes. What isn’t emphasized is how many decades of research it took to arrive at the principle we take for granted now. Ever since Christian Anfinsen’s landmark discovery that won the 1972 Nobel Prize in Chemistry for connecting the protein’s amino acid sequence to its biological activity, we have had a greater appreciation for how structure leads to function. Although it is more complicated than just letting thermodynamics work, a properly folded protein is critical to normal cellular functions and organismal health.
The easy part of the project, now that the Human Genome Project was completed and technology has advanced, is determining the amino acid sequence from the candidate gene. The hard part, as many have discovered, is figuring out how that sequence folds into the final, biologically functional conformation. Assuming the protein can be expressed and purified in large enough quantities to be processed into a crystal, the next step is to perform procedures such as x-ray crystallography, nuclear magnetic resonance (NMR), mass spectrometry, cryo-EM, and other techniques to piece together enough data to generate a high-resolution structure. 2 As you can imagine, many of these methodologies are resource-heavy, labor-intensive, and time-consuming, so like many folks who would prefer to do literally anything else, structural biologists are always looking for a shortcut.
With the accelerated growth of computers and computing power, researchers began to develop computational modeling methods to bridge the gap between spotty experimental data and a more refined structure. 2, 3 The structural biologist’s dream is to be able to feed an amino acid sequence into a machine, turn the crank, and spit out a structure. In a bid for further progress as well as to stroke some egos, research groups have participated in the Critical Assessment of Structure Prediction (CASP) competition to display their newest prediction algorithms and strategies. The protein structure prediction strategies are split into template-based modeling (TBM) that utilize reference structures deposited into the Protein Data Bank (PDB), or template-free modeling (FM) that do not rely on existing structures. 2, 3 These methods rely on machine learning and probabilistic models, with known physics and chemistry principles thrown in, to fit the structure. 3
The proficiency of these models has improved over the past three decades since CASP started, and in 2021, two major publications were published describing new platforms that are freely accessible and predict structures with unprecedented accuracy. 4, 5 These accomplishments were recognized by Science Magazine as the breakthrough of 2021, being much more adept at computing structures in a fraction of the time needed to do a crystallography experiment, and allowing fellow scientists to continue contributing to the broad knowledge base without barriers.
Despite the imperfections, the freely accessible software and ease of use provide plenty of avenues for advancement going forward. I hope that this encourages the scientific community to continue to be collaborative, to keep as much of science open access as possible, and given the nature of these experimental methods, to implement coding and statistical analysis into their research. Science is better when we can all share our stories, and this breakthrough has made it easier for us to exchange knowledge for the betterment of humanity.
Lots of research is based on recombinant proteins these days, but there is still a market and a use for proteins purified directly from the source species. Check out this episode of BioChat where I hang out with Josiah Carney of ProNique Scientific to talk about how native proteins still have advantages and are still useful today!
Please check out our BioChat page to learn how to subscribe (and of course please rate and review us on Apple Podcasts!) and check out our latest episode by clicking the player above. Thanks for listening!
References