book reviews
Structural Biology Booksreviewed by T. Nelson |
Reviewed by T. Nelson
This is a book written by a group of eminent biophysicists, all of whom are big names in the field, telling us why their theories don't work.
The Human Genome Project created a gold mine for biomedicine. We now know vastly more about genetic diseases than before. We have found over 50 million protein sequences. In one project (headed by Craig Venter, in case you're wondering what he's up to these days), scientists are sequencing random DNA every 200 miles in the ocean, and discovering 1.3 million new genes and 50,000 new species in each barrel of seawater. So we know what proteins are out there; what we don't know is what they do.
Take my project. In my field, scientists have discovered hundreds of mutations in half a dozen proteins, all of which cause the same terrible disease that will ultimately kill 40–50 million people in the most horrific way imaginable. What we'd like to do is put those proteins in the computer and have it tell us what they do that's different from the normal protein: does it interact with something it's not supposed to, and if so what is the affinity? Or if not, then what the heck is it doing?
To do that, we need an accurate three-dimensional structure. That would tell us what's happening around the loops that join different regions of the protein. Even calculating the 3D structure of a protein from its amino acid sequence is a major unsolved problem. Right now, the field of ab initio protein folding is split between trying to use physics to model the protein and those who are trying to use homology to proteins whose structure is known by X-ray crystallography. Both have their advocates.
We've all heard, for example, about Bakerlab's Rosetta@home project and Stanford's folding@home, which distribute the enormous number of calculations across the Internet. Biochemists know about Zhang's I-TASSER, which uses a combination of ab initio and homology searches. Some of the physics guys are even talking about improving accuracy by adding quantum mechanics, which would raise the computational cost enormously. But we can't even be sure that any amount of computer power will solve it. What if, for example, most proteins don't just fold up by themselves when you dissolve them in water, like ribonuclease A does, but require specialized fold-em-up-type proteins to make it happen? If so, all the simulated annealing and molecular dynamics algorithms in the world won't solve the structure.
The title of this book is misleading: it won't help you use bioinformatics to identify protein function. It does give you ideas on what's the best bioinformatics package to use, but not how to get it running (Rosetta takes a whole day just to compile; the molecule viewer Chimera has been upgraded: it no longer just hangs X.org—the latest, improved version crashes the Linux kernel itself). And like all scientific books, the writing is dry, dry, dry (Lawrence Kelley's article on fold recognition being a welcome exception) and technical, technical, technical, and almost every sentence is broken up with big clumps of literature citations in name, year format, which the reader just has to learn how to skip over.
The articles have interesting color illustrations, and there are many interesting facts. But mostly what this book will teach you is that somebody else out there is facing an even more insurmountable problem than you. And that is really good to know. Now at least we have a list of who to blame.
apr 29, 2018