Friday, April 08, 2005

Protein folding and the benefits of ignorance

First, a bit of background: there are lots of different types of proteins and they [mostly] do different things. Some make chemical reactions go faster, some serve as little machines to help make other proteins, some degrade proteins, some form a cell's "skeleton" etc. The bottom line is that proteins make the biological world go round -- without them, we wouldn't be around, at least not in our present form.

What a protein actually does is, to a good first approximation, determined by it's three-dimensional shape. If you know what shape a protein is, you can start to make some guesses about what it does, even if you've never seen the protein before; this is useful in cases like when you've isolated some proteins from a pathogen and are trying to figure out what they do that causes disease. It would also be nice to be able to "custom-design" proteins that have a particular shape so that you can, for example, have them stick to the proteins produced by a pathogen and prevent them from doing bad things.

So -- protein structure matters, for lots of reasons. Unfortunately, it's pretty difficult to determine what a protein's structure really is, involving stuff like X-ray crystallography where you shine X-rays on a crystal of the protein, get a bunch of dots as a picture and then do lots of complicated math to try to figure out what structure could have produced those dots. And the hardest thing isn't even the math, it's trying to get the protein to form nice, regular crystals -- that part alone can easily take a couple of years [of a graduate student's life ...].

Given that it's so hard to get the "real" structure of the protein, lots of effort has been [and is being] expended on trying to predict protein shape using computer models.
One way that this is tackled is by trying to simulate a protein going from it's initial extended, "cooked noodle"-like shape right after it's been made into the eventual complicated shape it assumes -- this process is called "protein folding". And trying to simulate protein folding is pretty hard. For example, it took 512 months of supercomputer time to simulate one-millionth of a second of actual protein folding a few years ago, for a pretty simple protein. Granted, that was in 1998, so it'd probably take a lot less computer time now, but, c'mon, one-millionth of a second ? The fastest-folding protein takes one-thousandth of a second to fully fold, so they basically managed to only simulate one-thousandth of that time. What about proteins that take several seconds to fold ? Makes the X-ray crystallography method not look so bad ... unless you're the poor sucker who has to try to get the protein to crystallize nicely.

In any case, I've now sat through three lectures that deal with how to predict the 3-d structure of proteins and I think my reaction is best summed up by what my niece Chloe said to the pastor when he asked her what she thought of the service: "Booooring". It's not that the professor is bad, it's that I find the material rather uninteresting. I think there are a couple of reasons for this. The most important one is probably that trying to figure out exactly what shape a protein is is a bit too far down in the details for me. Returning to my favorite "living things as machines" analogy, it's kind of like trying to figure out whether a bolt in the machine is 2.5cm or 2.51cm long and which direction its thread runs ... I'm much more interested in understanding what big piece that bolt is a part of, and how that piece interacts with the other big pieces. The other part of it is that a lot of these algorithms involve a level of physics and chemistry that's just way beyond me [and the other people in the class], so a lot of the lecture slides have basically consisted of complicated-looking equations that are explained in a hand-waving way by the professor because we're not really supposed to understand them anyway. Not something that inspires me to be very engaged.

So, in summary, computational structural biology is an important thing for somebody to do, as long as that somebody isn't me.

The ironic thing about this is that one of the things that first got me interested in the intersection of computer science and biology is IBM's Blue Gene supercomputer, built expressly to help model protein folding. It's a good thing I didn't know then what I know now [ie that I find computational structural biology boring], or I might have been turned off by the whole idea and never gone down the computational biology grad school path. Sometimes, ignorance is bliss.

[One question that occured to me today as I thought about this was -- why is it called Blue Gene if it's concerned with protein folding ? Shouldn't it be Blue Protein ? I guess that wouldn't be quite as catchy a name ...]


Post a Comment

<< Home