Thursday, April 26, 2007

The PNAS paper that inspired junk

The article referenced in this SciAm article was finally online today in PNAS Early Edition. Its a neat little paper, and I think the SciAm (minus the 'junk DNA') did a nice job summarizing it. Two complaints, still.

  1. '10,000 new uses for junk DNA!' was used out of context. 10,000 sounds like a lot. Its 0.4% of our genome.
  2. It missed a cool point I think pop culture readers might like (but maybe it is too much for a short article).
There are lots of ways to compare the genomes of humans and our nearest relatives-- you can sequence everybody and align as much as your computer can handle to make a phylogenetic tree (HAHAHA! Good luck!), or you can cut your computer (and yourself) some slack and only compare sequences of protein coding regions or specific proteins to generate a tree, or you can translate the mRNA from protein coding sequences into amino acid sequences for comparison... etc etc etc.

So say your lining up all these nucleotide and amino acid sequences and you find a gene we have in common with chimpanzees thats actively transcribed in, say, neurons. You can blow up some human neurons and some chimpanzee neurons and figure out how often the gene we have in common is transcribed by using a microarray.
Turns out that even though we have 98-99% of our protein coding regions in common with chimpanzees, a BIG difference between the two of us is the transcription rates of these genes. Sometimes the in-common-gene is up-regulated in humans, compared to chimpanzees. Or down-regulated. Or the transcribed at the same rate.

You can also study the actual process of transcription with a neato technique called ChIP-on-chip. This assay is used to find the proteins that help transcribe genes, promoters, transcription enhancers or repressors, etc. This paper demonstrates that a portion of unexplored DNA effects the transcription rates of genes associated with system development, nervous system development, and transcription regulatory activity itself.
Minor problem: we ignore repetitive DNA in ChIP-on-chip assays.

Well, thats not entirely unexpected, considering that there is allota genome to explore. However the paper authors gave a rather snippy reason for why we 'ignore' repetitive elements in ChIP-chip assays:
The majority of current whole-genome experimental and computational approaches to gene regulation, such as tiling arrays used on ChIP-chip experiments, and transcription factor-binding site prediction, choose to ignore repetitive regions, for pragmatic reasons, assuming that most if not all are inert.
That sounds curiously similar to a Creationist Claim ("Evilutionists dont study 'junk DNA' because they think its all garbage, but every base pair is sacred!").

A more reasonable explanation is that ChIP-on-chip assays are friggen expensive, and including a few million million unexplored sequences in your assay is not only cost prohibitive, but a lesson in futility. It would be a fishing expedition, and you probably couldnt wade through all the data if you worked on the same assay for years. I mean we're thinking about doing what theyre suggesting with HIV-1, but HIV is only 10 kilo base pairs long. The human genome is ~3.2 billion base pairs long. Little bit of a difference, there.

Now, after more of these regulatory regions are fleshed out, certainly they will be used. But I think that 'conclusion' in the discussion was a touch harsh.

(link for evolgen)

No comments: