Tuesday, May 15, 2012

Numbers numbers everywhere.

So I've finally got some data back from my pilot experiment. I've spent the last couple of days processing it. And doing it wrong. Which is not a bad thing, it's been quite nice to finally have some real data to play with.

The data that I got back was in the form of short reads. We take our sample cells, grind them up and extract the RNA, which is then sent away for sequencing. The sequencing process chops all the RNA up into little pieces and we end up with a large number of very small sequences or reads. or in my case, roughly 59,000,000 per sample. And I submitted 12 samples. Not as daunting as it sounds, automation is a wonderful thing.

After doing some basic quality control on the sequences, I've been using a application (bwa - uses a Burrows Wheeler transform for those of you that are of a geeky mindset, have a look, it's an interesting idea) compares each of those reads to the sequences of genes from kiwifruit and PSA that we know exist. After that though, I'm getting bugger all reads matching the genes - they're not even registering at the lvel of background noise. So I'm obviously doing something wrong. Which believe it or not, I don't mind. Now that I have an actual problem to solve, there is much fun to be had he said. <grin>

