While I was away…

View of WhitbyI’ve just returned from a week away in North Yorkshire, and scanning through my RSS backlog and FriendFeed from while I was away, I notice a few interesting developments.

  1. Zotero made a firm announcement of a standalone version of their excellent open-source reference management tool. I’ve been keen on Zotero for a while, but have moved away from FireFox as my browser lately (at present, Zotero is only available as a FireFox plugin). I’m looking forward to using it in earnest again, and trying to integrate it more fully into my workflow (which has been the main problem with other reference managers I’ve tried). No release date yet that I’ve found (sadly).
  2. Nodalpoint, the bioinformatics blog of old, has been reincarnated as a podcast. I love podcasts, and listen to a whole bunch of them (to the point where I don’t have much of a chance to listen to music on my commute any more), but there’s not many around that are relevant to what I do with most of my time (apart from the excellent c2cbio podcast of course). I’m really looking forward to listening to episode one, and hope more follow in due course.
  3. The people behind the ‘Science is Vital‘ lobby group, as well as organising a public rally, and a lobby of Parliament, have set up a petition, I urge everyone (in the UK) to go and sign it. If we don’t try to protect science in some degree in the forthcoming round of spending cuts, no one else will.

Telomerase – make your skin immortal!

I know that the beauty industry has made a habit of twisting science somewhat for it’s own ends (see this and this for instance), but this one takes the biscuit.
The wife spotted a piece in Harper’s Bazaar while she was in the hairdressers yesterday, about an amazing new beauty treatment (the article itself is hard to link to, but it’s number 3 in the list of “9 Skin Secrets for Spring“). Injections of telomerase for $1,500 a pop. Apparently it ‘stimulates resting stem cells’. Obviously the Harper’s piece has guff about it being Nobel-prize winning technology.
Telomerase is an enzyme that amplifies DNA repeats at the ends of chromosomes, without this activity, the telomeres would get progressively shorter until the “Hayflick limit” is reached and the cell will stop dividing, or undergo programmed cell death (there’s a reasonable review of the role of telomerase here:
Now I’m no expert, but as far as I know, telomerase is turned off in normal somatic cells, and telomerase activity has been associated with up to 90% of cancers (even its Wikipedia entry will tell me this much, a rather old paper with some concrete figures can be found here: I’m not suggesting for a second that injecting telomerase will give you cancer (the overwhelming probability is it will do nothing at all), but this seems to be an amazing example of abusing science in the name of ‘beauty’.

Defining absolute protein abundance

At the heart of Systems Biology is a vast hunger for measurements. mRNA abundance, metabolite concentration, reactions rates, degradation rates, protein abundance. This last measurement has long been problematic for researchers, mass spectrometers get increasingly accurate and powerful, but are still hindered by the simple fact that observed signal intensity does not necessarily correlate directly with the abundance of that peptide in the sample. Factors such as peptide ionisation efficiencies, dominant neighbour effects, and missing observations all give rise to erroneous estimates of peptide quantities. Until recently, the best way to get close to measures of protein abundance was to use a peptide tagging methodology, but these are typically expensive, and provide only relative quantification (useful for expression proteomics studies, less useful if you need to know the absolute levels of a protein for a Systems Biology study).

Recently, a three step method has been proposed for determining the absolute quantities of proteins in the cell, on a proteome scale. Step one is isoelectric focussing of tryptic digests of whole cell extracts. Step two, calculating the absolute abundance of a small group of proteins by Selective Reaction Monitoring (SRM). SRM uses spike in, isotopically labelled peptides of known concentration as references to calculate the actual abundance of peptides of interest. Finally, step three uses these abundances as reference points to calculate the abundance of all proteins in the sample, using the median intensities from the 3 most intense peptides for each protein.

Leptospira interrogans (Wikimedia Commons)

Leptospira interrogans (Wikimedia Commons)

Using this methodology, the abundances of >50% of the proteome of a human parasite (Leptospira interrogans) have been determined to an accuracy of ~2-fold. These abundance measurements were confirmed by almost literally counting the number of flagellar proteins present in a cell by cryo-electron tomography.

Although current hardware probably limits this technique to a few thousand proteins, that is still a big step forward on what was previously possible. If whole proteome scale absolute abundance measurements become an achievable reality, maybe proteomics can finally take on microarrays as the dominant technique in the post genomics world.
Malmström, J., Beck, M., Schmidt, A., Lange, V., Deutsch, E., & Aebersold, R. (2009). Proteome-wide cellular protein concentrations of the human pathogen Leptospira interrogans Nature, 460 (7256), 762-765 DOI: 10.1038/nature08184

“Peer review does not guarantee quality”

I am still catching up on my podcast backlog after my 2 week holiday in August. The excellent ‘More or Less’ provided the gem of a quote in the title during a discussion about meta-analyses.

Professor Stephen Senn was explaining why careless mathematics can distort the results of a meta-analysis (things like including a prior meta-analysis amongst your data sets can lead to double-counting – see this paper). The presenter, Tim Harford, suggested that surely this is a problem easily fixed. A reader spots an error in a published meta-analysis, contacts the journal and a correction ensues. A suggestion that was quickly knocked back by Prof Senn. The problem, as he sees it, is that we have no culture of correction; that peer reviewed results are considered irreproachable.

Doesn’t peer review offer some guarantee of quality?, suggests Harford. “Peer review is of minimal value” is the response to this, “…checkability is what really guarantees quality”. Senn goes on to suggest that scientists sign an undertaking to provide raw original data to anyone who requests it.

This was the clearest argument I’ve heard, not against peer review, but for the availability of raw data, and for post-publication quality control on a grand scale.

This multi-eyes approach to quality checking, post-publication, is familiar from somewhere

Charles Minard's 1869 chart showing the losses in men, their movements, and the temperature of Napoleon's 1812 Russian campaign.

Charles Minard's 1869 chart showing the losses in men, their movements, and the temperature of Napoleon's 1812 Russian campaign.

The same edition of the show had a section on data visualisation, and bought the ‘Napoleon’s March’ graphic to my attention. I had not previously been aware of this ‘infographic’, produced in the mid-19th century.

From eczema to asthma (in mice)

ResearchBlogging.orgEczema and asthma often co-occur, indeed, I suffer from both (albeit mildly). What I wasn’t aware of was that eczema often comes first. Though eczema often precedes asthma (asthma has an underlying rate of 4-8% in the general population, but 70% in individuals with a history of chronic severe eczema), the underlying mechanism for this so called ‘atopic march’ isn’t known, though work published today in PLoS Biology elucidates a possible mechanism.

Researchers genetically engineered mice with chronic skin barrier defects (mice lacking Notch signalling in the skin, leading to impairment of epidermal differentiation), who exhibit an eczema like skin condition. They then used these mice to demonstrate the predisposition of such affected individuals to allergic asthma. Occurance of allergic asthma was 7-fold higher in the mutant mouse population, compared to a wild-type population.

The authors then went on to demonstrate that a cytokine called thymic stromal lymphopoietin (TSLP), which is secreted by the damaged skin into the circulation, is required for atopic march in the mutant mice. They show that by knocking out the TSLP receptor in these mice, they can prevent atopic march. They also show that over-production of TSLP in the skin is sufficient to cause allergic asthma, regardless of the cause of that over-production.

This is a paper a little outside my areas of expertise, which is why this is much more of a skim overview than normal. However, there is clearly good work being done here elucidating the molecular mechanisms of a very common disease process. There are also clear implications in this paper on the future management and treatment of eczema and asthma patients. Even though this is unlikely to improve my own experiences of these conditions, I’m very happy this kind of work is being done.

Demehri, S., Morimoto, M., Holtzman, M., & Kopan, R. (2009). Skin-Derived TSLP Triggers Progression from Epidermal-Barrier Defects to Asthma PLoS Biology, 7 (5) DOI: 10.1371/journal.pbio.1000067

Nature Methods

homecoverI love my free Nature Methods subscription. It allows me to get my hands on a paper journal, which I rarely get to do these days, and the content is actually pretty marvellous.

This month there’s a new technique for enzymatic assembly of DNA molecules from the Venter Institute, a standardised methodology for proteomics sample preparation, and a great technology feature from Nathan Blow about new proteomics techniques, including surface plasmon resonance (about which I knew nothing before today). Not to mention cool pictures of mice having light shone on their brains.

You can still apply for a free subscription, and if you are eligible to do so (individuals in North America and Europe involved in research within the life sciences or chemistry), I would urge you to.

IET BioSysBio 2009

Frank and Dan have already blogged about this year’s BioSysBio conference in Cambridge (23rd-25th March). I just thought I’d add my thoughts to theirs.

I don’t get to go to many conferences. The nature of my work doesn’t really demand it, but about once a year it does me good to reconnect with some cutting edge science, and get a good idea of developments in the field as a whole.

Before now, ISMB has been the conference of choice, as the largest gathering of bioinformatics types, it certainly was the obvious one. But in recent years it has become a cumbersome beast. Multi-tracked and vast, hard to pin down stuff you want to hear, often disappointing when you do find something. So this year we cast about for something smaller and fresher. We had heard good things about BioSysBio last year, and it certainly looked promising, so we made our decision.

And boy, was it the right decision. Small enough to be single track, there were very few choices to make in terms of what talks to attend (actually there were none, there was only really one parallel session, workshops on the Tuesday afternoon, and I was obliged to be at the ONDEX one, since I was helping out). This meant that instead of skipping between halls, missing bits and pieces of talks, and sometimes not bothering at all, I sat in one place, pretty much for 3 days straight, and listened to everything.

Highlights were the ethics and biosecurity debate, with a fabulously engaging talk from Drew Endy; showcases of the importance of transcription initiation and elongation from Marko Djordjevic and Andre Riberio; an excellent Synthetic Biology talk from a man apparently inspired by the iGEM competition, Philip LoCascio; and a couple of excellent videos of lab robots hard at work (Adam the Robot Scientist, and another in the final paper talk of the conference by T Ben Yehezkel).

Wordle of #biosysbio tweets

Wordle of #biosysbio tweets

Next year I would happily micro-blog the conference again. This was my first conference since I joined Twitter and FriendFeed, and I was unsure about how I (and my followers) would feel about really going hard at the live updating of the conference experience. I think, though, that those of us who Tweeted provided an idea of the content being presented to those who could not attend, and the feeling I got from the feedback we received, and the fact that not a single person unfollowed me in the three days, is that we were providing a useful service. It has also provided me with a useful resource, a set of notes on the event produced by a crowd, not just me. Search for #biosysbio to see what I mean. Oh, and no review of this conference would be complete without a mention of Ally’s blogging, in which she chronicled pretty much every single talk, except her own (I did that one!)

I do think that for future events I would create threads on FriendFeed for each talk, and group my thoughts about it there, then tweet the URL of the FriendFeed post – this might make things a little less noisy.

Coming back from a conference feeling exactly how you should feel, refreshed, invigorated and excited to get on with your own work, is a great thing. For this feeling alone I will be returning to BioSysBio next year.

Saint: A lightweight SBML annotation integration environment

Allyson Lister

CISBAN, Newcastle University

This post is an homage to Ally’s own herculean note taking style, since she can’t blog her own talk.

Saint has been developed to help modellers get information into their SBML models really quickly. Ally shows a picture of a model describing neuromuscular junctions (standard biomodel). This model contains terms which are descriptions, and the mathematical model. The maths doesn’t know anything about the underlying biology. For example, actin is just a label, there is no implicit knowledge contained in that label (ie actin is a protein, invoved in the cytoskeleton etc).

Short intro to SBML: SBML is a standard format, which is widely used. it stores the maths and enables linking to the underlying biology.

So what do we know about actin –

  • its a protein (UniProt)
  • interactions? (Pathway Commons, STRING)
  • reactions and parameters (SABIO-RK, BRENDA, KEGG)
  • vocab (SBO, GO)

Now we can use the MIRIAM standard to annotate the model with the above information.

When building a model, you need to add info to things like species, name, reaction, compartment
Annotation and SBO term sit between the model and the biology information – these can be used to retrieve the information from the databases. This has to be done manually currently, this is hard and is often not done exhausively, or even at all.

Saint enables automation of this procedure. It already links to a number of data sources – MIRIAM, UniProt, STRING, SBO, Pathway Commons. Reduces effort on the part of the modeller. Saint is lightweight and easy-to-use. Useful as a first pass annotation tool, or to add annotation to an existing model.

How Saint works:

  • import SBML into Saint
  • Saint then searches for appropriate annotation
  • and presents this annotation, and allows to to accept or reject the changes

Ally is using a model produced by Carole Proctor in CISBAN as an example run-through of Saint.

Saint does some validation via libsbml on import of an SBML model. The tool then presents a list of species found in the model, these can be hidden if you don’t want to retrieve information on them. Zoom into ‘Ctelo’ for an example – a plus next to the name of the species shows the annotation already available in the SBML model (‘known’ information). So we can se that Ctelo is a Capped Telomere. You can decide which species you want to annotate, and which datasources you want to retrieve that annotation from.

Queries are made from datasources by a Master Asynchronous Query Service – once information becomes available, it is immediately visible in the UI (as an ‘inferred’ tab), and you see a ‘New Annotation found’ message. Once Saint has retrieved annotation, the user can choose which annotation he wants to keep, and how this information links to the species in the model (is, part etc – MIRIAM terms)

CDC13 = polypeptide chain, nuclear telomere cap complex, protein binding, single-stranded telomeric DNA binding, telomerase inhibitor activity.

Future work – more data sources, use of species type, better support for non-systematic names, adding software source attribution, incorporation of SBGN (Systems Biology Graphical Notation) for better display.

Personal comments – good job Ally – hope I did it justice!

Ally’s standard disclaimer:

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else’s. I’m happy to correct any errors you may spot – just let me know!

Save the Scientist, Save the World?

Gordon Brown has already saved the world once, but it didn’t take. So the world needs another solution. i humbly suggest that the way to help save the economy, of Britain at least, is to invest heavily in Science and Technology. In the following I try to justify this as anything other than pure selfishness.

Science is very often one of the first casualties of government spending in a recession. This is because it is seen as a luxury, a good-time frippery that is difficult to justify when times are hard. The reverse should be true, science and technology investment are not disposable because they are the generators of future income, the basis of a future successful economy.

The economy of this country has, for a long time now, been based in the service sector. This keeps people employed, which drives the economy because employed people buy things. But we no longer produce anything of note, we don’t generate significant external input into our economy – except through the financial sector… and I think everyone knows what happened there by now. We reached a point where confidence amongst those employed in the service sector collapsed, so they stopped buying things, this means that the service sector looked to the financial sector for support, but the financial sector, to all intents and purposes, no longer existed, so the service sector began collapsing upon itself. This is self-perpetuating, it leads to job losses, which leads to less buying of things, which leads to further job losses… and so on. (I realise this is a gross simplification of the real situation, and not 100% accurate, but it is pretty close to the real thing, and makes my point).

The government has declared its intention to follow a Keynesian approach and spend its way out of recession, taking upon itself the responsibility of injecting the cash that the economy needs to rebuild itself. This is a well recognised approach, and has merit, the new investment has to come from somewhere, and no institution has the borrowing power of the government. However, we (as a nation) must be able to recover this investment at some future point. This means we have to create wealth that is not already in the system. We have to make something that the rest of the world wants to buy.

So, invest in science, engineering and technology. Reverse the decline in these disciplines, the unpopularity of Maths and Physics in the classroom, the hemorrhage of the talent we do have overseas. Make the product the rest of the world buy into our innovation. Funding research keeps the current generation of innovators employed (the selfish bit), and creates new opportunities for the next generation. And not just for those lucky enough to have the education to pursue this route. Infrastructure is needed to surround research. Newcastle University is one of the largest employers in the North East.

At this point I am clearly in danger of getting carried away, so it’s probably best to wrap up. Since I started writing this particular perma-draft, many things have happened. Gordon Brown spoke in congress, about the need to ‘educate our way out of the downturn, invest and invent our way out of the downturn and re-tool and re-skill our way out of the downturn.‘ The US stimulus package has promised vast investment in science and technology. And just today President Obama unfroze research into Stem Cells in the US. All of these are obviously good things, let’s hope the momentum can be maintained, and the doom merchants don’t win.

Fixing Proteomics

Fixing ProteomicsI’ve only just discovered the Fixing Proteomics Campaign, thanks to a post on FriendFeed. It’s an initiative that I probably should have known about before, since it appears to originate, at least partly, from Nonlinear Dynamics, a Newcastle based proteomics informatics company. The campaign is also dedicated to a message I have been trying to spread among the researchers I interact with during my work: experiments must be robustly designed, and an unreproducible experimental result is meaningless.

The website for the campaign contains some useful resources for spreading this message, most effective are the analogies that illustrate the most common experimental design techniques, and the 4-step guide for Fixing Proteomics (the subject of the FF link, above). I have used something akin to the analogies in lectures I have given about experimental design (indeed I have used the apocryphal ‘Fahrenheit and the Cow’ story itself), and I will certainly be using the 4-steps in the future, and referencing the Fixing Proteomics website too.

Just one note: as Frank points out in the FriendFeed thread, the PSI could be highlighted a little more. Proteomics experiments would not be reproducible at all, particularly cross-site, without the efforts made by the standards community. As AnalysisXML enters its public comment phase, it is worth remembering the contribution they have made to opening up data formats and making data and metadata available in a non-proprietry way.