Uncategorized

R programming courses at Newcastle University

An announcement courtesy of Colin Gillespie, a lecturer in Maths & Stats here in Newcastle:

The School of Mathematics & Statistics at Newcastle University, are
again running some R courses. In January, 2012, we will run:

  • January 16th: Introduction to R;
  • January 17th: Programming with R;
  • January 18th & 19th: Advanced graphics with R.

The courses aren’t aimed at teaching statistics, rather they aim to go through the fundemental concepts of R programming.

Further information is available at the course website.

Brand new same old job

The week before last I had my first job interview in 6 years (actually 6 years and one day, to be precise), and I’m delighted that 15 minutes after I left the interview room, I was offered the job.

I wasn’t expecting to get a new job this year, I’m perfectly happy where I am. I love my job. I enjoy the challenge of continually changing focus to work on different people’s projects, and of driving multiple lines of research all at once. However, this was an opportunity I simply couldn’t ignore. Because it is the same job, only better. Heck, I don’t even have to move desks (if I don’t want to).

See, this all came about because my esteemed colleague, and the founding head of the Newcastle Bioinformatics Support Unit, Daniel Swan, has decided to move to pastures new, at Oxford Gene Technology. This means there was an opening to do pretty much what I already do, but while running the show too.

I’m very excited to have this opportunity. Dan will be sorely missed – I have very big boots to fill – but I’ll be working very hard to make sure the unit goes from strength to strength. Also, since I have effectively vacated my old job, we will be recruiting very shortly to fill that gap too. So watch this space if you’re interested in working in Bioinformatics support in the North East.

CASE PhD studentship in Bioinformatics available

I’m delighted to announce we’re offering a PhD studentship, commencing in October. I’ve spent most of my time on the Ondex project building an integrated network focussed on drug repositioning (see [cite source=’doi’]10.2390/biecoll-jib-2010-116[/cite]). I’m very excited that we’ve managed to secure a CASE studentship, in collaboration with Philippe Sanseau at GSK, to continue and considerably extend this work. I think this is a very exciting opportunity. Full details below.

Where? – Newcastle University – School of Computing Science

What? – Development of Novel Computational Approaches to Mine Integrated Datasets for Drug Repurposing Opportunities

The blurb

We invite applications for a CASE PhD studentship in Bioinformatics at Newcastle University in the North East of England. The project is a 3-year EPSRC PhD sponsored by GlaxoSmithKline (GSK) and involves the development of novel methods of finding new targets for existing drugs using data integration.

Ondex is a data integration computational platform for Systems Biology (SB). The student will research the optimization and application of Ondex integrated datasets to the identification of repurposing opportunities for existing compounds with a particular, but not exclusive, focus in the infectious diseases therapeutic area. The student will also use the dataset to explore the interplay between microbial targets and perturbations in the metabolic and community structure of the human gut microbiome.

An ideal student will have a background in computing science, good programming skills, preferably in Java and an interest in biology and bioinformatics. Applicants should also possess an upper second class undergraduate degree. Only students who meet the EPSRC home student requirements are eligible for full fees, other EU students are only eligible to support for the fees. Students from outside the EU are not eligible to apply – please see the EPSRC website for details. 

The studentship will start in October 2011, jointly supervised by Prof. Anil Wipat and Dr. Simon Cockell at Newcastle University, and Dr. Philippe Sanseau at GSK. The student will spend at least three months at GSK in Stevenage as part of the project. Home students are eligible for payment of full fees and an enhanced stipend of approximately £18,000 tax free. To apply, please send an email to [anil dot wipat at ncl dot ac dot uk] with CV (including the contact details of least two referees) and a cover letter indicating your suitability for the position. Please include “Application CASE PhD” in the subject of the email. Applications will be dealt with as they arrive – there is no closing date.



Stack Overflow, BioStar and – shock, horror – some code

Stack Overflow reached critical mass a few months ago now. The site gets upwards of 6 million unique visitors a month. The chances are, if you write code, you know it exists, and you’ve received some sort of help there, whether directly or by proxy. In its wake, Stack Overflow has spawned Stack Exchange, a question and answer platform that anyone can buy into, and set up a site. So there’s now Stack Overflow type sites for any number of topics.

Amongst these sites, there have been a couple of attempts to set up science Stack Exchanges (http://asksci.com/ & http://science.stackexchange.com/), but to my mind, science as a whole is not specific enough, even biology is probably too broad an area for Stack Exchange to work as a platform. As a result questions are too hand-wavy, and communities have not really seemed to build. The key to Stack Overflow’s success is that it has very tightly defined boundaries, only questions about programming are accepted, anything else is removed for being off-topic. The site’s creators even set up more sites, Super User, and Server Fault, to keep Stack Overflow on topic.

It seems, then, that bioinformatics is the perfect use case for Stack Exchange. A more narrow domain than science or biology, with an already web savvy community ready to coalesce around a useful focal point. Until recently, however, no one had made the site. Then a couple of weeks ago, http://biostar.stackexchange.com/ started to get some attention on Twitter and FriendFeed. It’s early days, but the site has made a good start, some interesting questions, with some good, intelligent, answers. My main concerns would be:

  • Not enough users, no critical mass achieved.
    • The site seems to be gaining some traction, and getting more active by the day.
  • Not enough questions, no reason for users to keep coming back.
    • This does remain an area for concern, but is also starting to pick up a little.
  • No financial backing, the site may disappear after the test period comes to an end.
    • Stack Exchange is not a cheap platform, and a site like this will need funding to continue. However, the administrator, Istvan Albert, has insisted on the Google Group, set up for ‘meta’ discussion surrounding the site, that the site is funded for a year at least.
Finally, the real motivation for this post… A question on BioStar made me revisit some semi-abandoned code in order to post an answer, and I thought it was quite a nice snippet. About 15 lines of Python that utilises the UniProt ID Mapping service to automate protein ID conversion. I’ve stuck the code into a gist, and I thought I’d stick it up here too.

Randomness, statistics and understanding

So here I am, sitting in a statistics workshop, having finished all the exercises ahead of time, musing on how much easier all this stuff is once you understand where it all comes from. This made me think that I have found this workshop more understandable and simpler to tackle because I have pretty much finished reading a marvellous little book called ‘The Drunkard’s Walk’ by Leonard Mlodinow.

Mlodinow aims to educate the reader about randomness and statistics, by way of history and illustrative example, and he succeeds admirably. The book is a walk through mathematics from the Greeks and Romans, by way of the renaissance, to Einstein and the modern day. Each important advance toward the modern day study of statistics is illustrated with excellent examples and anecdotes, many of them personal to the author. The Monty Hall problem, the anomoly of Jeanne Calment, who reverse-mortgaged her apartment to a 47 year old lawyer when she was 90, only to outlive him (and he died aged 77), even the author’s own (false) positive AIDS test makes for an intriguing case study, and illustrates the importance of understanding prior probabilities when reporting the results of a test.

The setting of all this stuff in context has really helped my brain with the basic concepts, and even without this current course, I feel like I’ve got a much better grip on statistics in general. A remarkable claim for a popular science book. I look forward to the remaining 30 or so pages.