Announcing a Bioinformatics Kblog writeathon

(Reposted from Knowledgeblog.org)

The Knowledgeblog team is holding a ‘writeathon’ to produce content for a tutorial-focused bioinformatics kblog.

The event will be taking place in Newcastle on the 21st June 2011.  We’re looking for volunteer contributors who would like to join us in Newcastle on the day, or would like to contribute tutorial material remotely to the project.

We will be sending invites shortly to a few invited contributors but are looking for a total of 15 to 20 participants in total.

Travel and accommodation costs (where appropriate) can be reimbursed.

If you would like to contribute tutorial material on microarray analysis, proteomics, next-generation sequencing, bioinformatics workflow development, bioinformatics database resources, network analysis or data integration and receive a citable DOI for your work please get in touch with us at admin@knowledgeblog.org

For more information about Knowledgeblog please see http://knowledgeblog.org.  For examples of existing Knowledgeblogs please see http://ontogeneis.knowledgeblog.org and http://taverna.knowledgeblog.org.

Automatic citation processing with Zotero and KCite

Writing papers. It’s a pain, right? Journals are finicky about formatting. You write the content and then the journal wants you to make it look right. You finally get the content in the right shape and then they tell you that you’ve formatted the bibliography wrong. Your bibliography is clearly in Harvard format, when the journal only accepts papers where the bibliography is formatted Chicago style. Another hour or two of spitting and cursing as you try to massage the citations and bibliography into the “correct” format. You’re not even allowed to cite everything you want to, because the internet is clearly so untrusted a resource.

I’m of the opinion that publishing should be lightweight, the publishers should get out of the way of the author’s process, not actively get in the way. Working on the Knowledgeblog project has only reinforced this opinion. Why should I spend days formatting the content, when any web content management system (CMS) worth its salt will take raw content and format it in a consistent way? Why should I process all the citations and format the bibliography when it should be (relatively) simple to do this in software? Why should I spend time producing complicated figures that compromise what I am able to show when data+code would give the reader far more power to visualise my results themselves?

This document is written in Word 2007 on a Windows 7 virtual machine. On this virtual machine I have also installed Standalone Zotero. The final piece of this particular jigsaw is a Citation Style Language (CSL) style document I wrote (you can download it from the Knowledgeblog Google Code site) that formats a citation in such a way that KCite, Knowledgeblog’s citation engine, can understand it. Now, when I insert citations into my Word document via the Zotero Add-In, I can pick the “KCite” style from the list, and the citation is popped into my document. Now when I hit “Publish” in Word, the document is pushed to my blog, KCite sees the citation as added by Zotero, and processes it, producing a nicely formatted bibliography. We are working on the citeproc-js implementation that means the reader can format this bibliography any way they choose (Phil has a working prototype of this). The biggest current limitation is that your Zotero library entry must have a DOI in it for everything to join up.

So, here is a paragraph with some (contextually meaningless) citations in it [cite]10.1006/jmbi.1990.9999[/cite]. All citations have been added into the Word doc via Zotero, and processed in the page you’re viewing by KCite [cite]10.1073/pnas.0400782101[/cite]. Adding a reference into the document from your Zotero library takes 3-4 clicks, no further processing is needed [cite]10.1093/bioinformatics/btr134[/cite].

Other popular reference management tools, such as Mendeley and Papers, also use CSL styles to format citations and bibliographies, so this same style could be employed to enable KCite referencing with those tools as well. This opens up a wide range of possible tool chains for effective blogging. Mendeley + OpenOffice on Ubuntu. Papers + TextMate on OS X (Papers can be used to insert citations into more than just office suite documents, more on that in a later post). The possibilities are broad (but not endless, not yet anyway). Hopefully this means many people’s existing authoring toolchain is already fully supported by Knowledgeblog.

Image credit: http://www.flickr.com/photos/sybrenstuvel/2468506922/ (Sybren Stüvel on Flickr)

Pretty equations in WordPress

Chalkboard maths

We’ve spent a couple of months now on Knowledgeblog since JISC funded the project. My one day a week working on developing the tools and workflows for lightweight publishing has presented totally different challenges to the majority of my work, and I’m really enjoying it so far. Hopefully we’re engaged in building something that a lot of people will find useful in the long run.

Part of what will make the project useful to as many people as possible is the incremental goals that we will be combining into the whole platform, but that will hopefully be useful to a lot of folks in their own right. The first of these milestones is MathJax-LaTeX, a plugin for WordPress that renders mathematical equations in as attractive a way as possible.

WP-LaTeX is the usual way to do this in WordPress. This plugin takes inline LaTeX code in blog posts, and converts it into PNG images. These images are pretty good, they look good at the default resolution and they do the job, but we thought there might be a better way. Images are not particularly accessible, and they don’t scale very well (as you zoom in on a page, they start to pixelate pretty badly). It also requires running LaTeX locally, or on a third-party server, which might be undesirable for some people.

I’m aware of MathJax because I used to listen to the Stack Overflow podcast, and Joel and Jeff talked about it in one episode in relation to the Math Overflow site, because it was being leveraged there to render the large quantity of equations that a site like that requires. MathJax is a Javascript library that interprets LaTeX and MathML, and renders it as scalable web fonts inline. The LaTeX that is interpreted remains in the source of the page, and the equations are not images, so they scale perfectly with the rest of the text on the page. So the question is, what’s the best way to use MathJax to render equations in blog posts?

The instructions on the MathJax page tell you to edit the header of your blog theme to introduce the Javascript library on every page of the blog. We thought that using a plugin to inject the Javascript only on the pages it is required would be more efficient (it’s a big library, and you don’t want to load it on every page if you don’t have to). It also allows us to stay compatible with WP-LaTeX, because we can leverage the shortcode API that is a brilliant part of the WordPress environment.

Well, the MathJax-LaTeX plugin was published this week, you can download it now, and there’s a page on knowledgeblog.org describing in full how it works. If you’ve used WP-LaTeX in the past, MathJax-LaTeX understands the same syntax, so you can replace one with the other, if you wish.

Here’s a few examples:

The probability of getting (k) heads when flipping (n) coins:

P(E)   = {n choose k} p^k (1-p)^{ n-k}

This is an inline equation: sqrt{3x-1}+(1+x)^2  it should be rendered without affecting the text around it.

OK, one more, definition of (e):

e = lim_{ntoinfty} left( 1 + frac{1}{n} right)^n


If you want to keep tabs on how Knowledgeblog is developing, you can follow us on Twitter, watch our Google Code repository, and keep an eye on the site.


(Photo courtesy of bourgeoisbee on flickr.com)