Getting the picture - part two

If you read my last blog entry and thought "hey, why don't I see more results represented using circles?" then you're either going to like this second entry or you're going to learn what's wrong with circular charts.

You may also be thinking "wait, I thought you were trying to make results easier to decipher! These look dense and complex!" You're right! Circular diagrams are helpful specifically because they aren't constrained by rules about where text should go or whether a horizontal order is secretly an x-axis. We inherently treat any circle as a collection of objects. It may be due to the prevalence of pie charts; pies work well because they're each just a summary of subsets of a collection. Pie charts come with their own problems, though, primarily that their subsets are fractions rather than actual values. It also isn't easy to visually compare those subsets. 

An example pie chart. Don't make something like this. I made it on Chartgo.com.

An example pie chart. Don't make something like this. I made it on Chartgo.com.

If we're not concerned about comparing exact group size and just want a way to keep a bunch of groups in one place, then circular charts are great! In bioinformatics, we often have to use hierarchies like taxonomic classifications. The following tools make visualizing such hierarchies easier. I've mixed two different types of tools: those producing circular trees and those simply visualizing hierarchies. This also isn't intended to be an exhaustive list. Some combination of the approaches - or even tools building upon these approaches -  may be most appropriate for your needs. 

 

iTOL (interactive Tree Of Life)

Perhaps you'd like to keep the underlying tree structure of your data intact but it's way too large to not be circular. Here's an EMBL-hosted project for producing phylogenetic trees, especially the large, circular kind. It works nicely with taxonomy trees produced using phyloT. Unfortunately, it's written in Flash and, depending on your platform, browser, etc., may not render trees properly or at all. (I can't get iTOL to produce a usable tree in Chrome at the moment but Explorer 11 works fine, oddly enough.) Hopefully they'll get the site updated soon. In the meantime, see below for the kind of trees iTOL can produce.

All the taxonomic groups in Mammalia, as per iTOL. Don't blame me for polytomy.

All the taxonomic groups in Mammalia, as per iTOL. Don't blame me for polytomy.

 

Krona

This is a tool for visualizing hierarchical data. No trees here - just the subsets of your data. Your data doesn't even have to be a taxonomy, though that's what Krona was designed for. The visualization is quite nice for interactive use: labels resize and reposition themselves automatically, subsets resize themselves to occupy the full chart when they're zoomed in on, and different chart views can be saved and shared as links. You can see it in action as part of the Islander project (a database of genomic islands in Bacteria and Archaea, courtesy of Sandia National Laboratories), MG-RAST (a server and project for analyzing metagenomics sequence data; it requires registration and really dislikes Chrome), and other projects.

Sources of known genomic islands in Firmicutes - those present in the Islander database, at least. Notice how Krona summarizes some groups as "n more". That's very helpful when there are as many group names as we have here.

Sources of known genomic islands in Firmicutes - those present in the Islander database, at least. Notice how Krona summarizes some groups as "n more". That's very helpful when there are as many group names as we have here.

Treevolution

Like the interactive Tree of Life project, this software assumes you're working with a phylogenetic tree of some sort. It produces circular output by default. The tree can be freely rotated (frotated?), a feature most tools appear to lack. Treevolution is written in Java so it should work on a decent range of platforms. It can also produce images of your tree in a variety of vector and bitmap formats, though its output isn't always clear for large trees (in some cases, like PDF output, it renders the whole thing as an overly-pixelated bitmap). 

The Treevolution interface and an example tree. The interface is a bit like that in The Sims, with its curvy menus and cheery icons.

The Treevolution interface and an example tree. The interface is a bit like that in The Sims, with its curvy menus and cheery icons.

 

sunburstR

If you prefer the mind-numbing level of control over figure details that you can only get in a package like R, then here's a circular plot-maker for you (and for me, though I only found this one recently). Another detail I found recently: these visualizations can be called "sunburst" charts. Ignore your instincts and stare directly into the sun(burst). Or, er, just read this entry at Building Widgets

Note that this is more of a widget than an old-fashioned 2D visualization, but it's quite attractive and interactive. Your R skills should be well-honed, though, as this widget doesn't come with much documentation. The input data format is crucial; set your data up in some kind of separated values and two columns like the following:

groupA-groupA2-groupA3 900

groupB-groupB2-groupB4 400

groupX-groupX5-groupX6 400

where the first column is the place in the hierarchy and the second is the value determining the group size. Order matters, so the first group in each list will always be the top of the hierarchy and so on. So here's what we get from that example data:

Not very exciting, but fake data seldom is.

Not very exciting, but fake data seldom is.

I unfortunately don't have a workflow in place to get the full, interactive output from RMarkdown to blog-friendly HTML, so you'll have to trust me that the widget works as advertised. It scales in size well, a useful property as exporting it from R to a vector image format like SVG doesn't appear to work. I didn't spend much time on that aspect.

 

Please feel free to notify me about any novel examples you've found or created!