Notes on notes 3: it's a process

  • August 11, 2013

It's time for a science story, presented in convenient bullet-point format for easy digestion. Most of this story was first told during the course of my Master's research in Gail Christie's lab at Virginia Commonwealth U., so if you're hungry to know more, please read my Master's thesis (a phrase I don't get to use often, if ever), the corresponding paper by Wall et al., or a more recent paper on the subject by Wall et al. that extends well beyond the material I was concerned with. To be honest, most of my work at the time was a combination of fighting with protein expression conditions and trying to figure out why my bacterial cultures were spontaneously dying, so I'm just glad the story of L27 in the Firmicutes yielded some interesting findings in the long run. Anyway, here are those bullet points I promised:

  • Staphylococcus aureus is a bacterial species best known for causing nasty, antibiotic resistant infections (e.g., those involving MRSA or VRSA strains). It's frequently a benign resident of human skin and nasal passages, though depending on geography and other factors, you may or may not be colonized with it. Here, I'll just call it S. aureus, and I'm not referring to any specific pathogenic strain.
  • Bacteriophage 80alpha is a virus capable of infecting S. aureus. Here, I'll call it 80alpha.
  • 80alpha structural proteins get processed during viral assembly. Or, if we think of these viruses as Ikea furniture, this is one of those designs in which some of the parts have to be trimmed a bit to fit together properly. 
A bunch of 80alpha virions, complete with capsid (or head) structures, tails, and baseplates (those are harder to see; the arrow indicates one seen head-on). Image from  Spilman et al. (2011)  Journal of Molecular Biology  .

A bunch of 80alpha virions, complete with capsid (or head) structures, tails, and baseplates (those are harder to see; the arrow indicates one seen head-on). Image from Spilman et al. (2011) Journal of Molecular Biology.


  • The 80alpha genome doesn't appear to code anything allowing the virus to make those essential cuts. That suggests that its host S. aureus cell provided the necessary enzyme (or, to continue the Ikea metaphor, you're assembling furniture in a friend's house and have to borrow their table saw. Or, to better represent the bacteriophage vs. bacterial host relationship, you steal it out of their garage). 
  • As another clue, the sequence of the 80alpha proteins being cut is a close match to a sequence in S. aureus ribosomal protein L27. So if there's an enzyme that 80alpha sneakily borrows for its own purposes, does S. aureus normally use that enzyme to process its own ribosomal protein L27? Spoiler: yes. This was not previously known to be a common phenomenon among bacterial ribosomal proteins.
  • L27 occupies a rather crucial location in the middle of the ribosome. Recall that ribosomes are the structures in cells responsible for assembling proteins out of amino acids and you'll realize how important it is to have functional ribosomes; ribosomes without L27 don't work terribly well, especially if that extra little bit on the end (the one with the similar sequence to the 80alpha protein, as noted above) doesn't get removed.
  • To make matters even stranger, the extra sequence on the end of L27 is present in S. aureus and many related bacterial species, but not in E. coli. This suggests there's a solid evolutionary reason why the extended protein has been kept around in some bacteria but not others. The "why" part remains unclear, but seeing as this L27 processing is essential to the survival of S. aureus and other potential bacterial pathogens, it may be a good basis for developing new antibiotic strategies.

The "eggnog" part here refers to eggNOG, my favorite database of orthologous groups, or clusters of genes with similar sequences. Similar sequence implies evolutionary relatedness, so resources like eggNOG provide a way to see how many branches of the tree of life contain sequences like the gene coding for ribosomal protein L27 or the enzyme responsible for processing it.

According to eggNOG, L27 is in the bacteria-specific orthologous group ENOG4105K46. The taxonomic profile of that group looks like this:


This doesn't tell us much other than that L27 is broadly conserved - that is, it's seen in a very wide variety of bacterial species. Cyanobacteria, actinobacteria, firmicutes, and proteobacteria are all distantly related but certainly share features like the general structure of their ribosomes. What this doesn't show us is which species have that extended form of L27, like in S. aureus. That's more of a job for a sequence alignment, and we can align the sequences of all 1,660 predicted gene products in this orthologous group. Clustal Omega will do most of the heavy lifting for this sizable alignment but won't distinguish sequences with that extension from others. Luckily we can just guess based on the consensus of the alignment, so after taking a look using Jalview, we see there does appear to be a contingent of sequences (somewhere around 430 out of the 1,660, in fact) with an extra bit on their N-terminal ends. That's the left side here.

L27 consensus.png


This doesn't answer the question of which species have the longer L27. We know from previous comparisons that species in the Firmicutes, like Staphylococcus species, generally seem to have the long form while E. coli and its relatives do not. Can we get more specific? Yes - if we trim the alignment down to just the first few amino acids of the N-terminus and remove gap-only sequences, we can then use a taxonomy-based phylogenetic tree builder to see how diverse this set of species is. I've previously used phyloT for this purpose but they recently switched to an unfortunate subscription-based funding model (though that's preferable to losing the site entirely, I suppose). Luckily the ETE Toolkit treeviewer can handle the same function, as long as we provide it with a newly-constructed Newick format tree. This is such a short sequence that it doesn't really provide the information content necessary for a fine-grained look at possible evolutionary relationships, but we can still compare it to the existing taxonomic groups.

The tree is large and doesn't entirely match up with bacterial taxonomy, as expected, but we do see that Streptococcus and members of the Veillonellaceae family, among others, seem to form some solid clades. Again, this sequence is so short that even small differences can really throw off how related we estimate them to be. The full tree is here though I'm not certain how stable that link is in the long term and it will take a while to load.

Not terribly readable, I know, but that second green column to the right indicates species in Firmicutes.

Not terribly readable, I know, but that second green column to the right indicates species in Firmicutes.


We can also just cut everything down to the taxonomy IDs and use the NCBI conversion tool to get their human-readable names. The resulting list isn't much to look at, but it tells us that, in addition to S. aureus, 12 other species of Staphylococcus, 60 species/strains of Streptococcus, 33 species/strains of Lactobacillus, and even some more exotic Fusobacteria and Acetomicrobium species have genomes appearing to code for the extended L27. Many of these species are at home in extreme conditions (temperatures or pH, in particular) so we could guess that the extension relative to some species less tolerant of extremes confers a benefit to regulating protein assembly under those conditions. 

Conclusion: bacterial ribosomal proteins may be more diverse than we may have otherwise thought and may be involved in some neat regulatory functions.