The usual suspects: databases of virulence factors in Bacteria

Bacteria have earned much of their reputation. Despite their close relationship with humanity, whether in our microbiomes or in our food and beverages, various bacterial species still pose a threat to our society. Luckily, genomics allows us to quickly compare virulent bacterial strains with more temperate ones. The result is more data, and where there's data, there's eventually a database.

Here are a few on the subject:

PATRIC - The Pathosystems Resource Integration Center
Most recently, PATRIC has added a manually literature and database-curated set of virulence factors from six big-name bacterial pathogen genera, including Mycobacterium and Listeria. They now specifically detail virulence factors for ten different pathogen species. Many of these are virulence factor homologs identified by protein BLAST.

PATRIC doesn't make it easy to just get virulence factors, but if you're on an organism page, hit the Specialty Genes tab and then click the Virulence Factor checkbox on the left to filter.
Listeria. Always clean your vegetables.
A recent citation: Mao, C. et al. Curation, integration and visualization of bacterial virulence factors in PATRIC. Bioinformatics 31, 252–8 (2015).

VFDB
Oh, wait, this one's included in PATRIC now as far as I can tell. Its last major release was in 2012.

PAI DB
This database is a bit different: it contains pathogenicity islands rather than just virulence factors. Pathogenicity islands are rich in virulence factors and all manner of other novel genes, so they're important to consider in any discussion of virulence factors. PAI DB appears to be actively updated and even contains a pathogenicity island search function (PAI Finder) though it's just based on BLASTing against known islands and virulence genes.

Their logo is perfect.

A recent citation: Yoon, S. H., Park, Y.-K. & Kim, J. F. PAIDB v2.0: exploration and analysis of pathogenicity and resistance islands. Nucleic Acids Res gku985– (2014). doi:10.1093/nar/gku985

MVirDB
A database with some overlap between PATRIC, MvirDB doesn't look like much and I'm not sure if it's actively maintained. That being said, it's essentially just a few tables, accessible through the download links on the left of the page. MVirDB unifies some smaller databases so it may contain completely unique sets of annotations.

Uniprot
OK, it's not explicitly a database of virulence factors. It's easy to sort it by GO terms, though, like the term for pathogenesis. That GO term has a keyword in Uniprot and you can search for it by dropping this in the search box: "keyword:"Virulence [KW-0843]"". The results can be filtered down to the reviewed Swiss-Prot set for a subset of reliable virulence factors.

Try it yourself!

The NCBI databases don't make it quite this easy, but linking through the Biosystems database seems to help. Here's an example.





I read this Forbes article about 23andMe on the bus this afternoon. It details a perfect instance of the customer as commodity. I'm excited about the possibilities regarding personal genomics, especially as whole-human-genome sequencing gets cheaper and easier to interpret. Control over the resulting data will need to remain in the hands of the consumers, though. Those who volunteer their genetic material for analysis need access to the results.

Opener and opener

Here's another decontextualized bit from Emotional Intelligence 2.0: "Be Open and Be Curious".* I think this may be one of the more useful strategies presented in the book, though that makes sense as it's strategy #1 on their list of Relationship Management techniques. The strategy could be reworded in this way: provide context for yourself and others.

It's easy to interact with people while making assumptions about them. We really don't have any other choice as it's just more efficient that way. If someone's behaving angrily, we tend to assume that something irritated them and leave it at that. If they're acting in a way we find strange or illogical, we tend to assume that's just how they are. These tendencies rub me the wrong way. There's a reason for everything, so why not pursue additional detail? A lack of context really may be one of the most pervasive communication issues of the information age.

The whole strategy of "be open and curious" is vague, of course. It's something we all have to do to some extent but also something we might avoid when it's most necessary. As it's worded, the strategy also doesn't really provide any meaningful way to get beyond small talk. That being said, I'd like to think it's applicable to more than just in-person interactions. It may be most effective in spaces where context is at a premium (i.e., social networks) and the culture of anonymity clashes with the demand for content. I'll try it out and see how it works.

*Alternate lesson: tell everyone you were in the Marines at every possible convenience unless that's actually true.

Ambience

The music for today is...well, it's this playlist. It's my carefully curated list of background noise, ideal for working. I've kept vocals to a minimum and it's all ambient but not necessarily peaceful. Give it a try if you can. The embedded playlist here only contains 200 tracks but the full version contains exactly 1000.

Oct 20, 2015 update: the full playlist is now more than 6,000 items. I've expanded its scope to be slightly more active and noticeably darker, so it keeps the music interesting without becoming intrusive. Well, it might be volatile at times. That's to remind you to take a break.

The wonders of combustion. Photo by Stewart Butterfield.
Happy new year to you, reader!

The keyword at this time of year is usually resolutions*, so I'll share one with you: I will read more papers from my field. It's usually not to difficult for me to skim a paper or two each day, at least, but they frequently aren't directly relevant to what I'm doing. There's just so much interesting material out there! Keeping a wide scope when reading research papers certainly aids with perspective but it provides less material to implement immediately.

Luckily, fresh research has never been easier to find. I'll find at least one directly-relevant paper each week and briefly discuss it on this blog. "Directly-relevant" could include computational microbiology, evolution, bacteriophage biology, or even just novel methods and software.

Today's material is a short report by Marc del Grande and Gabriel Moreno-Hagelsieb in BMC Research Notes. It's relevant because it deals with a concept I've explored lately: gene conservation across numerous bacterial species. A group led by Moreno-Hagelsieb already developed a tool for generating a set of non-redundant genomes**, but here they use it to generate a set of prokaryote*** genomes to analyze co-conservation of their transcription factors (TFs). Why transcription factors? Bacterial genomes contain many transcription factors - even more if we count predicted ones - but it isn't clear if they're broadly conserved in pathways or if they usually show up through rapid evolutionary processes like horizontal gene transfer.**** The set of genomes in this paper doesn't include genomes smaller than 2.5 Mb, presumably to avoid the bias of minimal genomes, but it would have been nice to see them.

As with similar studies, this one was limited by the available data. It's difficult to make predictions about TF interactor conservation when we're not sure what these TFs interact with in the first place. In the end, the authors had to predict the existence of TFs across 857 genomes, then examine the TFs and their interactors in their chosen models of E. coli and B. subtilis. Their conclusion: compared to other protein-coding genes, those coding for TFs have fewer conserved potential interactions across Bacteria. The emphasis is mine as these are predicted interactions based on conservation. That's fine unless the TF has some other reason for its conservation. It's an interesting comparison so I'm curious to see how TF interactions look across bacterial protein interactomes (spoiler alert: available data sources are still a limiting factor).  

Citation:
Del Grande, M. & Moreno-Hagelsieb, G. The loose evolutionary relationships between transcription factors and other gene products across prokaryotes. BMC Res. Notes 7, 928 (2014).


*I'm also resolving to use more images in my blog posts.

**Their report promised a web interface but that doesn't seem to have happened.

***The term "prokaryote" is archaic, isn't it? This paper is really only talking about Bacteria.

****These two models aren't mutually exclusive.