MyBASE is an integrated platform for the functional and evolutionary genomic study of the important bacterial genus Mycobacterium. It is believed that genomic polymorphisms, especially long sequence polymorphisms (LSPs), contribute to the pathological phenotypic outcomes of different mycobacterial strains. Currently there are more and more literatures published different types of polymorphism data and functional annotations. MyBASE is thus developed to study genomic diversity especially LSPs, virulence genes, gene function and evolution by comparative genomic analysis of mycobacterial species through large-scale data integration, data production and data mining.
MyBASE Data Content:
a) NimbleGene whole-genome tiling microarray experiments in our lab.
b) extensive literature review and data curation on LSPs, virulence factors, and essential genes.
c) literature review and integration of public resources, including complete genome sequences, annotations and genome structures.
In this search engine, users can input various keywords to find genes of interest, including species in Mycobactierum genus, gene name or accession number, external ID (like protein accession ID, protein GI, InterPro ID, Uniprot-SwissProt ID, EC_number), gene product and function or note by NCBI, biological annotation by MyBASE (mainly virulence factor). Meanwhile, you can find the genes belong to some COGs by COG name or description, genes located in some LSPs/RDs by name, experiment or literature information.
The Sequence Search is for sequence BLAST. The BLAST database are all protein/nucleotide sequences of Mycobacterium. The best BLAST match sequences will be shown. User can also link the gene to the detailed gene information page.
In the result page, user can explore by all the tools provided by MyBASE. Users can get the annotation information from the gene info page, especially the annotation by MyBASE about virulence or essentiality. If the gene located in some LSPs/RDs, user can check the information of LSP in Polymorphism page. We also predicted the homologous genes of 18 genomes in MyBASE. In Homology page, user can get all homologs of interested gene in Mycobacterium genus. Genome Viewer will show the gene in a circular genome map, which includes virulence gene and pseudo gene annotations, operon, LSPs/RDs etc. Zoom-in, zoom-out and rotate functions are provided as well. Genome Comparison page is for multiple-genome comparison analysis by using current gene as an anchor gene.
LSPs/RDs data is an important feature of MyBASE. MyBASE collected and curated LSP data both from literature and experiments in our own lab, that is, we have used NimbleGene tiling microarray for whole-genome comparison of 13 Mycobacterium bovis BCG strains, with subsequently confirmation by PCR and DNA sequencing.
The search helps user to find LSPs/RDs of interest. LSPs name or synonym is the name of LSPs/RDs in the original references, although the name is a little confused in different references, but it is still the identity for RDs. All LSP experiments, no matter the method is microarray or PCR, they all have query species and reference species. The query is samples that the authors studied for genomic diversity, while reference is the genome that query samples compare to, lots of which were Mtb H37Rv. Users can also search LSPs described in some references by citation information, such as author, year, journal, title etc.
LSP Viewer will help to show the distirbutions of the selected RDs in reference genome. All LSPs/RDs described in the same paper are drawn in one line. Mouse over to see the reference information in which LSPs/RDs were described.Figure 1 LSP viewer of select LSPs/RDs
The Multi-genome Comparison Viewer (MCV) allows users to rapidly align and compare mycobacterial genome synteny by selecting an anchor gene of interest. It is helpful for the genome structure and evolutionary analysis of Mycobacterium. Users can select any numbers of genome, zoom in or out and move upstream or downstream of genome in the viewer. Genes in MCV with the same color-coding are predicted homologs via COG designation, while grey means no homolog detected. Various properties of a gene, such as virulence factor, pseudogenes, genes in an operon or in a polymorphic region, are labeled with different legend, as follows:
common gene in an operon
virulence gene in an operon
pseudogene in an operon
common gene in a polymorphic region (e.g. RD)
virulence gene in a polymorphic region (e.g. RD)
pseudogene in a polymorphic region (e.g. RD)
common gene in an operon and polymorphic region (e.g. RD)
virulence gene in an operon and polymorphic region (e.g. RD)
pseudogene in an operon and polymorphic region (e.g. RD)
By clicking the gene, users can either re-anchor the viewer with this gene or goes to the detailed gene information page, Users can browse the gene function description when mouse over. Hold "CTRL" to select multiple genomes.
Figure 2. A comparison picture of the trp operon among different mycobacterial genomes generated by MCV.
The Genome Viewer is created by CGView. The viewer not only shows gene informtion, but also LSP, operon, RNA, pseudogene, virulence factor information. Users can visualize a particular segment of a genome by zooming in/out, rotating forward/reverse and designating the start and end position. In the zoom-in page, users can go to gene, RD and operon page for detailed information (Figure 3).
Figure 3. Genome Viewer near RD1 region in Mycobacterium tuberculosis H37Rv.