Author: Natasha Glover
Last updated Aug 2018
You are interested in studying gene families. There are exercises below, each using different ways to access OMA: the browser http://omabrowser.org, using OMA Standalone, and programmatically via the API.
You ran a network analysis and found that the human gene with UniProt ID OR2L5_HUMAN is involved in an interesting pathway. Search for this gene on the OMA homepage.
You recently read about carbohydrate-active enzymes (CAZymes) that are potentially involved in degradation of polysaccharides in A. bisporus when grown on compost. You want to know if this gene is conserved in Penicillium. Here is the protein sequence:
MLFKLASTVFLAQFFALTSAQTISGPFDCLPAGNSYTLCQNLWGRTSGVGSQSSTLVGSSGDSVSW
STNWNWQNNQNSVKSYANIIADNAMGKQLSAVTSAPTSWSWSYETKSDPIRANVAYDLWLGASPVG
APASRNSSYEIMVWLSRQGGIQPIGGPTASGIQLAGNTWTLWSGPNSNWQVLSFVSDTGDIPNFNA
DFKEFFDYLVQNSGVSSQQYVQAIQAGEPFTGSANLVTHSYSVALN
Search for this protein on the OMA website.
Tip: search for the most similar sequence to this gene by selecting “Protein Sequence” from the drop down menu and pasting part or all of the sequence)
You want to know how conserved the mushroom genome is with the Ascomycota clade. Perform the following tasks and answer the following questions.
Export the following genomes from omabrowser.org: Schizosaccharomyces pombe (strain 972 / ATCC 24843), Saccharomyces cerevisiae (strain ATCC 204508 / S288c), Agaricus bisporus.
Tips:
- From OMA home page, click Download → Export All/All
- Once on the export page, click on the (?) for more information on how to export genomes.
- Create a working directory and copy the downloaded .tgz file into it
- Uncompress the file: tar -zxvf AllAllExport953417362.tgz
Examine the contents of the archive. We have exported the precomputed all-by-all for our 3 fungal genomes in order to save time when running OMA standalone. The genomes are stored in the DB/ folder.
Now we want to add our own, newly sequenced genome. (For demonstration purposes, this genome is reduced to cut down on computation time.) Add the following dummy fungal genome to your dataset: mygenome.fa
Tip: download and copy the genome to the DB folder
Now run OMA standalone on the 4 fungal genomes.
Tip: although it may be more convenient to install OMA standalone on your system, you don’t have to: simply navigate inside the OMA.2.2.0 folder, then run the following:
bin/oma
Now the Output folder is created, check it out. OMA has estimated a rough species tree from orthologous groups, using a distance-based method. (If you know the phylogeny of the species, you can give a predefined species tree in the parameters file)
Examine the pairwise orthologs:
grep -v "#" [file] | cut -f 5 | sort | uniq –c
If you want to retrieve information from the OMA browser in R, you can also use the OmaDB package.
Install the OmaDB bioconductor package, and load it.