Multiple alignments are central to analyze the gene and protein sequence patterns and functional homologies by constructing HMM profiles, consensus sequences and sequence logos. GPRO constructs Sequence logos from both gapped and ungapped alignments using CheckAlign (Muñoz-Pomer et al. 2008) a logo-maker implementation following the methodology introduced by Schneider et al. (Schneider et al. 1986; Schneider and Stephen 1990) based on Information Theory.
By clicking to the menu path Alignment analysis > Sequence Logos you will find a GUI (Figure 9.1) with a box for paste your multiple alignment in Fasta format (you can also upload the alignment from a file). Then select if you alignment is based on DNA or protein sequences choose a method for constructing the logo. If you want to obtain a significant Logo select the Schneider method which implements three options for applying corrections to alignments with a small number of aligned sequences. If you are just interested in to visualize the most prominent consensus common to your aligned sequences you can try a logos approximation based on a relative frequency analysis. This is not particularly significant under Information Theory but it may give you some keys for further analyses if the Shannon approach would fail because the high divergence of the aligned sequences.
Figure 9.1. Screenshot of the GPRO logos maker tool. Upload/paste a multiple alignment then make click on the button create logos (in a circle) to create a logos representation (within a square) |
Hidden Markov Model (HMM) profiles (Eddy. 1998) are probabilistic models capable to capture specific information of the sequence consensus of a set of aligned sequences. In this regard, the most representative to our knowledge is the HMMER package created by Sean Eddy. GPRO implements a GUI for constructing HMMs and Majority Rule Consensus (MRC) sequences based on multiple alignment input by running HMMER.
Figure 9.2 shows a screenshot of the GPRO GUI for running a HMMER-based server installed in the remote computing cluster. Note that to run HMMER via GPRO you must be connected to the internet. The procedure is similar to those previously mentioned for running of all others GPRO pipelines.
Figure 9.2. Creating HMM profiles and consensus sequences using a HMMER sever via GPRO |
Llorens, C., Futami, R., Covelli, L., Dominguez-Escriba, L., Viu, J.M., Tamarit, D., Aguilar-Rodriguez, J. Vicente-Ripolles, M., Fuster, G., Bernet, G.P., Maumus, F., Munoz-Pomer, A., Sempere, J.M., LaTorre, A., Moya, A. (2011) The Gypsy Database (GyDB) of Mobile Genetic Elements: Release 2.0 Nucleic Acids Research (NARESE) 39 (suppl 1): D70-D74 doi: 10.1093/nar/gkq1061