CAARD:Pepsins A1a d2

Revision as of 10:45, 8 May 2009 by imported>Gydbwiki
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Welcome to the CAARD
The Clan AA Reference Database (CAARD), an in-progress database developed to investigate the major consensus and the phylogeny of clan AA to typify the different protein families according to prior (and further) estimations of their relationships using phylogenies, ancestral reconstructions, sequence logos and HMMs.

Ancestral Maximum Likelihood (ML) Joint Reconstruction

Comments

Ancestral ML reconstructions were performed using FastML 2.02 (Pupko et al. 2000). The tool generates six outputs:

  • Pin; reconstructed ancestral NJ tree in newick format.
  • Fin; parental relationship among reconstructed nodes and contemporary sequences.
  • Jrof; multiple alignment of contemporary sequences and nodes reconstructed with the Joint method.
  • Jpf; joint probability per position and total log likelihood.
  • Mrof; multiple alignment of contemporary sequences and nodes reconstructed with the marginal method
  • Mpf; marginal probability and total log likelihood

Top

Pin output : Ancestral tree

((CatDchi2:0.124412,CatDonc2:0.082390)N2:0.053454,(CatDceg2:0.248584,(NapAmo2:0.368014,((CatDdro2:0.191392,CatDschi2:0.321620) N6:0.074849,(((Renrat2:0.220135,Rencal2:0.227798)N9:0.224064,(Eimpeim2:0.451561,Plasmep1pl:0.675691)N10:0.047604)N8:0.085498, ((Sccpepcan2:0.426127,(Gastmac2:0.427370,(Mucorhi2:0.784918,((SAP7can2:0.674817,Cparapcan2:0.609072)N16:0.193456, ((Memapsin1h:0.347342,Memapsin2h:0.251921)N18:0.495260,(Sypepsyn2:0.560354,(Enpepcry2:0.352425,Ppeppen2:0.254676)N20:0.205481) N19:0.158248)N17:0.068820)N15:0.012393)N14:0.116440)N13:0.028146)N12:0.089438,(CatExen2:0.270432,(PepArin2:0.227019, (Pepgal2:0.337171,PepFequ2:0.414174)N23:0.021412)N22:0.070832)N21:0.044379)N11:0.027323)N7:0.061125)N5:0.043160)N4:0.007450) N3:0.032229,(CatDgal2:0.208863,(CatDhum2:0.146380,CatDhyn2:0.119802)N25:0.028346)N24:0.007137);

Top

Fin output: Parental relationship among reconstructed nodes and OTUs

Show/Hide
Name Father Distance to father Sons
CatDchi2N20.124412-
CatDonc2N20.0823897-
CatDceg2N30.248584-
NapAmo2N40.368014-
CatDdro2N60.191392-
CatDschi2N60.32162-
Renrat2N90.220135-
Rencal2N90.227798-
Eimpeim2N100.451561-
Plasmep1plN100.675691-
Sccpepcan2N120.426127-
Gastmac2N130.42737-
Mucorhi2N140.784918-
SAP7can2N160.674817-
Cparapcan2N160.609072-
Memapsin1hN180.347342-
Memapsin2hN180.251921-
Sypepsyn2N190.560354-
Enpepcry2N200.352425-
Ppeppen2N200.254676-
CatExen2N210.270432-
PepArin2N220.227019-
Pepgal2N230.337171-
PepFequ2N230.414174-
CatDgal2N240.208863-
CatDhum2N250.14638-
CatDhyn2N250.119802-
N1root!-N2 N3 N24
N2N10.0534545CatDchi2 CatDonc2
N3N10.0322288CatDceg2 N4
N4N30.00745046NapAmo2 N5
N5N40.0431602N6 N7
N6N50.0748491CatDdro2 CatDschi2
N7N50.0611249N8 N11
N8N70.0854978N9 N10
N9N80.224064Renrat2 Rencal2
N10N80.0476035Eimpeim2 Plasmep1pl
N11N70.0273227N12 N21
N12N110.0894381Sccpepcan2 N13
N13N120.028146Gastmac2 N14
N14N130.11644Mucorhi2 N15
N15N140.0123925N16 N17
N16N150.193456SAP7can2 Cparapcan2
N17N150.0688203N18 N19
N18N170.49526Memapsin1h Memapsin2h
N19N170.158248Sypepsyn2 N20
N20N190.205481Enpepcry2 Ppeppen2
N21N110.0443787CatExen2 N22
N22N210.0708323PepArin2 N23
N23N220.021412Pepgal2 PepFequ2
N24N10.00713701CatDgal2 N25
N25N240.0283457CatDhum2 CatDhyn2

Top

Jpf output: Joint probability per amino acid position and total Log Likelihood

Position Joint probability Position Joint probability Position Joint probability Position Joint probability
04.94094e-017374.31185e-005743.51464e-0351115.33244e-016
15.57941e-007380.000120145751.13017e-0441122.75464e-010
28.38563e-012391.06592e-016761.63349e-0241133.30375e-033
34.85499e-005409.80352e-028774.58615e-0301142.87831e-027
43.1369e-010412.93883e-017782.05358e-0051151.33256e-021
55.07064e-011421.15015e-031791.37119e-0151166.8233e-011
64.62614e-018432.60512e-018802.56005e-0221170.0013376
78.04008e-018442.00401e-032812.44994e-0271183.50171e-006
81.60938e-028453.0187e-005822.2556e-0341197.00563e-019
98.32192e-016461.75929e-026834.4112e-0311204.6136e-025
101.6694e-012473.16716e-005846.37787e-0121215.21357e-021
115.70622e-035482.12209e-005850.000641181227.36872e-013
123.69498e-032490.000557609862.26245e-0251231.39168e-012
131.18278e-029503.02287e-005871.04723e-0281244.85499e-005
145.21348e-023510.000168844883.68479e-0231250.00306766
150.000372035521.90925e-032891.13955e-0341266.25802e-013
162.17025e-011534.10285e-030903.48358e-0221271.33452e-020
172.63053e-039549.49964e-028911.09003e-0101281.13531e-010
182.46171e-035551.61502e-030923.99096e-0131291.10816e-017
195.34598e-029562.5761e-016933.19415e-0201302.3978e-014
202.09935e-026571.13087e-007946.13553e-0401314.70512e-031
213.60249e-032586.16256e-005959.94857e-0141322.63348e-017
220.0003113598.61014e-010961.42392e-0241331.49577e-013
232.13297e-030607.12792e-037971.91691e-0201341.04752e-019
245.72777e-016617.323e-021984.54469e-0281352.0358e-016
254.4986e-015622.33191e-026993.0187e-0051362.39004e-009
261.86447e-013636.04569e-0171003.32302e-0251376.84931e-010
275.62491e-005642.44523e-0391011.59141e-0141380.0142937
280.000105914652.7881e-0251022.16154e-0121399.47552e-025
297.82516e-013661.57491e-0231032.4206e-0161406.0131e-028
301.68331e-037671.69989e-0091041.0509e-0301411.56861e-026
315.36881e-027681.84721e-0051057.44251e-0211423.18695e-022
322.78921e-027696.2142e-0091061.12762e-0221435.13349e-025
338.26258e-034701.69934e-0061073.46639e-0121447.15639e-013
344.63722e-030711.69862e-0051081.08623e-0061456.7068e-009
355.85682e-006721.04907e-0071091.45573e-0051461.55118e-013
363.13127e-005731.00911e-0071108.21575e-024  
Total log likelihood of joint reconstruction: -5979.9

Top

Jrof output: Ancestral ML Reconstruction Alignment

There are two methods of ancestral reconstruction - Joint and Marginal. In this section, we provide a multiple alignment including both input peptidases and ancestral ML sequences reconstructed using the Joint method.The alignment is available in several formats clicking below the option "Set 1". To build HMM profiles and MRC sequences we removed non-informative amino acid stretches and gaps from several ancestral ML reconstruction analyses You can also retrieve the processed Jrof output, clicking below the option "Set 2". Note however that should you cannot select option 2 is because the output was not processed. <align id="pepsins_a1a_d2" folder="jrof"></align>

Top

Mpf output: Marginal probability per amino acid position and total Log Likelihood

Position Joint probability Position Joint probability Position Joint probability Position Joint probability
01.43403e-016370.000209369741.66248e-0331112.55966e-015
15.69315e-007380.0011179752.15894e-0421125.86543e-009
21.01213e-011395.273e-016761.27158e-0231136.3741e-032
34.86247e-005401.0718e-025771.05031e-0291149.42056e-027
43.52034e-010417.42972e-017780.0004867281152.596e-020
57.13239e-011422.93695e-031794.61461e-0151164.6494e-010
65.62984e-018431.37577e-017801.1481e-0211170.0307901
71.13039e-016442.52127e-031817.60626e-0261180.000120002
85.3983e-027453.01897e-005821.21371e-0321197.5057e-018
91.07048e-015463.026e-026832.78912e-0291201.29241e-023
102.26856e-012470.000652008841.3444e-0111212.66532e-020
113.87047e-033480.00130118850.004791641221.60138e-012
122.90212e-031490.00387182867.19269e-0241234.63714e-012
132.02869e-028500.000773129875.70947e-0281244.86247e-005
145.82542e-022510.00315376882.17718e-0221250.051544
150.068765529.97463e-030892.38317e-0331267.62804e-013
161.46816e-009531.19872e-028902.36317e-0211271.00179e-019
171.20194e-037545.06038e-027911.91207e-0091281.94684e-010
184.52193e-033553.39022e-029922.28514e-0111291.83492e-017
199.06587e-028563.46636e-016939.53966e-0191305.55369e-014
202.71969e-025576.74451e-006948.78935e-0381314.36333e-030
215.57062e-031580.000711655952.95601e-0131322.52605e-016
220.00780915591.13836e-009964.52399e-0231331.87238e-013
233.47568e-029602.30963e-034975.83155e-0191349.8498e-019
242.01026e-014614.70893e-020984.71229e-0271353.48592e-016
251.22459e-014621.62484e-025993.01897e-0051364.35359e-009
262.68186e-013631.13445e-0161001.2267e-0231379.26567e-010
270.00110595641.20312e-0371012.40957e-0131380.073152
280.000351207651.2696e-0241021.27481e-0111398.19025e-024
292.73261e-011663.06999e-0221031.05068e-0151407.5968e-027
303.75556e-036672.07178e-0091042.42764e-0291413.38358e-025
312.22656e-025680.0005352291051.33533e-0191426.15789e-022
322.10844e-026693.86228e-0071062.77548e-0211435.17223e-024
338.973e-032705.46968e-0051073.49388e-0111444.99939e-012
341.19482e-028710.0006335391080.0001338711451.17126e-008
355.48105e-005721.03907e-0061090.0004373521462.55118e-013
360.000343088731.43079e-0061101.83318e-023  
Total log likelihood of joint reconstruction: -5641.19

Top

Mrof output: Ancestral ML Reconstruction Alignment

There are two methods of ancestral reconstruction - Joint and Marginal. In this section, we provide a multiple alignment including both input peptidases and ancestral ML sequences reconstructed using the Joint method.The alignment is available in several formats clicking below the option "Set 1". To build HMM profiles and MRC sequences we removed non-informative amino acid stretches and gaps from several ancestral ML reconstruction analyses You can also retrieve the processed Jrof output, clicking below the option "Set 2". Note however that should you cannot select option 2 is because the output was not processed. <align id="pepsins_a1a_d2" folder="mrof"></align>

Top

Models

Error creating thumbnail: Unable to save thumbnail to destination

Sequence logo constructed from the input of the processed Jrof alignment. In every position, each residue is a letter whose height is proportional to its frequency multiplied by the information content of each position measured in bits. Letters are placed such that the most common is at the top.

  • Basic residues are represented in red
  • Hydrophobic residues in black
  • Amino acids frequent in β-turns (G and P) in dark grey
  • Small nucleophiles in violet
  • Acidic residues in orange
  • Acidic-relative amides in green

The logo was constructed using ChekAlign server with the Shannon's algorithm (Shannon 1997) and options "include gaps" and "Correction factor". Gaps are not represented by any symbol but occupy a blank also proportional to its frequency and, for aesthetic reasons, always at the top. Maximum entropy is log221. The alignment gap is considered to be another state or amino acid species.

Top

HMMs

>AP_pepsins_a1a_d2 profile HMM generated consensus sequence
vDTGTSLitgPssvvqalqkaiGAtessdgeYvvnCskintLPtitFtlgGkqytLpps
dYvlqvsGiClsGfggmDipplwILGDvFlrkyYtVFDrdnnrVGF

Top

Pairwise alignment between the MRC sequence and the DTG/ILG template HMM-profile

         Domain 1 of 1, from 1 to 98: score 40.9, E = 4.8e-13

DTG_ILG template *->vDTGAsvlsviskecklaqklgltrkkafdp..SS...Y.v.C..iv
                    vDTG+s+++ +s  + +a++++      +++++SS+++Y v+C++i+
AP_pepsins     1    VDTGTSLITGPSSVV-QALQKA------IGAteSSdgeYvVnCskIN 40   

                 tllsysqPssktsttaqdtirgagGqskiyvSklktsgqirknllslvti
                      ++P+  ++      + + gG  k+y    + +++       ++ +
AP_pepsins    41 -----TLPT--IT------F-TLGG--KQY----TLPPS-------DYVL 63   

                 kitkGnvTevenrslpsdgvflvv.tdpedqksrydvILGrldfLrqlns
                 ++  G       +  +    f    ++p      + +ILG+  fLr++++
AP_pepsins    64 QVS-G------ICLSG----F-GGmDIP------PLWILGD-VFLRKYYT 94   

                 vhidl<-*
                 v +d+   
AP_pepsins    95 V-FDR    98   

Top

Cite this site

Llorens, C. Futami, R. Renaud, G. and A. Moya (2009). Bioinformatic Flowchart and Database to Investigate the Diversity of Clan AA Peptidases.Biology Direct, 4:3.




Welcome to the Gypsy Database (GyDB) an open editable database about the evolutionary relationship of viruses, mobile genetic elements (MGEs) and the genomic repeats where we invite all authors to contribute with their knowledge to improve and expand the topics.
Cite this project:

Llorens, C., Futami, R., Covelli, L., Dominguez-Escriba, L., Viu, J.M., Tamarit, D., Aguilar-Rodriguez, J. Vicente-Ripolles, M., Fuster, G., Bernet, G.P., Maumus, F., Munoz-Pomer, A., Sempere, J.M., LaTorre, A., Moya, A. (2011) The Gypsy Database (GyDB) of Mobile Genetic Elements: Release 2.0 Nucleic Acids Research (NARESE) 39 (suppl 1): D70-D74 doi: 10.1093/nar/gkq1061

Contact - Announcements - Acknowledgments - Terms of use and policy - Help - Donate
Donating legal disclaimer - Terms and conditions of the donation