CAARD:Pepsins A1a d1

Welcome to the CAARD
The Clan AA Reference Database (CAARD), an in-progress database developed to investigate the major consensus and the phylogeny of clan AA to typify the different protein families according to prior (and further) estimations of their relationships using phylogenies, ancestral reconstructions, sequence logos and HMMs.

Ancestral Maximum Likelihood (ML) Joint Reconstruction

Comments

Ancestral ML reconstructions were performed using FastML 2.02 (Pupko et al. 2000). The tool generates six outputs:

  • Pin; reconstructed ancestral NJ tree in newick format.
  • Fin; parental relationship among reconstructed nodes and contemporary sequences.
  • Jrof; multiple alignment of contemporary sequences and nodes reconstructed with the Joint method.
  • Jpf; joint probability per position and total log likelihood.
  • Mrof; multiple alignment of contemporary sequences and nodes reconstructed with the marginal method
  • Mpf; marginal probability and total log likelihood

Top

Pin output : Ancestral tree

((Plasmepsin:0.619584,((Mucorhi1:0.588375,Eimpeim1:0.453746)N4:0.071315,((Sypepsyn1:0.330143,(Ppeppen1:0.259579,Enpepcry1:0.237607) N7:0.229824)N6:0.135018,((Memapsin1h:0.315785,Memapsin2h:0.249853)N9:0.402007,(Cparapcan1:0.435582,SAP7can1:0.508581)N10:0.163198) N8:0.054444)N5:0.032440)N3:0.055109)N2:0.058724,((CatDdro1:0.207070,Sccpepcan1:0.326066)N12:0.032808,(CatDceg1:0.338483, (CatDschi1:0.205753,(CatDgal1:0.133100,(CatDhum1:0.267800,(CatDhyn1:0.103214,(CatDonc1:0.039951,CatDchi1:0.050869)N18:0.062134) N17:0.020139)N16:0.012072)N15:0.074766)N14:0.020364)N13:0.023514)N11:0.044815,((CatExen1:0.226556,(Gastmac1:0.308945, (Pepgal1:0.218648,(PepFequ1:0.295050,PepArin1:0.210448)N23:0.037629)N22:0.033798)N21:0.029503)N20:0.111089,(NapAmo1:0.298268, (Renrat1:0.178563,Rencal1:0.219815)N25:0.204093)N24:0.024229)N19:0.006440);

Top

Fin output: Parental relationship among reconstructed nodes and OTUs

Show/Hide
Name Father Distance to father Sons
PlasmepsinN20.619584-
Mucorhi1N40.588375-
Eimpeim1N40.453746-
Sypepsyn1N60.330143-
Ppeppen1N70.259579-
Enpepcry1N70.237607-
Memapsin1hN90.315785-
Memapsin2hN90.249853-
Cparapcan1N100.435582-
SAP7can1N100.508581-
CatDdro1N120.20707-
Sccpepcan1N120.326066-
CatDceg1N130.338483-
CatDschi1N140.205753-
CatDgal1N150.1331-
CatDhum1N160.2678-
CatDhyn1N170.103214-
CatDonc1N180.0399508-
CatDchi1N180.0508695-
CatExen1N200.226556-
Gastmac1N210.308945-
Pepgal1N220.218648-
PepFequ1N230.29505-
PepArin1N230.210448-
NapAmo1N240.298268-
Renrat1N250.178563-
Rencal1N250.219815-
N1root!-N2 N11 N19
N2N10.0587244Plasmepsin N3
N3N20.0551087N4 N5
N4N30.0713151Mucorhi1 Eimpeim1
N5N30.03244N6 N8
N6N50.135018Sypepsyn1 N7
N7N60.229824Ppeppen1 Enpepcry1
N8N50.0544444N9 N10
N9N80.402007Memapsin1h Memapsin2h
N10N80.163198Cparapcan1 SAP7can1
N11N10.0448152N12 N13
N12N110.0328084CatDdro1 Sccpepcan1
N13N110.0235138CatDceg1 N14
N14N130.020364CatDschi1 N15
N15N140.0747663CatDgal1 N16
N16N150.0120716CatDhum1 N17
N17N160.0201385CatDhyn1 N18
N18N170.062134CatDonc1 CatDchi1
N19N10.00643973N20 N24
N20N190.111089CatExen1 N21
N21N200.0295032Gastmac1 N22
N22N210.0337978Pepgal1 N23
N23N220.0376293PepFequ1 PepArin1
N24N190.0242294NapAmo1 N25
N25N240.204093Renrat1 Rencal1

Top

Jpf output: Joint probability per amino acid position and total Log Likelihood

Position Joint probability Position Joint probability Position Joint probability Position Joint probability
01.57781e-009433.81446e-039860.005178031293.12784e-013
13.50414e-006441.73779e-023870.02153511300.00263569
22.48179e-007454.53685e-007881.70663e-0221310.00241536
30.000163858460.000132969891.25428e-0191323.92194e-007
41.11117e-007470.0191658902.57776e-0231330.00699781
51.06101e-011480.00257837915.26452e-0421345.27849e-024
61.16e-011495.93795e-011927.29433e-0241359.3712e-031
71.56882e-010501.92363e-025930.0002400741361.42885e-009
81.83393e-006513.02305e-031942.56137e-0371371.75103e-023
93.57934e-011520.00418673955.15622e-0321382.27454e-021
109.34664e-015530.00379987967.35543e-0141393.5293e-011
111.82275e-011545.33712e-022971.57836e-0361402.28218e-006
120.00121012555.37481e-029981.09138e-0111411.29124e-005
130.000209351562.57533e-014996.13381e-0171427.76068e-023
140.00196933576.11189e-0311002.2621e-0141435.03096e-035
152.24754e-025580.0001075491013.93064e-0181441.23292e-029
161.85793e-025592.73109e-0121023.42667e-0311452.64246e-021
176.01532e-009601.7434e-0191030.0001382631463.82402e-028
180.000275473610.0001638581047.78449e-0331479.59371e-011
190.00359027623.07528e-0211055.61524e-0221487.76025e-006
201.52583e-017631.54296e-0061065.48253e-0141495.84296e-005
212.22854e-020641.10097e-0211071.33023e-0211501.9744e-014
222.89413e-032652.56224e-0251082.81822e-0311511.2577e-028
231.15099e-021662.69949e-0061091.14296e-0291521.23442e-018
248.77868e-025678.72986e-0321106.15328e-0121531.24454e-019
251.46125e-017682.86576e-0271113.14941e-0291546.5831e-025
260.0109094690.001445141126.27328e-0101551.78284e-024
272.61775e-009702.17415e-0191130.003032561565.19879e-033
282.04772e-039711.43954e-0281140.001993131572.42067e-032
292.11629e-033723.50414e-0061150.008263121581.88211e-015
309.77622e-017733.31617e-0231160.001794561593.07712e-023
312.18054e-031749.53364e-0181170.0003248371607.91072e-026
322.36159e-027755.20412e-0261182.66695e-0251611.93661e-025
334.65787e-012769.48168e-0191192.74398e-0191621.36541e-029
346.66723e-021770.0003741231201.03705e-0281635.90665e-033
351.07017e-023780.005697511213.77463e-0151643.88203e-029
363.04273e-031790.004166671228.07189e-0101651.67091e-021
373.275e-031800.003016181230.0001638581661.77373e-007
381.11117e-007810.005178031244.36959e-0111671.70787e-016
391.6086e-014820.003016181251.95391e-0121689.4127e-021
408.98171e-011830.003016181260.0001638581691.50813e-018
412.91712e-015840.005178031272.61029e-0131701.63551e-015
427.52062e-028850.003016181283.66987e-017  
Total log likelihood of joint reconstruction: -6213.39

Top

Jrof output: Ancestral ML Reconstruction Alignment

There are two methods of ancestral reconstruction - Joint and Marginal. In this section, we provide a multiple alignment including both input peptidases and ancestral ML sequences reconstructed using the Joint method.The alignment is available in several formats clicking below the option "Set 1". To build HMM profiles and MRC sequences we removed non-informative amino acid stretches and gaps from several ancestral ML reconstruction analyses You can also retrieve the processed Jrof output, clicking below the option "Set 2". Note however that should you cannot select option 2 is because the output was not processed. <align id="pepsins_a1a_d1" folder="jrof"></align>

Top

Mpf output: Marginal probability per amino acid position and total Log Likelihood

Position Joint probability Position Joint probability Position Joint probability Position Joint probability
03.34251e-009433.05371e-037860.0767481295.75976e-013
13.54939e-006442.42892e-022870.0919041300.058565
22.50684e-007453.49699e-006881.36361e-0211310.068765
30.00016402460.00229331892.50311e-0191322.8083e-006
41.12845e-007470.091904905.49643e-0221330.06183
51.2665e-011480.058565913.32244e-0401342.90441e-022
63.62428e-011496.10382e-011925.33894e-0231351.09448e-029
72.37112e-010505.83566e-024930.005825991361.16641e-008
81.92388e-006518.33377e-030948.95451e-0371372.22109e-022
93.94487e-011520.076748955.03291e-0311382.87431e-020
101.90098e-013530.066005961.16863e-0131394.46269e-010
114.00658e-011541.02315e-021971.69357e-0351404.30976e-005
120.0208031555.71144e-028981.59169e-0111410.000243769
130.00189978566.90962e-014993.46818e-0161421.40991e-021
140.042645572.8051e-0291003.63896e-0131433.48369e-033
151.09625e-024580.0001076961017.38059e-0181442.43725e-028
161.82739e-024593.10544e-0121029.52361e-0301451.0427e-020
172.75977e-008604.32435e-0191030.00178311461.50499e-027
180.00308764610.000164021042.64125e-0311475.89056e-010
190.066005621.05825e-0201053.26134e-0211480.000126302
207.08355e-016633.13799e-0051061.1691e-0131490.000939208
218.27968e-019644.58474e-0211071.2043e-0201505.64029e-014
226.69124e-031659.41842e-0251083.9134e-0301513.93207e-028
236.81495e-021662.72295e-0061093.90017e-0281521.44839e-018
244.4163e-024673.14928e-0301101.15676e-0111531.58141e-019
254.42521e-016682.34229e-0261114.03159e-0281542.88988e-024
260.0385965690.0238261122.65099e-0081551.4066e-023
278.10582e-009708.86579e-0191130.0407521562.7059e-032
281.76359e-037716.60123e-0281140.0687651571.31139e-031
293.22736e-032723.54939e-0061150.0509011582.18749e-015
301.57764e-016735.48229e-0231160.0426451591.33166e-022
315.87258e-030741.28698e-0171170.007773431601.92666e-025
323.10825e-026752.20247e-0251181.55377e-0241615.66728e-024
336.37907e-012763.92732e-0181198.25521e-0181628.35743e-029
341.70946e-020770.001093041203.33439e-0271633.17694e-032
351.4559e-022780.0198031211.11701e-0141643.34513e-028
361.98059e-030790.0407521222.25426e-0091651.2324e-020
371.86749e-030800.0687651230.000164021662.24069e-007
381.12845e-007810.0767481241.12148e-0101676.17886e-016
394.48316e-014820.0687651252.32044e-0121682.54455e-020
403.00031e-010830.0687651260.000164021693.46674e-018
411.26268e-014840.0767481277.97693e-0131702.87298e-015
421.34742e-027850.0687651281.39808e-016  
Total log likelihood of joint reconstruction: -5892.98

Top

Mrof output: Ancestral ML Reconstruction Alignment

There are two methods of ancestral reconstruction - Joint and Marginal. In this section, we provide a multiple alignment including both input peptidases and ancestral ML sequences reconstructed using the Joint method.The alignment is available in several formats clicking below the option "Set 1". To build HMM profiles and MRC sequences we removed non-informative amino acid stretches and gaps from several ancestral ML reconstruction analyses You can also retrieve the processed Jrof output, clicking below the option "Set 2". Note however that should you cannot select option 2 is because the output was not processed. <align id="pepsins_a1a_d1" folder="mrof"></align>

Top

Models

Error creating thumbnail: Unable to save thumbnail to destination

Sequence logo constructed from the input of the processed Jrof alignment. In every position, each residue is a letter whose height is proportional to its frequency multiplied by the information content of each position measured in bits. Letters are placed such that the most common is at the top.

  • Basic residues are represented in red
  • Hydrophobic residues in black
  • Amino acids frequent in β-turns (G and P) in dark grey
  • Small nucleophiles in violet
  • Acidic residues in orange
  • Acidic-relative amides in green

The logo was constructed using ChekAlign server with the Shannon's algorithm (Shannon 1997) and options "include gaps" and "Correction factor". Gaps are not represented by any symbol but occupy a blank also proportional to its frequency and, for aesthetic reasons, always at the top. Maximum entropy is log221. The alignment gap is considered to be another state or amino acid species.

Top

HMMs

>AP_pepsins_a1a_d1 profile HMM generated consensus sequence
FDTGSSNLWVpStqChgtsiaCslhnkFdpskSSTYkenGtfsIqYGtGsslsGflsqD
tVtigGltvtnqtfgeavkepgstFvdakfDGILGLgYpslavdgvtpvfdnlikqgli
ekpvFSvyL

Top

Pairwise alignment between the MRC sequence and the DTG/ILG template HMM-profile

        Domain 1 of 1, from 1 to 104: score 45.8, E = 1.6e-14

DTG_ILG template *->vDTGAsvlsviskecklaqklgltrkk.a.fdp..SS.Y..v.C.iv
                    +DTG+s l+v+s++c +    +++  ++++fdp++SS+Y+++++ ++
AP_pepsins     1    FDTGSSNLWVPSTQC-HGTSIACS--LhNkFDPskSStYkeNgTfSI 44   

                 tllsysqPssktsttaqdtirgagGqskiyvSklktsgqirknllslvti
                 +   y+ +ss  + ++qdt+ ++gG   +                  vt+
AP_pepsins    45 Q---YGTGSSLSGFLSQDTV-TIGG--LT------------------VTN 70   

                 kitkGnvTevenrslpsdgvflvvtdpedqksrydvILGrldfLrqlnsv
                 +++ G++    +++++s+  f+++        ++d+ILG+ +  + +   
AP_pepsins    71 QTF-GEA----VKEPGST--FVDA--------KFDGILGL-GYPSLA--- 101  

                 hidl<-*
                  +d    
AP_pepsins   102 -VDG    104  

Top

Cite this site

Llorens, C. Futami, R. Renaud, G. and A. Moya (2009). Bioinformatic Flowchart and Database to Investigate the Diversity of Clan AA Peptidases.Biology Direct, 4:3.




Welcome to the Gypsy Database (GyDB) an open editable database about the evolutionary relationship of viruses, mobile genetic elements (MGEs) and the genomic repeats where we invite all authors to contribute with their knowledge to improve and expand the topics.
Cite this project:

Llorens, C., Futami, R., Covelli, L., Dominguez-Escriba, L., Viu, J.M., Tamarit, D., Aguilar-Rodriguez, J. Vicente-Ripolles, M., Fuster, G., Bernet, G.P., Maumus, F., Munoz-Pomer, A., Sempere, J.M., LaTorre, A., Moya, A. (2011) The Gypsy Database (GyDB) of Mobile Genetic Elements: Release 2.0 Nucleic Acids Research (NARESE) 39 (suppl 1): D70-D74 doi: 10.1093/nar/gkq1061

Contact - Announcements - Acknowledgments - Terms of use and policy - Help - Donate
Donating legal disclaimer - Terms and conditions of the donation