The Clan AA Reference Database (CAARD), an in-progress database of ancestral maximum likelihood reconstructions (AMLRs), sequence logos and HMMs constructed based on different protein families according to estimations of their taxonomy and relationships. The objetive is to investigate the major consensus and phylogeny to classify the different protein families. The current version is based on 323 non-redundant sequences belonging to different clan AA families (to visualize a conventional phylogeny inferred based on these sequences, click here). The set of tools is stored by family datasheets in the database, which can be navigated using the phylogenetic tree shown in this web site. This tree acts as dynamic map of links. By clicking the name of each cluster in this tree, the user can locate the datasheet corresponding to the family selected.
The tree has been reconstructed based on an alignment of AMLR sequences available clicking on the blue circle in the center of the tree. Additionally, the sequences can be retrieved in 2 independent ways, one) within each data sheet as a Jrof alignment, 2) leaves link to the AMLR sequences which are available in separate files. Sequences have tags with information about the parental relationships of the sequence represented by each leaf. "N_x" means the ancestral ML sequence (or node) reconstructed per ancestral reconstruction. The topology is a MRC tree (Margush and McMorris 1981) inferred using the Felsenstein parsimony method based on the approaches of Eck and Dayhoff (1966) and Fitch (1971). Values accompanying the clusters are two independent bootstrap estimations (separated by bars) supporting clusters that occurred >55% of the time in the analysis when using the parsimony and NJ methods of phylogenetic reconstruction, respectively. Branches are not distance-scaled.
Llorens,C. Futami,R., Renaud,G and A. Moya (2009). Bioinformatic Flowchart and Database to Investigate the Origins and Diversity of Clan AA Peptidases. Biology Direct, 4:3.