Retroelement integrases (INTs) are zinc finger nucleic acid-processing enzymes that catalyze the insertion of reverse-transcribed retroviral DNA into the host genome (Chiu and Davies 2004; Nowotny 2009). These enzymes remove two bases from the end of the LTR and are responsible for the insertion of the linear double-stranded viral DNA copy into the host cell DNA. INT amino acid architecture includes three subdomains:

  • The N-terminal subdomain, which displays a conserved Zinc finger "HHCC" binding motif (Lodi et al. 1995).
  • The central subdomain, which contains a catalytic core characterized by the presence of a conserved D-D-E motif (Kan et al. 1991; Polard and Chandler 1995).
  • The C-terminal subdomain, which is less preserved than the others.

This enzyme seems to be related to unspecific DNA-binding although several studies of chimeric integrases assign this function to the central core (Katzman and Sudol 1995; Shibagaki and Chow 1997), while other authors alternatively suggest that the C-terminal subdomain might interact with a sub-terminal region of the viral DNA (Jenkins et al. 1997; Heuer and Brown 1997; Esposito and Craigie 1998; Heuer and Brown 1998).

The functional structure of LTR retroelement-like INTs is already under study although it seems to be, together with a proviral DNA molecule and other viral and host proteins, part of a pre-integration complex of which little is known. Several studies suggest that this enzyme could act as a multimer or at least as a dimer (for a review in this topic see Craigie 2001).

HIV 1 INT 3D structure adapted from the PDB-file 1bis

There are four known families of eukaryotic LTR retroelements – the Ty3/Gypsy, the Ty1/Copia, the Bel/Pao, and the Retroviridae. The INT domain usually constitutes the C-terminal region of the pol polyprotein encoded by most but not all the elements belonging to the aforesaid pools. The exceptions are all the elements belonging to the Ty1/Copia group and several Ty3/Gypsy elements called Gmr1 clade (for more details, see (Butler et al. 2001; Goodwin & Poulter 2002)), which display the INT domain, N-terminal to the RT, and also, other retroelement lineages, which do not encode for any type of known INT).

Ty3/Gypsy, Retroviridae and Bel/Pao INTs usually present the typical INT core followed by an additional module termed GPY/F that is firstly recognized by a preserved GPY/F motif that gives the name to this module (Malik and Eickbush 1999). The GPY/F module is thought to mediate multimerization (Ebina et al. 2008).


Adjacent to this module, some Ty3/Gypsy elements described in genomes of plants, fungi, and vertebrates incorporate an additional chromodomain similar to certain nuclear and chromatin-interacting proteins (Malik and Eickbush 1999). This finding inspired the term "Chromovirus" to describe chromodomain-containing integrase elements as a genus (Marin and Llorens 2000). The chromodomain (the chromatin organization modifier) is a small protein module involved in chromatin re-modeling, regulation of gene expression (Koonin et al. 1995) and differential host genome integration of LTR retroelements (Wright et al. 2005; Singleton & Levin 2002).


Ty1/Copia INTs have not GPY/F module at their C-terminus but they usually present an additional module called GKGY (Malik & Eickbush 1999; Peterson-Burch & Voytas 2002; Gao & Voytas 2005).


Although Ty1/Copia INT usually lack chromodomain features, recent research has revealed a particular clade of protist Ty1/Copia elements called CoDi-I (also called CoDi-A), which also shows the occurrence of a chromodomain at their C-terminus (see Maumus et al. 2009; Llorens et al. 2009). Click here for more information about the Ty1/Copia family.

There are two complementary classifications for the LTR retroelement INTs. Based on sequence these enzymes can be classified as DDE TRs and INTs. Based on INT-like structural potential similarities, the LTR retroelement INTs are members of the Retroviral Integrase Superfamily (Nowotny 2009) of nucleic acid-processing enzymes involved in; a) selfish evolution; b) replication and repair of DNA; c) recombination and gene fusion; d) RNA-mediated gene silencing; and e) oncogenesis.

Welcome to the Gypsy Database (GyDB) an open editable database about the evolutionary relationship of viruses, mobile genetic elements (MGEs) and the genomic repeats where we invite all authors to contribute with their knowledge to improve and expand the topics.
Cite this project:

Llorens, C., Futami, R., Covelli, L., Dominguez-Escriba, L., Viu, J.M., Tamarit, D., Aguilar-Rodriguez, J. Vicente-Ripolles, M., Fuster, G., Bernet, G.P., Maumus, F., Munoz-Pomer, A., Sempere, J.M., LaTorre, A., Moya, A. (2011) The Gypsy Database (GyDB) of Mobile Genetic Elements: Release 2.0 Nucleic Acids Research (NARESE) 39 (suppl 1): D70-D74 doi: 10.1093/nar/gkq1061

Contact - Announcements - Acknowledgments - Terms of use and policy - Help - Donate
Donating legal disclaimer - Terms and conditions of the donation