Retroelement integrases (INTs) are zinc finger nucleic acid-processing enzymes that catalyze the insertion of reverse-transcribed retroviral DNA into the host genome (Chiu and Davies 2004; Nowotny 2009). These enzymes remove two bases from the end of the LTR and are responsible for the insertion of the linear double-stranded viral DNA copy into the host cell DNA. INT amino acid architecture includes three subdomains:
This enzyme seems to be related to unspecific DNA-binding although several studies of chimeric integrases assign this function to the central core (Katzman and Sudol 1995; Shibagaki and Chow 1997), while other authors alternatively suggest that the C-terminal subdomain might interact with a sub-terminal region of the viral DNA (Jenkins et al. 1997; Heuer and Brown 1997; Esposito and Craigie 1998; Heuer and Brown 1998).
The functional structure of LTR retroelement-like INTs is already under study although it seems to be, together with a proviral DNA molecule and other viral and host proteins, part of a pre-integration complex of which little is known. Several studies suggest that this enzyme could act as a multimer or at least as a dimer (for a review in this topic see Craigie 2001).
There are four known families of eukaryotic LTR retroelements – the Ty3/Gypsy, the Ty1/Copia, the Bel/Pao, and the Retroviridae. The INT domain usually constitutes the C-terminal region of the pol polyprotein encoded by most but not all the elements belonging to the aforesaid pools. The exceptions are all the elements belonging to the Ty1/Copia group and several Ty3/Gypsy elements called Gmr1 clade (for more details, see (Butler et al. 2001; Goodwin & Poulter 2002)), which display the INT domain, N-terminal to the RT, and also, other retroelement lineages, which do not encode for any type of known INT).
Ty3/Gypsy, Retroviridae and Bel/Pao INTs usually present the typical INT core followed by an additional module termed GPY/F that is firstly recognized by a preserved GPY/F motif that gives the name to this module (Malik and Eickbush 1999). The GPY/F module is thought to mediate multimerization (Ebina et al. 2008).
Adjacent to this module, some Ty3/Gypsy elements described in genomes of plants, fungi, and vertebrates incorporate an additional chromodomain similar to certain nuclear and chromatin-interacting proteins (Malik and Eickbush 1999). This finding inspired the term "Chromovirus" to describe chromodomain-containing integrase elements as a genus (Marin and Llorens 2000). The chromodomain (the chromatin organization modifier) is a small protein module involved in chromatin re-modeling, regulation of gene expression (Koonin et al. 1995) and differential host genome integration of LTR retroelements (Wright et al. 2005; Singleton & Levin 2002).
Ty1/Copia INTs have not GPY/F module at their C-terminus but they usually present an additional module called GKGY (Malik & Eickbush 1999; Peterson-Burch & Voytas 2002; Gao & Voytas 2005).
Although Ty1/Copia INT usually lack chromodomain features, recent research has revealed a particular clade of protist Ty1/Copia elements called CoDi-I (also called CoDi-A), which also shows the occurrence of a chromodomain at their C-terminus (see Maumus et al. 2009; Llorens et al. 2009). Click here for more information about the Ty1/Copia family.
There are two complementary classifications for the LTR retroelement INTs. Based on sequence these enzymes can be classified as DDE TRs and INTs. Based on INT-like structural potential similarities, the LTR retroelement INTs are members of the Retroviral Integrase Superfamily (Nowotny 2009) of nucleic acid-processing enzymes involved in; a) selfish evolution; b) replication and repair of DNA; c) recombination and gene fusion; d) RNA-mediated gene silencing; and e) oncogenesis.
Llorens, C., Futami, R., Covelli, L., Dominguez-Escriba, L., Viu, J.M., Tamarit, D., Aguilar-Rodriguez, J. Vicente-Ripolles, M., Fuster, G., Bernet, G.P., Maumus, F., Munoz-Pomer, A., Sempere, J.M., LaTorre, A., Moya, A. (2011) The Gypsy Database (GyDB) of Mobile Genetic Elements: Release 2.0 Nucleic Acids Research (NARESE) 39 (suppl 1): D70-D74 doi: 10.1093/nar/gkq1061