Software Layout and Computing Cluster

Return to Index


Software Layout and Computing Cluster

Workbench and GUIs

GPRO software consists of friendly-to-use central menu with distinct tools accessible via graphical interfaces (GUIs) each one usually displaying their own submenu of functionalities. GUIs vary depending on the invoked tool. As we will show you in this manual, you will find GUIs for both a database and a sequence editor called TIME, but also for the worksheet tool for annotation and functional analysis, for all the pipeline and online servers, for the data management tool and more. As for layout organization, GPRO has been designed based on a eclipse-like workbench shown in Figure 3.1 and consisting the following:

  1. MAIN DESKTOP: This is the central working space, where the most important tools and interfaces of GPRO are graphically launched

  2. DIRECTORY: users can select any folder of their PCs and set it as a directory shown left of the main desktop in order to store and organize files/folders through a dynamic filetree hierarchy allowing the users a variety of actions and analyses based on both menu and mouse utilities. The directory can be shown and hidden via menu.

  3. FTP explorer: this is a File Transfer Protocol (FTP) called FTP explorer that allows you to transfer and download files and folders (using the computer mouse) from the directory to the remote user at the computing cluster or vice versa

  4. FASTA EXPLORER: this is a window-based utility coupled with a "Database Editor" accessible via menu that allows the users to visualize, search and select sequences from plain-text files in fasta format. It is really useful to manage large databases and Refseq files because of the implementation of an "informatic buffer" allowing to browse sequences one-to-one by clicking on the sequences names shown in the fasta explorer


Error creating thumbnail: Unable to save thumbnail to destination
Figure 3.1. GPRO layout organization and interface implementation. Numbers indicate the four window-based sections as described in the text and visualized in the figure: (1) main desktop; (2) Directory; (3) FTP; (4) FASTA explorer.


Computing cluster overview

The software is coupled via distinct menu tabs with an online computing infrastructure in continuous progress and installed on the high-end remote cluster at GyDB project. This infrastructure can be only accessed via GPRO (which is the infrastructure remote manager) and includes hard-disk accounts for all GPRO users in order to run intensive computing jobs in private session, at present, based on the following items.

  • 50GB hard disk space, which will increase periodically to guarantee sufficient computational space to fit the requirements of the most demanding projects

  • A guaranteed quality of service distributed CPU bandwidth for high-throughput computing analyses, providing users with the maximum available processing capacity on the cluster.

  • A SSH client for logging into a user's private account on the remote cluster and sending commands for launching automated analysis tools.

  • An FTP client system organized as a remote Filetree manager for transferring analysis files between a client computer and the remote cluster user's account. Users can upload sequence files for processing on the remote cluster and download generated result files to a local computer.

  • A server for constructing HMM profiles based on HMMER

  • A server for Mapping/Assembling NGS raw data based on the following tools; BWA, BFAST and MIRA. The GPRO GUI for accessing this server is already under development. This means that at present this server is only accessible via SSH command shell client protocol

  • A pipeline for Genome/Exome analysis plus SNP/Indel Calling and Annotation based on BFAST and BWA mappers plus PICARD, GATK-LITE (McKenna et al. 2010), and the snpEff annotator (Cingolani et al. 2012). The GPRO GUI for accessing this pipeline is already under development. This means that at present this server is only accessible via SSH command shell client protocol

  • A post-processing pipeline for correction of frameshifts with particular focus on those generated by 454 and IonTorrent homopolimer artifacts based on a combination of a collection of scripts with the BLAST and HMMER packages with the INTERPROSCAN databases and the frameshift corrector HMM-FRAME (Zhang and Yung 2011). The GPRO GUI for accessing this pipeline is already under development. This means that at present this server is only accessible via SSH command shell client protocol

  • A pipeline for Mobilome or Mobile Genetic Elements mapping and annotation based on the GyDB and other RefSeq databases and the following tools; REPEATMODELER, REPEATMASKER, RECON (Bao and Eddy 2002), REPEATSCOUT (Price et al. 2005) , TANDEM REPEAT FINDER ((Benson. 1999), and RMBLAST. The GPRO GUI for accessing this pipeline is already under development. This means that at present this server is only accessible via SSH command shell client protocol

  • A MySQL database architecture to store and manage and consult any kind of relational, classificatory and GoldenPath or including user-created databases using the worksheet system of the GPRO software.


Return to Index




Welcome to the Gypsy Database (GyDB) an open editable database about the evolutionary relationship of viruses, mobile genetic elements (MGEs) and the genomic repeats where we invite all authors to contribute with their knowledge to improve and expand the topics.
Cite this project:

Llorens, C., Futami, R., Covelli, L., Dominguez-Escriba, L., Viu, J.M., Tamarit, D., Aguilar-Rodriguez, J. Vicente-Ripolles, M., Fuster, G., Bernet, G.P., Maumus, F., Munoz-Pomer, A., Sempere, J.M., LaTorre, A., Moya, A. (2011) The Gypsy Database (GyDB) of Mobile Genetic Elements: Release 2.0 Nucleic Acids Research (NARESE) 39 (suppl 1): D70-D74 doi: 10.1093/nar/gkq1061

Contact - Announcements - Acknowledgments - Terms of use and policy - Help - Donate
Donating legal disclaimer - Terms and conditions of the donation