Hauries d´instal.lar el plug-in del flash... Descarregar plug-in de Flash




Disminuir Aumentar

Biological data, from Babel to BIANA

A new platform integrates various scattered databases

The chaos of biological data, hitherto scattered in various databases, comes to an end. A new program of free access, BIANA has been made available to scientists and has been put together to facilitate their research. With this new tool, with a Catalan stamp, scientists have at their disposal a wealth of information, they will be able to find new relationships between different biological elements, make predictions and identify new therapeutic targets.

Patricia Morén | 22 February 2010

Research with biological data will be easier and faster from now on. A new platform, BIANA (acronym for Biological Interactions and Network Analysis) has made this possible, after gathering an enormous amount of data scattered across various sources and stored in multiple formats. Researchers who view the new platform will have at its disposal the entire integrated database. They will be able to manipulate them according to their research interests, infer new interactions between proteins and finding new therapeutic targets, amongst other uses. The journal Bioinformatics has described the new software.

The new tool has been fully developed by the Laboratory of Structural Bioinformatics, led by Baldomero Oliva, within the Research Group of Biomedical Informatics (GRIB), from the University Pompeu Fabra and the Municipal Institute of Medical Research (IMIM-UPF). The project stems from a previous platform from the same group, PIANA (Proteins Interactions and Analysis), but it has more applications than this one. It is accessible to all researchers, easy to use and available under the GNU GLP Licence (General Public License) found on the website .  BIANA has an additional value that uses graphical Cytoscape interference to manage the data in an easy and interactive way.

“It is about a system that is open to anyone who can insert the desired biological data and integrate them into the system to other databases, instead of using so many specific basis”, explains Javier García, one of the researchers of the GRIB. Together with Emre Güney, also of the group, they have been directly involved in the development of this software. But where does the importance of developing bioinformatics lie? Let’s take a look back.

Context item

Until now, the storage of biological data has been carried out at three main towns of the world: Japan; the United States through the National Institutes of Health (NIH); and in Europe through the European Molecular Biology Laboratory (EMBL) in Heidelberg (Germany) and its node, the European Bioinformatic Institute (EBI) in Hinxton (Great Britain). They all have different databases.

During the 80s-90s they began to proliferate the studies on scientific sequencing of genes and proteins. The results were stored in different ways and different places. In Europe, Swiss-Prot (Switzerland) was the first well-organized database, until it was overtaken by Universal Protein Resource (UniProt), which merged the data from the Swiss database with others. And throughout the world more and more were established, such as BioGRID, from Canada; Database of Interacting Proteins (DIP) and GenBank, from the United States; Human Protein Reference Database (HPRD), from India, or the Munich Information Centre for Proteins Sequencing (MIPS), from Germany. And the list does not end here.

Different databases contain large quantities of information, but in different codes and formats Together they all gather a great wealth of information of biological elements. It must be said, however, that the study of protein structure is an arduous task that can take between one and two years, whilst its sequencing is becoming faster. This explains that these databases contain information on the structure of about 50,000 proteins, with more than 500,000 sequences with their function (in UniProt) and more than 10 million DNA sequences of plant, animals and human organisms, Oliva reports. The problem is that this large amount of information has been saved in different formats and codes, in addition to several databases. All in all it resembles the Babel of biological data.

Contributions of BIANA

BIANA unifies protocols and criteria for the analysis of information of biological data hitherto dispersed. Spinning thinner, one of its distinguishing features with respect to these databases is that one can create networks of interaction for all the stored proteins, a task already carried out by PIANA. In addition, explains Oliva, it opens the doors for the exploration of all types of reactions, such as the relationship between an enzyme and another belonging to the same metabolic route.

The tool also allows to make predictions based on known protein domains. This is possible because the UniProt database and others have not limited the storing of entire sequences of protein; domains have also been gathered, i.e., parts of the proteins which form their own folding, that have their own function and which are present in many species because they remain throughout evolution. Therefore, when there is no complete protein, it can be inferred from the functions of the domains. "With a simple integration of information from these databases we can make predictions based on domains”, stresses Oliva.

BIANA can also identify and correct errors in the information stored in the previous databases. It even allows the user to drag it until reaching the real experiment with which to find the specific gene.

The ranking of the targets

BIAN is being used to study diseases like Alzheimer's, diabetes, cancer and aneurysms. In the case of tumours, the tool can be especially useful, since the databases contain much information that can be crossed. Thanks to this platform, proteins have been found that are related to Alzheimer's, that interact with proteins of diabetes and which may be potential targets for drug treatments.

In fact, one of the main utilities of this program is that in the future it will help to find new therapeutic targets. For this reason, researchers are working today on the construction of another program, NETSCORE, to rank the targets. Thus, once the candidates have been identified (let's say they found 1,000), the system assigns a probability and orders them from the most to the least likely. In this way, researchers will have a better direction for their research and instead of studying 'a thousand' candidates, only a few will have to be analysed.

Regarding the impact of BIANA on research, Oliva believes that with this tool, "Fleming is not finished”. The discoveries by serendipity will continue to happen. Even so, BIANA will allow for the results to dramatically increase when it happens, given that its big advantage is that it allows to treat the information that is found and find the right way for the researcher.


Global Global Global Global