The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
A large number of long non-coding RNAs (lncRNAs) have been identified over the past decades. Accumulating evidence proves that lncRNAs play key roles in various biological processes. However, the majority of the lncRNAs have not been functionally characterized. The annotation of lncRNA functions has become an area of focus in the fields of biology and bioinformatics. In this paper, we develop a global...
Ligand binding site prediction from protein structure plays an important role in various complex rational drug design efforts. Its applications include drug side effects prediction, docking prioritization in inverse virtual screening and elucidation of protein function in genome wide structural studies. Currently available tools have limitations that disqualify them from many possible use cases. In...
In bioinformatics, protein multiple sequence alignment (MSA) and phylogenetic tree construction are among the major problems for which many algorithms have been developed to improve the accuracy the results. However, finding the best algorithm among the available ones remains a challenging task since the efficiency of an algorithm is closely related to the characteristics of the input sequences. Moreover,...
In this paper, we propose algorithms for biomolecular docking sites selection problem by various machine learning approaches with selective features reduction. The proposed method can reduce the number of various amino acid features before constructing machine learning prediction models. Given frame boxes with features, the proposed method analyzes the important features by correlation coefficients...
Protein function prediction is a challenging and essential research problem in the field of computational biology. Conventionally, a protein consists of a number of structural domains and performs multiple function. By representing proteins, domains and functions by bags as well as instances and classes respectively, we are able to model the protein function prediction task as the Multi-Instance Multi-Label...
Next Generation Sequencing (NGS) technologies have led to fast and inexpensive production of large amounts of biological sequence data, including nucleotide sequences and derived protein sequences. These fast-increasing volumes of data pose challenges to computational methods for annotation. Machine learning approaches, primarily supervised algorithms, have been widely used to assist with classification...
The inference of a Gene Regulatory Network (GRN) using gene expression data is a major research topic in bioinformatics. Modeling GRNs is significantly important in order to understand gene dependencies, regulatory functions among genes, biological processes, way of process occurrence and avoiding some unplanned processes (disease). Due to the huge number of genes and the small number of samples,...
Many studies have shown roles of miRNAs (microRNAs) on human disease and a number of computational methods have been proposed to predict such associations by ranking candidate microRNAs according to their relevance to a disease. Among them, network-based methods are becoming dominant since they well exploit the “disease module” principle in miRNA functional similarity networks. Of which, Random Walk...
Physical interactions between the proteins in a living organism helps in identification of most protein-protein interaction data. The annotated proteins are previously known by their functions. Their knowledge is definite. The un-annotated proteins are annotated based on estimation of such similar functions. Generally a cluster containing annotated nodes with their adjacent unlabeled nodes is assumed...
The identification of disease genes is the first step towards the understanding of genetic disease mechanisms. Although many computational algorithms are proposed to identify disease genes, they either have poor performance in terms of AUC scores or are very time consuming. To overcome these two problems, a logistic regression based algorithm is proposed in this study for identifying disease genes...
The reduced cost of the next generation sequencing technologies provides opportunities to study non-model organisms. However, one challenge is the large volume of data generated and, thus, the need to use automated approaches to annotate these data. Machine learning algorithms could provide a cost-effective solution but they need lots of labeled data and informative features to represent these data...
Most proteins form macromolecular complexes to perform their biological functions. With the increasing availability of large amounts of high-throughput protein-protein interaction (PPI) data, a vast number of computational approaches for detecting protein complexes have been proposed to discover protein complexes from PPI networks. However, such approaches are not good enough since the high rate of...
microRNA (miRNA) is a post-transcriptional gene regulation mechanism that mediates sequence-specific degradation of the targeted RNA and thus provides an opportunity for the development of oligonucleotide-based drugs. Here, we propose a systematic approach for finding, selecting, and validating miRNAs that target conserved regions in the hepatitis C virus (HCV). Different factors, such as target conservation,...
Virtual screening based on protein-ligand docking is widely applied at the early stage of drug discovery. Scoring functions from a diverse set of existing protein-ligand docking tools, however, often poorly distinguish bioactive compounds from inactive ones. As a result, considerable effort has been devoted to the combination of multiple scoring functions for more reliable evaluation. State-of-the-art...
We present an algorithm EcSymDock which optimized SymDock using Evolution couplings for better performance. SymDock is a protocol of Rosetta used to predict protein oligomer structure. Evolution Coupling (EC) is an optimized mutual information calculated by specific protein family MSA, it reveals certain contacts between residues. Today more and more researchers pay close attention to how to use ECs...
The experimental study of signal transduction over a decade has made a substantial contribution to understanding functional mechanisms in a cell. A signaling pathway represents a linear path of a signaling cascade involving a series of proteins. As an advanced model, multiple linear pathways with extensive cross-talk between receptors can be merged into a larger-scale signaling network. We present...
Proteins play a key role in cells' function and metabolism. Their functions are directly related with the three-dimensional (3D) native structure. Different algorithms have been proposed to predict the 3D protein structure from the amino acids sequence by minimizing its free energy, nonetheless, this problem still a great challenge in structural biology. The space of possible conformations becomes...
The analysis of the whole set of molecular interactions in an organism, often referred to as interaction networks, is becoming an important research area. A main approach for such analysis resides on the application of clustering techniques to such networks. The meaning of discovered clusters, (i.e. highly interconnected regions), is strictly related to the type of networks. For instance in protein-protein...
Proteins perform most important biochemical reactions in organisms, such as the catalysis, signal transduction, and transport of nutrients. The urgent need of automatic annotation is due to the advent of high-throughput sequencing techniques in the post-genomic era. Proteins consist of domains which are elementary building units of protein folding, function, and evolution. The evidence of protein...
We present CARAS, a web server that allows the automatic annotation of a chloroplast genome sequence, and the visualization and editing of the annotation results interactively and in real-time. CARAS accepts a complete chloroplast genome sequence as input. First, it accurately predicts protein-coding sequences and exon-intron structures by combining the results from two types of annotation approaches:...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.