Algorithms in Bioinformatics

part

Papers

Lecture Notes in Computer Science > Algorithms in Bioinformatics

chapter

Reversing Gene Erosion – Reconstructing Ancestral Bacterial Genomes from Gene-Content and Order Data

Joel V. Earnest-DeYoung, Emmanuelle Lerat, Bernard M. E. Moret

Lecture Notes in Computer Science > Algorithms in Bioinformatics > Papers > 1-13

In the last few years, it has become routine to use gene-order data to reconstruct phylogenies, both in terms of edge distances (parsimonious sequences of operations that transform one end point of the edge into the other) and in terms of genomes at internal nodes, on small, duplication-free genomes. Current gene-order methods break down, though, when the genomes contain more than a few hundred genes,...

chapter

Reconstructing Ancestral Gene Orders Using Conserved Intervals

Anne Bergeron, Mathieu Blanchette, Annie Chateau, Cedric Chauve

Lecture Notes in Computer Science > Algorithms in Bioinformatics > Papers > 14-25

Conserved intervals were recently introduced as a measure of similarity between genomes whose genes have been shuffled during evolution by genomic rearrangements. Phylogenetic reconstruction based on such similarity measures raises many biological, formal and algorithmic questions, in particular the labelling of internal nodes with putative ancestral gene orders, and the selection of a good tree topology...

chapter

Sorting by Reversals with Common Intervals

Martin Figeac, Jean-Stéphane Varré

Lecture Notes in Computer Science > Algorithms in Bioinformatics > Papers > 26-37

Studying rearrangements from gene order data is a standard approach in evolutionary analysis. Gene order data are usually modeled as signed permutations. The computation of the minimal number of reversals between two signed permutations produced a lot of literature during the last decade. Algorithms designed were first approximative, then polynomial and were further improved to give a linear one....

chapter

A Polynomial-Time Algorithm for the Matching of Crossing Contact-Map Patterns

Jens Gramm

Lecture Notes in Computer Science > Algorithms in Bioinformatics > Papers > 38-49

Contact maps are a model to capture the core information in the structure of biological molecules, e.g., proteins. A contact map consists of an ordered set S of elements (representing a protein’s sequence of amino acids), and a set A of element pairs of S, called arcs (representing amino acids which are closely neighbored in the structure). Given two contact maps (S,A) and (S ...

chapter

A 1.5-Approximation Algorithm for Sorting by Transpositions and Transreversals

Tzvika Hartman, Roded Sharan

Lecture Notes in Computer Science > Algorithms in Bioinformatics > Papers > 50-61

One of the most promising ways to determine evolutionary distance between two organisms is to compare the order of appearance of orthologous genes in their genomes. The resulting genome rearrangement problem calls for finding a shortest sequence of rearrangement operations that sorts one genome into the other. In this paper we provide a 1.5-approximation algorithm for the problem of sorting by transpositions...

chapter

Algorithms for Finding Maximal-Scoring Segment Sets

Miklós Csűrös

Lecture Notes in Computer Science > Algorithms in Bioinformatics > Papers > 62-73

We examine the problem of finding maximal-scoring sets of disjoint regions in a sequence of scores. The problem arises in DNA and protein segmentation, and in post-processing of sequence alignments. Our key result states a simple recursive relationship between maximal-scoring segment sets. The statement leads to an algorithm that finds such a k-set of segments in a sequence of length n in O(nk) time...

chapter

Gapped Local Similarity Search with Provable Guarantees

Manikandan Narayanan, Richard M. Karp

Lecture Notes in Computer Science > Algorithms in Bioinformatics > Papers > 74-86

We present a program qhash, based on q-gram filtration and high-dimensional search, to find gapped local similarities between two sequences. Our approach differs from past q-gram-based approaches in two main aspects. Our filtration step uses algorithms for a sparse all-pairs problem, while past studies use suffix-tree-like structures and counters. Our program works in sequence-sequence mode, while...

chapter

Monotone Scoring of Patterns with Mismatches

Alberto Apostolico, Cinzia Pizzi

Lecture Notes in Computer Science > Algorithms in Bioinformatics > Papers > 87-98

We study the problem of extracting, from given source x and error threshold k, substrings of x that occur unusually often in x within k substitutions or mismatches. Specifically, we assume that the input textstring x of n characters is produced by an i.i.d. source, and design efficient methods for computing the probability and expected number of occurrences for substrings of x with (either exactly...

chapter

Suboptimal Local Alignments Across Multiple Scoring Schemes

Morris Michael, Christoph Dieterich, Jens Stoye

Lecture Notes in Computer Science > Algorithms in Bioinformatics > Papers > 99-110

Sequence alignment algorithms have a long standing tradition in bioinformatics. In this paper, we formulate an extension to existing local alignment algorithms: local alignments across multiple scoring functions. For this purpose, we use the Waterman-Eggert algorithm for suboptimal local alignments as template and introduce two new features therein: 1) an alignment of two strings over a set of score...

chapter

A Faster Reliable Algorithm to Estimate the p-Value of the Multinomial llr Statistic

Uri Keich, Niranjan Nagarajan

Lecture Notes in Computer Science > Algorithms in Bioinformatics > Papers > 111-122

The subject of estimating the p-value of the log-likelihood ratio statistic for multinomial distribution has been studied extensively in the statistical literature. Nevertheless, bioinformatics laid new challenges before that research by often concentrating its interest on the “thin tail” of the distribution where classical statistical approximation typically fails. Hence, some of the more recent...

chapter

Adding Hidden Nodes to Gene Networks

Benny Chor, Tamir Tuller

Lecture Notes in Computer Science > Algorithms in Bioinformatics > Papers > 123-134

Bayesian networks are widely used for modelling gene networks. We investigate the problem of expanding a given Bayesian network by adding a hidden node – a node on which no experimental data are given. Finding a good expansion (a new hidden node and its neighborhood) can point to regions where the model is not rich enough, and help locate new, unknown variables that are important for understanding...

chapter

Joint Analysis of DNA Copy Numbers and Gene Expression Levels

Doron Lipson, Amir Ben-Dor, Elinor Dehan, Zohar Yakhini

Lecture Notes in Computer Science > Algorithms in Bioinformatics > Papers > 135-146

Genomic instabilities, amplifications, deletions and translocations are often observed in tumor cells. In the process of cancer pathogenesis cells acquire multiple genomic alterations, some of which drive the process by triggering overexpression of oncogenes and by silencing tumor suppressors and DNA repair genes. We present data analysis methods designed to study the overall transcriptional effects...

chapter

Searching for Regulatory Elements of Alternative Splicing Events Using Phylogenetic Footprinting

Daichi Shigemizu, Osamu Maruyama

Lecture Notes in Computer Science > Algorithms in Bioinformatics > Papers > 147-158

We consider the problem of finding candidates for regulatory elements of alternative splicing events from orthologous genes, using phylogenetic footprinting. The problem is formulated as follows: We are given orthologous sequences P ₁,...,P _a and N ₁,...,N _b from a + b different species, and a phylogenetic tree...

chapter

Supervised Learning-Aided Optimization of Expert-Driven Functional Protein Sequence Annotation

Lev Soinov, Alexander Kanapin, Misha Kapushesky

Lecture Notes in Computer Science > Algorithms in Bioinformatics > Papers > 159-169

The aim of this work is to use a supervised learning approach to identify sets of motif-based sequence characteristics, combinations of which can give the most accurate annotation of new proteins. We assess several of InterPro Consortium member databases for their informativeness for the annotation of full-length protein sequences. Thus, our study addresses the problem of integrating biological information...

chapter

Multiple Vector Seeds for Protein Alignment

Daniel G. Brown

Lecture Notes in Computer Science > Algorithms in Bioinformatics > Papers > 170-181

We present a framework for improving local protein alignment algorithms. Specifically, we discuss how to extend local protein aligners to use a collection of vector seeds [3] to reduce noise hits. We model picking a set of vector seeds as an integer programming problem, and give algorithms to choose such a set of seeds. A good set of vector seeds we have chosen allows four times fewer false positive...

chapter

Solving the Protein Threading Problem by Lagrangian Relaxation

Stefan Balev

Lecture Notes in Computer Science > Algorithms in Bioinformatics > Papers > 182-193

This paper presents an efficient algorithm for aligning aquery amino-acid sequence to a protein 3D structure template. Solving this problem is one of the main steps of the methods of protein structure prediction by threading. We propose an integer programming model and solve it by branch-and-bound algorithm. The bounds are computed using a Lagrangian dual of the model which turns out to be much easier...

chapter

Protein-Protein Interfaces: Recognition of Similar Spatial and Chemical Organizations

Alexandra Shulman-Peleg, Shira Mintz, Ruth Nussinov, Haim J. Wolfson

Lecture Notes in Computer Science > Algorithms in Bioinformatics > Papers > 194-205

Protein-protein interfaces, which are regions of interaction between two protein molecules, contain information about patterns of interacting functional groups. Recognition of such patterns is useful both for prediction of binding partners and for the development of drugs that can interfere with the formation of the protein-protein complex. We present a novel method, Interface-to-Interface (I2I)-SiteEngine,...

chapter

ATDD: An Algorithmic Tool for Domain Discovery in Protein Sequences

Stanislav Angelov, Sanjeev Khanna, Li Li, Fernando Pereira

Lecture Notes in Computer Science > Algorithms in Bioinformatics > Papers > 206-217

The problem of identifying sequence domains is essential for understanding protein function. Most current methods for protein domain identification rely on prior knowledge of homologous domains and construction of high quality multiple sequence alignments. With rapid accumulation of enormous data from genome sequencing, it is important to be able to automatically determine domain regions from a set...

chapter

Local Search Heuristic for Rigid Protein Docking

Vicky Choi, Pankaj K. Agarwal, Herbert Edelsbrunner, Johannes Rudolph

Lecture Notes in Computer Science > Algorithms in Bioinformatics > Papers > 218-229

We give an algorithm that locally improves the fit between two proteins modeled as space-filling diagrams. The algorithm defines the fit in purely geometric terms and improves by applying a rigid motion to one of the two proteins. Our implementation of the algorithm takes between three and ten seconds and converges with high likelihood to the correct docked configuration, provided it starts at a position...

INFONA - science communication portal

Algorithms in Bioinformatics
4th International Workshop, WABI 2004, Bergen, Norway, September 17-21, 2004. Proceedings

Papers

Reversing Gene Erosion – Reconstructing Ancestral Bacterial Genomes from Gene-Content and Order Data

Reconstructing Ancestral Gene Orders Using Conserved Intervals

Sorting by Reversals with Common Intervals

A Polynomial-Time Algorithm for the Matching of Crossing Contact-Map Patterns

A 1.5-Approximation Algorithm for Sorting by Transpositions and Transreversals

Algorithms for Finding Maximal-Scoring Segment Sets

Gapped Local Similarity Search with Provable Guarantees

Monotone Scoring of Patterns with Mismatches

Suboptimal Local Alignments Across Multiple Scoring Schemes

A Faster Reliable Algorithm to Estimate the p-Value of the Multinomial llr Statistic

Adding Hidden Nodes to Gene Networks

Joint Analysis of DNA Copy Numbers and Gene Expression Levels

Searching for Regulatory Elements of Alternative Splicing Events Using Phylogenetic Footprinting

Supervised Learning-Aided Optimization of Expert-Driven Functional Protein Sequence Annotation

Multiple Vector Seeds for Protein Alignment

Solving the Protein Threading Problem by Lagrangian Relaxation

Protein-Protein Interfaces: Recognition of Similar Spatial and Chemical Organizations

ATDD: An Algorithmic Tool for Domain Discovery in Protein Sequences

Local Search Heuristic for Rigid Protein Docking

Filter options

Publication date

Content availability

Publication language

Keywords

INFONA - science communication portal

Algorithms in Bioinformatics 4th International Workshop, WABI 2004, Bergen, Norway, September 17-21, 2004. Proceedings $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication language

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

Algorithms in Bioinformatics
4th International Workshop, WABI 2004, Bergen, Norway, September 17-21, 2004. Proceedings