Combinatorial Pattern Matching
15th Annual Symposium, CPM 2004, Istanbul, Turkey, July 5-7, 2004. Proceedings

Suleyman Cenk Sahinalp, S. Muthukrishnan, Ugur Dogrusoz

Items from 1 to 20 out of 37 results

chapter

Back Matter

Lecture Notes in Computer Science > Combinatorial Pattern Matching

chapter

Sorting by Reversals in Subquadratic Time

Eric Tannier, Marie-France Sagot

Lecture Notes in Computer Science > Combinatorial Pattern Matching > 1-13

The problem of sorting a signed permutation by reversals is inspired by genome rearrangements in computational molecular biology. Given two genomes represented as two signed permutations of the same elements (e.g. orthologous genes), the problem consists in finding a most parsimonious scenario of reversals that transforms one genome into the other. We propose a method for sorting a signed permutation...

chapter

Computational Problems in Perfect Phylogeny Haplotyping: Xor-Genotypes and Tag SNPs

Tamar Barzuza, Jacques S. Beckmann, Ron Shamir, Itsik Pe’er

Lecture Notes in Computer Science > Combinatorial Pattern Matching > 14-31

The perfect phylogeny model for haplotype evolution has been successfully applied to haplotype resolution from genotype data. In this study we explore the application of the perfect phylogeny model to other problems in the design and analysis of genetic studies. We consider a novel type of data, xor-genotypes, which distinguish heterozygote from homozygote sites but do not identify the homozygote...

chapter

Sorting by Length-Weighted Reversals: Dealing with Signs and Circularity

Firas Swidan, Michael A. Bender, Dongdong Ge, Simai He, more

Lecture Notes in Computer Science > Combinatorial Pattern Matching > 32-46

We consider the problem of sorting linear and circular permutations and 0/1 sequences by reversals in a length-sensitive cost model. We extend the results on sorting by length-weighted reversals in two directions: we consider the signed case for linear sequences and also the signed and unsigned cases for circular sequences. We give lower and upper bounds as well as guaranteed approximation ratios...

chapter

Optimizing Multiple Spaced Seeds for Homology Search

Jinbo Xu, Daniel G. Brown, Ming Li, Bin Ma

Lecture Notes in Computer Science > Combinatorial Pattern Matching > 47-58

Optimized spaced seeds improve sensitivity and specificity in local homology search [1]. Recently, several authors [2-4] have shown that multiple seeds can have better sensitivity and specificity than single seeds. We describe a linear programming-based algorithm to optimize a set of seeds. Our algorithm offers a performance guarantee: the sensitivity of a chosen seed set is at least 70% of what can...

chapter

Approximate Labelled Subtree Homeomorphism

Ron Y. Pinter, Oleg Rokhlenko, Dekel Tsur, Michal Ziv-Ukelson

Lecture Notes in Computer Science > Combinatorial Pattern Matching > 59-73

Given two undirected trees T and P, the Subtree Homeomorphism Problem is to find whether T has a subtree t that can be transformed into P by removing entire subtrees, as well as repeatedly removing a degree-2 node and adding the edge joining its two neighbors. In this paper we extend the Subtree Homeomorphism Problem to a new optimization problem by enriching the subtree-comparison with node-to-node...

chapter

On the Average Sequence Complexity

Svante Janson, Stefano Lonardi, Wojciech Szpankowski

Lecture Notes in Computer Science > Combinatorial Pattern Matching > 74-88

In this paper we study the average behavior of the number of distinct substrings in a text of size n over an alphabet of cardinality k. This quantity is called the complexity index and it captures the “richness of the language” used in a sequence. For example, sequences with low complexity index contain a large number of repeated substrings and they eventually become periodic (e.g., tandem repeats...

chapter

Approximate Point Set Pattern Matching on Sequences and Planes

Tomoaki Suga, Shinichi Shimozono

Lecture Notes in Computer Science > Combinatorial Pattern Matching > 89-101

The point set pattern matching problem is, given two sets “pattern” and “text” of points in Euclidean space, to find a linear transformation that maps the pattern to a subset of the text. We introduce an approximate point set pattern matching for axis-sorted point sequences that allows a translation, space insertions and deletions between points. We present an approximate pattern matching algorithm...

chapter

Finding Biclusters by Random Projections

Stefano Lonardi, Wojciech Szpankowski, Qiaofeng Yang

Lecture Notes in Computer Science > Combinatorial Pattern Matching > 102-116

Given a matrix X composed of symbols, a bicluster is a submatrix of X obtained by removing some of the rows and some of the columns of X in such a way that each row of what is left reads the same string. In this paper, we are concerned with the problem of finding the bicluster with the largest area in a large matrix X. The problem is first proved to be NP-complete. We present a fast and efficient...

chapter

Real-Time String Matching in Sublinear Space

Leszek Gąsieniec, Roman Kolpakov

Lecture Notes in Computer Science > Combinatorial Pattern Matching > 117-129

We study a problem of efficient utilisation of extra memory space in real-time string matching. We propose, for any constant ε >0, a real-time string matching algorithm claiming O(m ^ε) extra space, where m is the size of a pattern. All previously known real-time string matching algorithms use Ω(m) extra space.

chapter

On the k-Closest Substring and k-Consensus Pattern Problems

Yishan Jiao, Jingyi Xu, Ming Li

Lecture Notes in Computer Science > Combinatorial Pattern Matching > 130-144

Given a set S ={s ₁,s ₂,...,s _n } of strings each of length m, and an integer L, we study the following two problems. k -Closest Substring problem: find k center strings c ₁,c ₂ ,...,c _k of length L minimizing d such that for each s _j ∈...

chapter

A Trie-Based Approach for Compacting Automata

Maxime Crochemore, Chiara Epifanio, Roberto Grossi, Filippo Mignosi

Lecture Notes in Computer Science > Combinatorial Pattern Matching > 145-158

We describe a new technique for reducing the number of nodes and symbols in automata based on tries. The technique stems from some results on anti-dictionaries for data compression and does not need to retain the input string, differently from other methods based on compact automata. The net effect is that of obtaining a lighter automaton than the directed acyclic word graph (DAWG) of Blumer et al...

chapter

A Simple Optimal Representation for Balanced Parentheses

Richard F. Geary, Naila Rahman, Rajeev Raman, Venkatesh Raman

Lecture Notes in Computer Science > Combinatorial Pattern Matching > 159-172

We consider succinct, or highly space-efficient, representations of a (static) string consisting of n pairs of balanced parentheses, that support natural operations such as finding the matching parenthesis for a given parenthesis, or finding the pair of parentheses that most tightly enclose a given pair. This problem was considered by Jacobson, [Proc. 30th FOCS, 549–554, 1989] and Munro and Raman,...

chapter

Two Algorithms for LCS Consecutive Suffix Alignment

Gad M. Landau, Eugene Myers, Michal Ziv-Ukelson

Lecture Notes in Computer Science > Combinatorial Pattern Matching > 173-193

The problem of aligning two sequences A and B to determine their similarity is one of the fundamental problems in pattern matching. A challenging, basic variation of the sequence similarity problem is the incremental string comparison problem, denoted Consecutive Suffix Alignment, which is, given two strings A and B, to compute the alignment solution of each suffix of A versus B. Here, we present...

chapter

Efficient Algorithms for Finding Submasses in Weighted Strings

Nikhil Bansal, Mark Cieliebak, Zsuzsanna Lipták

Lecture Notes in Computer Science > Combinatorial Pattern Matching > 194-204

We study the Submass Finding Problem: Given a string s over a weighted alphabet, i.e., an alphabet Σ with a weight function $\mu:\Sigma \to {\mathbb N}$ , decide for an input mass M whether s has a substring whose weights sum up to M. If M is indeed a submass, then we want to find one or all occurrences of such substrings. We present efficient algorithms for both the decision and the search problem...

chapter

Maximum Agreement and Compatible Supertrees

Vincent Berry, François Nicolas

Lecture Notes in Computer Science > Combinatorial Pattern Matching > 205-219

Given a collection of trees on n leaves with identical leaf set, the Mast, resp. Mct, problem consists in finding a largest subset of leaves such that all input trees restricted to these leaves are isomorphic, resp. have a common refinement. For Mast, resp. Mct, on k rooted trees, we give an O(min{3^p kn,2.27^p+kn ³}) exact algorithm,...

chapter

Polynomial-Time Algorithms for the Ordered Maximum Agreement Subtree Problem

Anders Dessmark, Jesper Jansson, Andrzej Lingas, Eva-Marta Lundell

Lecture Notes in Computer Science > Combinatorial Pattern Matching > 220-229

For a set of rooted, unordered, distinctly leaf-labeled trees, the NP-hard maximum agreement subtree problem (MAST) asks for a tree contained (up to isomorphism or homeomorphism) in all of the input trees with as many labeled leaves as possible. We study the ordered variants of MAST where the trees are uniformly or non-uniformly ordered. We provide the first known polynomial-time algorithms for the...

chapter

Small Phylogeny Problem: Character Evolution Trees

Arvind Gupta, Ján Maňuch, Ladislav Stacho, Chenchen Zhu

Lecture Notes in Computer Science > Combinatorial Pattern Matching > 230-243

Phylogenetics is a science of determining connections between groups of organisms in terms of ancestor/descendent relationships, usually expressed by phylogenetic trees, also called “trees of life”, cladograms, or dendograms. In parsimony approach to reconstruct the phylogenetic trees, the goal is to find the most parsimonious tree, i.e., the tree requiring the smallest number/score of evolutionary...

chapter

The Protein Sequence Design Problem in Canonical Model on 2D and 3D Lattices

Piotr Berman, Bhaskar DasGupta, Dhruv Mubayi, Robert Sloan, more

Lecture Notes in Computer Science > Combinatorial Pattern Matching > 244-253

In this paper we investigate the protein sequence design (PSD) problem (also known as the inverse protein folding problem) under the Canonical modelon 2D and 3D lattices [12,25]. The Canonical model is specified by (i) a geometric representation of a target protein structure with amino acid residues via its contact graph, (ii) a binary folding code in which the amino acids are classified as hydrophobic...

chapter

A Computational Model for RNA Multiple Structural Alignment

Eugene Davydov, Serafim Batzoglou

Lecture Notes in Computer Science > Combinatorial Pattern Matching > 254-269

This paper addresses the problem of aligning multiple sequences of non-coding RNA genes. We approach this problem with the biologically motivated paradigm that scoring of ncRNA alignments should be based primarily on secondary structure rather than nucleotide conservation. We introduce a novel graph theoretic model (NLG) for analyzing algorithms based on this approach, prove that the RNA multiple...

Publication date

Set your own date range

Publication language

English (37)
Polish (1)

Keywords

APPROXIMATE MATCHING (1)
AUTOMATA AND FORMAL LANGUAGES (1)
CLOSEST STRING AND SUBSTRINGS (1)
CONSENSUS PATTERN (1)
DESIGN AND ANALYSIS OF ALGORITHMS (1)
EDIT DISTANCE (1)
FACTOR AND SUFFIX AUTOMATA (1)
GRAPH REALIZATION (1)
HAPLOTYPES (1)
INDEX (1)
INVERTED INDICES (1)
MERGING (1)
MULTIPLE SEARCH (1)
MUSICAL SEQUENCE SEARCH (1)
NODE(TAG=, PARTS=[NODE(TAG=I, PARTS=[K])])-CENTER PROBLEMS (1)
PERFECT PHYLOGENY (1)
POINT SET PATTERN MATCHING (1)
POLYNOMIAL TIME APPROXIMATION SCHEME (1)
PROTEIN DESIGN (1)
ROTATION (1)
SECIS (1)
SELENOCYSTEINE (1)
SET OPERATIONS (1)
SNPS (1)
SUFFIX TREE (1)
TAG SNPS (1)
TEXT COMPRESSION (1)
TWO DIMENSIONAL PATTERN MATCHING (1)
WEB SEARCH ENGINES (1)
more

INFONA - science communication portal

Combinatorial Pattern Matching
15th Annual Symposium, CPM 2004, Istanbul, Turkey, July 5-7, 2004. Proceedings

Back Matter

Sorting by Reversals in Subquadratic Time

Computational Problems in Perfect Phylogeny Haplotyping: Xor-Genotypes and Tag SNPs

Sorting by Length-Weighted Reversals: Dealing with Signs and Circularity

Optimizing Multiple Spaced Seeds for Homology Search

Approximate Labelled Subtree Homeomorphism

On the Average Sequence Complexity

Approximate Point Set Pattern Matching on Sequences and Planes

Finding Biclusters by Random Projections

Real-Time String Matching in Sublinear Space

On the k-Closest Substring and k-Consensus Pattern Problems

A Trie-Based Approach for Compacting Automata

A Simple Optimal Representation for Balanced Parentheses

Two Algorithms for LCS Consecutive Suffix Alignment

Efficient Algorithms for Finding Submasses in Weighted Strings

Maximum Agreement and Compatible Supertrees

Polynomial-Time Algorithms for the Ordered Maximum Agreement Subtree Problem

Small Phylogeny Problem: Character Evolution Trees

The Protein Sequence Design Problem in Canonical Model on 2D and 3D Lattices

A Computational Model for RNA Multiple Structural Alignment

Filter options

Publication date

Publication language

Keywords

INFONA - science communication portal

Combinatorial Pattern Matching 15th Annual Symposium, CPM 2004, Istanbul, Turkey, July 5-7, 2004. Proceedings $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication language

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

Combinatorial Pattern Matching
15th Annual Symposium, CPM 2004, Istanbul, Turkey, July 5-7, 2004. Proceedings