The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
FASTQ is the defacto standard for data from next generation sequencing platforms. The FASTQ format uses four lines per read: two lines for header information, one for the sequence itself, and one for the quality scores. The proposed compression scheme treats the various lines of each four line set differently. The highly repetitive headers are encoded using an LZ77 variant. The reads themselves are...
We propose a compression algorithm for the quality scores contained in FASTQ files which are generated in large volumes during high throughput sequencing. The proposed algorithm is a context dependent arithmetic coder which is based on observations of the structure of quality scores in FASTQ files. Simulation results indicate a significantly superior performance of the algorithm to the current state...
Average mutual information (AMI) has been used in a number of applications in bioinformatics. In this paper we present its use to study genetic changes in populations; in particular populations of HIV viruses. Disease progression of HIV-1 infection in infants can be rapid resulting in death within the the first year, or slow, allowing the infant to survive beyond the first year. We study the development...
This paper presents the design and measurements of a predictive coding on-sensor compression CMOS imager. Predictive coding is employed to decorrelate the image. The prediction operations are performed in the analog domain to avoid quantization noise and to decrease the area complexity of the circuit. The decorrelated image is encoded with a bank of column-parallel entropy encoders. Each encoder is...
A focal plane video compression integrated circuit is presented. The design consists of a 128 times 128 pixel array and a bank of column-level processors. Each one of the column-level processors performs the tasks of image decorrelation, quantization, and entropy encoding. The chip provides at its output a compressed bit stream. The integration of the quantizer and the entropy encoder at the column...
Two sets of figures are presnted without discussion. The first show (1a): Average Mutual Information Profile for the Human Chromosomes plotted for values of k between 5 and 50; and b) Average Mutual Information Profile for the Mouse Chromosomes plotted for values of k between 5 and 50. Thesecond set of figures show: (2a) Average Mutual Information Profile for the C. Elegans Chromosomes plotted for...
We propose a "genome signature" for bacterial genomes based on a triplets Markov model. Without the alignment or data preprocessing required by traditional analysis methods, the model is shown to efficiently capture identifying genomic information at genus, species and strain levels. Based on the model, a simple distance measure is proposed for constructing phylogeny trees. Unlike other...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.