The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
A set of protein pairs predicted to be interacting with high ratio of true positive is valuable for target selection in experiments like protein structure determination. Our goal in this paper is to investigate the problem of finding such a set of protein pairs in an organism by machine learning methods. Yeast genome was taken for this study and support vector machine was adopted as the classification...
To compare large genomic DNA sequences of related organisms and of different species, researchers need efficient methods to align many long sequences. We developed a new tool super multiple genome alignment (SMGA for short), specially for rapid multiple global alignment of genomic sequences. SMGA can align a data set consisting of 50M of 9 enterobacterial genomic sequences in 40 minutes. Our method...
Structural genomics is an international effort to determine the three-dimensional shapes of all important biological macromolecules, with a primary focus on proteins. Target proteins should be selected according to a strategy that is medically and biologically relevant, of good financial value, and tractable. In 2003, we presented the "Pfam5000" strategy, which involves selecting the 5,000...
The identification of protein-protein interactions along with their spatial and temporal localization is vital data for assigning functional information to proteins. Historically, these data sets obtained from fluorescence microscopy, have been analyzed manually, a process that is both time consuming and tedious. The development of an automated system that can measure the location dynamics of the...
The CellML language is an XML-based specification for representing mathematical models of biological processes. The University of Auckland's Bioengineering Institute is committed to creating tools and evolving the language so that biologists and modelers can build, understand, and share models. The CellML standard and tool developments are open source; we encourage researchers and software developers...
The Internet is becoming increasingly accessible and new technologies are enabling the delivery of more features to end users. It is therefore increasingly compelling to develop technology to facilitate the delivery of educational content and computational tools via the Internet. Here we report on the Internet enabling of the CMISS package as a Web browser extension, and its use in a custom online...
The performances of support vector regression estimation were analyzed. It was found that the insensitive factor epsiv can affect the performance of support vector regression estimation significantly. The noise inside the sample data should be considered in determining the insensitive factor epsiv when support vector regression was employed. A novel support vector regression based on non-uniform lost...
This paper presents a microfluidic mixing module array developed for bio-fluid/chemical delivery and mixing. Vortex micropumps, microchannels and pillared-surface diaphragm (PSD) active micromixers were successfully integrated into a single polymer-based microfluidic chip, consisting of three mixing modules. The pumping characteristics of the vortex micropump were investigated with both analytical...
A new cryosection milling imaging system with high spatial resolution is developed to screen small laboratory animals such as mice and rats. The system hardware consists of cutting device, image capture and photography device, refrigerated storage and parallel data processing system. By this system high spatial resolution (no less than 20 mum) small laboratory animal atlas can be achieved. After image...
A new method based on support vector regression (SVR) has been introduced to predict the relative solvent accessibility (RSA) of residues from protein primary sequences, which uses the local information of protein primary sequences as input. Different to most previous methods which are designed to predict the exposure state (exposed/buried, exposed/intermediate/buried, etc) of a particular residue...
The geometry-based mechanical models called elastic network models (ENMs) in various resolutions have been developed for the study of macromolecular motions. In a coarse-grained ENM, a biological system is represented as a network of springs connecting representative points. They range from single atoms to functional domains depending on the level of details in modeling. In this paper presented are...
Conformational searching is common in many applications and algorithmic improvements on either speed or the quality would have a profound impact. In this paper, we address a target-constrained conformational searching problem and show two subdivision methods approximating the solutions rather than solving for the exact solutions. The performance of the methods is presented
Annotation of the functional sites on the surface of a protein has been the subject of many studies. In this regard, the search for attributes and features characterizing these sites is of prime consequence. Here, we present an implementation of a kernel-based machine learning protocol for identifying residues on a DNA-binding protein form the interface with the DNA. Sequence and structural features...
Protein sequence alignments reveal the evolutionary information between homologous sequences. Traditional sequence alignment methods only use sequence information and the structure information from template is ignored. Recently, Kleinjung et al. developed a contact-based sequence alignment method that used the structural information from side-chain contacts. Alignment scores are provided by the CAO...
Subcellular location of a protein is one of the key functional characters as proteins must be localized correctly at the subcellular level to have normal biological functions. In this paper, all motifs in PROSITE were examined and those that are indicative to eukaryotic protein subcellular localizations were picked out. A corresponding motif module was built and combined to our former work: LOCSVMPSI...
Proteolytic processing occurs predominantly at basic amino acid residues. The existence of the cleavage sites not recognized by rules proposed in previous studies prompts us to test whether, and to what extent, the sites cleave. Due to the imbalanced cleavage site database from SWISS, Smote combined with Tomek links is applied to over-sample the data. A neural network method is then developed to predict...
This paper presents an application of neural networks in location of the copper-binding sites of metalloprotein. Using annotated metalloprotein downloaded from PDB, sequences including copper-binding sites were extracted. By further finding the particular core segments of copper-binding sites, the input and output information for training is polished. Moreover, this paper investigates the number of...
DNA sequences are generally very long chains of sequentially linked nucleotides. There are four different nucleotides and combinations of these build the nucleotide information of sequence files contained in data sources. When a user searches for any sequence for an organism, a compressed sequence file can be sent from the data source to the user. The compressed file then can be decompressed at the...
We propose a "genome signature" for bacterial genomes based on a triplets Markov model. Without the alignment or data preprocessing required by traditional analysis methods, the model is shown to efficiently capture identifying genomic information at genus, species and strain levels. Based on the model, a simple distance measure is proposed for constructing phylogeny trees. Unlike other...
Multiple sequence alignment is a central topic of extensive research in computational biology. Basically, two or more protein sequences are compared so as to evaluate their similarity. This work reports a methodology for parallel processing of a multiple sequence alignment algorithm (ClustalW) in an environment of networked computers. A detailed description of the modules that compose the distributed...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.