The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Data in large-scale genetic studies of complex human diseases, such as substance use disorders, are often incomplete. Despite great progress in genotype imputation, e.g., the IMPUTE2 method, considerably less progress has been made in inferring phenotypes. We designed a novel approach to integrate individuals' comorbid conditions with their genotype data to infer missing (unreported) diagnostic criteria...
Clustering cancer patients into subgroups and identifying cancer subtypes is an important task in cancer genomics. Clustering based on comprehensive multi-omic molecular profiling can often achieve better results than those using a single data type, since each omic data type may contain complementary information. However, it is challenging to integrate heterogeneous omic data directly. Based on one...
Representation learning algorithm in medical area maps high dimensional real world medical concepts to low dimensional vector space, encodes rich medical knowledge, and has brought improvement to various machine learning applications in medical area. However, previous representation learning models in medical area failed to consider the multi-sense characteristic of medical concept. Moreover, the...
Heart failure (HF) has a highly variable annual mortality rate and there is an urgent need of determining patient prognosis to enable informed decision-making about heart failure treatment strategies. Existing survival risk prediction models either require features that limit their applicability or pose difficulties for parameter estimation as physicians have to use a limited set of variables with...
There has been a continuing demand for traditional and complementary medicine worldwide. A fundamental and important topic in Traditional Chinese Medicine (TCM) is to optimize the prescription and to detect herb regularities from TCM data. In this paper, we propose a novel clustering model to solve this general problem of herb categorization, a pivotal task of prescription optimization and herb regularities...
We present a novel computational method for Multiple Sequence Alignment (MSA), a fundamental problem in computational biology. In contrast to other known approaches, our method searches for an optimal alignment — structurally and evolutionarily — by inserting or deleting gaps from a set of initial candidates in an efficient manner. Our method called a Universal Partitioning Search (UPS) approach for...
A fundamental and important challenge in modern datasets of ever increasing dimensionality is variable selection, which has taken on renewed interest recently due to the growth of biological and medical datasets with complex, non-i.i.d. structures. Naïvely applying classical variable selection methods such as the Lasso to such datasets may lead to a large number of false discoveries. Motivated by...
There are intensive computational efforts to discover large-scale microbial interactions from metagenomic abundance data, however, it is often difficult to validate such inferred interactions without a manually curated dataset. There are also a number of small-scale microbial interactions reported in massive literature with experimental confidence. Text mining can be employed to extract such microbial...
Trigger detection plays a key role in the extraction of biomedical events, so it will influence the results of biomedical events extraction directly. The traditional biomedical event trigger recognition method is based on artificial design features and construct feature vectors; Not only does it consume great amounts of manpower, it also lacks system generalization ability. Most of methods of trigger...
Non-invasive blood glucose measurement is a crucial challenge in both academic and industry communities. Currently, most of non-invasive solutions are developed based on optical signals. However, their accuracy is still far from clinical requirements if these measured optical signals directly used to estimate corresponding glucose levels. To solve this challenge, a novel Back-propagation Monte Carlo...
Next Generation Sequencing has introduced novel means of sequencing millions of DNA molecules simultaneously and has opened up new avenues in the field of bioinformatics that requires high performance computing technologies. Bioinformatics pipelines are constructed to carry out bioinformatics analyses in a fast and efficient manner. Workflow systems are developed to simplify the construction of pipelines...
High-sensitivity C-reactive protein (hs-CRP) performs important roles on the onset of metabolic syndrome and cardiovascular diseases (CVD), but little is known about association between hs-CRP and obesity-related metabolic abnormalities in young people without classical CVD risk factors. It thus motivated us to investigate association among hs-CRP, body fat mass (FM) distribution, and other cardiometabolic...
Drug-target interaction identification is of highly importance in drug research and development. The traditional experimental paradigm is costly, while the previous in silico prediction paradigm remains a challenge because of diversified data production platforms and data scarcity. In this paper, we modeled drug-target interaction prediction as a binary classification task based on transcriptome data...
In molecular biology, phenotypes are often described using complex semantics and diverse biomedical expressions, thereby facilitating the development of named entity recognition (NER). Here, we propose a novel approach of recognizing plant phenotypes by cascading word embedding to sentence embedding with a class label enhancement. We utilized a word embedding method to find high-frequency phenotypes...
The clinical decision support system can effectively solve the limitations of doctors' knowledge, reduce misdiagnosis and help enhance health. The traditional genetic data storage and analysis technology based on the stand-alone environment have limited scalability, which has been difficult to meet the computational requirements of rapid genetic data growth. In this paper, we propose a distributed...
In order to facilitate better estimations on coronary artery disease conditions of a patient, we aim to predict the number of Angioplasty (a coronary artery procedure) by taking into account all the information from his/her Electronic Health Record (EHR) data. For this purpose, two exponential family members—multinomial distribution and Poisson distribution models—are considered, which treat the target...
Antimicrobial peptides are short amino acid sequences with antibacterial, antifungal, and antiviral properties. Antibacterial peptides have the possibility to form a new class of antibiotics to aid in combating bacterial antibiotic resistance. Most machine learning methodologies applied to the task of identifying antimicrobial peptides have applied features representing the presence or absence of...
Recent studies show that drug-disease associations provide important information for drug discovery and drug repositioning. Wet experimental identification of drug-disease associations is time-consuming and labor-intensive. Therefore, the development of computational methods that predict drug-disease associations is an urgent task. In this paper, we propose a novel computational method named NTSIM,...
Sequence alignment is a core step in the processing of DNA and RNA sequencing data. In this paper, we present a high performance GPU accelerated set of APIs (GASAL) for pairwise sequence alignment of DNA and RNA sequences. The GASAL APIs provide accelerated kernels for local, global as well as semi-global alignment, allowing the computation of the alignment score, and optionally the start and end...
Inverted Repeats in DNA sequences have long been known to have both major beneficial and detrimental effects in regards to how DNA is transcribed and duplicated. Palindromic sequences are frequently translated into proteins and may also facilitate DNA repair in some instances. However, they are also associated with significantly increased risk of mutation. Current methods are either slow or limited...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.