The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Thanks to recent advances in the field of genomics, it is now possible to create a comprehensive atlas of the basic units of life—cells. In this paper, we present a frame work for single cell genomics research which employs several new machine learning models such as convolutional neural networks, deep auto-encoder, recurrent neural networks etc. With these effective learning models on multi-source...
Breast cancer is the most common type of invasive cancer in females. It accounts for 18.2% of all cancer deaths worldwide. Although somatic mutations play important roles in cancer development and prognosis, the outcome predictions are largely based on the expression of marker genes. We submit that developing an innovative prognostic model incorporating somatic mutations with gene expression can improve...
Pulmonary adenocarcinoma in situ (AIS) is an intermediate subtype of lung adenocarcinoma that exhibits non-invasive growth patterns, but can develop into invasive. Almost 100% of AIS patients can be cured with complete resection. In contrast, the five-year survival rate for those diagnosed with invasive lung adenocarcinoma is only about 4%. In order to get a better understanding of adenocarcinoma...
Diabetes mellitus and obesity are becoming some of the most serious public health challenges in the world. To help researchers more quickly reveal the complex relationships existing between diabetes mellitus, obesity, and related diseases in the literature, and give them an inspiration to search the effective treatments for these diseases, we propose a novel model named as representative latent Dirichlet...
The flow-diverting (FD) stent has become a commonly used endovascular device to treat cerebral aneurysms. This discourages blood from entering the aneurysm, thereby reducing the likelihood of aneurysm rupture. Using computational fluid dynamics (CFD) to simulate the aneurysmal haemodynamics after FD treatment could help clinicians predict the stent effectiveness prior to the real procedure in the...
Long noncoding RNAs (lncRNAs) function as regulators and play critical roles in diverse biological processes, however, the majority of lncRNAs are not characterized, and their roles in regulation remain to be elucidated. Present RNA-seq assembly approaches are insufficient to identify complete full-length transcripts and often reveal excessive amount of single-exon lncRNAs, many of them tend to be...
Efficacy prediction is an inseparable part of TCM. We firstly analyze the correlation between indicators and efficacy, and max blood-drug concentration(Cmax) is chosen as the target to reflect the efficacy of drugs. Then we apply linear regression(LR), support vector regression(SVR) as well as artificial neural networks(ANNs) to predict the efficacy of Wuji pills. The results of the leave-one-out...
The completeness of gene expression data is essential to many gene expression data analysis issues. In this paper, inspired by the idea of semi-supervised learning with tri-training, a hybrid iterative imputation method called tri-imputation is proposed to estimate the missing values in gene expression data. In detail, in each round of tri-imputation, any two imputation methods are collaborating with...
In this paper, we present a study on how to achieve Byzantine fault tolerance for collaborative editing systems with commutative operations. Recent research suggests that Conflict-free Replicated Data Types (CRDTs) can be used to construct collaborative editing systems where concurrent update operations are commutative. This new approach is shown to avoid the complex issue of conflict resolution for...
The antioxidant activity of green tea polyphenol epigallocatechin-3-gallate (EGCG) has been found to be critical in inhibiting carcinogenesis. In our previous study, we identified a set of protein coding genes and microRNAs whose expressions were significantly modulated in response to the EGCG treatment in tobacco carcinogen-induced lung adenocarcinoma in A/J mice. In this study, we further conducted...
In this paper, we present a model to automatically generate efficient transportation networks given a simulated urban environment with predefined population distributions and other physical constraints. Based on the empirical analysis of different topological structures of networks, we found that that the efficiency of transportation networks heavily depends on the layout of the stations. The model...
Distributional semantics and frame semantics are two representative views on language understanding in the statistical world and the linguistic world, respectively. In this paper, we combine the best of two worlds to automatically induce the semantic slots for spoken dialogue systems. Given a collection of unlabeled audio files, we exploit continuous-valued word embeddings to augment a probabilistic...
Spoken dialogue systems typically use predefined semantic slots to parse users' natural language inputs into unified semantic representations. To define the slots, domain experts and professional annotators are often involved, and the cost can be expensive. In this paper, we ask the following question: given a collection of unlabeled raw audios, can we use the frame semantics theory to automatically...
Previous work on dialogue act classification have primarily focused on dense generative and discriminative models. However, since the automatic speech recognition (ASR) outputs are often noisy, dense models might generate biased estimates and overfit to the training data. In this paper, we study sparse modeling approaches to improve dialogue act classification, since the sparse models maintain a compact...
We study the opportunity for using crowdsourcing methods to acquire language corpora for use in natural language processing systems. Specifically, we empirically investigate three methods for eliciting natural language sentences that correspond to a given semantic form. The methods convey frame semantics to crowd workers by means of sentences, scenarios, and list-based descriptions. We discuss various...
We investigate the problem of automatically detecting unnatural word-level segments in unit selection speech synthesis. We use a large set of features, namely, target and join costs, language models, prosodic cues, energy and spectrum, and Delta Term Frequency Inverse Document Frequency (TF-IDF), and we report comparative results between different feature types and their combinations. We also compare...
In this paper, we use extended letter-to-sound rules for automatic mispronunciation detection, aiming at checking pronunciation errors made by Chinese learners of English. The knowledge-based approach is used to generate extended pronunciation lexicon and incorporated into the HMM-based mispronunciation detection system. The pronunciation errors lead to misunderstanding of a word are expected to be...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.