The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Current standard genome-wide association studies (GWAS) have relied on the simple analysis by focusing on the association between single genetic factor and one single common complex trait. However, since most common complex traits are associated with multiple genetic factors and their epistasis, this simple analysis is not powerful enough to detect multiple genetic factors. Furthermore, in many GWAS,...
Mining published articles in biology and medicine is a favored means of identifying potential biomarkers in comparison to conventional reviewing process. This is made possible by the development of public literature databases and data mining algorithms. In this article, we present a method to extract novel protein interactions from online full-text articles for biomarker discovery. By evaluating support...
In this paper we present a global analysis of colon rectal cancer genes and their associated miRNAs. Significant genes in colon cancer were obtained by mining the literature and cancer related miRNAs were obtained from miRbase. Five different features were used to analyze to obtain a global gene-miRNA profile. By combining the topological features along with miRNA-gene associations and gene propensity...
In this paper, we describe our contribution to the Spoken Dialog Challenge. We set up a user simulation using the large Let's Go corpus as resource to build our models. Automatic calls were made to all four dialog systems in the SDC, bus information systems that cover the schedule of Pittsburgh, PA. We discuss in detail the architecture and required setup for our system-independent user simulation...
When active learning is applied to real-world applications, human experts usually act as oracles to provide labels. However, human make mistakes, thus noise might be introduced during the learning process. Most previous studies simplify the problem by assuming uniformly-distributed noise over the sample space. Such assumption, however, might fail to precisely reflect the human experts' behaviour in...
When we think of an object in a supervised learning setting, we usually perceive it as a collection of fixed attribute values. Although this setting may be suited well for many classification tasks, we propose a new object representation and therewith a new challenge in data mining: an object is no longer described by one set of attributes but is represented in a hierarchy of attribute sets in different...
In active learning, where a learning algorithm has to purchase the labels of its training examples, it is often assumed that there is only one labeler available to label examples, and that this labeler is noise-free. In reality, it is possible that there are multiple labelers available (such as human labelers in the online annotation tool Amazon Mechanical Turk) and that each such labeler has a different...
In recent years, new emerging application domains have introduced new constraints and methods in data mining field. One of such application domains is activity discovery from sensor data. Activity discovery and recognition plays an important role in a wide range of applications from assisted living to security and surveillance. Most of the current approaches for activity discovery assume a static...
Human motion recognition in video data has several interesting applications in fields such as gaming, senior/assisted living environments, and surveillance. In these scenarios, we might have to consider adding new motion classes (i.e. new types of human motions to be recognized) as well as new training data (say, for handling different type of subjects). Hence, both accuracy of classification and...
Understanding the connectome of the human brain is a major challenge in neuroscience. Discovering the wiring and the major cables of the brain is essential for a better understanding of brain function. Diffusion Tensor imaging (DTI) provides the potential way of exploring the organization of white matter fiber tracts in human subjects in a non-invasive way. However, it is a long way from the approximately...
In this paper, a definition and function of a curation are extended according to the chance discovery aspects. A new definition, function, and effect of a curation in chance discovery are discussed. First, ordinal types of curation are reviewed. Definition by American Association of Museums Curators Committee (AAMCC) and Digital data curation are shown. Especially the latter is not curation in (art)...
The problem of link prediction has been studied extensively in literature. There are various versions of the link prediction problem e.g., link existence problem, link removal problem, predicting edge weights over time etc. In this paper we describe a new type of link prediction problem called the Internetwork link-prediction problem where the task is to predict links across different networks. Thus...
Document classification plays an increasingly important role in extracting and organizing the knowledge, however, the Web document classification task was hindered by the huge number of Web documents while limited resource of human judgment on the training data. To obtain sufficient training data in a cost-efficient way, in this paper, we propose a semi-supervised learning approach to predict a document's...
Today E-commerce popularity has made web an excellent source of gathering customer reviews / opinions about a product that they have purchased. The number of customer reviews that a product receives is growing at a very fast rate (It could be in hundreds or thousands). Opinion mining from product reviews, forum posts and blogs is an important research topic today with many applications. However, existing...
This paper introduces the clustering-based sentiment analysis approach which is a new approach to sentiment analysis. By applying a TF-IDF weighting method, voting mechanism and importing term scores, an acceptable and stable clustering result can be obtained. It has competitive advantages over the two existing kinds of approaches: symbolic techniques and supervised learning methods. It is a well...
In this paper we present a divisive hierarchical method for the analysis and segmentation of visual images. The proposed method is based on the use of the k-means method embedded in a recursive algorithm to obtain a clustering at each node of the hierarchy. The recursive algorithm determines automatically at each node a good estimate of the parameter k (the number of clusters in the k-means algorithm)...
This paper studies the problem of extracting data from large numbers of semi-structured web pages. The fact that many websites have enormous pages generated dynamically from a underlying structured source like a database makes it feasible to induct a common template for similar web pages and then extract data accordingly. Previous work on this problem has limited practical utility because of either...
Arterial geometry variability is present both within and across individuals. To analyze the influence of geometric parameters on maximal wall shear stress (MWSS) in the human carotid artery bifurcation, the computer simulations were run to generate the data pertaining to this phenomenon. In our work we evaluate various prediction models for modeling relationship between geometric parameters of the...
Recently, the number of elderly people increases in Japan, and the system which supports them is necessary. It is necessary for the system to recognize human action. Then, the system also uses Mining technology. The system confirms the action by using voice recognition. The authors construct this system based on the concept of "Kukanchi". The intelligent space made by basing on the concept...
Since a dance motion is represented as a temporal sequence of poses, comparison of motion data is reduced to the comparison of individual poses. In the present paper, the pose comparison problem is basically solved by 2D correlation assuming that human motion is bound by gravity. Therefore, dance motion analysis is discussed based on correlation matrices that are calculated between two motion sequences...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.