The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Protein-protein interaction (PPI) networks are valuable biological data source which contain rich information useful for protein function prediction. The PPI network data set obtained from high-throughput experiments is known to be noisy and incomplete. By modeling PPI data as a graph, research efforts are being made in the literature to improve the performance of protein function prediction by extending...
Reducing unnecessary lab tests is an essential issue in intensive care unit (ICU). In this paper we analyze lab tests ordered for ICU patients using data mining methods. The selected dataset is extracted from Multi-parameter Intelligent Monitoring in Intensive Care II (MIMIC-II) database. Calcium test is selected as the target test which is one of the frequent tests for gastrointestinal bleeding patients...
Parkinson is a disease attacking the nervous system and worsens the work of nervous system over time. This disease is incurable, the therapy existing today is only able to help to relieve the symptoms. Hence, an early diagnose is deemed essential to determine an accurate type of therapy. Parkinson disease can be diagnosed by examining the symptoms apparent to the patient. One of the symptoms is the...
The number of application based on Apache Hadoop is increasing dramatically due to the robustness and dynamic features of this system. At the heart of Apache Hadoop, the Hadoop File System (HDFS) provides the reliability, scalability and high availability to computation by applying a static replication strategy. However, because of the characteristics of parallel operations on the application layer,...
In the paper kernel evolving neural network and its learning algorithm are investigated. The proposed system solves the problem of finding the eigenvectors and the corresponding principal components in on-line mode in an environment where hidden in the experimental data interdependencies are nonlinear and can change throw time.
This paper is an exploration to find a way to get the person attributes in profiles. Considering those attributes exists in large volume of unstructured data, and it is very difficult to gain in a short time. So, we use a method combing the pattern and SVM to extract the person attributes. Firstly, we collect many raw profiles in websites by our configurable crawler. Secondly, we use statistic methods...
A hybrid convolution tree kernel is applied to extract of the history evolution information. The hybrid kernel consists of two individual convolution kernels: a Path kernel, which captures predicate-argument link features, and a Constituent Structure kernel, which captures the syntactic structure features of arguments. The Predicate-Arguments Feature (PAF) kernel was extracted and decomposed into...
Multiclass classification is the task of classifying the samples into more than two classes. Generally multi-classifiers face difficulty in classifying samples those are very close to the separating hyperplane, known as Generalization error. Generalization error can be reduced by maximizing the margin of the separating hyperplanes. Support Vector Machine (SVM) is a maximum-margin classifier, its aim...
Clustering of high dimensionality data which can be seen in almost all fields these days is becoming very tedious process. The key disadvantage of high dimensional data which we can pen down is curse of dimensionality. As the magnitude of datasets grows the data points become sparse and density of area becomes less making it difficult to cluster that data which further reduces the performance of traditional...
We present Rough-Fuzzy Support Vector Domain Description (RFSVDD), a novel data description algorithm that provides a rough-fuzzy characterization of a data set and shows its potential for outlier detection. Its resulting data structure is characterized by two components: a crisp lower-approximation and a fuzzy boundary. While the lower-approximation consists of those data points that lie inside the...
Conventional methods for mixed pixel analysis have their limitations in performance when the scenario is highly mixed without pure endmembers or only virtual endmembers can be generated. Moreover, theses approaches do not address the endmember variability. In this study, a multiple endmembers extraction algorithm based on Archetypal Analysis (AA) is proposed to solve the above problems. AA aims at...
As a new kind of social media, query log gains mass size of users and data. It's easy for people to post on query log. Also, the posts spread fast and can be easily seen by many other users. For the reasons above, users post various and large number contents on query log. Among these posts, we find numerous posts that express authors purchase wish for a certain product, in other words, consumption...
We report on the use of a CMOS Contrast-based Binary Vision Sensor (CBVS), with embedded contrast extraction, for gesture detection applications. The first advantage of using this sensor over commercial imagers is a dynamic range of 120dB, made possible by a pixel design that effectively performs auto-exposure control. Another benefit is that, by only delivering the pixels detecting a contrast, the...
The Internet produces massive financial unstructured textual information every day. How to utilize these unstructured data effectively is a challenging topic. In the background of A share T+0 and stock option promoting in the China security market, we present a model to recognize the risk and investment opportunity according to the massive online financial textual information. Since the key word vector...
In many applications such as dynamic social network and customer behavioral analysis, the data intrinsically have many dimensions and can be naturally represented as high-order tensors. In this study, a SVM ensemble learning method is proposed for classification using tensor data. The method is used in identifying cross selling opportunities to recommend personalized products and services to customers...
Knowledge is stored in an enterprise in various forms ranging from unstructured operational data, legal documents to structured information like programs, as well as relational data stored in databases to semi-structured information stored in xml files. All these information if viewed from a holistic standpoint can help an enterprise to understand and reflect upon itself and thereby make knowledgeable...
To improve performance of large-scale scientific applications, scientists or tuning experts make various empirical attempts to change compiler options, program parameters or even the syntactic structure of programs. Those attempts followed by performance evaluation are repeated until satisfactory results are obtained. The task of performance tuning requires a great deal of time and effort. On account...
This dataset documents the activity in the public portion of the git Super-repository of the Linux kernel during 2012. In a distributed version control system, such as git, the Super-repository is the collection of all the repositories (repos) used for development. In such a Super-repository, some repos will be accessible only by their owners (they are private, and are located in places that are unreachable...
Over the last years, energy consumption has become a first-class citizen in software development practice. While energy-efficient solutions on lower-level layers of the software stack are well-established, there is convincing evidence that even better results can be achieved by encouraging practitioners to participate in the process. For instance, previous work has shown that using a newer version...
Build systems contain a lot of configuration knowledge about a software system, such as under which conditions specific files are compiled. Extracting such configuration knowledge is important for many tools analyzing highly-configurable systems, but very challenging due to the complex nature of build systems. We design an approach, based on SYMake, that symbolically evaluates Make files and extracts...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.