The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
To solve the difficulty of the field of Automatic Entity Relation Extraction, in this paper, a method that used binary classification thinking, meanwhile combined with reasoning rules to extract the field of entity relation is proposed. considering comprehensively the context information of entity, entity type and their combination of characteristics to construct the feature set, which in order to...
This thesis has selected 108 documents from the Chinaqking.com and the Emerald databases that are related to the value-added services of government information resources, talking about the research topics and methods of the available documents. Meanwhile, it provides insights into the main research conclusions and research limitations in terms of the classification of the government information resources,...
According to the traditional morphological classification divide the quality of traditional Chinese medicine White Peony Root into first grade second grade and the third grade. Discrete the chromatography data of the White Peony Root which obtained under the condition of standard test and also make the information reduction. Obtaining the great peaks of linear independent vectors and obtaining every...
We give sub linear-time approximation algorithms for some optimization problems arising in machine learning, such as training linear classifiers and finding minimum enclosing balls. Our algorithms can be extended to some kernelized versions of these problems, such as SVDD, hard margin SVM, and L2-SVM, for which sub linear-time algorithms were not known before. These new algorithms use a combination...
In this paper, we present the problem of appropriate feature selection for constructing a Maximum Entropy (ME) based Named Entity Recognition (NER) system under the multiobjective optimization (MOO) framework. Two conflicting objective functions are simultaneously optimized using the search capability of MOO. These objectives are (i). the dimensionality of features, which is tried to be minimized,...
Instance-based learning algorithms typically suffer influences of dissimilarity functions. The problem is frequently related to the Nearest Neighbor rules of these algorithms. This paper will introduce a new dissimilarity measure, called Heterogeneous Centered Difference Measure, which is tested over many known databases. The results are compared with other distance functions.
Aimming at the ever-present problem of imbalanced data in text classification, the authors study on several forms of imbalanced data, such as text number, class size, subclass and class fold. Some useful conclusions are gotten from a series of correlative experiments: first, when the text of two class is almost the same number, the difference of word number become major factor to affect the accuracy...
Data mining or Knowledge discovery is seen as an increasingly important tool by modern business to transform data into an informational advantage. Mining is a process of finding correlations among dozens of fields in large relational databases and extracts useful information that can be used to increase revenue, cuts costs, or both. Classification is a supervised machine learning procedure and an...
In this paper we propose is an extension of kernel k-means clustering algorithm for symbolic interval data with aggregated kernel functions. To evaluate this method, experiments with synthetic interval data set was performed and we have been compared our method with a dynamic clustering algorithm with single adaptive distance. The evaluation is based on an external cluster validity index (corrected...
This paper presents an off-line signature verification system composed of a combination of several different classifiers. Identity authentication is a very important characteristics specially in systems that requires a high degree of security such as in bank transactions. In our experiments, one-class classifier was used to create a signature verification system, consequently only genuine signatures...
This paper presents Perturbed Frequent Itemset based Classification Technique (PERFICT), a novel associative classification approach based on perturbed frequent itemsets. Most of the existing associative classifiers work well on transactional data where each record contains a set of boolean items. They are not very effective in general for relational data that typically contains real valued attributes...
This work studies the use of Particle Swarm Optimization (PSO) as a classification technique. Beyond assessing classification accuracy, it investigates the following questions: does PSO present limitations for high dimensional application domains? Is it less efficient for multi class problems? To answer the questions, an experimental set up was realized that uses three high dimensional data sets....
The paper selected PERCLOS to evaluate driving fatigue after the comparison of various fatigue detection methods for smart vehicle space. We detected driver fatigue status by measuring the proportion of eyes closed in a certain period of time and the continued closure time. On the basis of the Haar-Like feature, AdaBoost algorithm was adopted to produce the strong classifier for face and eye detection...
With the boom of web and social networking, the amount of generated text data has increased enormously. Much of this data can be considered and modeled as a stream and the volume of such data necessitates the application of automated text classification strategies. Although streaming data classification is not new, considering text data streams for classification purposes has been extensively researched...
A tool for discovery of gait anomalies of elderly from motion sensor data is proposed. The gait of the user is captured with the motion capture system, which consists of tags attached to the body and sensors situated in the apartment. Position of the tags is acquired by the sensors and the resulting time series of position coordinates are analyzed with dynamic time warping and machine learning algorithms...
The Covering algorithm is proposed by Professor ZhangLing and ZhangBo in the 20th century, which simulates the structure of human learning, building a Constructive Neural Network Learning Model. Covering algorithm has been widely used to solve massive data classification problem, because its performance. The covering classification algorithm has fast learning, high recognition rate, massive data processing...
Multi-class classifier is usually constructed by means of combining the outputs of several binary ones, according to an error correcting output code (ECOC) scheme. In the paper, within the framework of ECOC, we analyse of the ECOC of kernel machines originally proposed before. Then we present the generalization bounds of the ECOC of kernel machines according to the results of stability and generalization...
We propose a sparse probabilistic learning approach for nonlinear channel equalization in wireless communication systems, by using the relevant vector machine (RVM) technique. In particular, we propose two versions of the RVM based equalizer: 1) maximum a posterior RVM (MAP-RVM), 2) marginalized RVM (MRVM). Compared to the standard support vector machine (SVM) method, the proposed RVM approach not...
Cognitive radio (CR) is a promising technology for improving the utilization of the scarce radio spectrum by allowing secondary users to regularly sense the spectrum and opportunistically access the under-utilized frequency bands. However, spectrum sensing in CR environment is a challenging task due to varying radio channel conditions and might lead to interference with licensed users. In this paper,...
In email networks, user behaviors affect the way emails are sent and replied. While knowing these user behaviors can help to create more intelligent email services, there has not been much research into mining these behaviors. In this paper, we investigate user engagingness and responsiveness as two interaction behaviors that give us useful insights into how users email one another. Engaging users...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.