The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Knowledge discovery from the Web is a cyclic process. In this paper we focus on the important part of transforming unstructured information from Web pages into structured relations. Relation extraction systems capture information from natural language text on Web pages, called Web text. However, extraction is quite costly and time consuming. Worse, many Web pages may not contain a textual representation...
Although data mining techniques are made tremendous progress, "knowledge-poor" is still a large gap of the current data mining systems. Few researches notice the fact that useful knowledge not only is the final results of an intelligent classification, clustering or prediction algorithm, but also runs through the whole process of data mining in which much potential useful information is...
This article proposes such a question classification approach that integrates multiple semantic features. It is aimed at these two questions in Chinese question classification models: inaccurate semantic information extraction and too slow processing speed caused by too high Eigenvector dimension. With the help of HowNet and the support vector machine and syntactic and semantic information of question...
Hepatitis patients are those who need continuous special medical treatment to reduce mortality rate. Using clinical test findings data and machine learning technology such as Support Vector Machines (SVM), the classification and prediction of their life prognosis can be done. However, we cannot pledge that all the features values in the data are correlated to each other. Therefore, we incorporate...
There are a lot of text documents on the Web which contain opinions or sentiments about an object such as software reviews, product reviews, movies reviews, music reviews, and book reviews etc. Opinion mining or sentiment classification aim to extract the features on which the reviewers express their opinions and determine they are positive or negative. In this paper we proposed an ontology based...
Automatic document classification due to its various applications in data mining and information technology is one of the important topics in computer science. Classification plays a vital role in many information management and retrieval tasks. Document classification, also known as document categorization, is the process of assigning a document to one or more predefined category labels. Classification...
An approach of sensor subset selection is considered one of significant issues in machine olfaction. Basically, each sensor should provide different selectivity profiles over the range of target odor application so that a unique odor pattern is produced from each sensor in the array. However, some or most of the features obtained from an array of sensors in practice are redundant and irrelevant due...
In network intrusion detection systems, feature extraction plays an important role in a sense of improving classification performance and reducing the computational complexity. Principle Component Analysis and Independent Component Analysis are both common feature extraction methods currently. This paper proposed a novel feature extraction method for network intrusion detection and the core of this...
The paper provides a novel approach to emotion recognition from facial expression and voice of subjects. The subjects are asked to manifest their emotional exposure in both facial expression and voice, while uttering a given sentence. Facial features including mouth-opening, eye-opening, eyebrow-constriction, and voice features including, first three formants: F1, F2, and F3, and respective powers...
The application of feature ranking to software engineering datasets is rare at best. In this study, we consider wrapper-based feature ranking where nine performance metrics aided by a particular learner are evaluated. We consider five learners and take two different approaches, each in conjunction with one of two different methodologies: 3-fold Cross-Validation (CV) and 3-fold Cross-Validation Risk...
This paper presents a computer-aided diagnosis (CAD) system based on combined support vector machine (SVM) and linear discriminant analysis (LDA) classifier for detection and classification breast cancer in digital mammograms. The proposed system has been implemented in four stages: (a) Region of interest (ROI) selection of 32??32 pixels size which identifies suspicion regions, (b) Feature extraction...
Support vector machine (SVM) is a novel machine learning method based on statistical learning theory (SLT). SVM is powerful for the problem with small samples, non linear and high dimension. A multi-class SVM classifier is applied to predict the coal and gas outburst in the paper. In this model, the dominant factors are the input vectors and the degree of outburst danger is divided into four types:...
Feature extraction is of great importance in condition monitoring and fault diagnosis of rolling machinery. Nonlinear dimensionality reduction (NDR) theories brought a new idea for recognizing and predicting the underlying nonlinear behavior. In this paper, we propose a NDR based feature extraction method for fault classification of rolling element bearing. Original feature spaces are constructed...
India is a multi-lingual and multi-script country, where eighteen official scripts are accepted and there are over hundred regional languages. In this paper we propose a zone-based hybrid feature extraction system. The character centroid is computed and the image (character/numeral) is further divided into n equal zones. An average angle from the character centroid to the pixels present in the zone,...
A novel method of feature extraction form protein sequences, structures and physicochemical properties has been proposed and obtained a better classification results by the key eigenvector obtained form knowledge reduction combined with the algorithm of support vector machine. Based on Jackknife detecting methods, the comprehensive classification results 78.3% and 90.9% for all-??, all-??, ??+?? and...
Kernel principal component analysis (kernel PCA or KPCA) has been used widely for non-linear feature extraction, dimensionally reduction, and classification problems. However, KPCA is known to have high computational complexity, that is the eigenvalue decomposition of which size equals to the number of samples n. Moreover, in order to calculate projection of vector onto the subspace obtained by KPCA,...
This paper presents the results of using statistical analysis and automatic text categorization to identify an author's age group based on the author's online chat posts. A naive Bayesian classifier and support vector machine (SVM) model were used. The SVM model experiments generated an f-score measurement of 0.996 on test data distinguishing teens from adults. We also introduce an alternative method...
Automatic mood information acquiring from music data is an important topic of music retrieval area. In this paper, we try to find the strongest emotional expression of the song in large music databases. By analyzing hundreds of credible reviews from website, a 7 keywords mood model is constructed. 217 songs were collected in our dataset. Every song was divided into several 10s-long segments and our...
Even though numerous kinds of anti-virus software packages have been used for many years, previously unseen malware is still a serious threat to computer and information system. By analyzing portable executable header entries of executables, a malware detection model which consists of four stages: attribute extraction, attribute binarization, attribute elimination, and feature selection and classifier...
Nowadays, the carpet quality analysis is determined in industry by human experts, because the automated assessment is not capable of matching the human expertise. Therefore, the carpet company demands a reliable and economic standardization of carpet wear level. This paper presents a new strategy for analyzing and classifying the texture of the wear carpet surface of 3D image, where 3D image is produced...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.