The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper studied on aspects extraction from product reviews by unsupervised topic model, which is an important subtask of opinion mining. The topic distribution of topic model, such as LDA, leans to the high-frequency words since the words in the document comply with the characteristics of power law distribution, which leads to that most of the words that can represent topics are overwhelmed by...
In this paper, a new technique is presented for mining key domain areas from scientific publications. A domain refers to a particular branch of scientific knowledge and hence largely defines the theme of any scientific research paper. The proposed technique stems from a fusion of knowledge derived from natural language processing and machine learning. Some words or phrases are extracted based on their...
The major challenge that the big data era brings to the services computing landscape is debris of unstructured data. The high-dimensional data is in heterogeneous formats, schemaless, and requires multiple storage APIs is some cases. This situation has made it almost impractical to apply existing data mining techniques which are designed for schema-based data sources in a knowledge discovery in database...
In this paper, we introduce an alpha-numerical sequences extraction system (keywords, numerical fields or alpha-numerical sequences) in unconstrained handwritten documents. Contrary to most of the approaches presented in the literature, our system relies on a global handwriting line model describing two kinds of information : i) the relevant information and ii) the irrelevant information represented...
It is now widely recognized that user interactions with search results can provide substantial relevance information on the documents. In this paper, we focus on extracting relevance information from one source of user interactions, user click-through data which record the sequence of documents being clicked in the result sets during a user search session. We emphasize the importance of the temporal...
In the field of speaker recognition, the Gaussian mixture model with diagonal covariance matrices is a popular technique, in this way, it simplified model and reduced the amount of computation, but lost the correlation information between feature vectors, and then influenced the classification performance. In this paper, in order to compensate the correlation between feature elements, we proposed...
In this work, we use Hidden Markov Models (HMM), Conditional Random Field (CRF), Gaussian Mixture Models (GMM) and Mathematical Methods of Statistics (MMS) for Chinese and Japanese text summarization. The purpose of this work is to study the applicability of mentioned three trainable models for cross-language text summarization. For model training, we use several training features such as sentence...
A novel system for the recognition of spatiotemporal hand gestures used in sign language is presented. While recognition of valid sign sequences is an important task in the overall goal of machine recognition of sign language, recognition of movement epenthesis is an important step towards continuous recognition of natural sign language. We propose a framework for recognizing valid sign segments and...
Hierarchical topic structure can express topics in a natural way which is more reasonable for human machine interface. However, the hierarchical topic structure that is extracted by most of the topic analysis algorithms can not present a meaningful description for all subtopics in the hierarchical tree. We propose a new hierarchical clustering algorithm based on variable feature selection for each...
A novel affective video segment retrieval method based on the correlation between emotion and emotional audio events (EAEs) is presented. The proposed method focuses on retrieving three types of affective video segments, joy, sadness and excitement, by utilizing correlations between emotions and EAEs. The correlation between these emotions and EAEs is investigated by a subjective evaluation. The proposed...
This work proposes a fast decision algorithm in pattern classification based on Gaussian mixture models (GMM). Statistical pattern classification problems often meet a situation that comparison between probabilities is obvious and involve redundant computations. When GMM is adopted for the probability model, the exponential function should be evaluated. This work firstly reduces the exponential computations...
Passage extraction is an important component of passage retrieval. Sentence coherence and relevance is two factors mainly considered in the passage extraction. This paper proposes subsequence-based query-sensitive maximum cut algorithm for passage extraction. It incorporates the sentence coherence cut measure and sentence relevance cut measure into normalized cut criterion. And it uses suffix tree...
In natural languages, compound words play an important role and their automatically extraction is very helpful in information retrieval, information extraction and text classification. We introduce a semi-supervised Chinese compound extraction approach based on HMM using bootstrapping in this paper. First, we define a set of tags BEMI {beginning, end, middle, independence}, which means the position...
To study effective speech features which can represent different emotion styles in infant voice, nonlinear features based on Teager Energy Operator are investigated. Neutral state and 4 emotional states (i.e. happiness, impatience, anger and fear) are classified from the infant voice database. MFCC extraction and HMM-based emotion classification are used as baseline system to evaluate the emotional...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.