The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The feature extraction is the most key technology of text categorization. The word is used as the feature in the traditional text classification, and its effect for the text classification is evidence. The feature extraction method using base phrase and keyword changes the feature extraction of Chinese text from
This paper presents a new keyword extraction algorithm for Chinese news Web pages using lexical chains and word co-occurrence combined with frequency features, cohesion features, and corelation features. A lexical chain is an external performance consistency by semantically related words of a text, and is the
Keyword extraction has been a very traditional topic in Natural Language Processing. However, most methods have been too complicated and slow to be applied in real applications, for example in web-based system. This paper proposes an approach which will complete some preparing works focusing on exploring the
In this paper, we tackle the problem of automatic keyword extraction in the meeting domain, a genre significantly different from written text. For the supervised framework, we proposed a rich set of features beyond the typical TFIDF measures, such as sentence salience weight, lexical features, summary sentences, and
Two keyword-extraction ways are usually used, one is simply using the information from exactly single word like word frequency and TF.IDF, the other is based on the relationship between words. The relationship is usually described as word similarity which derives from a corpus (WordNet, HowNet) or man-made thesaurus
This paper proposes a new keyword extraction method that uses bag-of-concept to extract keywords from Arabic text. The proposed algorithm utilizes semantic vector space model instead of traditional vector space model to group words into classes. The new method built word-context matrix where the synonym words will be
Keyword spotting becomes a very important branch of speech recognition. But the acoustic mismatch between training and testing environments often causes a severe degradation in the recognition performance. This paper presents an improved keyword spotting strategy. A fuzzy search algorithm is proposed to extract
In this paper, a method of automatic Chinese keyword extraction based on KNN is proposed. Firstly, it preprocesses the document by vector space model. Secondly, it constructs a set of candidate keywords based on KNN method and the labeled dataset. Finally, it post-processes on candidate keywords by the character of
enhances the machine learning based Stanford CoreNLP Part-of-Speech (POS) tagger with the Twitter model to extract essential keywords from a tweet. The system was enhanced using two rule-based parsers and a corpus. The research was conducted using tweets of customer service requests sent to a telecommunication company. A
avoid unnecessary email reading for that a better email management system is required. Here author used fuzzy logic techniques for email clustering. Extract concept and feature, same feature keyword goes into one cluster if a new keyword is found and not matched with any existing cluster than a new cluster is defined for
both manual and automatic transcriptions, for non-English documents, we use automatic translations. In this work, we use AdaBoost, a discriminative classification method with both lexical and semantic features. The results indicate 11%-13% relative improvement over a baseline keyword-spotting-based approach. We also show
Twitter, as a social media is a very popular way of expressing opinions and interacting with other people in the online world. When taken in aggregation tweets can provide a reflection of public sentiment towards events. In this paper, we provide a positive or negative sentiment on Twitter posts using a well-known machine learning method for text categorization. In addition, we use manually labeled...
With the increased demand for English communication, various styles of learning support methods have been proposed and provided to the Japanese learners. However, there are still many learners finding it hard to read, write and speak in English. Regardless of language difference, understanding the other's intention and emotional status accurately and expressing what they think or feel to the others...
livelihoods, how to deal with its negative impacts, and which mitigation or adaptation policies to support. A line of related work has used bag of words and word-level features to detect frames automatically in text. Such works face limitations since standard keyword based features may not generalize well to accommodate surface
accuracy than individual classifiers. The maximum accuracy was got by enhancing the ensemble with an additional automatically generated domain specific class wise keyword list. Use of this system gave us greater than 4 percent improvement over the techniques of just using the ensemble classifier. A further improvement in
This paper presents a novel method to extract Protein-Protein Interaction (PPI) information from biomedical literatures based on Support Vector Machine (SVM) and K Nearest Neighbors (KNN). The two protein names, words between two proteins, words surrounding two proteins, keyword between or among the surrounding words
videos and generate corresponding MPEG-7 description files. Subsequently, it establishes distributed index of the MPEG-7 files and distributed storage of video files separately. The system provides numerous web query interfaces, including keywords semantic expansion query, semantic graph query and natural language query
keywords in common, then the image is added to an image repository. Additional meta-information are now associated with each image such as caption, cluster features, names of people in the news article, etc. A very large repository containing more than 983k images from 12 million news articles was built using this approach
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.