The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
To capture the trends of concerned topics in specific field, people often use topic discovery methods to get this goal. The traditional topic discovery algorithms are generally divided into two types, text clustering algorithm and text topic model. The former lacks of attention on semantic information, and the latter always ignores relativity of the topic. These affect the topic discovery and topic...
Feature selection is an important preprocessing step of Chinese Text Categorization, which reduces the high dimension and keeps the reduced results comprehensible compared to feature extraction. A novel criterion to filter features coarsely is proposed, which integrating the superiorities of term frequency-inverse document frequency as inner-class measure and CHI-square as inter-class, and a new feature...
Text representation is the basis of text processing. Most current text representation models ignore the words' inter-relations, which result in the loss of textpsilas structure information. This paper proposed a novel text representation model, which uses lexical network to represent the text and retains the text's structure. According to the different levels of words' inter-relations, co-occurrence...
We propose a novel, convenient way for the building of emotion thesaurus which can be used in assessing the affective qualities of natural languages contained in text. Our main goals are fast analysis and visualization of affective content for machines to communicate smoothly with humans and to realize emotion communications. Although there have been some studies about analyzing affective content...
Keyword indexing is widely used in natural language processing. This paper proposed an unsupervised keyword indexing method based PageRank and HowNet. In the method, a free text is firstly represented as a sememe graph with sememes as vertices and relatedness of sememes as weighted edges based on HowNet. Then UW-PageRank is applied on the sememe graph to score the importance of sememes. Score of each...
Term extraction is a basic research topic to establish knowledge bases. This paper puts forward a new automatic Chinese term extraction based on cognition theory. Supervised by both linguistic knowledge and statistics information of research papers, we improve the traditional fair SCP and C-Value measures originally developed for multi-words, and then present a new comprehensive metric called MC-SCP...
Spoken dialog system can provide an interface between the user and a computer-based application that permits spoken interaction with the application in a relatively natural manner. However extracting user's intention from such spoken queries is a very difficult challenge. This paper presents a mixed approach to spoken language understanding that tries to make best use of the statistical and knowledge-based...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.