The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
There is an important issue that text summarization has to embody personal information need and provide indicative message to user. In this paper, a method of acquiring relevant documents based on user-feedback information and transductive inference SVM machine learning is presented. This method can well avoid the subjectivity of deciding relevant documents empirically. Furthermore, a sentence selection...
The paper proposes a new way of comprising the Non-negative matrix factorization (NMF) and Testor theory to make topic discovery. NMF method is good at dealing with high dimensional documents and clustering, while Testor theory is used to find the topic of each cluster. By an example of ten abstracts of Chinese science literature from magazines relative to environmental science, the whole process...
This paper proposes a strategy of the summary sentence selection for query-focused multi-document summarization through extracting keywords from relevant document set. It calculates the query related feature and the topic related feature for every word in relevant document set, then obtains the importance of the word by combining the two features. The score of candidate sentence is computed through...
There is an important issue that text summarization has to embody the personal information need and provide the indicative message for user. In this paper, a method of acquiring relevant documents based on user-feedback information and transductive inference SVM machine learning technology is presented. This method can well avoid subjectivity of deciding relevant documents empirically. To validate...
In this paper, the problem of text representation in the process of text mining is mainly discussed. The paper focuses on how to simplify the text model in advance of the construction of term-by-document matrix. By using association rules mining method to find the highly correlative words to form words-set, the vocabulary set is decreased effectively, which leads to the text modelpsilas simplification...
This paper presents how to use ROUGE to evaluate summaries without human reference summaries. ROUGE is a widely used evaluation tool for multi-document summarization and has great advantages in the areas of summarization evaluation. However, manual reference summaries written beforehand by assessors are indispensable for a ROUGE test. There was still no research on ROUGEpsilas abilities of evaluating...
The most important step of query-focused extractive summarization is deciding which sentences are appropriately included in the final summary. In this paper, we propose a feature fusion based sentence selecting strategy, to identify the sentences with high query-relevance and high information density. We score each sentence by computing its similarity and Skip-Bigram co-occurrence with query. These...
This paper brings forward an autocorrection algorithm for noise texts based on modularity optimization. By noise texts we mean those documents in text corpus being distributed to a wrong category. Firstly, the document- similarity network is constructed, in which each node represents a document. If two nodes are similar in content, they are connected with a weighted edge, and their similarity is the...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.