The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In order to over the shortcoming of the incomprehensive of summarization, a new lexical-chain-based keywords extraction and automatic summarization algorithm from Chinese texts based on the unknown word recognition using co-occurrence of neighbor words is proposed in this paper, and an algorithm for constructing
This paper presents a corpus-based approach for extracting keywords from a text written in a language that has no word boundary. Based on the concept of Thai character cluster, a Thai running text is preliminarily segmented into a sequence of inseparable units, called TCCs. To enable the handling of a large-scaled
This paper proposes a systematic full text search on document using a combined keyword and structural similarity of documents under consideration. The approach operates in two steps. The first step uses a set of designated keywords to acquire potential desired documents by means of an open source tool. The second step
A common strategy to assign keywords to documents is to select the most appropriate words from the document text. One of the most important criteria for a word to be selected as keyword is its relevance for the text. The tf.idf score of a term is a widely used relevance measure. While easy to compute and giving quite
Searching published papers is a required activity for the researching process. Since articles are presented in various languages, it makes precise queries hard to achieve. In this paper, we propose an automatic theses clustering method based on bilingual and synonymous keyword sets which includes Chinese and English
Keywords are the critical resources of information management and retrieval, automatic text classification and clustering. The keywords extraction plays an important role in the process of constructing structured text. Current algorithms of keywords extraction have matured in some ways. However the errors of word
word segmentation and pas tagging, language modeling and term translation, text clustering, text categorization, text summarization, keywords identification in a single document and duplication detection. The application can invoke any module of LJParser in Windows and Linux using any language including C, C# and Java
when the sentence is analyzed. The goal is to put each noun and verb of the sentence on the right place on the tree. Taking this information into account, it is possible to solve the ambiguity problem for the query keywords and create the indicative summaries taking into account query words, and semantically related
This article proposes such a question classification approach that integrates multiple semantic features. It is aimed at these two questions in Chinese question classification models: inaccurate semantic information extraction and too slow processing speed caused by too high Eigenvector dimension. With the help of HowNet and the support vector machine and syntactic and semantic information of question...
attribute labels to them. It can greatly boost the efficiency of text processing. For building up two views, we split features into two parts, each of which can form an independent view. One view is made up of the feature set of abstract, and the other is made up of the feature sets of title, keywords, creator and department
semantic lexicon for domain-specific term extraction. The experimental results show that our approach can get high precision in legal field. Keywords: automatic term recognition, bilingual seeds set, Chinese concept dictionary, legal terminology, single word term.
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.