The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The goal of Information Extraction is to automatically generate structured pieces of information from the relevant information contained in text documents. Machine Learning techniques have been applied to reduce the cost of Information Extraction system adaptation. However, elements of human supervision strongly bias the learning process. Unsupervised learning approaches can avoid these biases. In...
The scoring mechanism of the text features is the unique way for determining the key ideas in the text to be presented as text summary. The efficiency of the technique used for scoring the text sentences could produce good summary. The feature scores are imprecise and uncertain, this marks the differentiation between the important features and unimportant is difficult task. In this paper, we introduce...
Automatic text summarization is a wide research area. Automatic text summarization is to compress the original text into a shorter version and help the user to quickly understand large volumes of information. There are several ways in which one can characterize different approaches to text summarization: extractive and abstractive from single document or multi document. This paper focuses on the automatic...
This paper explores the method for Korean text watermarking and develops a morpheme-based scheme that a predicate nominal is segmented into a nominal and a predicate. Korean, as an agglutinative language, provides a good ground for the morpheme-based natural language watermarking. Korean word usually consists of a content morpheme and function morphemes. However, predicate nominal has two content...
Rich information spaces (like the Web or scientific publications) are full of "stories": sets of statements that evolve over time, manifested as, for example, collections of newspaper articles reporting events relating to an evolving crime investigation, sets of news articles and blog posts accompanying the development of a political election campaign, or sequences of scientific papers on...
With the rapid development of text summarization, evaluation methods for automatic summarization system is becoming more and more important in natural language processing, which can promote development of text summarization greatly. This paper analyzes the existed methods for automatic summarization evaluation, and introduces a new evaluation method based on HowNet. The original tests have shown that...
The textual entailment (TE) task consists of discovering unidirectional semantic inferences between the meanings of two text snippets. Taking advantage of this, in this paper we propose using the TE system as an answer validation (AV) engine to improve the performance of question answering (QA) systems and help humans in the assessment of QA systems' outputs. To achieve these aims and in order to...
For a human being it is easy to understand and resolve the anaphora in natural languages. For a computer it is a relatively complex task to handle it. This paper is about the resolution of distributive anaphoric devices in Urdu language. The algorithm presented in this paper is tested on real world text taken from e.g. newspapers and novels. The algorithm successfully searches the antecedents to distributive...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.