The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We investigate the problem of analyzing word frequencies in multiple text sources with the aim to give an overview of word-based similarities in several texts as a starting point for further analysis. To reach this goal, we designed a visual analytics approach composed of typical stages and processes, combining algorithmic analysis, visualization techniques, the human users with their perceptual abilities,...
In text categorization, feature selection is an effective feature dimension-reduction methods. To solve the problems of unadaptable high original feature space dimension, too much irrelevance, data redundancy and difficulties in selecting a threshold, we propose an improved LAM feature selection algorithm (ILAMFS). Firstly, combining the gold segmentation and the LAM algorithm based on the characteristics...
With the rapid development of text summarization, evaluation methods for automatic Chinese text summarization system are becoming more and more important in natural language processing, which can promote development of text summarization greatly. This paper analyzes the existed methods for automatic summarization evaluation, and introduces a new evaluation method based on cluster. The main idea of...
BBS, which is constructed in Internet for providing public information to people, is one type of electric information system and social network. It also has the functions of sending message, email service etc. In this paper, we focus on the relations among users, boards and posts, especially by analyzing and mining history data. We have downloaded great deals of data from one forum, named New SMTH,...
The first and foremost question needed to be considered in clustering analysis is how to measure the similarity that decides the result of clustering immediately. However, are many shortcomings in traditional methods. This paper deals with similarity of English texts using sequence alignment which is always used in biology informatics. This method do not use traditional way that transform texts so...
Text clustering is an important task of text mining. The purpose of text clustering is grouping similar text documents together efficiently to meet human interests in information searching and understanding. The procedure of clustering should involve a cognitive process of text understanding or comprehension.This paper introduces an innovative research effort, CogHTC, a hierarchical text clustering...
The traditional single-document automatic abstracting based on statistical extracts a number of sentences sorted by the importance of the sentences to form summarization, which often neglects semi important topics of the text, and makes the summarization not completely. To overcome this shortcoming, the paper presents an improved k-means algorithm to divide topic in the analysis of text structure...
Nowadays digital scanning techniques are widely used in many fields such as medical imaging and office affairs. In order to achieve fast and accurate text image acquisition and recognition, a new digital scanning pen as well as some image processing techniques is developed in this paper. An adaptive image division algorithm based on potential function clustering is achieved by the combination of total...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.