The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
To reduce the human effort in labeling the training set for document classification, some learning algorithms ask users to give the representative keywords for each class rather than any labeled documents. The key challenge in such \emph {keyword-labeled classification} is how to learn the high quality classifier with
Keyword clustering is useful for text information retrieval, text document classification and so on. This paper introduces an unsupervised method to cluster Chinese keyword by the artificial neural network of SOM (self-organized map). Keywords are encoded into numeric vectors by the similarities of their contextual
In order to improve searching results of Web pages and enhancing Web crawling operation, the Web page clustering based on searching keywords is proposed in this paper, which firstly employed matching degree between Web pages and searching keywords to decide the sequence of showing pages of searching results. Then
Traditional Web search engines mostly adopt a keyword-based approach. When the keyword submitted by the user is ambiguous, search result usually consists of documents related to various meanings of the keyword, while the user is probably interested in only one of them. In this paper we attempt to provide a solution to
This paper proposes a novel method to generate labels for grouping and organizing the search results returned by auxiliary search engines. It has applied statistical techniques to measure the quantities of co-occurrence keywords for forming the label matrix of them, and then agglomerated them into higher-level
In this paper we propose an approach for Chinese question analysis and answer extraction. A general question analysis process contains keyword extraction and question classification. Question classification plays a crucial role in automatic question answering. To implement the question classification, we have carried
detect user sentiments. The keyword-based approaches for identifying such themes fail to give satisfactory level of accuracy. Here, we address the above problems using statistical text-mining of blog entries. The crux of the analysis lies in mining quantitative information from textual entries. Once the relevant blog
method on the basis of the skew detection. Then use OCR keywords recognition technology to classify spam faxes once again. This method is simple to implement with high accuracy, and it has applied successfully.
Computer forensics is simply applies the computer investigation and analysis technique to the evidence of potential and the legal effect to determination and gain. It mainly includes the process of data access, data analysis, data submitted and so on. And the data analysis is the key link of computer forensics. It is faced with a question that we must extract useful information from the magnanimous...
It is well known that the work condition of pipeline, the leak included, can be identified by a pressure signal analysis. Because of the high frequency data collection and always on-line pipeline leak detection, the pressure signal brings up massive data. A methodology for pipeline leak detection using data mining technology and work condition identification is presented here. Sixteen groups of raw...
Traditional text learning algorithms need labeled documents to supervise the learning process, but labeling documents of a specific class is often expensive and time consuming. We observe it is convenient to use some keywords(i.e. class-descriptions) to describe class sometimes. However, short class-description
With rapid development of Internet information, It is quite an important project for data mining that how to classify these large amounts of texts. In this paper, we propose an improved text classify cluster algorithm, while calculating similarity, we synthetically consider the relationship between keywords and
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.