The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Due to the exponential growth of available text documents in digital form, it is of great importance to develop techniques for automatic document classification based on the textual contents. Earlier document classification techniques have used keyword-based features and related statistics to achieve good results when
Social tagging allows users to assign keywords (tags) to resources facilitating their future access by the tag creator, and possibly by other users. In terms of its support for resource discovery, social tagging has both proponents and critics. The goal of this paper investigates if tags are an effective means for
quality of text-mined data while efficacy relied on the context of the choice of techniques. Although developments of automated keyword extraction methods have made differences in the quality of data selection, the efficacy of the Natural Language Processing (NLP) methods using verified keywords remain a challenge. In this
(MWE) and they do not scale very well. This paper proposes a clustering and classification algorithm for semantic similarity using sample web pages. Further improvement is to analyze the short text for classification and labeling the short text according to the keyword and producing the result for the end user. This type
(MWE) and they do not scale very well. This paper proposes a clustering and classification algorithm for semantic similarity using sample web pages. Further improvement is to analyze the short text for classification and labeling the short text according to the keyword and producing the result for the end user. This type
Text classification is an important research topic for managing numerous electronic documents. Feature reduction is the key issue for text classification with high dimensional keywords. A document analysis method called discriminant coefficient was proposed to reduce features and achieve high precisiontext
Web page classification plays an essential role in facilitating more efficient information retrieval and information processing. Conventionally, web text documents are represented by term frequency matrix for classification purpose. However, considering the limitations of representing documents using terms or keywords
In text categorization, vectorizing a document by probability distribution is an effective dimension reduction way to save training time. However, the data sets that share many common keywords between categories affect the classification performance seriously. To address that problem, firstly, we conduct an effective
of the classifier. Our experimental results shows that these measures can improve the classifier's performances, for keywords change too rapidly in emails while address groups are much steadier.
likelihood in the entire training documents where the training and test data are split randomly into k-subsets like 2/3 for training and 1/3 for test data. In addition, it also utilizes two level hierarchy structures for training documents like features from title, keywords and content with the predefined knowledge available
Traditional image classification techniques are based on the analysis of low-level visual features or on textual information. In this paper, we describe a novel solution which tries to improve image analysis and processing algorithms by incorporating keywords and textual annotation produced by humans in a folksonomy
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.