The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
expansion of the keyword lexicon used to discover IED related web pages, which identified new relevant terms for inclusion. Additionally, we present an improved web page feature representation designed to better capture the structural and stylistic cues revealing of genres of communication, and a series of experiments
In this paper, reclassification for the current classification through K-means would be implemented based on the feedback of Web usage mining in order to improve the accuracy of news recommendation and convergence of classification. It could extract most relative keywords and eliminate the disturbance of multi-vocal
Social tagging allows users to assign keywords (tags) to resources facilitating their future access by the tag creator, and possibly by other users. In terms of its support for resource discovery, social tagging has both proponents and critics. The goal of this paper investigates if tags are an effective means for
(MWE) and they do not scale very well. This paper proposes a clustering and classification algorithm for semantic similarity using sample web pages. Further improvement is to analyze the short text for classification and labeling the short text according to the keyword and producing the result for the end user. This type
(MWE) and they do not scale very well. This paper proposes a clustering and classification algorithm for semantic similarity using sample web pages. Further improvement is to analyze the short text for classification and labeling the short text according to the keyword and producing the result for the end user. This type
There are huge numbers of valuable information resources resided on Invisible Web. However, it is hard to use for us. In this paper we propose a system called NewsReaper that is capable of making Invisible Web to be visible, especially the huge number of real-time information, which update frequently and are time-sensitive. NewsReaper makes use of information extraction, text classification, full...
Current classification techniques use word matching and clustering techniques to classify webpages. These techniques use ad hoc approach of checking and matching the entire keywords in a webpage for classification. These methods are efficient but not without problems. In general, they suffer from the following
Web page classification plays an essential role in facilitating more efficient information retrieval and information processing. Conventionally, web text documents are represented by term frequency matrix for classification purpose. However, considering the limitations of representing documents using terms or keywords
visualize the lattice structure of web pages and keywords as line diagram. This system is implemented on the computer (CPU=2.83GHz! $MM=2GB), by using Python, which is an object-oriented programming language, Application Program Interface (API), and one of the GUI libraries, Tkinter. Through the subjective evaluation and sign
the websites into their most appropriate category. Several parameters like the weight applied to each feature and the keywords used to classify the websites were tuned to yield better results. The experimental evaluation revealed that the method implemented provides very high accuracy. In particularly, we obtained an
small number of HTML input elements extracted from user input HTML forms and a few keywords. It utilizes pre-query technique and post-query technique in a hierarchical manner. Decision trees and multi layer artificial neural networks were used to obtain the classification rates over 91% to classify search forms and non
likely encountered a high ranking page that consists of nothing more than a bunch of query keywords. These pages detract both from the user experience and from the quality of the search engine. Search engine spam is a webpage that has been designed to artificially inflating its search engine ranking. Recently this search
Traditional automatic classifiers often conduct misclassifications. Folksonomy, a new manual classification scheme based on tagging efforts of users with freely chosen keywords can effective resolve this problem. Even though the scalability of folksonomy is much higher than the other manual classification schemes, the
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.