The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The extraordinary growth in both the size and popularity of the World Wide Web has created a growing interest not only in identifying Web page genres, but also in using these genres to classify Web pages. The hypothesis of this research is that an n-gram representation of a Web page can be used effectively to automatically classify that Web page by genre, even when the Web page belongs to more than...
Extracting instances of a given target relation from a given Web page corpus seems to be the basic work to exploit nearly endless source of knowledge which provided by the World Wide Web. Supervised learning requires a large amount of labeled data, but the data labeling process can be expensive and time consuming. In this paper we present a kernel-based weakly supervised machine learning algorithm...
As the World Wide Web expands and more users join, it becomes an increasingly attractive means of distributing malware. Malicious javascript frequently serves as the initial infection vector for malware. We train several classifiers to detect malicious javascript and evaluate their performance. We propose features focused on detecting obfuscation, a common technique to bypass traditional malware detectors...
In this paper, we construct and compare several feature extraction approaches in order to find a better solution for classification of Turkish Web documents in the marketing domain. We produce our feature extraction techniques using characteristics of the Turkish language, structures of Web documents and online content in the marketing domain. We form datasets in different feature spaces and we apply...
Extracting instances of a given target relation from a given Web page corpus seems to be the basic work to exploit nearly endless source of knowledge which provided by the World Wide Web. In this paper, we present an automated system which could extract instances of an arbitrary given binary relation from Chinese Web documents in domain of football games. Different syntactic sources are combined by...
Sentiment classification aims at mining reviews of people for a certain event's topic or product by automatic classifying the reviews into positive or negative opinions. With the fast developing of World Wide Web applications, sentiment classification would have huge opportunity to help people automatic analysis of customers' opinions from the web information. Automatic opinion mining will benefit...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.