The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper describes a multi-dimensional knowledge discovery and data mining (KDD) methodology that aims at discovering actionable knowledge related to Internet threats, taking into account domain expert guidance and the integration of domain-specific intelligence during the data mining process. The objectives are twofold: i) to develop global indicators for assessing the prevalence of certain malicious...
Most research on Internet topology is based on active measurement methods. A major difficulty in using these tools is that one comes across many unresponsive routers. Different methods of dealing with these anonymous nodes to preserve the connectivity of the real graph have been suggested. One of the more practical approaches involves using a placeholder for each unknown, resulting in multiple copies...
Word meaning disambiguation has always been an important problem in many computer science tasks, such as information retrieval and extraction. One of the problems,faced in automatic word sense discovery, is the number of different senses a word can have. Often, senses are dominated by some other, more frequent ones. Discovering such dominated meanings can significantly improve quality of many text-related...
Recently, many commercial products, such as Google Trends and Yahoo! Buzz, are released to monitor the past search engine query frequency trend. However, little research has been devoted for predicting the upcoming query trend, which is of great importance in providing guidelines for future business planning. In this paper, a unified solution is presented for such a purpose. Besides the classical...
We describe Deimos, a system that automatically discovers and models new sources of information.The system exploits four core technologies developed by our group that makes an end-to-end solution to this problem possible. First, given an example source, Deimos finds other similar sources online. Second, it invokes and extracts data from these sources. Third, given the syntactic structure of a source,...
Semantic concept learning is one of the most challenging problems in video retrieval. The key barrier for semantic concept learning is lack of annotated training data. Internet videos are different from ordinary videos: massive, rich information, customized, non-uniform format, uneven quality, little descriptive text, only a few shots with limited length etc. Therefore, Internet is a potential repository...
In contrast with most Internet topology measurement research, our concern here is not to obtain a map as complete and precise as possible of the whole internet. Instead, we claim that each machine's view of this topology, which we call ego-centered view, is an object worth of study in itself. We design and implement an ego-centered measurement tool, and perform radar-like measurements consisting of...
Recently, a new temporal dataset has been made public: it is made of a series of twelve 100 M pages snapshots of the .uk domain. The Web graphs of the twelve snapshots have been merged into a single time-aware graph that provide constant-time access to temporal information. In this paper we present the first statistical analysis performed on this graph, with the goal of checking whether the information...
We present a demo of ESTER, a search engine that combines the ease of use, speed and scalability of full-text search with the powerful semantic capabilities of ontologies. ESTER supports full-text queries, ontological queries and combinations of these, yet its interface is as easy as can be: A standard search field with semantic information provided interactively as one types. ESTER works by reducing...
This demonstration concerns a system designed and implemented to automatically build multimodal aggregations of informative news items coming from the two domains of digital television and the Web. Though in recent times several technological solutions have addressed the problem of clustering online articles, little is available which is capable of integrating these two sources of information. The...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.