The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In a disaster, accurate information is a resource that is often in short supply. The combination of rapidly unfolding events and the frequent loss of communication infrastructure including mobile phones, landlines, Internet and television broadcast make it difficult to gain situational awareness. And while there is obvious value in aggregating information for centralized emergency authorities, disasters...
Along with the rapid growth of the xml data quantity on the Internet, the xml data retrieval research has attracted more and more attention. The searching algorithm based on key words is a research hotspot in this field. We present a context-based layered intersection scan algorithm (CLISA), which uses the context semantic of key words to filter large amount of redundant information, different from...
Collaborative filtering (CF) based recommender systems have gained wide popularity in Internet companies like Amazon, Netflix, Google News, and others. These systems make automatic predictions about the interests of a user by inferring from information about like-minded users. Realtime CF on highly sparse massive datasets, while achieving a high prediction accuracy, is a computationally challenging...
For blocking pornographic, illegal websites by Intenet host domain, now the majority of solutions are based on the identification and blocking by software. Basing on the analysis of Bloom Filter algorithm and combining with the feature of FPGA chip, this paper proposes an efficient, high-speed hardware-based Implementation of the host domain blocking.
In this paper, Singular Value Decomposition (SVD) is combined with hybrid collaborative filtering (CF), proved to be an effective solution for sparsity problem. SVD is utilized in order to reduce the dimension of the user-pageview matrix obtained from web usage mining. Afterwards, both low-rank matrices are employed in order to generate item-based and user-based predictions. A framework for building...
People have benefited a lot from the fast development of the Internet, but at the same time, it also spreads harmful and erotic content widely. Compared with the past when the researchers concentrated on the mechanical filtering and text-based filtering, they have more focused on the researches of image-based filtering and fusion algorithm filtering now. This paper first introduced the single filtering...
During the last few years, the search result clustering has attracted a substantial amount of research. In this paper, we present a comparative study of the performance of fuzzy clustering algorithms, namely Fuzzy C-Means (FCM), and Gustafson-Kessel (GK) algorithms with clustering search results. Therefore, there is a need to reduce the information, help filtering out irrelevant items, and favors...
The web application of HTTP service holds the majority share of the traffic transported through web. A big flaw of online filtering affects data transfer throughput. The filter suffers from being poor to analyze HTTP traffic like html reconstruction, HTML encoding, analyzing HTTP, and so on. This paper presents a remedy to this issue by putting the filter engine in sniffer mode. By this way filtering...
Cloud provides dynamically computing services for large scales of data over the Internet. IaaS(information as a service) is one of utilities to provide information service in Cloud computing. Large scales of XML data are produced continually in Internet. Efficient information filtering services are needed. Previous XML filter approaches aim at XPath queries. However, many users tend to use keywords...
TF-IDF algorithm is widely used in text feature extraction, in which IDF value demonstrates the importance of a term. While applying to the procession of web news, the traditional IDF doesn't work well, especially in a collection divided according to channels. In order to solve this problem, a refined IDF schema is proposed, named Channel Distribution Information (CDI) IDF, which is based on the information...
Collaborative Filtering (CF) algorithms are widely used in a lot of recommender systems, however, the computational complexity of CF is high thus hinder their use in large scale systems. In this paper, we implement user-based CF algorithm on a cloud computing platform, namely Hadoop, to solve the scalability problem of CF. Experimental results show that a simple method that partition users into groups...
Collaborative filtering algorithms are successfully used in personalized recommender systems for their simplicity and high recommending quality. However, significant vulnerabilities have recently been identified in collaborative filtering recommender systems. Malicious users can inject a large number of biased profiles into such a system in order to make recommendations that favor or disfavor given...
Web sequential pattern mining is an important way to learn the access behavior of Web users. In this paper, we present an efficient method of Web sequential pattern mining in the e-learning environment. Different from traditional mining methods, we categorize the user sessions into human user sessions, crawler sessions and resource-download user sessions. Then we filter out the non-human user sessions,...
Similarity calculation has many applications, such as information retrieval, and collaborative filtering, among many others. It has been shown that link-based similarity measure, such as SimRank, is very effective in characterizing the object similarities in networks, such as the Web, by exploiting the object-to-object relationship. Unfortunately, it is prohibitively expensive to compute the link-based...
It is urgent to filter invalid or even hostile information in web information systems. A configurable rule engine is designed to solve the problem of simpleness and low self-adaptive ability of rule matching method in traditional information filtering algorithm. In the engine, functions are dynamically called by using reflect mechanism to achieve computing of atom rules, logic rules are described...
Image-based spam is becoming a new threat to the Internet and its users. In our early work, we proposed an image filtering system which detects the spam image by matching with user-specified image content using SIFT algorithm. In order to further improve efficiency, we develop a quick image matching algorithm instead of SIFT. After using difference-of-Gaussian to extract image feature points, we adopt...
The domain name system (DNS) is a fundamental component of the modern Internet, and repeated queries make up of a large amount of DNS traffic. To filter out the repeated queries in BIND DNS log file, an efficient algorithm is proposed. The algorithm is characterized by maintaining the time sequence of original queries during the processing. The experimental results using CN TLD root server log are...
With the development of information technology and the popularization of the Internet, the amount of information based on the network has raised rapidly, and the emergence and development of network information filtering provide a better choice for people to access to the information quickly, accurately and completely. In this paper, the introduction of PSO uses its simple and efficient to establish...
Web pages are often decorated with extraneous information (such as navigation bars, branding banners, JavaScript and advertisements). This kind of information may distract users from actual content they are really interested in and may reduce effects of many advanced Web applications. Automatic content extraction has many applications ranging from providing data for Web mining to realizing better...
In the past few years, as the volume of junk information on the internet has grown tremendously, researchers' begun to handle this issue. In this paper, an algorithm, called DCM (Discriminative Category Matching) algorithm is employed to do the content based information filtering system design. To our knowledge, the algorithm is the first introduced into filtering information on the internet. System...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.