The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The internet resources are very rich, and the traditional search engines can accelerate the speed of network information retrieval. But the bilingual translation comparison pages' information in intemet are not fully used. So the combine of personalized search engine and the bilingual auxiliary translation become an important research subject at present. In this paper, basing on the large-scale network-based...
Keyword extraction is an important application in the area of information technology. Automatic keyword extraction can help people know what is the article primarily talking about without reading the long passage carefully. This paper mainly introduced a keyword extraction algorithm using pagerank on Synonym. Firstly, the content in a single document is represented as a weighted synonym co-occurrence...
There's myriad of accessible information deposited in Deep Web, and the amount of the information is increasing rapidly. With the development of web application, there're more and more online databases, which make Deep Web a hot research topic. For the convenience of searching information in Scientific Data Sharing Platform, this paper does research of Deep Web application, and presents a system architecture...
Information needs and knowledge discovery was submerged in Inundated vast network information, Information for "many" affecting the demand for the "quasi", Affecting the transformation of information into knowledge. How to effectively manage information, found knowledge and provision of intelligence services, Become a hotspot in information resources of network management and application...
In Web environment,information is easier to get for the retail enterprises. However, knowledge discovery in databases become increasing difficulty. Data sources in Web environment is unstructured, these data need to enter the business intelligence through the pretreatment system. Business Intelligence enables retail to be intelligence and automation, including market basket analysis, customer value...
This paper studies the problem of extracting data from large numbers of semi-structured web pages. The fact that many websites have enormous pages generated dynamically from a underlying structured source like a database makes it feasible to induct a common template for similar web pages and then extract data accordingly. Previous work on this problem has limited practical utility because of either...
With the rapid growth of e-commerce, there has been millions of products in a large ecommerce site where customer unable to effectively choose the products they are exposed to. To overcome the product overload problem, a variety of recommendation methods have been developed. Collaborative filtering (CF) is the most successful recommendation method. However, the CF method has two well-known limitations,...
Recently, pattern matching with flexible gap constraints has attracted extensive attention especially in biological sequence analysis and mining patterns from sequences. An issue is to search Maximal Pattern Matching with Gaps and the One-Off Condition (MPMGOOC). Firstly, we introduce the concept of MPMGOOC. In order to solve the problem, we propose some special concepts of Nettree which is different...
Telecom broadband is a main channel supporting internet surfing in China. With the market competition development, customer churn management has become a kernel task of marketing for telecommunication operators. The traditional market research methods are difficult to support the challenge of churn. Data mining techniques are applied to the customer churn management, to establish an early-warning...
With the coming of network age, enterprise online trust crisis has become normal, and aroused increasing concern of senior leaders of various enterprises. The transmitting channel of enterprise online trust crisis is very widespread, and the Web mining is the key technology on early warning of enterprise online trust crisis. This paper has constructed enterprise online trust crisis early warning model...
Presently, in the data mining scenario clustering of large dataset is one of the very important techniques widely applied to many applications including social network analysis. Applying more specific pre-processing method to prepare the data for clustering algorithms is considered to be a significant step for generating meaningful segments. In this paper we propose an innovative clustering technique...
Mobile Social Network Analysis is the mapping and measuring of interactions and flows between people, groups, and organizations based on the usage of their mobile communication services. Social Network Analysis and Mining has been highly influenced by the online social web sites, telecom consumer data and instant messaging systems, and has widely analyzed the presence of dense communities using graph...
Nowadays, online social networks host more and more applications in order to provide their users with the possibility of finding everything they need on a single platform. The number and diversity of interactions that take place over time between users and applications within these platforms make these environments very good candidates for learning various types of information about users' interests...
The world has fundamentally changed as the Internet has become a universal means of communication. The Web is a huge virtual space where to express individual opinions and influence any aspect of life. Internet contains a wealth of data that can be mined to detect valuable opinions, with implications even in the political arena. Nowadays the Web sources are more accessible and valuable than ever before,...
The development of the Web engendered the emergence of virtual communities. Analyzing information flows and discovering leaders through these communities becomes thus, a major challenge in different application areas. In this paper, we present an algorithm that aims at detecting leaders in the context of behavioral networks. This algorithm considers the high connectivity and the potentiality of propagating...
Whenever the question arises how a product, a personality, a technology or some other specific entity is perceived by the public, the blogosphere is a very good source of information. This is what usually interests business users from marketing or PR. Modern search services offer a rich set of tools to monitor or track the blogosphere as a whole, but the analysis with respect to a certain domain is...
Social networks have generated great expectations connected with their potential business value. The purpose of our research is to present that even a rudimentary application of data mining techniques can bring statistically significant improvement in marketing response accuracy throughout the virtual community. In our test the C&RT (classification and regression tree) approach was used to generate...
The automatic generation of malicious behavior pattern based on system call trace is important to malware detection. This paper studied the existing generation method of malicious behavior specification. In order to reduce the complexity of pattern generation, it constructs graph which vertex label is unique, and uses these graphs to mine the pattern. To address the issue of limitation of the minimal...
Session identification is an important step in data processing of web log mining. To solve the defects in traditional session identification, an improved session identification algorithm was proposed. After identifying specific users, a great deal of frame pages were filtered, the relatively reasonable access time threshold for each page was made up according to contents of each page and all web structure...
Network security is becoming an increasingly important issue, since the rapid development of the Internet. Network Intrusion Detection System (IDS), as the main security defending technique, is widely used against such malicious attacks. Data mining and machine learning technology has been extensively applied in network intrusion detection and prevention systems by discovering user behavior patterns...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.