The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The rising of the modern Internet brought with it heap opportunities for attackers to gain illegal benefit from spreading spam mail. Spam is irrelevant or inappropriate messages sent on the Internet to a large number of recipients. Many researchers use a large number of classification method in machine learning to filter spam messages. But, there is still limited research which evaluate the use of...
This paper first studies the methods of web documents mining and text clustering, and summaries the fuzzy clustering algorithms and similarity measure functions, then proposes a modified similarity function which can solve the problems of feature selection and feature extraction in high-dimensional space. Finally, this paper puts forward to a dynamic fluzzy clustering algorithm(DCFCM) by combining...
A web crawler is a relatively simple automated program or script that methodically scans or “crawls” through Internet pages to retrieval information from data. Alternative names for a web crawler include web spider, web robot, bot, crawler, and automatic indexer. There are many different uses for a web crawler. Their primary purpose is to collect data so that when Internet surfers enter a search term...
During the last few years, the search result clustering has attracted a substantial amount of research. In this paper, we present a comparative study of the performance of fuzzy clustering algorithms, namely Fuzzy C-Means (FCM), and Gustafson-Kessel (GK) algorithms with clustering search results. Therefore, there is a need to reduce the information, help filtering out irrelevant items, and favors...
Clustering organizes text in an unsupervised fashion. In this paper, we propose an algorithm for the fuzzy clustering of text documents using the naive Bayesian concept. Fuzzy clustering implies that the text documents are assigned to multiple clusters, ranked in descending order of probability. The Vector Space Model is used to represent our dataset as a term-weight matrix. In any natural language,...
The paper reports a study on information categorizing based on high efficient feature selection and comprehensive semi-supervised learning algorithm. Feature selections or conversions are performed using maximum mutual information including linear and non-linear feature conversions. Entropy is made use of and extended to find right features commendably with machine learning method. Fuzzy partition...
In Internet-based e-commerce, the transaction entities' trust levels provided by e-commerce platform can be viewed as a key indicator for users selecting transaction partners. The trust of transaction entities objectively and comprehensively represents the integrated influence of various trust attributes. Aiming at uncertainty and ambiguity of trust in e-commerce, the concept of membership degree...
E-learning behavior analysis is an important issue to the instruction based on Internet. This paper proposed a new method to analyze the e-learning behavior. It classified e-learning behaviors into several clusters by fuzzy clustering algorithm. Behaviors in the same cluster have the most common in characters, while behaviors between clusters have the least common. Experiments fully demonstrated that...
A fuzzy clustering is one of important and valid methods to knowledge discovery. One of problems in fuzzy clustering is to determine a certain fuzzy sample classification in given limited sample space. Another is its validity, that is to say, if the sample is resemble in sample space, its fuzzy type will be resemble too. In our research, firstly, using triangle arithmetic operator and triangle transference,...
In order to solve the problem of user-classification to reflect the features of Web users inflexible, a novel user classification model was presented in this paper. By introducing the concept of time discretization and applying fuzzy equivalence relation clustering to classify Web users, the model can rationally solve the user classification problems. Empirical results showed that the output of user...
Fuzzy clustering is a popular method for modeling web usage data, and a number of techniques have been proposed. Performance of such techniques has been demonstrated through experiments using datasets which are often limited in the size and/or variety. This is mainly due to the difficulty in acquiring large real data, and also to the huge amount of time and effort required in performing experiments...
Web mining is defined as applying data mining techniques to the content, structure, and usage of Web resources. The three areas of Web mining are commonly distinguished: content mining, structure mining, and usage mining. In all these areas, a wide range of general data mining techniques, in particular association rule discovery, clustering, classification, and sequence mining, are employed and developed...
Clustering Web session is an important aspect of Web usage mining. In this paper, we propose a new algorithm of Web session fuzzy clustering, which applies the t-bridge algorithm to fuzzy equivalence matrix clustering algorithm. This algorithm is proved to have better accuracy, fewer CPU time and better scalability than others by the experiments.
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.