Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
Recently, with wide use of computer systems, internet, and rapid growth of computer networks, the problem of intrusion detection in network security has become an important issue of concern. In this regard, various intrusion detection systems have been developed for using misuse detection and anomaly detection methodologies. These systems try to improve detection rates of variation in attack types...
With the high development of Internet, e-commerce websites now routinely have to work with log datasets which are up to a few terabytes in size. How to remove messy data timely with low cost and find out useful information is a problem we have to face. The mining process involves several steps from pre-processing the raw data to establishing the final models. In this paper we describe our method to...
Regression problems on massive data sets are ubiquitous in many application domains including the Internet, earth and space sciences, and finances. In many cases, regression algorithms such as linear regression or neural networks attempt to fit the target variable as a function of the input variables without regard to the underlying joint distribution of the variables. As a result, these global models...
Fraud is increasing with the extensive use of internet and the increase of online transactions. More advanced solutions are desired to protect financial service companies and credit card holders from constantly evolving online fraud attacks. The main objective of this paper is to construct an efficient fraud detection system which is adaptive to the behavior changes by combining classification and...
It is obvious that internet has become a key media to share resources and exchange information. As a special category of social activities, the behavior from network users normally shows its complexity and diversity, which makes people pay an increased attention to study and manage it. Based upon the formation mechanism of ant colony, this paper proposes an ant colony algorithm to do cluster analysis...
With the widespread of Internet application, more and more enterprises build their Web sites and provide business information through Web pages. Web page classification could be used to assign the enterprise Web pages to one or more predefined business categories. On the purpose of Internet-based enterprises administration in E-government system, algorithms and application related to web page classification...
Real-time classification of Internet traffic according to application types is vital for network management and surveillance. Identifying emerging applications based on well-known port numbers is no longer reliable. While deep packet inspection (DPI) solutions can be accurate, they require constant updates of signatures and become infeasible for encrypted payload especially in multimedia applications...
This paper proposes a novel weighted feature fusion in color face recognition (FR) to automatically annotate faces in personal videos. In the proposed FR method, multiple face images (belonging to the same subject) are clustered from a sequence of video frames. To facilitate a complementary effect on improving annotation performance, the grouped faces are combined using the proposed weighted feature...
Most of the traditional classification methods behave undesirable, particularly producing poor predictive accuracy for the minority class of the imbalanced data from real world applications. This paper proposes a novel over-sampling strategy to handle imbalanced data based on cluster ensembles, named CE-SMOTE, which aims to provide a better training platform by introducing clustering consistency index...
Phishing fraudsters attempt to create an environment which looks and feels like a legitimate institution, while at the same time attempting to bypass filters and suspicions of their targets. This is a difficult compromise for the phishers and presents a weakness in the process of conducting this fraud. In this research, a methodology is presented that looks at the differences that occur between phishing...
There are many models in literature and practice that analyse user behaviour based on user navigation data and use clustering algorithms to characterize their access patterns. The navigation patterns identified are expected to capture the user's interests. In this paper, we model user behaviour as a vector of the time he spends at each URL, and further classify a new user access pattern. The clustering...
Extracting useful information from user generated text on the web is an important ongoing research in natural language processing, machine learning, and data mining. Online tools like emails, news groups, blogs, and web forums provide an effective communication platform for millions of users around the globe and also provide an added advantage of anonymity. Millions of people post information on different...
The increasing availability of digital educational resources in the Internet, called learning objects, has been followed by the definition of indexing standards. However, the lack of consensus about the definition of learning objects, as well the diversity of metadata approaches for its classification hinders the selection process of these elements. This scenario requires new investigations that allow...
Web document classification and clustering are two crucial sections in Web data mining. The models, algorithms and simulation experiments for both Web document classification and clustering have been studied separately to support for the personalized services and to overcome the deficiencies and shortcomings of the same type's algorithms in the paper. The Web document classification based on fuzzy...
The popularity of the Internet has caused a massive increase in the amount of Web pages. The information explosion has led to a growing challenge for information retrieval systems. Document clustering becomes an important process for helping the information retrieval systems organize this vast amount of data. It is believed that grouping similar documents together into clusters will help the users...
An adaptive bottom up Web news extraction approach based on human perception is presented in this paper. The approach simulates how a human perceives and identifies Web news information by using an adaptive bottom up clustering strategy to detect possible news areas. It first detects news areas based on content function, space continuity, and formatting continuity of news information. It further identifies...
There are a large quantity of non-certain and non-structure contents in the Web text at the present time. It is difficult to cluster the text by some normal classification methods. An algorithm of Web text clustering analysis based on fuzzy set is proposed in this paper, and the algorithm has been described in detail by example. The technique can improve the algorithm complexity of time and space,...
This paper presents a new algorithm of Web page classification, CUCS(Combined UC and SVM), for large training set. CUCS combines the advantages of SVM (Support Vector Machine) and UC (Unsupervised Clustering), achieving high precision and fast speed. In the training stage, CUCS gets clustering centers, which include positive example centers and negative ones, by means of UC. Then CUCS prunes training...
Because of today's explosive information from Internet, people will contact much new information at any moment. So how to analyze this non-stationary information becomes more and more important. Clustering analysis is a good information analysis method, but many clustering algorithms only fit to stationary situation. Then in this paper, a novel incremental clustering algorithm based on self-organizing-mapping-IGSOM...
A new algorithm of Web text clustering mining is presented, which is based on the Discovery Feature Sub-space Model (DFSSM). This algorithm includes the training stage of SOM and the clustering stage, which characterizes self-stability and powerful antinoise ability. It can distinguishes the most meaningful features from the Concept Space without the evaluation function. we have applied the algorithm...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.