Search results

Items from 1 to 19 out of 19 results

chapter

Flow classification using clustering and association rule mining

U K Chaudhary, I Papapanagiotou, M Devetsikiotis

2010 15th IEEE International Workshop on Computer Aided Modeling, Analysis and Design of Communication Links and Networks (CAMAD) > 76 - 80

2010 IEEE 15th International Workshop on Computer Aided Modeling and Design of Communication Links and Networks (CAMAD 2010). 2010 15th IEEE International Workshop on Computer Aided Modeling, Analysis and Design of Communication Links and Networks

Traffic classification has become a crucial domain of research due to the rise in applications that are either encrypted or tend to change port consecutively. The challenge of flow classification is to determine the applications involved without any information on the payload. In this paper, our goal is to achieve a robust and reliable flow classification using data mining techniques. We propose a...

chapter

User behavior mining on large scale web log data

Shun-Hua Tan, Miao Chen, Guo-Hai Yang

The 2010 International Conference on Apperceiving Computing and Intelligence Analysis Proceeding > 60 - 63

2010 International Conference on Apperceiving Computing and Intelligence Analysis (ICACIA 2010)

In this paper we propose a web log mining-based network user behavior analysis scheme, which plays an important role in network structure optimization and website server configuration. Based on clustering and regression model, we studied the network user's visit model in a university by analyzing a large amount of web log data which is collected from the university campus network. The data analyzing...

chapter

Dynamic Fluzzy Clustering Algorithm for Web Documents Mining

Qi Luo

2010 International Conference on Computational Intelligence and Security > 64 - 67

2010 International Conference on Computational Intelligence and Security (CIS 2010)

This paper first studies the methods of web documents mining and text clustering, and summaries the fuzzy clustering algorithms and similarity measure functions, then proposes a modified similarity function which can solve the problems of feature selection and feature extraction in high-dimensional space. Finally, this paper puts forward to a dynamic fluzzy clustering algorithm(DCFCM) by combining...

chapter

Distributed log information processing with Map-Reduce: A case study from raw data to final models

Mingyue Luo, Gang Liu

2010 IEEE International Conference on Information Theory and Information Security > 1143 - 1146

2010 IEEE International Conference on Information Theory and Information Security

With the high development of Internet, e-commerce websites now routinely have to work with log datasets which are up to a few terabytes in size. How to remove messy data timely with low cost and find out useful information is a problem we have to face. The mining process involves several steps from pre-processing the raw data to establishing the final models. In this paper we describe our method to...

chapter

An Improved Data Clustering Algorithm for Mining Web Documents

O H Odukoya, G A Aderounmu, E R Adagunodo

2010 International Conference on Computational Intelligence and Software Engineering > 1 - 8

2010 International Conference on Computational Intelligence and Software Engineering (CiSE 2010)

This paper formulates, simulates and assess an improved data clustering algorithm for mining web documents with a view to preserving their conceptual similarities and eliminating the problem of speed while increasing accuracy. The improved data clustering algorithm was formulated using the concept of K-means algorithm. Real and artificial datasets were used to test the proposed and existing algorithm...

chapter

Block-GP: Scalable Gaussian Process Regression for Multimodal Data

K Das, A N Srivastava

2010 IEEE International Conference on Data Mining > 791 - 796

2010 10th IEEE International Conference on Data Mining (ICDM 2010)

Regression problems on massive data sets are ubiquitous in many application domains including the Internet, earth and space sciences, and finances. In many cases, regression algorithms such as linear regression or neural networks attempt to fit the target variable as a function of the input variables without regard to the underlying joint distribution of the variables. As a result, these global models...

chapter

Automatic semantic annotation of images based on Web data

Guiguang Ding, Na Xu

2010 Sixth International Conference on Information Assurance and Security > 317 - 322

2010 Sixth International Conference on Information Assurance and Security (IAS 2010)

Image annotation is a promising approach to bridging the semantic gap between low-level features and high-level concepts, and it can avoid the heavy manual labor. Most existing automatic image annotation approaches are based on supervised learning. They often encounter several problems, such as insufficiency of training data, lack of ability in dealing with new concept, and a limited number of semantic...

chapter

An approach in web content mining for clustering web pages

R Etemadi, N Moghaddam

2010 Fifth International Conference on Digital Information Management (ICDIM) > 279 - 284

2010 Fifth International Conference on Digital Information Management (ICDIM 2010)

Nowadays, using web and Internet as a world wide information system faces us with so many data. In this direction, the necessity of accessing some tools for data processing in web level which helps the man intelligently to transform these data into useful knowledge seems so important. Clustering the web pages is one of these techniques. In this paper, a new algorithm has been represented to cluster...

chapter

A Personalized Resource Recommendation System Using Data Mining

Huimin Qi, Ming Cui, Mingming Xiao

2010 International Conference on E-Business and E-Government > 5365 - 5368

2010 International Conference on E-Business and E-Government (ICEE 2010)

This paper offers an overview on the concept of "personalization" applied to e-Learning processes. And we introduce a Personalized Resource Recommendation System(PRRS) in e-Learning by using Data Mining techniques. In the PRRS, there are four sub-modules: Learner Model, Learning Materials Clustering, Personalized Recommendation and Personalized Evaluation. PRRS is proposed for the purpose...

chapter

Toward Proper Random Graph Models for Real World Networks

Robert Elsässer, Andre Neubert

2010 Ninth International Conference on Networks > 306 - 315

2010 Ninth International Conference on Networks (ICN 2010)

Inspired by a huge amount of empirical study of real world networks such as the Internet, the Web, as well as various social and biological networks, researchers have in recent years developed several random graph models to help us to understand the most fundamental properties of these systems. Simple characteristics observed in many real world networks are 1.) a high clustering coefficient, i.e.,...

chapter

Sparsity-cognizant overlapping co-clustering for behavior inference in social networks

Hao Zhu, Gonzalo Mateos, Georgios B Giannakis, Nicholas D Sidiropoulos, more

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 3534 - 3537

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

Co-clustering can be viewed as a two-way (bilinear) factorization of a large data matrix into dense/uniform and possibly overlapping sub-matrix factors (co-clusters). This combinatorially complex problem emerges in several applications, including behavior inference tasks encountered with social networks. Existing co-clustering schemes do not exploit the fact that overlapping factors are often sparse,...

chapter

Formalizing MapReduce with CSP

Fan Yang, Wen Su, Huibiao Zhu, Qin Li

2010 17th IEEE International Conference and Workshops on Engineering of Computer Based Systems > 358 - 367

2010 17th IEEE International Conference and Workshops on Engineering of Computer-Based Systems (ECBS 2010)

As a programming model, MapReduce is popularly and widely used in processing and generating large cluster of data sets distributed on large amount of machines. With its widespread use, its validity and other major properties need to be analyzed in a formal framework. In this paper, a formal model is presented using CSP method. We focus on the dominant parts of MapReduce and formalize them in detail...

chapter

Research of Web Transactions Clustering Analysis Based on Ant-Colony Algorithm

Kejun Zhang, Rong Qian, Xiaokun Zhang, Zhixiang Zhu, more

2009 International Conference on Computational Intelligence and Software Engineering > 1 - 4

2009 International Conference on Computational Intelligence and Software Engineering

This paper discusses the two important phases, which are data preprocessing and clustering analysis, in Web transactions clustering analysis, in order to gain an easily interpreted clustering result, we introduce the "Concept URL" in the data preprocessing phase; In the clustering analysis phase, A model of artificial ant is set up. Based on this model, we implement an ant-colony clustering...

chapter

A simulation approach to evaluating design decisions in MapReduce setups

Guanying Wang, A.R. Butt, P. Pandey, K. Gupta

2009 IEEE International Symposium on Modeling, Analysis&Simulation of Computer and Telecommunication Systems > 1 - 11

2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS)

MapReduce has emerged as a model of choice for supporting modern data-intensive applications. The model is easy-to-use and promising in reducing time-to-solution. It is also a key enabler for cloud computing, which provides transparent and flexible access to a large number of compute, storage and networking resources. Setting up and operating a large MapReduce cluster entails careful evaluation of...

chapter

Dynamic Modeling by Usage Data for Personalization Systems

S.R. Aghabozorgi, Teh Yang Wah

2009 13th International Conference Information Visualisation > 450 - 455

2009 13th International Conference Information Visualisation, IV

With the extensive growth of data available on the Internet, personalization of this huge information becomes essential. Although, there are various techniques of personalization, in this paper we concentrate on using data mining algorithms to personalize web sitespsila usage data. This paper proposes an off-line model based web usage mining that is generated by clustering algorithm.Then, we will...

chapter

Proposal for a Growth Model of Social Network Service

K. Ishida, F. Toriumi, K. Ishii

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology > 1 > 91 - 97

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology

In this paper, we analyze the network structure of two SNSs, academic community system (ACS) and Amippy. From the viewpoint of network topology, the major characteristics of these data sets can be summarized as follows: low average shortest-path length, high clustering coefficient, presence of a power law degree distribution and negative assortativity. Based on our analysis, we propose a growth model...

chapter

IGSOM: Incremental Clustering Based on Self-Organizing-Mapping

Ming Liu, Yuan-Chao Liu, Xiao-Long Wang

2008 International Conference on Intelligent Information Hiding and Multimedia Signal Processing > 885 - 890

2008 Fourth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP)

Because of today's explosive information from Internet, people will contact much new information at any moment. So how to analyze this non-stationary information becomes more and more important. Clustering analysis is a good information analysis method, but many clustering algorithms only fit to stationary situation. Then in this paper, a novel incremental clustering algorithm based on self-organizing-mapping-IGSOM...

chapter

Cluster research based on remote server contention states using K-Means over the internet

Yu Song, Fan Xiaoping, Liao Zhifang

2008 27th Chinese Control Conference > 773 - 776

2008 Chinese Control Conference (CCC)

In the environment of data integration over the Internet, the remote serverpsilas contention states take direct effect on the cost of a data query. So to determine the server contention states plays an import role to estimate the cost of query. This paper uses sample queries and k-means algorithm to determine the remote serverpsilas contention states, and get the response cost of the server, then...

chapter

A study on the feature selection of network traffic for intrusion detection purpose

Wanli Ma, D. Tran, D. Sharma

2008 IEEE International Conference on Intelligence and Security Informatics > 245 - 247

2008 IEEE International Conference on Intelligence and Security Informatics (ISI 2008)

The 3 most important issues for anomaly detection based intrusion detection systems by using data mining methods are: feature selection, data value normalization, and the choice of data mining algorithms. In this paper, we study primarily the feature selection of network traffic and its impact on the detection rates. We use KDD CUP 1999 dataset as the sample for the study. We group the features of...

Filter options

Data set:
ieee
Keywords:
DATA MODELS
INTERNET
PATTERN CLUSTERING

Publication date

Set your own date range

Keywords

DATA MINING (12)
CLUSTERING ALGORITHMS (10)
PARTITIONING ALGORITHMS (5)
WEB SITES (5)
CLASSIFICATION ALGORITHMS (4)
CLUSTERING ALGORITHM (4)
TRAINING (4)
ACCURACY (3)
ALGORITHM DESIGN AND ANALYSIS (3)
CLUSTERING (3)
COMPUTATIONAL MODELING (3)
FUZZY LOGIC (3)
K-MEANS ALGORITHM (3)
LEARNING (ARTIFICIAL INTELLIGENCE) (3)
MATHEMATICAL MODEL (3)
REGRESSION ANALYSIS (3)
ANALYTICAL MODELS (2)
CLOUD COMPUTING (2)
COMPUTERS (2)
CONFERENCES (2)
DATA ANALYSIS (2)
DATABASES (2)
DISTRIBUTED DATABASES (2)
ENTROPY (2)
EQUATIONS (2)
FEATURE EXTRACTION (2)
FEATURE SELECTION (2)
GRAPH THEORY (2)
NETWORK TOPOLOGY (2)
NUMERICAL MODELS (2)
OPTIMISATION (2)
POWER LAW DEGREE DISTRIBUTION (2)
SELF-ORGANISING FEATURE MAPS (2)
SERVERS (2)
SOCIAL NETWORK SERVICES (2)
SOFTWARE ENGINEERING (2)
TELECOMMUNICATION TRAFFIC (2)
TESTING (2)
TEXT MINING (2)
WEB MINING (2)
ACADEMIC COMMUNITY SYSTEM (1)
ADAPTATION MODEL (1)
ADJUSTED RAND INDEX VALUE (1)
AMIPPY (1)
ANOMALY DETECTION (1)
ANT-COLONY ALGORITHM (1)
ANT-COLONY CLUSTERING ALGORITHM (1)
APPROXIMATED C-MEDIODS (1)
APPROXIMATION ALGORITHMS (1)
APPROXIMATION METHODS (1)
APRIORI ALGORITHM (1)
ARTIFICIAL ANT (1)
ARTIFICIAL DATASET (1)
ARTIFICIAL INTELLIGENCE (1)
ASSOCIATION RULE MINING (1)
ASSOCIATION RULES (1)
AUTOMATIC SEMANTIC IMAGE ANNOTATION METHOD (1)
BEHAVIOR ANALYSIS (1)
BEHAVIOR INFERENCE (1)
BEHAVIOR PATTERN (1)
BENCHMARK IMAGE DATASETS (1)
BIOLOGICAL NETWORK (1)
BIOLOGICAL SYSTEM MODELING (1)
BLOCK-GP (1)
BUSINESS (1)
CATALOGS (1)
CHROMIUM (1)
CLASSIFICATION AND REGRESSION TREES (1)
CLUSTERING ANALYSIS (1)
CLUSTERING ANALYSIS PHASE (1)
CLUSTERING METHODS (1)
CLUSTERING MODEL (1)
COMMUNITIES (1)
COMPONENT INTER-CONNECT TOPOLOGIES (1)
COMPUTER ARCHITECTURE (1)
COMPUTER NETWORKS (1)
CONCEPT CLUSTERING MODEL (1)
CONCEPT URL (1)
CONNECTING NEAREST-NEIGHBOR MODEL (1)
COSINE SIMILARITY CRITERION (1)
COUPLINGS (1)
COVARIANCE MATRIX (1)
CSP (1)
DATA CLUSTERING ALGORITHM (1)
DATA EXTRACTION (1)
DATA INTEGRATION (1)
DATA LOCALITY (1)
DATA MING (1)
DATA MINING ALGORITHMS (1)
DATA NORMALIZATION (1)
DATA PERSONALIZATION (1)
DATA PRE-PROCESSING (1)
DATA PREPROCESSING (1)
DATA QUERY (1)
DESIGN DECISIONS EVALUATION (1)
DETECTION ALGORITHMS (1)
DIFFERENTIAL GEOMETRY (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options