Search results

Items from 101 to 120 out of 2,294 results

1 ...
3
4
5
6
7
8
9

chapter

An empirical method to improve the performance of the classifiers on imbalanced dataset

S. Babu, N. R. Anantha Narayanan

2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC) > 1 - 8

2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC)

Research focus increases rapidly in recent years in mining imbalanced data sets, because of its challenge and its extensive application in the real world. A dataset is said to be imbalanced, if the representation of attribute categories are not approximately even. All the existing classifiers are inclined to perform poorly on imbalanced datasets. Hence it is very essential to go for well balanced...

chapter

Opinion mining and sentiment analysis on online customer review

K L Santhosh Kumar, Jayanti Desai, Jharna Majumdar

2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC) > 1 - 4

2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC)

The opinion mining is very much essential in e-commerce websites, furthermore advantageous with individual. An ever increasing amount of results are stored in the web as well as the amount of people would acquiring items from web are increasing. As a result, the users' reviews or posts are increasing day by day. The reviews toward shipper sites express their feeling. Any organization for example,...

chapter

Discovering the knowledge to find the affected areas of a plague for taking accurate decision

Ramesh Babu Pittala, M. Nagabhushana Rao, M. Shiva Kumar

2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC) > 1 - 6

2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC)

Correctness and completeness are the two major factors in the medical field to take the accurate decision for the treatment in a span of time. Automated Patient Records (APR) will help to the Health Management Organization (HMO) to take the decision on any specific disease. Among the huge APR's Retrieving the data is very important to HMOs. Proposed Collocation Rules in the spatial data mining will...

chapter

A novel approach for data retrieval using watermark technique

Anmol Chaturvedi, Devendra Kumar Somwanshi, Pranjal Ranjan

2016 International Conference on Recent Advances and Innovations in Engineering (ICRAIE) > 1 - 5

2016 International Conference on Recent Advances and Innovations in Engineering (ICRAIE)

Watermarking is the process of embedding information into a digital signal which may be used to verify its authenticity or the identity of its owners which can further resolve the pilfering of analytical properties. Watermarking plays major role in quality retrieval of data or the message embedded in the image. The process of data retrieval is very much critical as many stumbling blocks are there...

chapter

Mining Statistically Significant Attribute Associations in Attributed Graphs

Jihwan Lee, Keehwan Park, Sunil Prabhakar

2016 IEEE 16th International Conference on Data Mining (ICDM) > 991 - 996

2016 IEEE 16th International Conference on Data Mining (ICDM)

Graphs are widely used to represent many differentkinds of real world data such as social networks, protein-proteininteractions, and road networks. In many cases, each node in agraph is associated with a set of its attributes and it is criticalto not only consider the link structure of a graph but also usethe attribute information to achieve more meaningful results invarious graph mining tasks. Most...

chapter

A Scalable and Generic Framework to Mine Top-k Representative Subgraph Patterns

Dheepikaa Natarajan, Sayan Ranu

2016 IEEE 16th International Conference on Data Mining (ICDM) > 370 - 379

2016 IEEE 16th International Conference on Data Mining (ICDM)

Mining subgraph patterns is an active area of research. Existing research has primarily focused on mining all subgraph patterns in the database. However, due to the exponential subgraph search space, the number of patterns mined, typically, is too large for any human mediated analysis. Consequently, deriving insights from the mined patterns is hard for domain scientists. In addition, subgraph pattern...

chapter

Efficient Extraction of Non-negative Latent Factors from High-Dimensional and Sparse Matrices in Industrial Applications

Xin Luo, Mingsheng Shang, Shuai Li

2016 IEEE 16th International Conference on Data Mining (ICDM) > 311 - 319

2016 IEEE 16th International Conference on Data Mining (ICDM)

High-dimensional and sparse (HiDS) matrices are commonly encountered in many big data-related industrial applications like recommender systems. When acquiring useful patterns from them, non-negative matrix factorization (NMF) models have proven to be highly effective because of their fine representativeness of non-negative data. However, current NMF techniques suffer from a) inefficiency in addressing...

chapter

Improving the Prediction Cost of Drift Handling Algorithms by Abstaining

Pierre-Xavier Loeffel, Vincent Lemaire, Christophe Marsala, Marcin Detyniecki

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) > 1213 - 1222

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

The problem considered in this paper is regression with a constraint on the precision of each prediction in the framework of data streams subject to concept drifts (when the hidden distribution which generates the observations can change over time). Concept drifts can diminish the reliability of the predictions over time and it might not be possible to output a prediction which satisfies the constraints...

chapter

Clustering with the Levy Walk: “Hunting” for Clusters

Benjamin Schelling, Claudia Plant

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) > 1251 - 1260

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

The Levy Walk (or Levy flight) is a concept fromBiomathematics to describe the hunting–behaviour of manypredatory species. It is a very efficient way to find prey in avery short time frame. We now want to use this concept ina clustering–context to – if you so will – "hunt" for clusters. We describe how we convert this concept into an efficient wayto find cluster centres by linking the data...

chapter

What Drives Consumer Choices? Mining Aspects and Opinions on Large Scale Review Data Using Distributed Representation of Words

Kasturi Bhattacharjee, Linda Petzold

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) > 908 - 915

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

With the increasing popularity of online review sites, developing methods to mine and analyze information contained in the vast amounts of noisy user-generated reviews becomes a necessity. In this work, we develop a method to uncover the various aspects of a product or service reviewed by a user, and the opinions associated with them, in an automated fashion. We use the neural network model Word2Vec...

chapter

Markov Switching Copula Models for Longitudinal Data

Alfredo Cuesta-Infante, Kalyan Veeramachaneni

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) > 1104 - 1109

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

In this paper we present a novel Markov Switching generative model for continuous multivariate time series and longitudinal data based on Gaussian copula functions. We assume that the values of the multivariate time series at every time slice are sampled out of a joint probability distribution that is selected by the latent state. The use of Gaussian copula functions give the flexibility of individual...

chapter

Mining Effective Subsequences with Application in Marketing Attribution

Zi Yin, Ying Li, Pietro Mazzoleni, Yuanyuan Shen

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) > 700 - 707

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

In this paper, we present a new data mining framework for discovering sequence effects. In particular, we focus on the sequences consisting of actions that are taken in chronological order, like sequences of clinical procedures or marketing actions. Each sequence is associated with a binary outcome, a success or a failure. We investigate the hypothesis that certain subsequences of actions contribute...

chapter

Multi-type Co-clustering of General Heterogeneous Information Networks via Nonnegative Matrix Tri-Factorization

Xianchao Zhang, Haixin Li, Wenxin Liang, Jiebo Luo

2016 IEEE 16th International Conference on Data Mining (ICDM) > 1353 - 1358

2016 IEEE 16th International Conference on Data Mining (ICDM)

Many kinds of real world data can be modeled by a heterogeneous information network (HIN) which consists of multiple types of objects. Clustering plays an important role in mining knowledge from HIN. Several HIN clustering algorithms have been proposed in recent years. However, these algorithms suffer from one or moreof the following problems: (1) inability to model general HINs, (2) inability to...

chapter

DeBot: Twitter Bot Detection via Warped Correlation

Nikan Chavoshi, Hossein Hamooni, Abdullah Mueen

2016 IEEE 16th International Conference on Data Mining (ICDM) > 817 - 822

2016 IEEE 16th International Conference on Data Mining (ICDM)

We develop a warped correlation finder to identify correlated user accounts in social media websites such as Twitter. The key observation is that humans cannot be highly synchronous for a long duration, thus, highly synchronous user accounts are most likely bots. Existing bot detection methods are mostly supervised, which requires a large amount of labeled data to train, and do not consider cross-user...

chapter

BRPS: A Big Data Placement Strategy for Data Intensive Applications

Lihui Liu, Junping Song, Haibo Wang, Pin Lv

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) > 813 - 820

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

The Market of Data is an environment where data are reasonably deal with. Some data in the market of data are large and hard to analyze. How to efficiently analyze and organize such large scale data in the market of data is a difficult problem. When using Hadoop to analyze these massive data, if input data of a data mining task are not locally available in a processing node, data have to be migrated...

chapter

Informed Design Platform: Interpreting “Big Data” to Adaptive Place Designs

Linlin You, Bige Tuncer

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) > 1332 - 1335

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

As a novel concept, "Informed Design" is proposed in a multidisciplinary project "Livable Places" in Singapore to innovate place design from empirical to evidential by harnessing geo-referenced "Big Data" for a responsive design. As a final delivery, an Informed Design Platform (IDP) is being implemented as a design support tool interpreting multi-source big data to adaptive...

chapter

Graph Mining for Complex Data Analytics

Andre Petermann, Martin Junghanns, Stephan Kemper, Kevin Gomez, more

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) > 1316 - 1319

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

Complex data analytics that involve data mining often comprise not only a single algorithm but also further data processing steps, for example, to restrict the search space or to filter the result. We demonstrate graph mining with Gradoop, the first scalable system supporting declarative analytical programs composed from multiple graph operations. We use a business intelligence example including frequent...

chapter

Detecting Smooth Cluster Changes in Evolving Graphs

Sohei Okui, Kaho Osamura, Akihiro Inokuchi

2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA) > 369 - 374

2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)

Clustering vertices in graphs or in sequences of graphs has important applications in network science, bioinformatics, and other areas. Most research to date has focused on static graphs or sequences where the number of vertices does not change. We propose a new algorithm that successfully partitions the vertices of a graph sequence into smooth clusters, even when the number of vertices is allowed...

chapter

Steady Patterns

Willy Ugarte, Alexandre Termier, Miguel Santana

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) > 692 - 699

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

Skypatterns are an elegant answer to the pattern explosion issue, when a set of measures can be provided. Skypatterns for all possible measure combinations can be explored thanks to recent work on the skypattern cube. However, this leads to too many skypatterns, where it is difficult to quickly identify which ones are more important. First, we introduce a new notion of pattern steadiness which measures...

chapter

Max-node sampling: An expansion-densification algorithm for data collection

Katchaguy Areekijseree, Ricky Laishram, Sucheta Soundarajan

2016 IEEE International Conference on Big Data (Big Data) > 3944 - 3946

2016 IEEE International Conference on Big Data (Big Data)

In this work, we propose Max-Node sampling, a novel sampling algorithm for data collection. The goal of Max-Node is to maximize the number of nodes observed in the sample, given a budget constraint. Max-Node is based on the intuition that networks contain many densely connected regions (i.e., communities), that may be only weakly connected to another, and to maximize the number of nodes observed,...

1 ...
3
4
5
6
7
8
9

Keywords:
CONFERENCES
DATA MINING

Publication date

Set your own date range

Content availability

Available (2,221)
None (73)

Keywords

PROBABILITY DENSITY FUNCTION (333)
FEATURE EXTRACTION (221)
CYBERNETICS (198)
ALGORITHM DESIGN AND ANALYSIS (196)
MACHINE LEARNING (186)
SOFTWARE (158)
COMPUTERS (147)
DATABASES (144)
INTERNET (140)
EDUCATIONAL INSTITUTIONS (135)
SIGNAL PROCESSING (114)
COMPUTATIONAL MODELING (112)
SECURITY (107)
ACCURACY (97)
TRAINING (97)
DATA MODELS (95)
CLUSTERING ALGORITHMS (92)
CONTEXT (87)
COMPUTER SCIENCE (84)
DECISION SUPPORT SYSTEMS (83)
ELECTRONIC MAIL (83)
INFORMATION RETRIEVAL (80)
EDUCATION (78)
ARTIFICIAL NEURAL NETWORKS (76)
CLASSIFICATION ALGORITHMS (75)
MATHEMATICAL MODEL (75)
ROBUSTNESS (74)
COMPUTER AIDED INSTRUCTION (71)
EQUATIONS (71)
COLLABORATION (69)
IMAGE COLOR ANALYSIS (69)
SIGNAL PROCESSING ALGORITHMS (69)
USA COUNCILS (68)
BUSINESS (66)
COMPLEXITY THEORY (66)
TRANSFORMS (65)
ARTIFICIAL INTELLIGENCE (64)
CORRELATION (64)
MONITORING (64)
SOFTWARE ENGINEERING (64)
PATTERN CLASSIFICATION (63)
ANALYTICAL MODELS (62)
PROTOCOLS (62)
IMAGE PROCESSING (61)
WEB SERVICES (61)
INDEXES (60)
FUZZY SET THEORY (58)
ITEMSETS (58)
ONTOLOGIES (58)
PATTERN RECOGNITION (58)
IMAGE SEGMENTATION (57)
OPTIMIZATION (57)
SUPPORT VECTOR MACHINES (57)
TESTING (57)
NATURAL LANGUAGE PROCESSING (55)
NOISE (55)
SHAPE (55)
LEARNING (ARTIFICIAL INTELLIGENCE) (54)
ENGINEERING EDUCATION (53)
TEXT ANALYSIS (53)
COMPUTER ARCHITECTURE (52)
MULTIMEDIA COMMUNICATION (52)
SECURITY OF DATA (52)
WIRELESS SENSOR NETWORKS (52)
COMPUTER VISION (50)
CRYPTOGRAPHY (50)
MATERIALS (50)
SOCIAL NETWORK SERVICES (50)
PATTERN CLUSTERING (49)
VISUALIZATION (49)
ESTIMATION (48)
COMMUNITIES (47)
DISTANCE MEASUREMENT (47)
INFORMATION TECHNOLOGY (46)
PRESSES (46)
PROBABILITY (46)
MEDIA (45)
ONTOLOGIES (ARTIFICIAL INTELLIGENCE) (45)
PEER-TO-PEER COMPUTING (45)
BIOLOGICAL SYSTEM MODELING (44)
HUMANS (44)
DECISION MAKING (43)
ORGANIZATIONS (43)
ASSOCIATION RULES (42)
CAMERAS (42)
IMAGE EDGE DETECTION (42)
MOBILE COMMUNICATION (40)
REAL TIME SYSTEMS (40)
SENSORS (39)
STATISTICAL ANALYSIS (39)
ENCODING (38)
HEURISTIC ALGORITHMS (38)
ADAPTATION MODEL (37)
HIDDEN MARKOV MODELS (37)
CLASSIFICATION (36)
DATA ANALYSIS (36)
SEMANTIC WEB (36)
SERVERS (36)
more

INFONA - science communication portal

Search results

An empirical method to improve the performance of the classifiers on imbalanced dataset

Opinion mining and sentiment analysis on online customer review

Discovering the knowledge to find the affected areas of a plague for taking accurate decision

A novel approach for data retrieval using watermark technique

Mining Statistically Significant Attribute Associations in Attributed Graphs

A Scalable and Generic Framework to Mine Top-k Representative Subgraph Patterns

Efficient Extraction of Non-negative Latent Factors from High-Dimensional and Sparse Matrices in Industrial Applications

Improving the Prediction Cost of Drift Handling Algorithms by Abstaining

Clustering with the Levy Walk: “Hunting” for Clusters

What Drives Consumer Choices? Mining Aspects and Opinions on Large Scale Review Data Using Distributed Representation of Words

Markov Switching Copula Models for Longitudinal Data

Mining Effective Subsequences with Application in Marketing Attribution

Multi-type Co-clustering of General Heterogeneous Information Networks via Nonnegative Matrix Tri-Factorization

DeBot: Twitter Bot Detection via Warped Correlation

BRPS: A Big Data Placement Strategy for Data Intensive Applications

Informed Design Platform: Interpreting “Big Data” to Adaptive Place Designs

Graph Mining for Complex Data Analytics

Detecting Smooth Cluster Changes in Evolving Graphs

Steady Patterns

Max-node sampling: An expansion-densification algorithm for data collection

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options