Search results

Items from 1 to 20 out of 33 results

chapter

Optimal trade-off between accuracy and network cost of distributed learning in Mobile Edge Computing: An analytical approach

Lorenzo Valerio, Andrea Passarella, Marco Conti

2017 IEEE 18th International Symposium on A World of Wireless, Mobile and Multimedia Networks (WoWMoM) > 1 - 9

2017 IEEE 18th International Symposium on " World of Wireless, Mobile and Multimedia Networks (WoWMoM)

The most widely adopted approach for knowledge extraction from raw data generated at the edges of the Internet (e.g., by IoT or personal mobile devices) is through global cloud platforms, where data is collected from devices, and analysed. However, with the increasing number of devices spread in the physical environment, this approach rises several concerns. The data gravity concept, one of the basis...

chapter

Bigdata analysis and comparison of bigdata analytic approches

Shweta Malhotra, M.N Doja, Bashir Alam, Mansaf Alam

2017 International Conference on Computing, Communication and Automation (ICCCA) > 309 - 314

2017 International Conference on Computing, Communication and Automation (ICCCA)

Recent technological advancements in typical domains (e.g. internet, financial companies, health care, user generated data, supply chain systems etc.) have directed to inundate of data from these domains. Data outburst trend gave the insight meaning to the buzz word ‘Bigdata’. If we compare with traditional data, Bigdata exhibits some unique characteristics like it is commonly enormous and unstructured...

chapter

Feisu: Fast Query Execution over Heterogeneous Data Sources on Large-Scale Clusters

An Qin, Yuan Yuan, Dai Tan, Pengyu Sun, more

2017 IEEE 33rd International Conference on Data Engineering (ICDE) > 1173 - 1182

2017 IEEE 33rd International Conference on Data Engineering (ICDE)

Fast data analytics at an increasingly large scale has become a critical task in any Internet service company. For example, in Baidu, the major search engine company in China, large volumes of Web and business data in PB-scale are timely and constantly acquired and analyzed for the purposes of evaluating product revenue, tracking product demanding activities on market, predicting user behavior, upgrading...

chapter

Some key problems of data management in army data engineering based on big data

Xiao HongJu, Wang Fei, Wang FenMei, Wang XiuZhen

2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA)( > 149 - 152

2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA)

This paper analyzed the challenges of data management in army data engineering, such as big data volume, data heterogeneous, high rate of data generation and update, high time requirement of data processing, and widely separated data sources. We discussed the disadvantages of traditional data management technologies to deal with these problems. We also highlighted the key problems of data management...

chapter

Information Theoretic Limits of Data Shuffling for Distributed Learning

Mohamed Adel Attia, Ravi Tandon

2016 IEEE Global Communications Conference (GLOBECOM) > 1 - 6

GLOBECOM 2016 - 2016 IEEE Global Communications Conference

Data shuffling is one of the fundamental building blocks for distributed learning algorithms, that increases the statistical gain for each step of the learning process. In each iteration, different shuffled data points are assigned by a central node to a distributed set of workers to perform local computation, which leads to communication bottlenecks. The focus of this paper is on formalizing and...

chapter

Addressing the big-earth-data variety challenge with the hierarchical triangular mesh

Michael L. Rilee, Kwo-Sen Kuo, Thomas Clune, Amidu Oloso, more

2016 IEEE International Conference on Big Data (Big Data) > 1006 - 1011

2016 IEEE International Conference on Big Data (Big Data)

We have implemented an updated Hierarchical Triangular Mesh (HTM) as the basis for a unified data model and an indexing scheme for geoscience data to address the variety challenge of Big Earth Data. In the absence of variety, the volume challenge of Big Data is relatively easily addressable with parallel processing. The more important challenge in achieving optimal value with a Big Data solution for...

chapter

DRASH: A Data Replication-Aware Scheduler in Geo-Distributed Data Centers

Moise W. Convolbo, Jerry Chou, Shihyu Lu, Yeh Ching Chung

2016 IEEE International Conference on Cloud Computing Technology and Science (CloudCom) > 302 - 309

2016 IEEE International Conference on Cloud Computing Technology and Science (CloudCom)

Driven by the trends of BigData and Cloud computing, there is a growing demand for processing and analyzing data that are generated and stored across geo-distributed data centers. However, due to the limited network bandwidth between data centers and the growing data volume spread across different locations, it has become increasingly inefficient to aggregate data and to perform computations at a...

chapter

Hug the Elephant: Migrating a Legacy Data Analytics Application to Hadoop Ecosystem

Feng Zhu, Jie Liu, Sa Wang, Jiwei Xu, more

2016 IEEE International Conference on Software Maintenance and Evolution (ICSME) > 177 - 187

2016 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Big data applications that rely on relational databases gradually expose limitations on scalability and performance. In recent years, Hadoop ecosystem has been widely adopted as an evolving solution. This paper presents the migration of a legacy data analytics application in a provincial data center. The target platform follows "no one size fits all" method. Considering different workloads,...

chapter

Data driven decision making for sustainable smart environments

Achim P. Karduck, Shravan S. Chitlur

2015 11th International Conference on Innovations in Information Technology (IIT) > 146 - 151

2015 11th International Conference on Innovations in Information Technology (IIT)

An exponential growth in the availability of data from various sources has enabled large scale adoption of data driven decision making. Much of present day's data was generated in the recent years, complementing to this there has been a substantial reduction in data storage costs. Hence the analysis of data collected will assist decision making in our future smart environments. Here sustainability...

chapter

An Automatic Discovery Framework of Cross-Source Data Inconsistency for Web Big Data

Sha Yang, Wei Yu, Yahui Hu, Kai Wang, more

2015 Third International Conference on Advanced Cloud and Big Data > 73 - 79

2015 Third International Conference on Advanced Cloud and Big Data (CBD)

The vigorous growth of big data has triggered both opportunities and challenges in business and industry. However, Web big data distributed in diverse sources with multiple data structures frequently conflict with each other, i.e. inconsistency in cross-source Web big data. In this paper, we propose a state-of-the-art architecture of auto-discovering inconsistency with Web big data. Our contributions...

chapter

Decoding data analytics capabilities from topic modeling on press releases

JeanCarlo Bonilla, Bharat Rao

2015 Portland International Conference on Management of Engineering and Technology (PICMET) > 1959 - 1968

2015 Portland International Conference on Management of Engineering and Technology (PICMET)

In their quest for data-driven insight, firms align their resources to produce information that is actionable. Moreover, the bundling and utilization of these valuable resources is what defines an organizational capability. Thus, in this paper we conceptualize a new type of capability - data analytics capabilities, DAC, as the ability to assemble, coordinate, mobilize, and deploy analytics-based resources...

chapter

Data Analysis of Distributed Application Platform Based on the R Which Apply to Digital Library

Ningbo Wu, Fan Yang

2015 14th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES) > 296 - 299

2015 14th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES)

Digital Library large data resource lack of analysis and use, in order to mining the value of big data resources, proposed platformization analysis and processing mode. By integrate R and Hadoop to construct distributed data analysis platform, many big data analytical can be decomposed into "large" and "small" data processing section, overcome before scheme puzzle on analytical...

chapter

Adaptive cluster-based outlier detection

Tilo Strutz

2009 17th European Signal Processing Conference > 1710 - 1714

2009 17th European Signal Processing Conference

The analysis of data is typically accompanied by concern as to the correctness of recorded data points; some of the points might be contaminated, thereby distorting the result of the analysis. This paper proposes a novel cluster-based and distribution-independent method for outlier detection. Based on Monte Carlo simulations, the new method is tested with different data distributions and compared...

article

Second-Generation Big Data Systems

Fadi H. Gebara, H. Peter Hofstee, Kevin J. Nowka

Computer > 2015 > 48 > 1 > 36 - 41

More varied data channels, increasingly diverse analytic methods, and new deployment models--along with some fundamental technology shifts--will significantly impact the next generation of big data systems.

chapter

Outlier detection in streaming data a research perspective

Neeraj Chugh, Mitali Chugh, Alok Agarwal

2014 International Conference on Parallel, Distributed and Grid Computing > 429 - 432

2014 International Conference on Parallel, Distributed and Grid Computing (PDGC)

Data mining is a system that brings up the light to hidden and valuable information from the data and the facts revealed by data mining which were previously not known, theoretically useful, and of high quality. Data mining offers a means by which we can explores the knowledge in database. Data stream mining and finding outliers are dynamic research areas of data mining. It is thought that ‘data stream...

chapter

Applying MapReduce Programming Model for Handling Scientific Problems

Yun Hee Kang, Young B. Park

2014 International Conference on Information Science & Applications (ICISA) > 1 - 2

2014 International Conference on Information Science and Applications (ICISA)

According to data volumes in scientific applications have grown exponentially, new scientific methods to analyze and organize the data are required. MapReduce programming is driving Internet services and those services operation in a cloud environment. Hence it is required to efficiently provide resources for handling diverse MapReduce applications. In this paper we show the Hadoop application with...

chapter

Distributed log information processing with Map-Reduce: A case study from raw data to final models

Mingyue Luo, Gang Liu

2010 IEEE International Conference on Information Theory and Information Security > 1143 - 1146

2010 IEEE International Conference on Information Theory and Information Security

With the high development of Internet, e-commerce websites now routinely have to work with log datasets which are up to a few terabytes in size. How to remove messy data timely with low cost and find out useful information is a problem we have to face. The mining process involves several steps from pre-processing the raw data to establishing the final models. In this paper we describe our method to...

chapter

ObsDB: A System for Uniformly Storing and Querying Heterogeneous Observational Data

S Bowers, J Kudo, Huiping Cao, M P Schildhauer

2010 IEEE Sixth International Conference on e-Science > 261 - 268

E-Science 2010. 6th IEEE International Conference on E-Science (E-Science 2010)

Earth and environmental scientists collect and use a wide range of observational data. This data often exhibits high structural and semantic heterogeneity due to the variety of data collected and the ways in which observational datasets are structured in practice. However, to address questions at broad temporal, geographic, and biological scales, researchers often need to access and combine data from...

chapter

i-Analyst: An Agent-Based Distributed Data Mining Platform

Chayapol Moemeng, Xinhua Zhu, Longbing Cao, Chen Jiahang

2010 IEEE International Conference on Data Mining Workshops > 1404 - 1406

2010 10th IEEE International Conference on Data Mining Workshops (ICDMW 2010)

User-friendliness and performance are important properties of data mining and analysis tools. In this demo, we introduced an agent-based distributed data mining platform that allows users to manage and share the data-mining-related resources conveniently. Furthermore, the platform employs agents for workflow enactment in which the performance is improved with agent abilities. We also present an example...

chapter

Bridging the Gap between Heterogeneous and Semantically Diverse Content of Different Disciplines

S Bykau, N Kiyavitskaya, C Tsinaraki, Y Velegrakis

2010 Workshops on Database and Expert Systems Applications > 305 - 309

2010 21st International Conference on Database and Expert Systems Applications

The Web has been flooded with highly heterogeneous data sources that freely offer their data to the public. Careful design and compliance to standards is a way to cope with the heterogeneity. However, any agreement and compliance is practically hard to achieve across different communities. In this work we describe a framework that enables the exploitation of content across different scientific disciplines...

Keywords:
DATA MODELS
DATA ANALYSIS
DISTRIBUTED DATABASES

Publication date

Set your own date range

Publication type

book (31)
article (2)

Keywords

DATA MINING (9)
ANALYTICAL MODELS (7)
BIG DATA (7)
DISTRIBUTED PROCESSING (7)
COMPUTATIONAL MODELING (5)
ALGORITHM DESIGN AND ANALYSIS (4)
GRID COMPUTING (4)
BANDWIDTH (3)
DATA STRUCTURES (3)
DISTRIBUTED DATA MINING (3)
INTERNET (3)
RELATIONAL DATABASES (3)
ARRAYS (2)
BIOLOGICAL SYSTEM MODELING (2)
BUSINESS (2)
CLOUD COMPUTING (2)
CLUSTERING ALGORITHM (2)
CLUSTERING ALGORITHMS (2)
COMPUTERS (2)
CONTEXT (2)
DATA INTEGRATION (2)
DATA VISUALISATION (2)
DISTRIBUTED (2)
DISTRIBUTED COMPUTING (2)
GEOGRAPHIC INFORMATION SYSTEMS (2)
HADOOP (2)
INDUSTRIES (2)
LOAD BALANCING (2)
MOBILE COMMUNICATION (2)
OBJECT ORIENTED MODELING (2)
OBJECT-ORIENTED DATABASES (2)
PATTERN CLUSTERING (2)
PROGRAMMING (2)
RANDOM VARIABLES (2)
SEMANTICS (2)
SERVERS (2)
TRAINING (2)
WEB SERVICES (2)
XML (2)
1D SIGNAL (1)
2D STATISTICAL TESTS (1)
ADAPTATION MODELS (1)
ARMY DATA ENGINEERING (1)
ARRAY DATABASE (1)
ATMOSPHERIC MODELING (1)
BIG DATA ANALYTICS (BDA) (1)
BIGDATA (1)
BIGDATA ANALYTICS (1)
BIGDATA SYSTEM ARCHITECTURE (1)
BIOTECHNOLOGY (1)
CHALLENGES OF BIGDATA (1)
CLIMATE MODELS (1)
CLIMATOLOGY (1)
COMMUNICATION MESSAGES (1)
COMPACT MUON SOLENOID (1)
COMPLEX DISTRIBUTED IT SYSTEM BEHAVIOR (1)
COMPONENT (1)
COMPUTER ARCHITECTURE (1)
COMPUTER SYSTEMS ORGANIZATION (1)
CONCENTRIC DATA LAYOUT SOLUTION (1)
CONFERENCES (1)
CONSTRUCTION INDUSTRY (1)
CONTAINERS (1)
CORRELATED NOISE (1)
CORRELATION (1)
COUPLINGS (1)
CPU UTILIZATION (1)
DAAC (1)
DATA ACCESS OPTIMIZATION (1)
DATA ACCESS PATTERN (1)
DATA ANALYSTS (1)
DATA ANALYTIC JOBS (1)
DATA ANNOTATION (1)
DATA ASSOCIATIONS (1)
DATA CENTER (1)
DATA CLEANSING (1)
DATA CONSISTENCY (1)
DATA DISTRIBUTION (1)
DATA DISTRIBUTION ALGORITHM (1)
DATA DISTRIBUTION STRATEGY (1)
DATA ENGINEERING (1)
DATA EVOLUTION (1)
DATA EXTRACTION (1)
DATA FUSION (1)
DATA MANAGEMENT (1)
DATA MINING METHODS (1)
DATA MODEL (1)
DATA NODE (1)
DATA NONSTATIONARY (1)
DATA PRE-PROCESS OPERATION (1)
DATA PRE-PROCESSING (1)
DATA QUALITY (1)
DATA QUALITY ASSESSMENT (1)
DATA QUERY (1)
DATA REPLICATION (1)
DATA REPRESENTATION (1)
DATA SEMANTICS (1)
more

INFONA - science communication portal

Search results

Optimal trade-off between accuracy and network cost of distributed learning in Mobile Edge Computing: An analytical approach

Bigdata analysis and comparison of bigdata analytic approches

Feisu: Fast Query Execution over Heterogeneous Data Sources on Large-Scale Clusters

Some key problems of data management in army data engineering based on big data

Information Theoretic Limits of Data Shuffling for Distributed Learning

Addressing the big-earth-data variety challenge with the hierarchical triangular mesh

DRASH: A Data Replication-Aware Scheduler in Geo-Distributed Data Centers

Hug the Elephant: Migrating a Legacy Data Analytics Application to Hadoop Ecosystem

Data driven decision making for sustainable smart environments

An Automatic Discovery Framework of Cross-Source Data Inconsistency for Web Big Data

Decoding data analytics capabilities from topic modeling on press releases

Data Analysis of Distributed Application Platform Based on the R Which Apply to Digital Library

Adaptive cluster-based outlier detection

Second-Generation Big Data Systems

Outlier detection in streaming data a research perspective

Applying MapReduce Programming Model for Handling Scientific Problems

Distributed log information processing with Map-Reduce: A case study from raw data to final models

ObsDB: A System for Uniformly Storing and Querying Heterogeneous Observational Data

i-Analyst: An Agent-Based Distributed Data Mining Platform

Bridging the Gap between Heterogeneous and Semantically Diverse Content of Different Disciplines

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options