Search results

Items from 1 to 16 out of 16 results

chapter

Multi-time scale forecast for schedulable capacity of EVs based on big data and machine learning

Meiqin Mao, Yangyang Wang, You Yue, Liuchen Chang

2017 IEEE Energy Conversion Congress and Exposition (ECCE) > 1425 - 1431

2017 IEEE Energy Conversion Congress and Exposition (ECCE)

The application of large-scale electric vehicles (EVs) into the future smart grid may bring about serious power quality problems. But EVs can provide ancillary services for the power system as distributed energy resources through Vehicle-to-grid (V2G) technology. The fast and accurate prediction of schedulable capacity (SC) of EVs is the first step to enable this benefit. In this paper, two different...

chapter

Big data anonymization with spark

Yavuz Canbay, Seref Sagiroglu

2017 International Conference on Computer Science and Engineering (UBMK) > 833 - 838

2017 International Conference on Computer Science and Engineering (UBMK)

Privacy is an important issue for big data including sensitive attributes. In the case of directly sharing or publishing these data, privacy breach occurs. In order to overcome this problem, previous studies were focused on developing big data anonymization techniques on Hadoop environment. When compared to Hadoop, Spark facilitates to develop faster applications with the help of keeping data in memory...

chapter

Deep-level quality management based on big data analytics with case study

Xiaolei Li, Zhenyu Tu, Quanchao Jia, Xinjiang Man, more

2017 Chinese Automation Congress (CAC) > 4921 - 4926

2017 Chinese Automation Congress (CAC)

The Big data analytics gives new chances to the enterprises to enhance their management and manufacturing levels. A solution with case study is proposed to accomplish deep-level quality management based on big data analytics. First, the implementation of big data analytics based on industrial process data is illustrated with case study illustration. Through the analysis and feature extraction of off-line...

chapter

A Big Data architecture for knowledge discovery in PubMed articles

Francesco Gargiulo, Stefano Silvestri, Mario Ciampi

2017 IEEE Symposium on Computers and Communications (ISCC) > 82 - 87

2017 IEEE Symposium on Computers and Communications (ISCC)

The need of smart information retrieval systems is in contrast with the difficulties to deal with huge amount of data. In this paper we present a Big Data Analytics architecture used to implement a semantic similarity search tool for natural language texts in biomedical domain. The implemented methodology is based on Word Embeddings (WEs) models obtained using the word2vec algorithm. The system has...

chapter

An Ensemble Random Forest Algorithm for Insurance Big Data Analysis

Ziming Wu, Weiwei Lin, Zilong Zhang, Angzhan Wen, more

22017 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC) > 1 > 531 - 536

2017 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC)

Due to the imbalanced distribution of business data, missing of user features and many other reasons, directly using big data techniques on realistic business data tends to deviate from the business goals. It is difficult to model the insurance business data by classification algorithms like Logistic Regression and SVM etc. This paper exploits a heuristic bootstrap sampling approach combined with...

chapter

GPU in-Memory Processing Using Spark for Iterative Computation

Sumin Hong, Woohyuk Choi, Won-Ki Jeong

2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) > 31 - 41

2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)

Due to its simplicity and scalability, MapReduce has become a de facto standard computing model for big data processing. Since the original MapReduce model was only appropriate for embarrassingly parallel batch processing, many follow-up studies have focused on improving the efficiency and performance of the model. Spark follows one of these recent trends by providing in-memory processing capability...

chapter

The IPTV video evaluation model based on big data

Longfeng Yu, Junhua Gu, Shoubin Wang, Suqi Zhang

2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA)( > 159 - 163

2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA)

The IPTV video evaluation model based on big data is a beneficial basis for IPTV video evaluation. With the new media, social network, Internet of things and cloud computing continuing to evolve, the video-related big data arises at the historic moment. IPTV has also become the choice of more and more users. And IPTV editors are troubled by how to choose the best video for IPTV users. In this paper,...

chapter

Research the Data Analysis and Processing between MapReduce and Spark

Jaime Raigoza, Vijay Parmar

2016 International Conference on Computational Science and Computational Intelligence (CSCI) > 1401 - 1402

2016 International Conference on Computational Science and Computational Intelligence (CSCI)

Big Data can be defined as large data sets which are being generated from different sources like social media, audios, imaging, logging online websites etc. A need exists to process and analyze this huge amount of data to extract meaningful information. This can be a challenging task. Big data exceeds the processing capability of traditional databases to capture, manage, and process the voluminous...

chapter

Big data techniques, systems, applications, and platforms: Case studies from academia

Atanas Radenski, Todor Gurov, Kalinka Kaloyanova, Nikolay Kirov, more

2016 Federated Conference on Computer Science and Information Systems (FedCSIS) > 883 - 888

2016 Federated Conference on Computer Science and Information Systems (FedCSIS)

Big data is a broad term with numerous dimensions, most notably: big data characteristics, techniques, software systems, application domains, computing platforms, and big data milieu (industry, government, and academia). In this paper we briefly introduce fundamental big data characteristics and then present seven case studies of big data techniques, systems, applications, and platforms, as seen from...

chapter

Predictive Spatio-Temporal Query Processor on Resilient Distributed Datasets

Vijay Akkineni, Berkay Aydin, Sajitha Naduvil-Vadukootu, Rafal Angryk

2016 IEEE International Conferences on Big Data and Cloud Computing (BDCloud), Social Computing and Networking (SocialCom), Sustainable Computing and Communications (SustainCom) (BDCloud-SocialCom-SustainCom) > 50 - 58

Moving object prediction and indexing have been a well studied area of research and include applications in environment monitoring, traffic prediction, advertising, and efficient routing. Spark is a cluster computing framework, which utilizes Resilient Distributed Datasets (RDD) on a cluster of several commodity machines. Spark is popularly used for parallel processing of massive datasets. The modeling...

chapter

Visualization and Adaptive Subsetting of Earth Science Data in HDFS: A Novel Data Analysis Strategy with Hadoop and Spark

Xi Yang, Si Liu, Kun Feng, Shujia Zhou, more

Data analytics becomes increasingly important in big data applications. Adaptively subsetting large amounts of data to extract the interesting events such as the centers of hurricane or thunderstorm, statistically analyzing and visualizing the subset data, is an effective way to analyze ever-growing data. This is particularly crucial for analyzing Earth Science data, such as extreme weather. The Hadoop...

chapter

Online Credit Card Fraud Detection: A Hybrid Framework with Big Data Technologies

You Dai, Jin Yan, Xiaoxin Tang, Han Zhao, more

2016 IEEE Trustcom/BigDataSE/ISPA > 1644 - 1651

2016 IEEE Trustcom/BigDataSE/ISPA

In this paper, we focus on designing an online credit card fraud detection framework with big data technologies, by which we want to achieve three major goals: 1) the ability to fuse multiple detection models to improve accuracy, 2) the ability to process large amount of data and 3) the ability to do the detection in real time. To accomplish that, we propose a general workflow, which satisfies most...

chapter

Topic Modeling and Visualization for Big Data in Social Sciences

Nitin Sukhija, Mahidhar Tatineni, Nicole Brown, Mark Van Moer, more

2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld) > 1198 - 1205

Topic modeling is a widely used approach for analyzing large text collections. In particular, Latent Dirichlet Allocation (LDA) is one of the most popular topic modeling approaches to aggregate vocabulary from a document corpus to form latent "topics". However, learning meaningful topic models with massive document collections which contain millions of documents, billions of tokens is challenging,...

chapter

Stage Aware Performance Modeling of DAG Based in Memory Analytic Platforms

Giovanni Paolo Gibilisco, Min Li, Li Zhang, Danilo Ardagna

2016 IEEE 9th International Conference on Cloud Computing (CLOUD) > 188 - 195

2016 IEEE 9th International Conference on Cloud Computing (CLOUD)

Spark has grown both in popularity and complexity in recent years. In order to use available resources in an efficient way, users need to understand how the behavior of their applications is affected by the size of the datasets and various configuration settings. Indeed, Spark allows users to specify many configuration parameters and understanding the impact of these choices with respect to the application...

chapter

SparkRDF: Elastic Discreted RDF Graph Processing Engine With Distributed Memory

Xi Chen, Huajun Chen, Ningyu Zhang, Songyang Zhang

2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT) > 1 > 292 - 300

2015 IEEE / WIC / ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)

With the explosive growth of semantic data on the Web over the past years, many large-scale RDF knowledge bases with billions of facts are generating. This poses significant challenges for the storage and query of big RDF graphs. Current systems still have many limitations in processing big RDF graphs including scalability and real-time. In this paper, we introduce the SparkRDF, an elastic discreted...

chapter

A Multi-dimensional Comparison of Toolkits for Machine Learning with Big Data

Aaron N. Richter, Taghi M. Khoshgoftaar, Sara Landset, Tawfiq Hasanin

2015 IEEE International Conference on Information Reuse and Integration > 1 - 8

2015 IEEE International Conference on Information Reuse and Integration (IRI)

Big data is a big business, and effective modeling of this data is key. This paper provides a comprehensive multidimensional analysis of various open source tools for machine learning with big data. An evaluation standard is proposed along with detailed comparisons of the frameworks discussed, with regard to algorithm availability, scalability, speed, and more. The major tools profiled are Mahout,...

Filter options

Keywords:
DATA MODELS
BIG DATA
SPARK

Publication date

Set your own date range

Keywords

SPARKS (13)
HADOOP (5)
COMPUTATIONAL MODELING (4)
DATA MINING (3)
DISTRIBUTED DATABASES (3)
ENGINES (3)
MAPREDUCE (3)
REAL-TIME SYSTEMS (3)
ANALYTICAL MODELS (2)
BIG DATA ANALYTICS (2)
CLUSTERING ALGORITHMS (2)
DATA ANALYSIS (2)
MACHINE LEARNING (2)
NOSQL (2)
PREDICTIVE MODELS (2)
PROGRAMMING (2)
SEMANTICS (2)
TRAINING (2)
VISUALIZATION (2)
ALGORITHM DESIGN AND ANALYSIS (1)
ANONYMIZATION (1)
BIO-MEDICAL LITERATURE (1)
BIOLOGICAL SYSTEM MODELING (1)
BP NEURAL NETWORK (1)
CLASSIFICATION ALGORITHMS (1)
CLOUD COMPUTING (1)
COMPUTER ARCHITECTURE (1)
COUPLINGS (1)
CREDIT CARDS (1)
DATA PRIVACY (1)
DATA VISUALIZATION (1)
DATA-INTENSIVE APPLICATIONS (1)
DISTRIBUTED MEMORY (1)
ELECTRIC VEHICLE (1)
ELECTRONIC MAIL (1)
ENSEMBLE LEARNING (1)
ESTIMATION (1)
FEATURE EXTRACTION (1)
FUSES (1)
GEOSCIENCE (1)
GPU (1)
GRAPHICS PROCESSING UNITS (1)
HBASE (1)
HIDDEN MARKOV MODELS (1)
IMBALANCE CLASSIFICATION (1)
IN-MEMORY COMPUTING (1)
INDEXES (1)
INDEXING (1)
INSURANCE (1)
IPTV (1)
IPTV VIDEO EVALUATION MODEL (1)
LARGE RDF GRAPH (1)
LDA (1)
MACHINE LEARNING ALGORITHMS (1)
MAHOUT (1)
MALLET (1)
MARKET RESEARCH (1)
MEDIA (1)
METADATA (1)
MODEL (1)
MODEL FUSION (1)
MOVING OBJECTS (1)
MULTI-TIME SCALE (1)
NATURAL LANGUAGE PROCESSING (1)
NATURAL LANGUAGES (1)
NEURAL NETWORKS (1)
ONLINE CREDIT CARD FRAUD DETECTION FRAMEWORK (1)
ONTOLOGIES (1)
PAGERANK (1)
PERFORMANCE MODELS (1)
PREDICTIVEKNN (1)
PRIVACY (1)
PRIVACY PRESERVING (1)
PRODUCTION (1)
PUBLISHING (1)
PUBMED (1)
QUALITY MANAGEMENT (1)
QUERY PROCESSING (1)
R (1)
RANDOM FOREST (1)
RDD (1)
REDUCER (1)
REGRESSION TREE ANALYSIS (1)
RESOURCE DESCRIPTION FRAMEWORK (1)
RESOURCE MANAGEMENT (1)
REVIEW (1)
RUNTIME (1)
SCALABILITY (1)
SCHEDULABLE CAPACITY (1)
SEMANTIC ANNOTATIONS (1)
SEMANTIC SIMILARITY SEARCH (1)
SOCIAL SCIENCE (1)
SPARQL (1)
SPATIO-TEMPORAL (1)
SPATIOTEMPORAL PHENOMENA (1)
SPRING MVC (1)
STORM (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options