The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The aim of this paper is to introduce a semantic methodology using ontology in order to improve results of data mining in judicial decisions database. An intelligent and automatic method to search for sentences in lawsuits related to the one in trial is presented. A judicial ontology is built with and without rules from experts. The method can provide judiciary celerity, seeking to solve the yearning...
As big data spreads rapidly, performance problems in these systems become common concerns. As the first line of defending these problems, performance diagnosis plays an essential role in big data systems. It is notoriously difficult to conduct performance diagnosis in large distributed systems. Previous work either pinpoint the root causes by instrumenting the applications or runtime systems in a...
In the last 20 years the term of Big Data took strength which refers to datasets that in size exceed the ability of typical database tools to capture store manage and analyze. Big Data visual analysis is a new field that is emerging as a powerful tool for extracting useful information. This paper discusses the revision of 83 articles on visualization techniques for Big Data of the last six years for...
The paper proposes a methodology for the development of a marketing decision support system using Big Data technology and data mining techniques. The approach was inspired by the CRISP-DM methodology, which is not oriented towards Big Data projects. Therefore, we have modified this methodology with respect to the purpose and technological requirements of the project. The proposed methodology was tested...
The paper presents the Big data definition and the main characteristics description The model of association between entities and characteristics is constructed. The method of heterogeneous data sharing and bringing to relational data model “entity-characteristic” was created. The testing results of developed methods and algorithms are presented.
The conviction that big data analytics is a key for the success of modern businesses is growing deeper, and the mobilisation of companies into adopting it becomes increasingly important. Big data integration projects enable companies to capture their relevant data, to efficiently store it, turn it into domain knowledge, and finally monetize it. In this context, historical data, also called temporal...
In this paper we present a recommendation service based on Neo4j graph data base which is applied in the context of smart cities application and leverages the potential of big data. The current work studies a modelling approach for user generated data combined with open big data and proceeds with the appropriate reference implementation and experimentation to validate recommendation services for innovative...
The purpose of data mining is to explore, find and hence analyze relevant data from a massive data source using various technical means. This paper introduces the development of data mining to date, its functions, tasks and algorithms, as well as the process of data mining. The application and problems of data mining are also presented and finally the potential future development of data mining technology...
In recent years, the State Grid has built amounts of business systems, e.g. OA, marketing and MIS systems. And it gradually developed into big data. However, with the construction of information increasingly deepening and a sharp increase of data volume, new challenges and inconvenience have been brought out for data seekers. In this paper, an architecture model of data integration in power field...
The back-end database is pivotal to the storage of the massive size of big data Internet exchanges stemming from cloud-hosted web applications to Internet of Things (IoT) smart devices. Structured Query Language (SQL) Injection Attack (SQLIA) remains an intruder's exploit of choice on vulnerable web applications to pilfer confidential data from the database with potentially damaging consequences....
Graphs represent an increasingly popular data model for data-analytics, since they can naturally represent relationships and interactions between entities. Relational databases and their pure table-based data model are not well suitable to store and process sparse data. Consequently, graph databases have gained interest in the last few years and the Resource Description Framework (RDF) became the...
RDF models are widely used in the web of data due to their flexibility and similarity to graph patterns. Because of the growing use of RDFs, their volumes and contents are increasing. Therefore, processing of such massive amount of data on a single machine is not efficient enough, because of the response time and limited hardware resources. A common approach to overcome this limitation is cluster...
The Consumer Financial Protection Bureau was established in USA for enabling the USA consumers to report customer support and complaint related information regarding their financial issues with the US government. The complaint data is freely available for analysis and tracking of how efficiently and effectively the financial institutes handle the complaints lodged against them. Each complaint consists...
In the era of big data, in order to understand the correlation between data with more intuitive and more effective. This paper designed an interactive visualization tool, exploring the overall data through exploring part of the data. Analysis on the function of visualization interactive technology based on baidu map API, echarts and jQuery UI. It described the general structure and the interaction...
The CloudMdsQL polystore provides integrated access to multiple heterogeneous data stores, such as RDBMS, NoSQL or even HDFS through a big data analytics framework such as MapReduce or Spark. The CloudMdsQL language is a functional SQL-like query language with a flexible nested data model. A major capability is to exploit the full power of each of the underlying data stores by allowing native queries...
In this paper we present a new Smart Online Vehicle Tracking System for Security Applications (AMOTSSA) and we describe how it can be modelled and implemented as a Big data application. In order to model AMOTSSA as a big data application, we argue our design choices that meets its specific data and processing needs and we present a set of data analytic algorithms that would achieve a set of investigation...
This paper explores scalable implementation strategies for carrying out lazy schema evolution in NoSQL data stores. For decades, schema evolution has been an evergreen in database research. Yet new challenges arise in the context of cloud-hosted data backends: With all database reads and writes charged by the provider, migrating the entire data instance eagerly into a new schema can be prohibitively...
Modern trends in the agriculture domain havemade people realize the importance of big data. The keychallenge of big data in agriculture is to identify theeffectiveness of big data analytics. Moreover, how big dataanalytics can be used to improve the productivity inagricultural practices. The purpose of the proposed researchis to reduce the technological gap between rural communitiesand information...
NoSQL databases are an effective solution for storing and processing large data, but these databases are heterogeneous. They offer different data storage models, implementations and languages to developers and users. This wide variety of platforms makes it difficult data interoperability, data integration and even data migration from one system to another. This paper proposes a literature review of...
Hadoop has become a popular platform for the management of big data. To provide a healthy Hadoop platform for big data application, an HMM-based approach for performance diagnosis in Hadoop clusters is proposed. We use metrics which are collected under the normal situation to train HMM (Hidden Markov Model), then use this model to detect anomaly based on the probability, which is more accurate than...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.