The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The behavior of university students is a field of study on the rise, whose main objective is the search for patterns that help improve their learning process. This paper analyzes the use of Learning Management Systems (LMS) in Higher Education and the interactions with their different tools from the students' viewpoint. For the analysis of the student activity statistical techniques and algorithms...
We present our preliminary results on building a framework for analyzing online ratings. There are four major components constructed in this framework including data retrieval, data processing, data analytics, and data visualization. The data retrieval module is responsible for scraping or streaming online rating data. Cleaning, filtering, and parsing unstructured data are done in the data processing...
In this paper, we present DPWeka, a differentially private prototype based on a widely used data mining software WEKA, for practical data analysis. DPWeka includes a suite of differential privacy preserving computation blocks which support a variety of data analysis tasks including test statistics calculation, regression analysis, and interactive exploratory data analysis. We illustrate the use of...
Objective: This paper presents a novel supervised regularized canonical correlation analysis, termed as CuRSaR, to extract relevant and significant features from multimodal high dimensional omics datasets. Methods: The proposed method extracts a new set of features from two multidimensional datasets by maximizing the relevance of extracted features with respect to sample categories and significance...
The use of smart meters have been increased in many countries across the globe. Smart meters are not only used for measurement of electricity consumption but also used for the measurement of gas and water consumption. The smart meter is an integral part of the smart grid system and its use incurs benefits to people in various aspects such as social, economic and environmental. The smart meters' main...
As scientific data analysis applications become more and more complex, there is a great need to simplify the definition and execution of such applications, particularly when dealing with large datasets. The Data Mining Cloud Framework (DMCF) is a system allowing domain experts to design and execute complex data analysis workflows on cloud platforms, relying on cloud storage services for every I/O...
Social media analysis is a fast growing research area aimed at extracting useful information from huge amounts of data generated by social media users. This work presents a Java library, called ParSoDA (Parallel Social Data Analytics), which can be used for developing parallel data analysis applications based on the extraction of useful knowledge from large dataset gathered from social networks. The...
The cumulative growth of data from various sources has led to the era of big data. Big Data analytics give rise opportunities in designing of competitive offer packages for customers to provide reliable services, but analysis must be accurate and timely for successful decision making. For testing and analyzing Big Data, various statistical methods are developed. Traditional statistical analysis focuses...
Monitoring the soil moisture level of crop fields is one of the most important things to do for having an optimal crop yield. In this paper we investigate the capabilities of Sentinel-1 to soil moisture states. Aspects of modeling the map of soil moisture by the radar scene from Sentinel-1 on the example of a territory having undergone intensive precipitation have been considered. Representations...
As data mining grows its significance in real world information extraction, we need a robust methodology without using rich assumptions. In detection of irregularities, statistical methods are very successful when we assume some distribution. However, even when we encounter data with unknown distribution, we must first assume some distribution, which may cause incorrect inference. This paper applies...
Today smart devices such as smartphones, smartwatches and activity trackers are widely available and accepted in most developed societies. These devices present a broad set of sensors capable of extracting detailed information about different situations of daily life, which, if used for good, have the potential to improve the quality of life not only for individuals but also for the society in general...
An outlier is an observation (or measurement) that is different with respect to the other values contained in a given data set. Outliers can occur due to several causes. The measurement can be incorrectly observed, recorded or processed or otherwise is correctly measured but represents a rare event. In this paper it is shown that observed data can contain values that differ from expected ones and...
The Internet of Things (IoT) enables connected objects to capture, communicate, and collect information over the network through a multitude of sensors, setting the foundation for applications such as smart grids, smart cars, and smart cities. In this context, large scale analytics is needed to extract knowledge and value from the data produced by these sensors. The ability to perform analytics on...
Many new applications have been recently developed to satisfy users special needs on the web. In this context, we are interested in personalized systems and particularly in Personalized Multi-Agent Systems (PMAS) characterized by collective and intelligent resolution in a distributed and parallel environment. This work assesses personalization, the most important characteristic of interface in multi-agent...
More and more on-line experiments have been done in E-Commerce in order to understand the behavior of users or customers and then apply the data analysis technique to provide business guidance. One of the techniques is A/B testing. However, there is not clear guidance on the sample size in order for us to have valuable, trustable discovery. The purpose of this work is to find out a way to group customers...
The advent of Big Data has triggered disruptive changes in many fields including Intelligent Transportation Systems (ITS). The emerging connected technologies created around ubiquitous digital devices have opened unique opportunities to enhance the performance of the ITS. However, magnitude and heterogeneity of the Big Data are beyond the capabilities of the existing approaches in ITS. Therefore,...
The aim of this paper is to examine possibilities for the initial data analyses of the failure data from industrial production process. To perform the initial data analysis of the data from production process we have used graphical statistical method and also data mining methods like drill-down analysis and cluster analysis. Before applying mentioned techniques and methods it was necessary to know...
The process of Knowledge Discovery in Databases, or KDD for short, have been intensively used in tasks focused on searching useful information based on data. The reason is that such data is generated in significant volume, high speed and with a large variety, which makes it require accurate, efficient and scalable methods to handle them. Due to this scenario, several tools and methodologies have been...
Fault classification in power systems is a challenging and complex task as the variety and variability of the electrical parameters of the various network components in spatial and temporal scales. The majority of machine learning methods for event detection require the labeled data sets or examples of previous events. However, the recorded event data happen in different locations, time and system...
The most widely adopted approach for knowledge extraction from raw data generated at the edges of the Internet (e.g., by IoT or personal mobile devices) is through global cloud platforms, where data is collected from devices, and analysed. However, with the increasing number of devices spread in the physical environment, this approach rises several concerns. The data gravity concept, one of the basis...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.