The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The conviction that big data analytics is a key for the success of modern businesses is growing deeper, and the mobilisation of companies into adopting it becomes increasingly important. Big data integration projects enable companies to capture their relevant data, to efficiently store it, turn it into domain knowledge, and finally monetize it. In this context, historical data, also called temporal...
Industry 4.0 proposes the integration of the new generation of ICT solutions for the monitoring, adaptation, simulation, and optimisation of factories. With the democratization of sensors and actuators, factories and machine tools can now be sensorized and the data generated by these devices can be exploited, for instance, to optimize the utilization of the machines as well as their operation and...
Trust and reputation systems represent a significant trend in decision support including selection of best match cloud providers to process Big Data. Reputation is often considered as a collective measure of trustworthiness based on the referrals or ratings from members in a community. Reputation systems have been applied in various applications such as online service provision. However, reputation...
Infrastructure failures have severe consequences which often have a negative impact on the society and the economy. In this paper, we propose a machine learning model to assist in risk management to minimise the cost of infrastructure maintenance. Due to the vast volume and complexity of infrastructure datasets, such problem is often computationally expensive to compute. A Bayesian nonparametric approach...
Current procedure in travel demand estimation models is to separately deal with attraction, production and trip distribution, where the latter typically assumes inverse distance proportionality. We show that this procedure leads to errors in the demand estimation, particularly when dealing with very specific zones and heterogeneous travel behavior. We argue that this traditional procedure is rooted...
The increasing amount of data generated every second of time and available from multiple sources and in various formats has generated new ways of dealing with them: Big Data. Methodologies and technologies have been developed to make good use of these data, but their adoption by organizations is complex in many respects. A review of the state of the art shows several models and frameworks where solutions...
As companies generate and handle increasingly large amounts of data along with the Big Data era, several model of data lifecycle have been proposed to deal with this situation. The analysis, the management and the use of data becomes more complicated or almost impossible in some cases for the companies. To transform these data to a knowledge, the choice of the adequate lifecycle that matches with...
A huge amount of data is constantly being produced in the world. Data coming from the IoT, from scientific simulations, or from any other field of the eScience, are accumulated over historical data sets and set up the seed for future Big Data processing, with the final goal to generate added value and discover knowledge. In such computing processes, data are the main resource, however, organizing...
We introduce a highly efficient online nonlinear regression algorithm. We process the data in a truly online manner such that no storage is needed, i.e., the data is discarded after used. For nonlinear modeling we use a hierarchical piecewise linear approach based on the notion of decision trees, where the regressor space is adaptively partitioned based directly on the performance. As the first time...
One of the biggest challenges in Big Data is to exploit value from large volumes of variable and changing data. For this, one must focus on analyzing the data in these Big Data sources and classify the data items according to a domain model (e.g. an ontology). To automatically classify unstructured text documents according to an ontology, a hierarchical multi-label classification process called Semantic...
To avoid unpredictable losses because of network failure, the reliability of the network needs to be evaluated in some application scenarios. This paper start the network failure prediction research upon 14 months' network alarm logs we collected. The logs are of one Metropolitan area network. The research method is shown as below: firstly, construct features to represent network characteristics by...
With the continuous increase of heterogeneous multimedia data, the question of how to access big multimedia data efficiently has become of crucial importance. In order to provide fast access to complex multimedia data, we propose to approximate content-based features of multimedia objects by means of generative models. The proposed gradient-based signatures epitomize a high quality content-based approximation...
Mining massive data streams in real-time is one of the contemporary challenges for machine learning systems. Such a domain encompass many of difficulties hidden beneath the term of Big Data. We deal with massive, incoming information that must be processed on-the-fly, with lowest possible response delay. We are forced to take into account time, memory and quality constraints. Our models must be able...
Large-scale maritime simulation platform faces the contradiction among timeliness, accuracy and integrity in the data collection, storage, analysis and exploitation, due to the features of maritime big data, such as large-volume, multi-source heterogeneity, abruptness and high noise. With this background, in order to deal with the complexity and imperfectness of maritime big data, this paper employs...
The analysis of Big Data needs to be performed on a range of data stores, both traditional and modern, on data sources that are heterogeneous in their schemas and formats, and on a diversity of query engines. The users that need to perform such data analysis may have several roles, like, business analysts, engineers, end-users etc. Therefore Big Data analytics should be expressed and executed in a...
Modern computer systems generate massive amounts of data in real-time. We have come to the age of big data, where the amount of information exceeds the perceptive abilities of any human being. Frequently the massive data collections arrive over time, in the form of a data stream. Not only the volume and velocity of data poses a challenge for machine learning systems, but also its variability. Such...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.