The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Given a corrupted low-rank matrix, robust principal component analysis performs a low-rank-plus-sparse matrix decomposition by solving a convex program. In this paper we first develop an efficient rank-revealing decomposition algorithm aided by randomization, which provides information about the singular subspaces and singular values of a given data matrix. The proposed factorization termed randomized...
Gaussian Processes (GPs) are state-of-the-art tools for regression. Inference of GP hyperparameters is typically done by maximizing the marginal log-likelihood (ML). If the data truly follows the GP model, using the ML approach is optimal and computationally efficient. Unfortunately very often this is not case and suboptimal results are obtained in terms of prediction error. Alternative procedures...
K-means algorithm is a classical algorithm and has been widely used in many applications. However, the traditional K-means algorithm is easily influenced by outliers and it usually obtains an unstable clustering result and poor clustering accuracy. In this paper, aiming at K-means algorithm resistant to outliers, we proposed a Capped Robust K-means Algorithm (CRK-means) by adding a capped norm and...
Learning robust regression model from high-dimensional corrupted data is an essential and difficult problem in many practical applications. The state-of-the-art methods have studied low-rank regression models that are robust against typical noises (like Gaussian noise and out-sample sparse noise) or outliers, such that a regression model can be learned from clean data lying on underlying subspaces...
In recent years, Deep Learning has been successfully applied to multimodal learning problems, with the aim of learning useful joint representations in data fusion applications. When the available modalities consist of time series data such as video, audio and sensor signals, it becomes imperative to consider their temporal structure during the fusion process. In this paper, we propose the Correlational...
In this work, we introduce a highly efficient algorithm to address the nonnegative matrix underapproximation (NMU) problem, i.e., nonnegative matrix factorization (NMF) with an additional underapproximation constraint. NMU results are interesting as, compared to traditional NMF, they present additional sparsity and part-based behavior, explaining unique data features. To show these features in practice,...
Subspace learning has been widely used in signal processing, machine learning, computer vision and so on. Matrix rank minimizing is a fundamental model. Nuclear norm is a convex relaxation for rank minimizing. In this paper, we propose a polynomial function to smoothen the nuclear norm. Lagrange multipliers method is employed to solve the problem. The optimal solution is obtained by iterative procedure...
Modern Cyber-Physical Systems are often driven bya plethora of controllers that are connected with each other andtheir environment. To guarantee a safe and robust execution ofthe systems, their control units have to strictly fulfill certainproperties which calls for the use of formal analysis methods inthe software development process. We present the combinationof the model-based engineering technique...
Low rank matrix approximation, in the presence of missing data and outliers, has previously shown its significance as a theoretic foundation in a wide spectrum of tabulated information processing applications. To fit low rank models, minimizing the nuclear norm of matrices is a popular scheme, the computational load of which, however, is heavy. While bilinear factorization can largely mitigate the...
Allowing for a priori optimization of the robot manipulation to improve the performance in the unmanned environment, it is critical for the augmented reality system to estimate the attitude of point clouds in model reconstruction. The estimation of planar parameter is not always faithful for point cloud fitting, because the gross errors and outliers are not considered in by the traditional plane fitting...
The feasibility of large-scale decentralized networks for local computations, as an alternative to big data systems that are often privacy-intrusive, expensive and serve exclusively corporate interests, is usually questioned by network dynamics such as node leaves, failures and rejoins in the network. This is especially the case when decentralized computations performed in a network, such as the estimation...
Digitalisation of industrial processes, also called the fourth industrial revolution, is leading to availability of large volume of data containing measurements of many process variables. This offers new opportunities to gain deeper insights on process variability and its effects on quality and performance. Manufacturing facilities already use data driven approaches to study process variability and...
With the arrival of big data era, data mining techniques have been widely used to build models for cyber security applications such as spam filtering, malware or virus detection, and intrusion detection. This project proposes a novel approach that uses randomness to improve robustness of data mining models used in cyber security applications against attacks that try to evade detection by adapting...
Feature learning based on deep learning is now widely used in various fields. When dealing with data missing or data noise, auto-encoder-based methods are proposed to learn more robust features. However, the existing methods based on deep learning which aim to deal with missing data or noise depend on the kind of signals itself. In order to enhance the fault-tolerant mechanism and improve the anti-noise...
We present a new algorithm for discovering clusters in noisy data streams using dynamic and cluster-specific temporal decay factors. Our improvement helps identify and adapt to evolving trends by adapting the weighting of stream data based on both content attributes and temporal arrival patterns. Our experimental results show that the proposed algorithm can discover better quality clusters in noisy...
This paper presents a comparative study of continuous time and discrete time system identification approaches for estimating the parameters of an electrical equivalent circuit model of a li-ion battery. Three such methods are studied here. While the first two represent examples of direct continuous time approaches, the third one represents an indirect discrete time approach. Based on both simulated...
The recognition of human activities in the field of video surveillance is attracting more researchers. This has led to various approaches and proposals using different methods and techniques. The growing interest in the surveillance has also led researchers to give importance to abnormal human activities in order to propose appropriate and dedicated techniques to this type of activities. Unfortunately,...
Though in the era of big data, it remains a challenge to be tackled that the forecasting model with high accuracy and robustness needs to be built using small size samples. One effective tool of addressing this problem is the virtual sample generation (VSG), which can generate a mass of new virtual samples on the basis of small sample sets. The bootstrap method is adopted to feasibly resample the...
In our previous work, we have applied ordinary linear regression equation to network anomaly detection. However, the performance of ordinary linear regression equation is susceptible to outliers. Unfortunately, it is almost impossible to obtain a “clean” traffic data set for ordinary regression model due to the burstiness of network traffic and the pervasive network attacks. In this paper, we make...
Location Model is a classification model that capable to deal with mixtures of binary and continuous variables simultaneously. The binary variables create segmentation in the groups called cells whilst the continuous variables measure the differences between groups based on information inside the cells. It is important to note that location model is biased and even impossible to be constructed when...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.