The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Deep multi-layer neural networks are generally trained using variants of the gradient descent based algorithm. However, this kind of algorithms usually encounter a series of shortcomings, such as low training efficiency, local minimum, difficult control parameter tuning, and gradient vanishing or exploding. Besides, for a specific application, how to design the structure of the network, that is, how...
Stochastic gradient algorithms are the main focus of large-scale optimization problems and led to important successes in the recent advancement of the deep learning algorithms. The convergence of SGD depends on the careful choice of learning rate and the amount of the noise in stochastic estimates of the gradients. In this paper, we propose an adaptive learning rate algorithm, which utilizes stochastic...
Despite the importance of distributed learning, few fully distributed support vector machines exist. In this paper, not only do we provide a fully distributed nonlinear SVM; we propose the first distributed constrained-form SVM. In the fully distributed context, a dataset is distributed among networked agents that cannot divulge their data, let alone centralize the data, and can only communicate with...
We consider supervised learning problems over training sets in which both the number of training examples and the dimension of the feature vectors are large. We focus on the case where the loss function defining the quality of the parameter we wish to estimate may be non-convex, but also has a convex regularization. We propose a Doubly Stochastic Successive Convex approximation scheme (DSSC) able...
Short text classification uses a supervised learning process, and it needs a huge amount of labeled data for training. This process consumes a lot of human resources. In traditional supervised learning problems, active learning can reduce the amount of samples that need to be labeled manually. It achieves this goal by selecting the most representative samples to represent the whole training set. Uncertainty...
Many research works have successfully extended algorithms such as evolutionary algorithms, reinforcement agents and neural networks using “opposition-based learning” (OBL). Two types of the “opposites” have been defined in the literature, namely type-I and type-II. The former are linear in nature and applicable to the variable space, hence easy to calculate. On the other hand, type-II opposites capture...
Data augmentation is the process of generating samples by transforming training data, with the target of improving the accuracy and robustness of classifiers. In this paper, we propose a new automatic and adaptive algorithm for choosing the transformations of the samples used in data augmentation. Specifically, for each sample, our main idea is to seek a small transformation that yields maximal classification...
Classifier competence is critical important for classifier ensemble. This study proposes an optimization problem on the neighborhood graph of data and develops an iteration algorithm to learn the competences of classifiers. The learned competences of classifiers not just reflect the competitiveness of classifiers, but also vary smooth on the neighboring data. Experimental results on five different...
In this paper we propose a novel algorithm, which is an improvement for one-stage dictionary learning (OS-DL) algorithm, by imposing a l2-norm constraint on the update of the atoms. Our contribution embarks from the OS-DL algorithm and incorporates the well-known convex optimization method, proximal point method, into this algorithm. Experimental results on recovering a known dictionary and sparsely...
An efficient algorithm for the calculation of the approximate Hessian matrix for the Levenberg-Marquardt (LM) optimization algorithm for training a single-hidden-layer feedforward network with linear outputs is presented. The algorithm avoids explicit calculation of the Jacobian matrix and computes the gradient vector and approximate Hessian matrix directly. It requires approximately 1/N the floating...
This paper presents a pruned sparse extreme learning machine (PS-ELM) algorithm, which can generate a compact single-hidden-layer neural network (SLNN) by automatically pruning the number of hidden nodes while keep high accuracy. In this PS-ELM algorithm, input connections between input and hidden layers are base vectors, which can sparsely map the input features into hidden layer by using gradient...
Dictionary learning for sparse representations is traditionally approached with sequential atom updates, in which an optimized atom is used immediately for the optimization of the next atoms. We propose instead a Jacobi version, in which groups of atoms are updated independently, in parallel. Extensive numerical evidence for sparse image representation shows that the parallel algorithms, especially...
Several distributed coordinated precoding methods relying on over-the-air (OTA) iterations in time-division duplex (TDD) networks have recently been proposed. Each OTA iteration incurs overhead, which reduces the time available for data transmission. In this work, we therefore propose an algorithm which reaches good sum rate performance within just a few number of OTA iterations, partially due to...
A new challenge for learning algorithms in cyber-physical network systems is the distributed solution of big-data classification problems, i.e., problems in which both the number of training samples and their dimension is high. Motivated by several problem set-ups in Machine Learning, in this paper we consider a special class of quadratic optimization problems involving a “large” number of input data,...
Least squares support vector machine (LS-SVM) has been successfully applied in many classification and regression tasks. The main drawback of the LS-SVM algorithm is the lack of sparseness. Combing the primal least squares twin support vector machine (LS-TSVM) and the sparse LS-SVM with L0-norm minimization, a new sparse least squares support vector regression algorithm with L0-norm in primal space(L...
Particle swarm optimisation has been successfully applied as a neural network training algorithm before, often outperforming traditional gradient-based approaches. However, recent studies have shown that particle swarm optimisation does not scale very well, and performs poorly on high-dimensional neural network architectures. This paper hypothesises that hidden layer saturation is a significant factor...
Deep neural networks (DNN) are typically optimized with stochastic gradient descent (SGD) using a fixed learning rate or an adaptive learning rate approach (ADAGRAD). In this paper, we introduce a new learning rule for neural networks that is based on an auxiliary function technique without parameter tuning. Instead of minimizing the objective function, a quadratic auxiliary function is recursively...
Dual decomposition methods are the current state-of-the-art for training multiclass formulations of Support Vector Machines (SVMs). At every iteration, dual decomposition methods update a small subset of dual variables by solving a restricted optimization problem. In this paper, we propose an exact and efficient method for solving the restricted problem. In our method, the restricted problem is reduced...
A Semi-supervised Segmentation Fusion algorithm is proposed using consensus and distributed learning. The aim of Unsupervised Segmentation Fusion (USF) is to achieve a consensus among different segmentation outputs obtained from different segmentation algorithms by computing an approximate solution to the NP problem with less computational complexity. Semi-supervision is incorporated in USF using...
Iterative learning control (ILC) algorithms are typically used to iteratively refine the feed-forward control input to a system to achieve an optimized performance objective. Because of its ease of implementation and robustness, ILC has found widespread use in a variety of industrial applications. However, a key limitation of ILC is the requirement that learning has to be re-initiated for each new...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.