The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Recurrent neural network has been widely used as auto-regressive model for time series. The most commonly used training method for recurrent neural network is back propagation. However, recurrent neural networks trained with back propagation can get trapped at local minima and saddle points. In these cases, auto-regressive models cannot effectively model time series patterns. In order to address these...
Accelerating the inference of a trained DNN is a well studied subject. In this paper we switch the focus to the training of DNNs. The training phase is compute intensive, demands complicated data communication, and contains multiple levels of data dependencies and parallelism. This paper presents an algorithm/architecture space exploration of efficient accelerators to achieve better network convergence...
This paper presents a novel nonlinear adaptive filter method, namely, Hammerstein adaptive filter with single feedback under minimum mean square error (HAF-SF-MMSE). A single delayed output is incorporated into the estimation of the current output based on minimum mean square error criterion, and therefore the history information of output is considered. Moreover, hybrid learning rates and adaptive...
Independent Component Analysis (ICA) is a dimensionality reduction technique that can boost efficiency of machine learning models that deal with probability density functions, e.g. Bayesian neural networks. Algorithms that implement adaptive ICA converge slower than their nonadaptive counterparts, however, they are capable of tracking changes in underlying distributions of input features. This intrinsically...
This paper develops a distributed stochastic subgrandient-based support vector machine algorithm when training data to train support vector machines are distributed in the network. In this situation, all the data are decentralized stored and unavailable to all agents and each agent has to make its own update based on its computation and communication with neighbors. With mild connectivity conditions,...
This paper investigates an event-triggered distributed cooperative learning (DCL) algorithm using radial basis function networks (RBFNs), where training samples are often extremely large-scale, high-dimensional and located on distributed nodes over strongly connected and weight-balanced networks. The algorithm is based on Zero-Gradient-Sum (ZGS) distributed optimization strategy and works in a fully...
Feature selection is an important task in machine learning, which aims to reduce the dataset dimensionality while at least maintaining the classification performance. Particle Swarm Optimisation (PSO) has been widely applied to feature selection because of its effectiveness and efficiency. However, since feature selection is a challenging task with a complex search space, PSO easily gets stuck at...
In this work we propose an accelerated stochastic learning system for very large-scale applications. Acceleration is achieved by mapping the training algorithm onto massively parallel processors: we demonstrate a parallel, asynchronous GPU implementation of the widely used stochastic coordinate descent/ascent algorithm that can provide up to 35× speed-up over a sequential CPU implementation. In order...
To improve the performance of the convolutional neural networks, it is normally done by increase the deepness or put more layers to the network. By doing such, the number of parameters is increased. In this paper, NU-InNet, which was developed from GoogLeNet, is modified by adding more layers to the network in order to improve the accuracy of the network while keeping the number of the parameters...
After analyzing the traditional levenberg-marguard algorithm and its properties of quadratic function, and combining with Tara formula, i find that it inherits the properties of image of quadratic function. It approximate the axis of symmetry. Through judging the plus and minus of the second derivative, it accelerates the change of damp fact. and make the transforming between of gauss newton method...
Multi-label data with high dimensionality arise frequently in data mining and machine learning. It is not only time consuming but also computationally unreliable when we use high-dimensional data directly. Supervised dimensionality reduction approaches are based on the assumption that there are large amounts of labeled data. It is infeasible to label a large number of training samples in practice...
Computational analysis of transcription factor binding site (TFBS) is one of the most challenging topics in bioinformatics. A set of TFBS sequences is a type of multiple sequence alignment (MSA). Thus, the hidden Markov model (HMM), as a powerful tool to model MSA, has been extensively applied in TFBS analysis. However, with the sizes of TFBS problems, training HMM in a deterministic way is computationally...
Low Rank Matrix Factorization (LRMF) is a classical problem that arises in a wide range of practical contexts, especially in collaborative filtering, dimension reduction, etc. In this paper, a stochastic alternating minimization approach applied to LRMF problem is proposed. The main idea of the approach is to randomly sample partial rows of the matrix to perform parameter update during training using...
Full-Batch update and mini-batch update are two most widely used algorithms in back-propagation(BP) neural network, to deal with the huge training time and computation cost in the learning process. Parallel computing can improve the computation efficiency and have implemented these two algorithms on Mapreduce framework. In this paper, we implement these two algorithms on Spark framework and evaluate...
In analysis dictionary learning, the learned dictionary may contain similar atoms, leading to a degenerate dictionary. To address this problem, we propose a novel incoherent analysis dictionary learning algorithm with the ℓ1-norm for sparsity and simultaneously with the coherence penalty. The whole problem is convex but nonsmooth due to the sparsity regularizer and the coherence penalty. Hence, the...
Sparsity-inducing penalties are useful tools in variational methods for machine learning. In this paper, we propose two block-coordinate descent strategies for learning a sparse multiclass support vector machine. The first one works by selecting a subset of features to be updated at each iteration, while the second one performs the selection among the training samples. These algorithms can be efficiently...
This paper presents a new algorithm called Feature Selection Age Layered Population Structure (FSALPS) for feature subset selection and classification of varied supervised learning tasks. FSALPS is a modification of Hornby's ALPS algorithm — an evolutionary algorithm renown for avoiding pre-mature convergence on difficult problems. FSALPS uses a novel frequency count system to rank features in the...
A learning process is easily trapped into a local minimum when training multi-layer feed-forward neural networks. An algorithm called Wrong Output Modification (WOM) was proposed to help a learning process escape from local minima, but WOM still cannot totally solve the local minimum problem. Moreover, there is no performance analysis to show that the learning has a higher probability of converging...
The Support Vector Machines (SVMs) dual formulation has a non-separable structure that makes the design of a convergent distributed algorithm a very difficult task. Recently some separable and distributable reformulations of the SVM training problem have been obtained by fixing one primal variable. While this strategy seems effective for some applications, in certain cases it could be weak since it...
A bidirectional blind equalization based on the constant modulus algorithm (CMA) and subspace-based algorithm (SBA) is proposed in this paper. Without any training sequence or channel estimation, blind equalization improves the transmission efficiency significantly in underwater acoustic communications. The combining scheme in which two outputs run in opposite directions exploits the diversity and...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.