The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Approximate dynamic programming method is a combination of neural networks, reinforcement learning, as well as the idea of dynamic programming. It is an online control method which bases on actual data rather than a precise mathematical model of the system. This method is suitable for the optimal control of nonlinear systems, and can avoid the problem of dimension disaster. It can effectively solve...
This paper is concerned with the dynamic pricing problems of a duopoly case in electronic retail markets. Combined with the concept of performance potential, the simulated annealing Q-learning (SA-Q) and the win-or-learn-fast policy hill climbing algorithm (WoLF-PHC) are used to solve the learning problems of multi-agent systems with either average- or discounted-reward criteria, under the case that...
It is a research trend to incorporate neural network with genetic algorithm for solving technical and practical problems. As a single genetic algorithm has slow convergent speed and it is easily falling into local optimum, this paper presents a genetic and simulated annealing hybrid algorithm, which searches the neighborhood using chaos variables. And this paper trains a neural network using the single...
Production scheduling is critical for manufacturing system. Dispatching rules are usually applied dynamically to schedule the job in the dynamic job-shop. The paper presents an adaptive iterative scheduling algorithm that operates dynamically to schedule the job in the dynamic job-shop. In order to get adaptive behavior, the reinforcement learning system is done with the phased Q-learning by defining...
To solve the problem on large storage of images in the synergetic recognition, an improved synergetic recognition method based on manifold learning has been proposed in this paper, in which the advantage and feasibility of combining the two methods has been discussed. The geometrical structure in high dimensions can be well maintained by using manifold learning to reduce dimensions while it is somehow...
Hierarchical finite state machine (HFSM) has proven to be a powerful tool for controlling non-player characters (NPCs) in computer games due to its flexibility and modularity. For most implementations, however, it is often the case that the control details at all levels are hand-coded. As a result, the development process is often time intensive and error prone. In this paper, we explore the use of...
In order to solve the bad convergence property of neural network which is used to generalize reinforcement learning, the neural and case based Q-learning (NCQL) algorithm is proposed. The basic principle of NCQL is that the reinforcement learning is generalized by NN, and the convergence property and learning efficiency are promoted by cases. The detail elements of the learning algorithm are fulfilled...
In the HRL field, there are several main methods such as HAMs, options, MAXQ. A main problem that exists in HAMs is its joint state space consisting of the cross-product of the machine states in the HAM and the states in the original MDP, which can not be completely solved by a subroutine-based state abstraction method. This paper analyzes this problem in detail, provides formal definitions of homomorphism...
Distribution Static Compensator (DSTATCOM) is a shunt compensation device which is generally used to solve power quality problems in distribution systems. In distribution power system, these power quality problems mainly arise due to the pulsed loads, which causes the degradation of the entire system performance. The control strategy of DSTATCOM plays an important role to meet the objectives. A novel...
In this paper, piecewise linearization and piecewise variable slope are used to train the detecting methods of non-linear analogue based on variable threshold neuron. The relationship between two training methods is given. The training method of piecewise variable slope can calibrate the detecting curve from the training method of piecewise linearization at the sampled-data feature, which improves...
To distinguish chatter gestation, a new method of chatter gestation based on HMM-SVM method is proposed for dynamic patterns of chatter gestation in cutting process. At first, FFT features are extracted from the model signal of cutting process, then FFT vectors are introduced to HMM-SVM (hidden Markov model-support vector machine) for machine learning and classification. the vibration signal of cutting...
Because it is time-consuming and costly to annotate the large vocabulary Tibetan language corpus, it is not suitable to directly adopt the traditional automatic speech recognition (ASR) methods such as Hidden Markov Model (HMM), Dynamic Bayesian Networks (DBN), Artificial Neural Network (ANN). Thus, active learning can reduce annotation cost by sample selection. This paper proposed a new method to...
A method of tool wear intelligence measure based on discrete hidden Markov models (DHMM) is proposed to monitor tool wear and to predict tool failure. FFT features are first extracted from the vibration signal and cutting force in cutting process, and then FFT vectors are presorted and converted into integers by SOM. Finally, these codes are introduced to DHMM for machine learning and 3 models for...
In order to realize comprehensive fault evaluation on faults occurred in maglev train, aim at the difficulty in establishing the evaluation weight matrix and subjection matrix parameter, faint comprehensive evaluation method based on ensemble learning algorithm is proposed. First, the structure of the suspension system of maglev train is analyzed and a fault diagnosis model is built. Then ensemble...
With the development and widely used of Internet and information technology, the Web has become one of the most important means to obtain information for people. According to the Web document classification and the theory of artificial neural network, a Web classification mining method based on fuzzy neural network (FNN) is presented in this paper. We construct the structure of fuzzy neural network...
The theory of learning automata has already been applied in reinforcement learning which is characterized by single-agent and single-stage. This paper proposed a multi-robot cooperative Q-learning algorithm based on learning automata. Each robot updates probability for action selection through the learning automata constantly, and then converts the probability to special experience. Robots can accelerate...
Bipedal locomotion is one of the most challenging problems in control, artificial intelligence, mechanics and other related fields. In this article a model free approach with emphasis on making robot's walking more stable and faster is presented. In this regard we use particle swarm optimization (PSO) to optimize the signals produced by truncated Fourier series (TFS) which control joints' angels....
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.