The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this research we present a new framework and associated algorithms for mining high speed data streams that take advantage of concept recurrence. Different from previous work our approach detects volatility in a stream and then matches the learning paradigm to the degree of volatility. In high volatility stream segments a decision forest is used as the learning mechanism whereas in low volatility...
In recent years plenty of new algorithms for data stream classification were developed. The occurrence of different concept drift types in data streams turned out to be especially challenging. Much attention was paid to the ensemble methods because of their desired properties. However, the problem of deciding how many components should be stored in the ensemble is still an open issue. Therefore in...
This study brings together systematised views of two related areas: data editing for the nearest neighbour classifier and adaptive learning in the presence of concept drift. The growing number of studies in the intersection of these areas warrants a closer look. We revise and update the taxonomies of the two areas proposed in the literature and argue that they are not sufficiently discriminative with...
An important problem that remains in online data mining systems is how to accurately and efficiently detect changes in the underlying distribution of large data streams. The challenge for change detection methods is to maximise the accumulative effect of changing regions with unknown distribution, while at the same time providing sufficient information to describe the nature of the changes. In this...
Outliers are observations that lie far away from the fitting function deduced from the bulk of a set of observations. The outlier detection has become more challenging when the nature of data has involved with the “concept drifting.” To address this challenging issue, this study explores a decision support mechanism (DSM) for coping with the outlier detection problem in the concept drifting environment...
Incrementally learning from large volumes of streaming data over time is a problem that is of crucial importance to the computational intelligence community, especially in scenarios where it is impractical or simply unfeasible to store all historical data. Learning becomes a particularly challenging problem when the probabilistic properties of the data are changing with time (i.e., gradual, abrupt,...
A recent paper and patent claims to have found a method of finding a threshold logic solution for all linearly separable Boolean functions. Although the method appears to work, one step of the method has not previously been proven. This paper gives a proof that the method does work.
Feedforward neural networks are neural networks with (possibly) multiple layers of neurons such that each layer is fully connected to the next one. They have been widely studied in the past partially due to their universal approximation capabilities and empirical effectiveness on a variety of application domains for both regression and classification tasks. In this paper, we provide an overview on...
A conventional weight in an artificial neural network has a single trainable real value and produces a linear relationship between the weight input and the weight output. A real synaptic cleft is also trainable but provides a more complex relationship. It is obvious to wonder if adding extra complexity to the conventional weight response would lead to more capable networks. This work describes a weight...
An efficient algorithm for the calculation of the approximate Hessian matrix for the Levenberg-Marquardt (LM) optimization algorithm for training a single-hidden-layer feedforward network with linear outputs is presented. The algorithm avoids explicit calculation of the Jacobian matrix and computes the gradient vector and approximate Hessian matrix directly. It requires approximately 1/N the floating...
RAM-based neural networks, despite their speed and hardware implementability, have a strong drawback in representing continuous input variables that affects its performance. Usually these networks require value discretization followed by binary encoding to access the binary memory addresses where the learned content is stored. Added to being a cumbersome process, binary encoding does not usually preserve...
We look at the neural network as a non-linear probability density function (pdf) transformer by stochastic learning cumulative (SLC) technique. We formulate a potential function that drives a neural network to non-linearly transform the input pdf to the desired pdf. We show the working of the algorithm using synthetic data drawn from three different pdfs and estimate the parameters of the distributions...
Clustering analysis has been widely used in many areas such as astronomy, bioinformatics, and pattern recognition. In 2014, Rodriguez proposed an algorithm based on the idea that cluster centers are characterized by a higher density than their neighbors and by a relatively large distance from points with higher density. But the density relies on cutoff distance, which might be affected by large statistical...
Coupled Tensor Factorization (CTF) has become one of the most popular methods for joint analysis of high dimensional data generated from multiple sources. The goal of CTF is to factorize correlated datasets into latent factors efficiently. This research was taken with a particular goal of improving the accuracy of CTF. It is important to optimize the factorization of each single tensor of the coupled...
An enterprise social network (ESN) involves diversified user groups from producers, suppliers, logistics, to end consumers, and users have different scales, broad interests, and various objectives, such as advertising, branding, customer relationship management etc. In addition, such a highly diversified network is also featured with rich content, including recruiting messages, advertisements, news...
Community-based Question Answering (CQA) sites have become popular since they allow users to get answers to complex, detailed and personal question from other users directly. However, since answering a question depends on the ability and willingness of other users to address the askers' real needs, a significant fraction of the questions remain unanswered. To decrease the unanswered question rate...
For recent or planned deep astronomical surveys, it is important to tell stars and galaxies apart, a task known as Star/Galaxy Separation Problem (SGSP). At faint magnitudes, the separation between pointy and extended sources is fuzzy, which makes SGSP a hard task. This problem is even harder for large surveys like Dark Energy Survey (DES) and, in a near future, the Large Synoptic Survey Telescope...
Surveys are used by hospitals to evaluate patient satisfaction and to improve operation. Collected satisfaction data is usually represented to the hospital administration using statistical charts and graphs. Although this statistical data and visualization is helpful, but because of the size and dimension of the dataset, it is very difficult if not impossible, to identify important factors that could...
In this paper a Modular Neural Network (MNN) with a granular approach optimization is proposed, where a firefly optimization is proposed to design a optimal MNN architecture. The proposed method can perform the optimization of some parameters such as; number of sub modules, percentage of information for the training phase and number of hidden layers (with their respective number of neurons) for each...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.