The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In recent years, one mode of data dissemination has become extremely popular, which is the deep web. A key characteristics of deep web data sources is that data can only be accessed through the limited query interface they support. This paper develops a methodology for mining the deep web. Because these data sources cannot be accessed directly, thus, data mining must be performed based on sampling...
Effectively utilizing readily available auxiliary data to improve predictive performance on new modeling tasks is a key problem in data mining. In this research the goal is to transfer knowledge between sources of data, particularly when ground truth information for the new modeling task is scarce or is expensive to collect where leveraging any auxiliary sources of data becomes a necessity. Towards...
Quantification is the name given to a novel machine learning task which deals with correctly estimating the number of elements of one class in a set of examples. The output of a quantifier is a real value, since training instances are the same as a classification problem, a natural approach is to train a classifier and to derive a quantifier from it. Some previous works have shown that just classifying...
Uncontrolled project investment attracts more and more public attention. The inaccuracy of cost estimation is one of main reasons that make the government investment out of control. Cost estimation is affected by many uncertain factors, and the relationship between these factors are nonlinear, and the traditional model is hard to solve. This paper brings forward a model based on rough set and neural...
The paper studies three typical weighting strategies for Shell-Neighbor Imputation (SNI) algorithm, while there are many weighting modes that can be used in the SNI. To best capture the imputation efficiency, a new metrics, called goodess, is proposed for evaluating imputation algorithms. We conduct some experiments for examining the proposed approached, and demonstrate that (1) distance-frequency-weighting...
Feature selection plays an important role in the area of machine learning. Class Label is often used as the supervised information for supervised feature selection algorithm while constraints are rarely used. So, an effective feature selection algorithm with pairwise constraints called Constraints Score was proposed. But its performance still is limited by neglecting the correlation between features...
In a data-mining approach, a model for estimation of aerosol optical depth (AOD) from satellite observations is learned using collocated satellite and ground-based observations. For accurate learning of such a spatio-temporal model, it is important to collect ground-based data from a large number of sites. The objective of this project is to determine appropriate locations for the next set of ground-based...
Target intention inference is an important aspect of situation assessment. The evidence system of targets' intention inference is discussed according to the independent relationship between targets' intention and input evidence. The targets' intention probability inference model is proposed based on static Bayesian network. In order to expand the application domain and predigest the parameter learning...
Traditional machine learning algorithms assume that data are exact or precise. However, this assumption may not hold in some situations because of data uncertainty arising from measurement errors, data staleness, and repeated measurements, etc. With uncertainty, the value of each data item is represented by a probability distribution function (pdf). In this paper, we propose a novel naive Bayes classification...
Representation learning is a fundamental challenge for feature selection and plays an important role in applications such as dimension reduction, data mining and object recognition. Traditional linear representation methods, such as principal component analysis (PCA), independent component analysis (ICA) and linear discriminate analysis (LDA), have good performance on certain applications based on...
A central problem in reinforcement learning is how to deal with large state and action spaces. When the problem domain presents intrinsic symmetries, exploiting them can be key to achieve good performance. We analyze the gains that can be effectively achieved by exploiting different kinds of symmetries, and the effect of combining them, in a test case: the stand-up and stabilization of an inverted...
EDA-RL, estimation of distribution algorithms for reinforcement learning problems, have been proposed by us recently. The EDA-RL can improve policies by EDA scheme: First, select better episodes. Secondly, estimate probabilistic models, i.e., policies, and finally, interact with the environment for generating new episodes. In this paper, the EDA-RL is extended for multi-objective reinforcement learning...
Naive Bayes Classifiers have been known with the advantages of high efficiency and good classification accuracy and they have been widely used in many domains. However, the classifiers need complete data. And the phenomenon of missing data widely exists in practice. Facing this instance, learning naive Bayes classifier and classification method with missing data are built in this paper. Compared with...
Using the Chebyshev nodes and methods in reference, we established the estimation of covering number of learning theory in reproducing kernel Hilbert space. A counter example is presented to show that the estimation of covering number of Gaussian kernel functions.
In this paper we propose a new approach to estimate the ratio of two probability density functions. The proposed approach is inspired by the kernel based function approximation technique. We apply this estimator to derive an estimator of mutual information and show that this estimator can be successfully used to detect dependence between two random variables.
It is well known that spatial perception is a basic ability in our daily life, while we compute spatial relationship between two objects universally. This study examined how people perceive spatial categories using three tasks, learning task, producing task, and rating task. Three different kinds of spatial configurations were manipulated. 27 subjects were assigned randomly to each kind of spatial...
This paper presents an algorithm based on the method of supervised machine learning and multi-keyframes to achieve markerless augmented reality (AR) application when there is a locally planar object in the scene. The main goal is to solve the problem of AR tracking in outdoor environment by only using vision and natural features. Instead of tracking fiducial markers, we track natural keypoints, during...
The detection of the iris boundaries is considered in the literature as one of the most critical steps in the identification task of the iris recognition systems. In this paper we present an iterative approach to the detection of the iris center and boundaries by using neural networks. The proposed algorithm starts by an initial random point in the input image, then it processes a set of local image...
In the present paper, we treat the problem of learning averages out of a set of symmetric positive-definite matrices (SPDMs). We discuss a possible learning technique based on the differential geometrical properties of the SPDM-manifold which was recently shown to possess a Lie-group structure under appropriate group definition. We first recall some relevant notions from differential geometry, mainly...
This paper proposes a novel feature ranking method, DensityRank, based on kernel estimation on the feature spaces to improve the classification performance. As the availability of raw data in many of today's applications continues to grow at an explosive rate, it is critical to assess the learning capabilities of different features and select the important subset of features to improve learning accuracy...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.