The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
An adaptive k-nearest neighbor algorithm (AdaNN) is brought forward in this paper to overcome the limitation of the traditional k-nearest neighbor algorithm (kNN) which usually identifies the same number of nearest neighbors for each test example. It is known that the value of k has crucial influence on the performance of the kNN algorithm, and our improved kNN algorithm focuses on finding out the...
In machine learning classification, the classifier can be described by some rules, and the rules can be expressed by fuzzy granules corresponding to fuzzy concepts. In this paper we will introduce fuzzy information granulation to the process of building fuzzy classifier. Furthermore, we will present an optimized information granulation based machine learning classification algorithm. Experiments carried...
In order to resolve the computational complexity for local map matching of hierarchical simultaneous localization and mapping (SLAM), a novel self-organizing fuzzy neural networks (SOFNN) based approach was proposed in this paper. The matching component for local maps in the hierarchical SLAM is realized by an SOFNN. A subset of signature elements included in a local map was chosen by a clustering...
An assessment method for water shortage risk based on neural network classificatory of fuzzy sets is presented in paper. Risk rate, weakness, possibility of recovery, period for reoccurrence and risk level are defined as the indexes for water shortage risk assessment of regional resources. The suggested model is used to evaluate water shortage risk of Zhanghe irrigation region in Hubei Province in...
This The index system of environmental impact assessment of tailings pond is hierarchal structure. As there is inherent uncertainty in determining the evaluation grade of low-level indicators and the relationship between the up-level and low-level indicators is non-linear, therefore, the evaluation model should have to deal with the uncertainty of information and achieve the non-linear conversion...
Learning from imbalanced data sets presents a new challenge to machine learning community, as traditional methods are biased to majority classes and produce poor detection rate of minority classes. This paper presents a new approach, namely fuzzy-rough k-nearest neighbor algorithm for imbalanced data sets learning to improve the classification performance of minority class. The approach defines fuzzy...
This study utilizes a fuzzy message requirement classifiers system (FMRCS) that integrates both learning and inference into the learning of the computer troubleshooting ability and adopts a teaching strategy of problem-solving. The main purpose in this study is to guide learners to have the conspicuous direction when they face some computer troubles. Consequently, learners can be based on FMRCS with...
Annual runoff forecasting is very important for improvement of the management performance of water resources: high accuracy in runoff prediction can lead to more effective use of water resources. The purpose of this study is to apply the adaptive network based fuzzy inference system (ANFIS) model to forecast annual runoff of Yamadu hydrological station in Xinjiang Province, China. The subtractive...
Text Categorization (TC) is an important component in many information organization and information management tasks. In many TC applications, the case-base grows at a fast rate and this causes inefficiency in the case retrieval process. Using Case-Base Maintenance learning via the GC (Generalization Capability) algorithm, which can reduce the case number into KNN algorithm, can improve efficiency...
In the field of imbalance learning and cost sensitive learning, minimization of the classification error rate is not an appropriate approach due to class skew and cost distributions. Thus the area under the ROC Curve (AUC) has been widely utilized to assess the performance of the classifiers in such cases. The Maximum AUC Linear Classifier (MALC), aiming at maximizing AUC directly, is a nonparametric...
The imbalanced data set has been reported to hinder the classification performance of many machine learning algorithms on both accuracy and speed. But extremely imbalanced data sets (3~5% positive samples) are common for many applications, such as multimedia semantic classification. In this paper, we propose a novel algorithm to automatically remove samples that have no or negative effects on classifier...
K-means clustering is a popular conventional clustering algorithm. As it does not use the structure information of data sets, sometime the clustering result will be dissatisfied. Manifold learning algorithms can reveal the low-dimensional geometry structure of the data sets. In this paper, we combine K-means clustering algorithm with manifold learning algorithms into a coherent framework. We show...
Hepatitis patients are those who need continuous special medical treatment to reduce mortality rate. Using clinical test findings data and machine learning technology such as Support Vector Machines (SVM), the classification and prediction of their life prognosis can be done. However, we cannot pledge that all the features values in the data are correlated to each other. Therefore, we incorporate...
A sample and class incremental learning algorithm based on hyper-sphere support vector machine is proposed. For every class, hyper-sphere support vector machine is used to get the smallest hyper-sphere that contains most samples of the class, which can divide the class samples from others. In the process of incremental learning, the hyper-sphere of every new class are trained, and the history hyper-spherees...
Name disambiguation has received considerable attention as an important subtask of NLP (Natural Language Processing). Given many potential references of person entities, the goal is to find out for each reference involved in the context the most possible person entity it refers to. However, many researches in this field either focus on name disambiguation within a single text or employ machine learning...
Most of the previous researches on sentiment analysis concentrate on the binary distinction of positive vs. negative. This paper presents the multi-class sentiment classification problem that attempt to mine the implied rating information from reviews. We use four machine learning methods and two feature selection methods to find out whether or not the multi-class sentiment classification problem...
Entity extraction involves multi-factors, and the different factor has an impact on the answer in varying degrees, this paper presents a machine learning approach to parameter learning for entity answer. Firstly, in view of characteristics of the Question Answering System (QA), we define three elements of the text score, passage score and entity score which influenced the answer extraction, also give...
Traditional machine learning and data mining algorithms mainly assume that the training and test data must be in the same feature space and follow the same distribution. However, in real applications, these two hypotheses are difficult to hold, traditional algorithms are hence no longer applicable. As a new framework of learning, transfer learning could solve this problem effectively. This paper focuses...
Feature selection is a crucial step in the supervised learning process. Traditional feature selection methods based on mutual information cannot directly handle the feature set with hybrid continuous and categorical features, and cannot dynamically eliminate the redundant features in the feature selection process. Resort to mutual information, a hybrid feature selection method named PGFB is proposed...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.