The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Traditional research into the arts has almost always been based around the subjective judgment of human critics. The use of data mining tools to understand art has great promise as it is objective and operational. We investigate the distribution of music from around the world: geographical ethnomusicology. We cast the problem as training a machine learning program to predict the geographical origin...
Credit risk analysis plays an important role in the financial market. In this paper, discriminative restricted Boltzmann machine (RBM) is used in credit risk classification. RBM is a generative model associated with an undirected graph, which can capture complicated features from observed data, and after introducing discriminative component into RBM, it can be used to train a non-linear classifier...
Effective machine-learning handles large datasets efficiently. One key feature of handling large data is the use of databases such as MySQL. The freeware fuzzy decision tree induction tool, FDT, is a scalable supervised-classification software tool implementing fuzzy decision trees. It is based on an optimized fuzzy ID3 (FID3) algorithm. FDT 2.0 improves upon FDT 1.0 by bridging the gap between data...
Rules inferring the memberships to single decision classes have been induced in rough set approaches and used to build a classifier system. Rules inferring the memberships to unions of multiple decision classes can be also induced in the same manner. In this paper, we show the classifier system with rules about the union of multiple decision classes has an advantage in the accuracy of classification...
Recently, semi-supervised sparse feature selection, which can exploit the large number unlabeled data and small number labeled data simultaneously, has placed an important role in web image annotation. However, most of the semi-supervised feature selection methods are developed for single-view data, which can not reveal and leverage the correlated and complemental information between different views...
The security of cloud networks is heavily contingent upon their ability to detect incoming attacks. An Intrusion Detection System (IDS) monitors a network for precisely this purpose. IDSs fall into one of two categories: signature-based and anomaly-based IDSs. Whereas signature-based IDSs rely upon pre-programmed matching rules designed by security experts and are therefore limited to pre-existing...
State-of-the-art phrase-based machine translation (MT) systems usually demand large parallel corpora in the step of training. The quality and the quantity of the training data exert a direct influence on the performance of such translation systems. The lack of open-source bilingual corpora for a particular language pair results in lower translation scores reported for such a language pair. This is...
Based on the incremental nature of knowledge learning, in this study a growing self-organizing neural network approach for modeling the acquisition process of semantic features is proposed. The Growing Self-Organizing Map (GSOM) algorithm is extended and applied to the problem of language acquisition. Based on that algorithm, experiments are conducted using Standard German children's books corpus...
People use both their speech and their body when they communicate face to face, thus human communication is multimodal. The development of multimodal coginfocom systems requires models of the relation between the various modalities, but many studies have shown that multimodal behaviours depend on numerous factors comprising the culture, the setting and the communicative situation. Thus, annotated...
In this paper, we present our solution and experimental results of the application of semi-supervised machine learning techniques and the improvement of SVM algorithm to build text classification applications. Firstly, we create a features model which is based on labeled data, and then we will be improved it by the unlabeled data. The technique that is to be added a label into new data is based on...
Due to standard label propagation algorithm does not use the correct posterior probability of each iteration, and the propagation information of labeled data and unlabeled data are not distinguished during the label propagation process, this paper proposes a multi-level label propagation algorithm Based on data reconstruction. It adds the data which is correctly labeled for each iteration into the...
Albayzin 2012 language recognition evaluation (LRE) is one of the most challenging language recognition evaluation, which is mainly reflected in: (1) the target languages are more confusable with other languages, which might push down the system performance; (2) developing and test data is heterogeneous regarding duration, number of speakers, ambient noise/music, channel conditions, etc. (3) signals...
Similar to neural networks, the generalization improvement of wavelet neural networks is also an important issue since a given network may have good approximation accuracy, but could not perform well on unseen data. Generally, to improve generalization different techniques could be used including regularization. In this paper, two newly regularization techniques, applied to radial wavelet neural networks,...
This paper presents a hybrid model which combines conditional random fields (CRFs) with dynamic gazetteers (DGs) for the task of Chinese named entity recognition (NER). In the previous work of NER, gazetteers were widely used. But their gazetteers were all static ones which cannot adapt themselves to the new domains and new out-of-vocabulary named entities (OOVNEs). In this work, we build and maintain...
Chinese is one of graceful languages, not only it's hieroglyphic writing, but also one of flexible languages, which can describe a thing or sensation of people visually. This is because of polyphone, which exists in Chinese. But it's also because of polyphone, there are some difficult in Chinese speech synthesis, especially in the conversion between character and pronunciation. Polyphone is a large...
Electronic Learning (e-Learning) is used to educate people in these days. Using e-Learning, a number of world ranking universities are starting different courses for high school level to degree level and even at post graduate level through distance learning. This paper describes the best-known different machine learning techniques to boost up the e-Learning education standard and model. Comprehensively...
User generated content on Twitter (produced at an enormous rate of 340 million tweets per day) provides a rich source for gleaning people's emotions, which is necessary for deeper understanding of people's behaviors and actions. Extant studies on emotion identification lack comprehensive coverage of "emotional situations" because they use relatively small training datasets. To overcome this...
To predict the continuous value of target variable using the values of explanation variables, we often use multiple linear regression methods, and many applications have been successfully reported. However, in some data cases, multiple linear regression methods may not work because of strong local dependency of target variable to explanation variables. In such cases, the use of the k nearest-neighbor...
Teachers and parents may use readability to select appropriate learning materials for primary school students. This research constructs Thai stop word list and evaluates the impact of eliminating stop words on readability assessment of Thai text. The corpus contains 1,188 textbook articles used by students from grade 1 to grade 6. Word segmentation, stop word list extraction, and feature selection...
In domains in which single agent learning is a more natural metaphor for an artifact-embedded agent, Exemplar-Based Learning (EBL) requires significantly large sets of training examples for it to be applicable. Obviously large sets of training examples contradict resource capabilities of artifacts. To make EBL a possibility for these artifacts, sets of training examples must be reduced in size in...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.