The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Many real-world problems involve multi-view high-dimension-small-sample-size data analysis, such as multi-omics data. The combination of multi-view databases is supposed to provide a better biological significance. However, the multi-view data always contain noise and outlying entries that result in inaccurate and unreliable. It has become an urgent need how to effectively analyze these data. We proposed...
The bag of words (BOW) represents a corpus in a matrix whose elements are the frequency of words. However, each row in the matrix is a very high-dimensional sparse vector. Dimension reduction (DR) is a popular method to address sparsity and high-dimensionality issues. Among different strategies to develop DR method, Unsupervised Feature Transformation (UFT) is a popular strategy to map all words on...
Medicinal plants are getting increasingly popular across the world for their ability to cure different diseases including chronic ones. The chemical compositions present in those plant leaves are main contributors for the healing characteristics. The potential of using such plants also depends on the maturity of the medicinal plant under use. The leaves with appropriate maturity can cause better healing...
In this work, terahertz time-domain spectroscopy was employed to establish the fingerprints of Radix Angelicae Dahuricae (RAD). Principal component analysis (PCA) and k-subspace clustering were employed to study the similarities and dissimilarities among the 120 RAD samples. The methods can be successfully used to classify these samples according to their regions. The results show that the referred...
Nowadays the activity recognition based on multiple wearable sensors is still a challenging task due to the diversity of human activities. The application of unsupervised classification is helpful to discovery new activity classes and improve the activity classification model. Therefore, a new multi-sensor activity recognition scheme using the two-dimensional principal component analysis (2DPCA) and...
Recognition of human actions by using wearable sensors has become an important research field. Segmentation to sensor data is a vital issue in reconstructing and understanding human daily actions, and strongly affects the accuracy of human actions recognition. Traditional online segmentation approaches are mostly designed for one-dimensional sensor data, which greatly limits these approaches to multi-dimensional...
The paper focuses on using stacking and rotation-based technique to improve performance and generalization ability of the machine learning classification with data reduction. The aim of data reduction technique is decreasing the quantity of information required to learn a high quality classifiers, especially when the data are huge. The paper shows that merging both stacking and rotation-based ensemble...
This paper presents the author clustering problem and compares it to related authorship attribution questions. The proposed model is based on a distance measure called Spatium derived from the Canberra measure (weighted version of L1 norm). The selected features consist of the 200 most frequent words and punctuation symbols. An evaluation methodology is presented and the test collections are extracted...
In commercial banks, data centers often integrates different data sources, which represent complex and independent business systems. Due to the inherent data variability and measurement or execution errors, there may exist some abnormal customer records (data). Existing automatic abnormal customer detection methods are outlier detection which focuses on the differences between customers, and it ignores...
Accurate network and phase connectivity models are crucial to distribution system analytics, operations and planning. Although network connectivity information is mostly reliable, phase connectivity data is typically missing or erroneous. In this paper, an innovative phase identification algorithm is developed by clustering of voltage time series gathered from smart meters. The feature-based clustering...
Bearing is a critical component that effects operational performance of machine. Fault classification to bearing that aims to identify category of bearing fault is helpful to improve reliability and safety of bearing. In this paper, a classification process is presented based on sparse subspace clustering. A sample corresponds to a specific fault state of the bearing is represented by its neighbourhood...
In this paper, an image segmentation method is presented to analyze the clusters of Computed Tomography (CT) image. Target image is divided to small parts called as observation screens. Principal Component Analysis (PCA) is used for better representation of features about observation screens. The optimal number of component related with observation screen is determined by Horn's Parallel Analysis...
In this paper, we propose a novel method to extract keyframes from motion capture data for people to better visualize and understand the content of the motion. It first applies a Butterworth filter to remove the noise in the motion capture data, then carries out principal component analysis (PCA) to reduce the dimension. By detecting the zero-crossing points of the velocity in the principal components,...
A novel method for defining an index based on multi-level clustering of 40-Hz auditory steady state response is presented in this paper. The index is a measure of depth of anaesthesia which can help monitoring depth of anaesthesia more closely and accurately. Multi-level expectation maximization (EM) is used for clustering the recorded 40-Hz auditory steady state response signals recorded from human...
Herein, we explore both a new supervised and unsupervised technique for dimensionality reduction or multispectral sensor design via band group selection in hyperspectral imaging. Specifically, we investigate two algorithms, one based on the improved visual assessment of clustering tendency (iVAT) and the other based on the automatic extraction of “blocklike” structure in a dissimilarity matrix (CLODD...
This paper describes different approaches for detection and identification of diseases in apples using computer vision. Our proposed algorithms analyze surface appearance of apple for defects using image features, viz. color and texture. For segmentation of Region Of Interest (ROI), K-means clustering is performed over the image pixels based on their intensity values. For creation of feature vector,...
Feature learning algorithms aim to provide a compact and discriminative representation of complex datasets in order to increase the speed and accuracy of clustering or classification. In this paper, we propose a novel interactive feature learning approach which is mainly based on 3D interactive data visualization and Non-negative Matrix Factorization (NMF). Here, the data is visualized in a 3D interface...
Most of the existing methods for generating a visual dictionary SIFT based on local characteristics, and adopt the common K-means clustering method to get the visual dictionary. But when the image vector dimension of the local feature is growing higher, the vector distribution of the local characteristics becomes sparse, resulting in the high correlation distance between the image vectors and reducing...
Reconstruction of gene regulatory networks or ‘reverse-engineering’ is a process of identifying gene interaction networks from experimental microarray gene expression profile through computation techniques. However, there are some issues and challenges remain in gene regulatory network construction. One of them is the inference complexity due to the high dimensionality of gene expression data. The...
As most electronic system structure is complex and uncertain, this paper presents a new efficiency method for spacecraft electrical characteristics identification. PCA (Principal Component Analysis) feature extraction, offline FCM (Fuzzy C-means) clustering and online SVM (Support Vector Machine) classifier is introduced into the registration model. At first step of the algorithm, get an expert training...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.