The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The bag of words (BOW) represents a corpus in a matrix whose elements are the frequency of words. However, each row in the matrix is a very high-dimensional sparse vector. Dimension reduction (DR) is a popular method to address sparsity and high-dimensionality issues. Among different strategies to develop DR method, Unsupervised Feature Transformation (UFT) is a popular strategy to map all words on...
Recently, the problem of the intrusion detection has been largely studied by the computer and networks security communities. Then, the Intrusion Detection System (IDS) becomes a interest topic in research and in particular in machine learning and data mining. In order to improve the classification accuracy and to reduce high false alarm rate from the classical data base like KDD99 or others. In this...
To utilize asynchronous multichannel recordings with different start and end time of recordings for acoustic scene analysis, we propose a combination method for estimating unrecorded durations and extracting spatial features. Focusing on the fact that amplitude information is relatively robust to the estimation error of the unrecorded durations and the synchronization mismatch of multichannel recordings,...
With the Internet applications become more complex and diverse, simple network traffic matrix estimation or approximation methods such as gravity model are no longer adequate. In this paper, we advocate a novel approach of approximating traffic matrices with multiple low-rank matrices. We build the theory behind the MULTI-LOW-RANK approximation and discuss the conditions under which it is better than...
Recognition of vehicle types in real life traffic scenarios is a challenging task due to the diversity of vehicles and uncontrolled environments. Efficient methods and feature representations are needed to cope with these challenges. In this paper, we address the vehicle type classification problem in real life traffic scenarios and propose a multimodal method that uses efficient representations of...
A stroke occurs when the blood supply to a person's brain is interrupted or reduced. The stroke deprives person's brain of oxygen and nutrients, which can cause brain cells to die. Numerous works have been carried out for predicting various diseases by comparing the performance of predictive data mining technologies. In this work, we compare different methods with our approach for stroke prediction...
In the quest of developing more accurate methodologies for Earth Observation (EO) image retrieval, visualization and information content exploration, a deep understanding of the data being analyzed is needed. In this paper we propose a simple but efficient visual data mining methodology that can be used for these tasks. Our solution consists in a patch-based feature extraction to derive image features...
This paper aims to provide a new method of visualizing high-dimensional data classification by employing principal component analysis (PCA) and support vector machine (SVM). In this method, PCA is adopted to reduce the dimension of high-dimensional data, and then SVM is used for the data classification process. At last, the classified result is projected to two-dimension mapping. The method can visualize...
One of the main challenges in pattern recognition is handling variations in pose, which has been addressed in the past using exhaustive training, increasingly complex neural network architectures, or state space transformations, but often with limits on pose variation. The solution presented here implements complete pose invariance by estimating affine transform parameters and then registering samples...
With the arrival of the era of big data, people's ability to collect and obtain data is becoming more powerful. These data have shown the characteristics of high dimension, large scale and complex structure. High dimensional data has seriously hindered the efficiency of data mining algorithm, we call it "the Dimension disaster ". Therefore, dimension reduction technology has become the primary...
In recent years, we are faced with large amounts of sporadic unstructured data on the web. With the explosive growth of such data, there is a growing need for effective methods such as clustering to analyze and extract information. Biological data forms an important part of unstructured data on the web. Protein sequence databases are considered as a primary source of biological data. Clustering can...
Hyperspectral images(HSIs) provide hundreds of narrow spectral bands for the land-covers, thus can provide more powerful discriminative information for the land-cover classification. However, HSIs suffer from the curse of high dimensionality, therefore dimension reduction and feature extraction are essential for the application of HSIs. In this paper, we propose an unsupervised feature extraction...
Medical image analysis is a pioneer research domain due to the challenges posed by different kinds of images and the complexities in attaining the accurate prediction of abnormalities presence. Brain MRI classification into normal and abnormal has received increasing attention because of the high level of difficulty in handling those huge numbers of images. Recently, many computational techniques...
Accurate network and phase connectivity models are crucial to distribution system analytics, operations and planning. Although network connectivity information is mostly reliable, phase connectivity data is typically missing or erroneous. In this paper, an innovative phase identification algorithm is developed by clustering of voltage time series gathered from smart meters. The feature-based clustering...
Feature selection is important for dimensionality reduction, analysis, and pattern discovery applications. We consider multivariate time series data and propose an unsupervised learning technique to identify the top-k discriminative features. The proposed technique uses statistics drawn from the Principal Component Analysis (PCA) of the input data to leverage the relative importance of the principal...
Accurate feature extraction plays a vital role in the fields of machine learning, pattern recognition and image processing. Feature extraction methods based on principal component analysis (PCA), independent component analysis (ICA), and linear discriminant analysis (LDA) are capable of improving the performances of classifiers. In this paper, we propose two features extraction approaches, which integrate...
Analyzing driving behavior data is essential for developing driver assistance systems. Statistical segmentation is one of the important methods to realize the analysis. Driving behavior data actually include undesirable defects caused by external environment and sensor failures. Defects in the data cause a huge negative effect on the segmentation. In this paper, we showed that a feature extraction...
In this paper, an image segmentation method is presented to analyze the clusters of Computed Tomography (CT) image. Target image is divided to small parts called as observation screens. Principal Component Analysis (PCA) is used for better representation of features about observation screens. The optimal number of component related with observation screen is determined by Horn's Parallel Analysis...
In this paper, we propose a novel method to extract keyframes from motion capture data for people to better visualize and understand the content of the motion. It first applies a Butterworth filter to remove the noise in the motion capture data, then carries out principal component analysis (PCA) to reduce the dimension. By detecting the zero-crossing points of the velocity in the principal components,...
Nowadays, online social network has become one of the main tools which people communicate with each other every day. The online users' behavior is a large amount of high dimension data. However, is online and offline behavior the mapping relation? This paper explores whether there is consistency between online and offline behavior using the Alternating Direction Method of Multipliers (ADMM) algorithm...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.