The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Many real-world applications exhibit scenarios where distributions represented by training and test data are not similar, but related by a covariate shift, i.e., having equal class conditional distribution with unequal covariate distribution. Traditional data mining techniques suffer to learn a good predictive model in the presence of covariate shift. Recent studies have proposed approaches to address...
Proportion-SVM has been deeply studied due to its broad application prospects, such as modeling voting behaviors and spam filtering. However, the geometric information has been widely ignored. Thus, current methods usually show sensitivity to noises. To address these problems, in this paper, we combine the proportion learning framework with Laplacian term. We exploit the advantages of Laplacian term...
Risk permeates all aspects of doing business. However, support tools capable of systematically identifying the complete spectrum of risks that a company might face are currently lacking. Such a tool would need to reliably identify company-risk relationships from unstructured sources, therefore providing a qualitative assessment of risk exposure. We propose a supervised learning approach that combines...
Bipolar magnetic regions (BMRs) are the corner-stone of solar variability. They are tracers of the large-scale magnetic processes that give rise to the solar cycle, shapers of the solar corona, building blocks of the large-scale solar magnetic field, and significant contributors to the free-energetic budget that gives rise to flares and coronal mass ejections. Surprisingly, no homogeneous catalog...
Limited access to supervised information may forge scenarios in real-world data mining applications, where training and test data are interconnected by a covariate shift, i.e., having equal class conditional distribution with unequal covariate distribution. Traditional data mining techniques assume that both training and test data represent an identical distribution, therefore suffer in presence of...
Outlier detection or anomaly detection is an important and challenging issue in data mining, even so in the domain of energy data mining where data are often collected in large amounts but with little labeled information. This paper presents a couple of online outlier detection algorithms based on principal component analysis. Novel algorithmic treatments are introduced to build incremental PCA and...
Monitoring high performance computing systems has become increasingly difficult as researchers and system analysts face the challenge of synthesizing a wide range of monitoring information in order to detect system problems on ever larger machines. We present a method for anomaly detection on syslog data, one of the most important data streams for determining system health. Syslog messages pose a...
As secure processing as well as correct recovery of data getting more important, digital forensics gain more value each day. This paper investigates the digital forensics tools available on the market and analyzes each tool based on the database perspective. We present a survey of digital forensics tools that are either focused on data extraction from databases or assist in the process of database...
While state-of-the-art kernels for graphs with discrete labels scale well to graphs with thousands of nodes, the few existing kernels for graphs with continuous attributes, unfortunately, do not scale well. To overcome this limitation, we present hash graph kernels, a general framework to derive kernels for graphs with continuous attributes from discrete ones. The idea is to iteratively turn continuous...
Micron's new Automata Processor (AP) architecture exploits the very high and natural level of parallelism found in DRAM technologies to achieve native-hardware implementation of non-deterministic finite automata (NFAs). The use of DRAM technology to implement the NFA states provides high capacity and therefore provide extraordinary parallelism for pattern recognition. In this paper, we give an overview...
Conventional echo hiding methods have simple encoding and decoding process, Robustness to MP3 compression, but the correct rate of extracted information need be improved. To copy with this problem, we propose a time-spread echo with random intervals. The encoding process generates an interval sequence to dominate the intervals of echoes. The echo is added into the audio according to the interval sequence,...
In order to solve multi-class classification problem in real world, we improved TSVM in this paper. We combined LSTSVM with partial binary tree to improve classification efficiency. Binary tree hierarchy can solved the inseparable regional issues in OVO-SVM and OVA-SVM classification. Experimental results show that it improved the classification accuracy. It also has better speed-up ratio than the...
This paper proposes a profiling-based method to extract a task graph, which describes the system behavior of a multiprocessor system-on-chip with Android OS. The proposed method computes the resource usage of each task and extracts dependency among tasks using the run-time system profiling results. The proposed method calculates CPU resource usage and I/O waiting time of each task by analyzing CPU...
An interesting class of irregular algorithms are tree traversal algorithms, which repeatedly traverse spatial trees to perform efficient computations. Optimizing tree traversal algorithms requires understanding specific characteristics of these algorithms which affect their behavior and govern which types of optimizations are likely to perform well. In this work, we present a set of tree traversal...
The imbalanced learning problem is becoming pervasive in today's data mining applications. This problem refers to the uneven distribution of instances among the classes which poses difficulty in the classification of rare instances. Several undersampling as well as oversampling methods were proposed to deal with such imbalance. Many undersampling techniques do not consider distribution of information...
Traditional network classification techniques will become computationally intractable when applied on a network which is presented in a streaming fashion with continuous updates. In this paper, we examine the problem of classification in dynamic streaming networks, or graphs. Two scenarios have been considered: the graph transaction scenario and the one large graph scenario. We propose a unified framework...
In clinical environment, Interventional X-Ray (IXR) system is used on various anatomies and for various types of the procedures. It is important to classify correctly each exam of IXR system into respective procedures and/or assign to correct anatomy. This classification enhances productivity of the system in terms of better scheduling of the Cath lab, also provides means to perform device usage/revenue...
Anomaly detection is a hot research field in the area of machine learning and data mining. The current outlier mining approaches which are based on the distance or the nearest neighbor are resulted in too long operation time results when using for the high-dimensional and massive data. Many improvements have been proposed to improve the results of the algorithms, but not yet satisfy the demand of...
This article proposed ‘TLiSVM’ or ‘3LiSVM’ (Triple Linear SVM Weight) as an alternative technique for dimensionality reduction with a Support Vector Machine (SVM) classifier on a two-class dataset. The efficiency of TLiSVM was compared with two chosen techniques, including Linear SVM Weight (LiSVM) and Double Linear SVM Weight (DLiSVM). Three datasets, including DLBCL, Duke Breast-Cancer and Leukemia,...
In order to improve prediction accuracy of power load and guarantee safe power supply, this paper proposes a new power load prediction method based on particle swarm optimization optimizing and supporting vector machine(PSO-SVM). It is also applied in data analysis sub-system of power dispatching automation system, designs and completes a set of periodic data set and periodic association rule mining-based...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.