The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we consider a novel problem referred to as term filtering with bounded error to reduce the term (feature) space by eliminating terms without (or with bounded) information loss. Different from existing works, the obtained term space provides a complete view of the original term space. More interestingly, several important questions can be answered such as: 1) how different terms interact...
Mass estimation, an alternative to density estimation, has been shown recently to be an effective base modelling mechanism for three data mining tasks of regression, information retrieval and anomaly detection. This paper advances this work in two directions. First, we generalise the previously proposed one-dimensional mass estimation to multidimensional mass estimation, and significantly reduce the...
Distributions over permutations arise in applications ranging from multi-object tracking to ranking of instances. The difficulty of dealing with these distributions is caused by the size of their domain, which is factorial in the number of considered entities (n!). It makes the direct definition of a multinomial distribution over permutation space impractical for all but a very small n. In this work...
The paper focuses on mining clusters that are characterized by a lagged relationship between the data objects. We call such clusters lagged co-clusters. A lagged co-cluster of a matrix is a sub matrix determined by a subset of rows and their corresponding lag over a subset of columns. Extracting such subsets (not necessarily successive) may reveal an underlying governing regulatory mechanism. Such...
Non-negative matrix factorization is an important method helpful in the analysis of high dimensional datasets. It has a number of applications including pattern recognition, data clustering, information retrieval or computer security. One its significant drawback lies in its computational complexity. In this paper, we introduce a new method allowing fast approximate transformation from input space...
Attribute reduction is one of the main issues in the theoretical research of rough set theory which is known as a NP-hard optimization problem. The objective is to find the minimal number of attributes from a large dataset. Hence it is difficult to solve to optimality. This paper proposes a composite neighbourhood structure approach to solve the attribute reduction problem that consists of two versions...
The equivalence relation in the theory is defined by the complete equal relation, which will result in complexity of computation of the knowledge reduction in the real world in rough sets theory (RST). In order to overcome this shortage of general rough set theory, the elementary concept of tolerance rough sets theory is proposed in this paper, and the theory is employed to build objects' tolerance...
Bayesian network is an uncertainty inference network based on probability. Its structure learning is one of the main research techniques in the field of data mining and knowledge discovering, while constructing Bayesian network structures from data is NP hard. According to the information theory and conditional independence test, a new algorithm is presented for the construction of optimal Bayesian...
According to existing have defects discernibility matrix, and the attribute reduction algorithm for attribute reduction algorithm of complex process. This paper made part of optimization, based on the condition attributes classify the grouping generated representative data to simplify the discernibility matrix, and the order of the discernibility matrix, and the complexity of the attribute reduction...
We propose a novel shot detection algorithm based on RS in Compressed-Domain. Firstly, the Algorithm extracts I frames from Compressed-Domain data sequence. We constructs information system with the difference between two adjacent I frames in column and attributes sets witch extracted from decompressed I frames in row. Then the established information system is normalized and discredited. Secondly,...
We consider the following tree-matching problem: Given labeled, ordered trees P and T, can P be obtained from T by deleting nodes? Deleting a node v entails removing all edges incident to v and, if v has a parent u, replacing the edges from u to v by edges from u to the children of v. This problem has a lot of applications in the computer engineering, such as XML tree pattern query evaluation, video...
An improved image signature is proposed and extracted. Firstly, a color information based ICM(Intersecting Cortical Model) image signature is presented. Then a new object feature named SRIC(Silhouette Rotate Interceptive Curve) is described. After these researches, a more distinguishable image signature algorithm based on ICM combined with SRIC is proposed and realized. The Contrast experimentation...
Decision tree, as an important classification algorithm in data mining, has been successfully applied in many fields. In this paper, based on the analysis of the essential characteristics of decision tree algorithm, we give a leaf criterion for multi-decision values of decision attribute, and establish a mathematical model for the selection for expanded attributes; also we give a concrete model based...
The vehicle routing problem is an important problem in logistics. And it has taken as the NP hard problem. Most of the researchers want to get the best resources by many kinds of ways. But the real-world problem is much more difficult than the VRP. In order to make the model and get the solving strategy for the real-world VRP, the paper analysis the customers' demands and divides the problem into...
Wavelet Analysis method is considered as one of the most efficient methods for detecting DDoS attacks. However, during the peak data communication hours with a large amount of data transactions, this method is required to collect too many samples that will greatly increase the computational complexity. Therefore, the real-time response time as well as the accuracy of attack detection becomes very...
The paper analyzed the methods of extracting information from the network traffic, among which the information extraction based on the Tag marker was studied in depth. Tags, which contain specific core words, can be considered as markers of different types of information. Due to the non-standardized nature the Tag marker, a Tag often has lots of interference characters and the traditional pattern...
In this paper we present a new subspace clustering algorithm TGSCA for large dataset with noise. Experiments show that TGSCA can discover clusters both on entire space and subspace; the computation complexity is proximate linear with object's number, space dimension, and clusters' dimension respectively; it is not sensitive to noise; it can find both disjoint clusters or overlap clusters; it can find...
The secure sum protocol is a well-known protocol for computing the sum of private inputs from distributed entities such that the inputs remain private. In this paper we present protocols for computing reputation in a privacy preserving manner that are inspired by the secure sum protocol. We provide a protocol that is secure under the semi-honest adversarial model as well as one that is secure under...
Outlier mining is an important branch of data mining and has attracted much attention recently. The density-based method LOF is widely used in application. However, the complexity of the method is quadratic to size of the dataset, and it is very sensitive to its parameters MinPts. In this paper, we propose a new outlier detection method based on Voronoi diagram, called Voronoi based Outlier Detection...
Occlusion is an important problem of the moving target detection. This paper proposed a new method of vehicle detection according to the deficiencies of common vehicle detection methods. Firstly, the background is modeled through the improved histogram-mean model to extract more accurate background model and update in real-time; then we obtained the background through background subtraction and supplement...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.