The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Differential privacy (DP) is a promising tool for preserving privacy during data publication, as it provides strong theoretical privacy guarantees in face of adversaries with arbitrary background knowledge. Histogram, as the result of a set of count queries, serves as a core statistical tool to report data distributions and is in fact viewed as the fundamental method for many other statistical analysis...
Analysis of lace texture images is a challenging problem because the lace is a soft and extensible material and can be easily deformed. This paper investigates a whole system for lace classification. A first step, based on Otsu's segmentation method, allows to remove the background. Then the lace texture is characterized using local binary patterns (LBP). In order to be robust against rotation the...
Microcalcifications are the earliest sign of breast carcinoma. Their typical size is about 1 mm, which is why it is difficult to detect for an expert. Therefore, a tool that eases their visualization becomes relevant. Segmentation gives the candidate areas that could contain microcalcifications. A preprocessing step can improve segmentation performance but the algorithm becomes database dependent...
In this paper we compare the performances of three automatic methods of identifying hemangioma regions in images: 1) unsupervised segmentation using the Otsu method, 2) Fuzzy C-means clustering (FCM) and 3) an improved region growing algorithm based on FCM (RG-FCM). For each image, the starting point of the algorithms is a rectangular region of interest (ROI) containing the hemangioma. For computing...
Dynamic time warping algorithm is a pattern matching algorithm that allows a nonlinear stretching of the data. In a recognition system using a matching algorithm, data clustering methods are used to reduce the number of gesture templates in the database, and thus reduce the computational cost; however, the recognition rate is degraded. In this paper, we proposed a DTW gesture recognition system that...
This paper presents a method for the detection of the regions of interest's (ROIs) in mammograms by using dynamic k-means clustering algorithm. In this approach, a method has been developed to determine the initialization number of clusters in mammograms by using a data mining algorithm based on the Local Binary Pattern (LBP) and co-occurrence matrix technique (GLCM). Our method consists of three...
Real datasets always play an essential role in graph mining and analysis. However, nowadays most available real datasets only support millions of nodes. Therefore, the literature on Big Data analysis utilizes statistical graph generators to generate a massive graph (e.g., billions of nodes) for evaluating the scalability of an algorithm. Nevertheless, current popular statistical graph generators are...
The paper deals with automatic training process using clustering algorithms. We provide a comparative study of several clustering algorithms such as K-means, self-organizing map (SOM) and DBSCAN. We use these algorithms for automatic selection of appropriate training samples for face recognition system. We provide an overview of selected methods and we compare their performance on the CMU PIE face...
A novel sports genre categorization algorithm based on representative shot extraction and geometry visual phrase(GVP) is presented in this paper. Performance of sports classification can be observably improved by generating reduced image set containing representative information and encoding spatial information into bag-of-words (BOW) model. Firstly, Shots containing significant information of videos...
Texture images can be characterized with key features extracted from images. In this paper, the scale invariant feature transform (hereinafter SIFT) algorithm is utilized to generate local features for texture image classification. The local features are selected as inputs for texture classification framework. For each texture category, a texton dictionary is built based on the local features. To...
Logo spotting is of a great interest because it enables to categorize the document images of a digital library of scanned documents according to their sources, without any costly semantic analysis of their textual transcript. In this paper, we present an approach for logo spotting, based on the matching of keypoints extracted both from the query document images and a given set of logos (gallery) using...
We propose two novel algorithms for fully-unsupervised, super-fast, and cross-channel TV commercial mining in this paper. The tasks involved in the process include: 1) mining commercial clusters from streams of individual channels, and 2) grouping identical commercial clusters across multiple channels. The first process is achieved with a dual-stage hashing algorithm, which searches for recurring...
A novel approach to recognize facial expressions from static images is proposed in this paper. The local binary pattern (LBP) operator is adopted as an effective feature extraction tool for facial image data. An unsupervised competitive neural network, called a centroid neural network with x2 distance measure, CNN-x2, is then utilized as the classification tool for the histogram data obtained by the...
To deal with the problem of too many answers returned from a Web database in response to a user query, this paper proposes a novel categorization approach which takes advantages of the user contextual preferences to construct a navigational tree in order to reduce the information overload. Based on the user original query, we first speculate how much the user cares about each attribute in the specified...
State-of-the-art object retrieval systems are mostly based on the bag-of-visual-words representation which encodes local appearance information of an image in a feature vector. A search is performed by comparing query object's feature vector with those for database images. However, a database image vector generally carries mixed information of an entire image which may contain multiple objects and...
String is a primary data format in majority of applications. With the rapid growth of diverse data driven applications in the current information era, retrieving string data from heterogeneous structured sources becomes more and more significant and challenging. The main concern is duplicate records are created when data is integrated from heterogeneous sources. Those duplicate records represent the...
Spectral clustering algorithms have seen an explosive development over the past years and been successfully used in data mining and image segmentation. They can deal with arbitrary distribution dataset and easy to implement. But they are sensitive to the datasets which include clusters with distinctly different densities and the parameters must be selected cautiously. This paper proposes an improved...
With the growing computer networks, accessible data is becoming increasingly distributed. Understanding and integrating remote and unfamiliar data sources are important data management issues. In this paper, we propose to utilize self-organizing maps (SOM) clustering to aid with the visualization of similar columns, and integration of relational database tables and attributes based on the content...
This paper proposes a new similarity measure for the content-based image retrieval (CBIR) systems. The similarity measure is based on the multidimensional generalization of the Wald-Wolfowitz (MWW) runs test and the k-means clustering algorithm. The performance comparisons between the proposed method and the current CBIR method based on MWW runs test were performed, and it can be seen that the proposed...
Due to the rapid development of motion capture technology, more and more human motion databases appear. In order to effectively and efficiently manage human motion database, human motion classification is necessary. In this paper, we propose an ensemble based human motion classification approach (EHMCA). Specifically, EHMCA first extracts the descriptors from human motion sequences. Then, singular...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.