The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Clustering homologous proteins is one of the important tasks in functional genomics. Homologous proteins may share common functions. Annotating proteins of unknown function by transferring annotations from their homologues of known annotations is one of the efficient ways to predict protein function. We use a modularity-based method called CD for grouping together homologous proteins. The method employs...
The isometric feature mapping (Isomap) method has demonstrated promising results in finding low-dimensional manifolds from data points in high-dimensional input space. Isomap has one free parameter (number of nearest neighbours K or neighbourhood radius ε), which has to be specified manually. This paper presents a novel method called Hierarchical Neighbourhood Technique (HNT), in order to obtain a...
With the development of on line shopping service, more and more web sites appear such as Amazon, Dangdang and so on. In order to satisfy the consumers' contrasting and choosing the same merchandises in different web sites, we present an extractor of extracting images from the result pages of deep web called AIE. This extractor can also get the images from the surface web sites which have some relations...
Image segmentation is a very important process for multimedia applications. Multimedia databases use segmentation for the storage and indexing of images. This paper presents a way to segment images by applying both a clustering method and watershed transformation. It is well known that the major drawback of the watershed transformation method is the oversegmentation phenomenon it produces. For this...
Density based clustering algorithms are one of the primary method for data mining. The clusters which are formed using density clustering are easy to understand and it does limit itself to shapes of clusters. Existing density based algorithms have trouble because they are not capable of finding out all meaningful clusters whenever the density is so much varied. VDBSCAN is introduced to compensate...
This paper presents an incremental clustering algorithm based on DGC, a density-based algorithm we developed earlier. We experimented with real-life datasets and both methods perform satisfactorily. The methods have been compared with some well-known clustering algorithms and they perform well in terms of z-score cluster validity measure.
As a density based clustering algorithm, DBSCAN plays an important role in data mining. Normally DBSCAN algorithm is computationally expensive, limiting its performance in large-scale data sets, especially in high dimensional data sets. The high complexity is rooted from the region queries, a very common operation in density based algorithms, which brings the complexity of the algorithms to O(n2),...
In recent years, the advent of high throughput data generation techniques have increased not only the number of objects collected in databases, but also the number of attributes describing these objects. Clustering is the process of grouping the data into classes or clusters, so that objects within a cluster have high similarity in comparison to one another but are very dissimilar to objects in other...
Density-based clustering algorithms are very powerful to discover arbitrary-shaped clusters in large spatial databases. However, in many cases, varied local-density clusters exist in different regions of data space. In this paper, a new algorithm LD-BSCA is proposed with introducing the concept of local MinPts (a minimum number of points) and the new cluster expanding condition: ExpandConClId (Expanding...
Cluster analysis is a primary method for database mining. Most of clustering algorithms require input parameters which are hard to determine but have a significant influence on the clustering result. Furthermore, for many real-datasets there does not exist a global parameter setting for which the result of the clustering algorithm describes the intrinsic clustering structure accurately. We introduce...
This paper proposed a new anomaly detection algorithm that can update normal profile of system usage pattern dynamically. The feature used to model systempsilas usage pattern was program behavior. When system usage pattern changed, new program behaviors will be inserted into old profiles by density-based incremental clustering. Compared to traditional re-clustering updating, it is much more efficiently...
The K-means algorithm based on partition and the DBSCAN algorithm based on density are analyzed. Combining advantages with disadvantages of the two algorithms, the improved algorithm DBSK is proposed. Because of the partition of data set, DBSK reduces the requirement of memory; the method of computing variable value is put forward; to the uneven data set, because of adopting different variable values...
DBSCAN is one of the most popular algorithms for cluster analysis. It can discover all clusters with arbitrary shape and separate noises. But this algorithm canpsilat choose parameter according to distributing of dataset. It simply uses the global MinPts parameter, so that the clustering result of multi-density database is inaccurate. In addition, when it is used to cluster large databases, it will...
User-supplied data such as browsing logs, click-through data, and relevance feedback judgements are an important source of knowledge during semantic indexing of documents such as images and video. Low-level indexing and abstraction methods are limited in the manner with which semantic data can be dealt. In this paper and in the context of this semantic data, we apply latent semantic analysis on two...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.