The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Clustering is one of the fundamental data mining procedures. Bisecting K-means (BKM) clustering has been studied to have higher computing efficiency and better clustering quality when compared with the basic Lloyd version of the K-means clustering. Elkan's method of utilizing triangle inequality significantly reduces distance calculations, and is applicable to each K-means iteration without affecting...
Pairwise link discovery approaches for the Web of Data do not scale to many sources thereby limiting the potential for data integration. We thus propose a holistic approach for linking many data sources based on a clustering of entities representing the same real-world object. Our clustering approach utilizes existing links and can deal with entities of different semantic types. The approach is able...
Ceph is an open source distributed file system. Basedon two methods of command line and library librados, weimplement the files reads/writes(also called download/upload) al-gorithms. For the library librados method, we apply two differentmulti-threaded algorithms to optimize the files reads/writes. Theresults show that the performance of multi-threaded algorithmsof downloading/uploading small files...
Discovery of useful patterns from human movement behavior can convey valuable knowledge to a variety of critical applications. Existing approaches focus on outdoor group discovery and mainly consider objects who belong to the same cluster as a possible group, which leads to the inability to discover all the existing groups. This is especially true for indoor human-generated trajectories, where spatially...
This paper proposes a cluster-based, distributed scheme for on-line testing of short faults in NoC interconnects. Proposed scheme detects both intra-and inter-interconnect short faults and identifies faulty interconnect-wires at a node. Then nodes are scheduled such that the proposed scheme offers comparative lower test time. We further see that the proposed scheduling suggests same time for larger...
Clustering is useful for discovering underlying groups and identifying interesting patterns in scientific data and engineering systems. Affinity propagation (AP) is an effective clustering algorithm which has been successfully applied to broad areas of computer science. To generate high quality clusters, AP iteratively performs information propagation on the full similarity matrix and requires excessive...
The success of any Information Retrieval system relies upon extracting relevant pages of similar knowledge matching the requirements of the user. The traditional best of all statistical methodologies fails in conquering the issues of relevancy and redundancy of web pages retrieved. In this paper we propose a novel architecture, FP Growth based Fuzzy Particle swarm optimization which captures the dynamicity...
In this paper, a self-constructing neuro-fuzzy (SCNF) classifier optimized by swarm intelligence technique is proposed for breast cancer diagnosis. The first step in the design is the definition of the fuzzy network structure. Accordingly, a rule generation approach with self-constructing property is proposed. Based on similarity measures, the given input-output patterns are organized into clusters...
Public health is one of the major concerns at the world level. Toxicology is an extremely challenging issue regarding that toxic substances are harmful to human health. In fact, toxicology studies are indispensable to evaluate the toxic effects on humans. Currently, a new evaluation technique based on the analysis of dendritic cells in vitro has been found by researchers. This analysis that remains...
Hyperspectral image clustering is commonly applied for unsupervised classification. However, the clustering results of traditional methods are not sufficient seeing the nature of the image as a data cube with high dimensionality. In addition, the complex relations between spatial neighboring pixels are not considered in traditional methods. In this paper the fuzzy c-means clustering is revisited and...
In software projects, there is a data repository which contains the bug reports. These bugs are required to carefully analyse and resolve the problem. Handling these bugs humanly is extremely time consuming process, and it can result the deleying in addressing some important bugs resolutions. To overcome this problem, researchers have introduced many techniques. One of the commonly used algorithm...
Image segmentation is one of the essential tasks in the field of computer vision. This paper proposes a new image segmentation approach based on Fuzzy C Means (FCM) and Ant Lion Optimization (ALO). FCM has the ability to represent ambiguous information in a more robust way. Bio-inspired algorithms such as ALO have the ability to find optimal parameters in search spaces. These characteristics of FCM...
Preserving the content of historic handwritten manuscripts is important for a variety of reasons. On the other hand, digital libraries are rapidly expanding and thus facilitate to store this information directly in digital form. For digitising text documents, a crucial step is to binarise the captured images to separate the text from the background. In this paper, we propose an effective approach...
This work introduces a hard clustering algorithm based on Particle Swarm Optimization metaheuristic that is able to partition objects considering their relational descriptions given by a single dissimilarity matrix. The PSO is a metaheuristic based on population which is well known for its simplicity, good performance and it was already designed as clustering algorithm for vector data. The proposed...
In this study, we propose a hybrid knowledge-based framework for author name disambiguation. The developed approach helps incrementally identify authors of documents in data acquired from various sources. The nature of the problem calls for an orchestrated use of several methods; thus, the framework is composed of two levels. The first level contains a rule-based disambiguation algorithm. The second...
This paper describes a method for extracting relevant tokens of entity from semi-structured administrative documents. This method is used for mislabeling correction by employing the entity tokens physically close in a document. Firstly, the entities are labeled. Secondly, each entity is modeled by a tokens structure graph in which the nodes represent the tokens and the arcs represent the distances...
Eye gaze patterns or scanpaths of subjects looking at art while answering questions related to the art have been used to decode those tasks with the use of certain classifiers and machine learning techniques. Some of these techniques require the artwork to be divided into several Areas or Regions of Interest. In this paper, two ways of clustering the static visual stimuli - k-means and the density...
With the rapid development of uncertain and large-scale datasets, Fuzzy Possibilistic C-means Clustering (FPCM) and Granular Computing (GrC) were introduced together with the aim to solve the feature selection and outlier detection problems. Utilizing the advantages of the FPCM and GrC, an Advanced Fuzzy Possibilistic C-means Clustering based on Granular Computing (GrFPCM) was proposed to select features...
This paper presents a machine learning approach to explore the phenetic relations of historical scripts and their glyphs. Its first step is the identification of the observable topological transformations in the development of the glyphs, and with the use of these transformations, the method collects the possible cognate glyphs by minimizing the necessary topological transformations between the glyphs...
Dichromats (a human with red-green color blindness considered in this paper) cannot always perceive meaningful visual information due to their deficiencies in eye cells. Stroke information extraction from a color-blindness image (CBI) is presented in order to deliver direct and effective information to dichromats. A CBI is first transformed into the pattern-highlighted image by means of color component...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.