The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The process of mining includes various methodologies and data classification is one of the advantageous methods involved in it. It not only eases the process of machine learning but also gives a platform for proper functioning of the process. There are cases wherein the data which is important or unidentified is missed during the process of classification. The process of mining is highly affected...
Paper presents the Shape Movement Pattern (ShaMP) algorithm, an algorithm for extracting Movement Patterns (MPs) from network data, and a prediction mechanism whereby the identified MPs can be used to predict the nature of movement in a previously unseen network. The principal advantage offered by ShaMP is that it lends itself to parallelisation. The reported evaluation was conducted using both Massage...
Pill identification is a serious concern for pharmacists due to similarity of pill appearances. Pill imprints usually contain important information that can be used to add or search for pill information on existing pill databases. However, current techniques for extracting imprints often give results as vectors which cannot be used with existing databases. Thus, this paper proposed an approach for...
Technology brings images as a communication media for humans. Image communication used today in many fields such as education, media, healthcare and in other domains. Based on image retrieval user input selection one of the most powerful technique and has been an active research direction for the couple of years. Various features are used for image retrieval. Most of the retrieval technique used image...
Breast cancer is one of the leading cause of death for women today and it is the most common cancer in developed countries. The cause and degree of the breast cancer are very much associated with the malfunctions of its tissues and cells. It is very hard and rigorous task for the doctors to observe the clinical records for many affected patients and regulate the therapy manually. Therefore, it is...
The score in art exam, part of College entrance examination, is very useful and helpful. Thanks to the exam management system we got the data easily, and there are huge amount of data accumulated. We analyzed the data and visualize the data through Cluster Analysis and Correlation Analysis. And get the relationship between number of students and age, subject, position. Finally we visualize the analyzed...
We propose a novel supervised initialization scheme for cascaded face alignment by searching nearest neighbors based on global image descriptors. Unlike existing schemes which resort to additional large training data sets for learning features, our method does not require additional training steps; thus making our method low computational. Moreover, we found that it is sufficient to use a simple low-dimensional...
Recognition of spatial relations between pairs of subexpressions is a key problem of recognition of handwritten mathematical expressions. Most methods for spatial relation classification are based on handcrafted rules and geometric indices extracted from the subexpression bounding boxes. In this work, we propose new spatial relation features that combine subexpression bounding box and intra-subexpression...
This paper proposes an original method for extracting the centerline of 3D objects given only partial mesh scans as input data. Its principle relies on the construction of a normal vector accumulation map build by casting digital rays from input vertices. This map is then pruned according to a confidence voting rule: confidence in a point increases if this point has maximal votes along a ray. Points...
This paper introduces three interpolation methods that enrich complex evolving region trajectories that are captured every day from numerous ground-based and space-based solar observatories. The interpolation module takes a trajectory as its input and generates an enriched trajectory with interpolated time-geometry pairs. we created three different interpolation techniques that are: MBR-Interpolation...
The Levy Walk (or Levy flight) is a concept fromBiomathematics to describe the hunting–behaviour of manypredatory species. It is a very efficient way to find prey in avery short time frame. We now want to use this concept ina clustering–context to – if you so will – "hunt" for clusters. We describe how we convert this concept into an efficient wayto find cluster centres by linking the data...
Thanks to the rise of wearable and connected devices, sensor-generated time series comprise a large and growing fraction of the world's data. Unfortunately, extracting value from this data can be challenging, since sensors report low-level signals (e.g., acceleration), not the high-level events that are typically of interest (e.g., gestures). We introduce a technique to bridge this gap by automatically...
In this paper, we present a dynamic clustering algorithm that efficiently deals with data streams and achieves several important properties which are not generally found together in the same algorithm. The dynamic clustering algorithm operates online in two different time-scale stages, a fast distance-based stage that generates micro-clusters and a density-based stage that groups the micro-clusters...
Clusters are well recognized regardless of their shape and of the dimensionality of the space in which they are embedded in traditional CFSFDP (Clustering by fast search and find of density peaks). But when large-scale dataset is processed, it takes too long time to calculate the distance between two data points. In this paper, we present a novel MapReduce-based CFSFDP clustering algorithm called...
The class of density-based clustering algorithms excels in detecting clusters of arbitrary shape. DBSCAN, the most common representative, has been demonstrated to be useful in a lot of applications. Still the algorithm suffers from two drawbacks, namely a non-trivial parameter estimation for a given dataset and the limitation to data sets with constant cluster density. The first was already addressed...
One of the more challenging real-world problems in computational intelligence is to learn from non-stationary streaming data, also known as concept drift. Perhaps even a more challenging version of this scenario is when - following a small set of initial labeled data - the data stream consists of unlabeled data only. Such a scenario is typically referred to as learning in initially labeled nonstationary...
This paper extends our previous work on deriving meaningful storm patterns from very large rainfall data. In an earlier work, we described MapReduce-based algorithms to identify three types of the storms: local, hourly and overall storms. In general, local storms have temporal characteristics of the storms at a particular site, hourly storms have spatial characteristics of the storms at a particular...
We propose a framework for Twitter events detection, differentiation and quantification of their significance for predicting spikes in sales. In previous approaches, the differentiation between Twitter events has mainly been done based on spatial, temporal or topic information. We suggest a novel approach that performs clustering of Twitter events based on their shapes (taking into account growth...
In this paper, we present a new approach of distributed clustering for spatial datasets, based on an innovative and efficient aggregation technique. This distributed approach consists of two phases: 1) local clustering phase, where each node performs a clustering on its local data, 2) aggregation phase, where the local clusters are aggregated to produce global clusters. This approach is characterised...
The number of devices capable of measurement Power Quality (PQ) parameters is increasing continuously in all voltage levels. Consequently, the amount of available PQ data is also growing very fast. These data contain a lot of valuable information about the behavior of PQ, but up to now it is in the most cases used only to assess compliance with limits (e.g. EN 50160 in Europe). Beside long-term characteristics...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.