The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Clustering is among the most common data mining techniques and Fuzzy clustering can model the world even more realistically and more precisely. One of the most favorable fuzzy clustering methods is the Fuzzy C-Means (FCM) algorithm, which is actually identical to the (original) K-Means clustering algorithm fueled with a fuzzy flavor. However, there are some issues with the fuzzy clustering methods;...
Usually nearest-neighbor density estimator methods suffer from problems such as high time complexity of O(n2) and high memory requirement especially when indexing is used. These problems produce limitations on applying them for small datasets. In this paper a new method is proposed that calculates distances to nearest and farthest neighbor nodes to make dataset subgroups; therefore, computational...
Now a day's many of crimes are related to financial domain so forensic analysis of such documents is required. Due to digitization many of documents for investigation is faster. If analyzer analyzes the document manually it will time consuming and tedious task so, we follow the approach which will specify the clustering algorithm to document for forensic analysis of seize system which will help the...
Although fuzzy c-means algorithm has shown great capability to spherical clusters, it can not perform very well on non-spherical data sets yet. To deal with this problem, kernel-based fuzzy clustering has been presented by mapping data points into a high-dimensional Hilbert space with kernel functions. However, the computational complexity of kernel matrix is always quadratic, usually makes kernel...
Community structure is a common feature in real-world network. Overlap community detection is an important method to analyze topology structure and function of the network. Most algorithms are based on the network structure, without considering the node attributes. In this paper, we propose an overlapping community detection algorithm based on node convergence degree which combines the network topology...
Iterative SpMV (ISpMV) is a key operation in many graph-based data mining algorithms and machine learning algorithms. Along with the development of big data, the matrices can be so large, perhaps billion-scale, that the SpMV can not be implemented in a single computer. Therefore, it is a challenging issue to implement and optimize SpMV for large-scale data sets. In this paper, we used an in-memory...
Data mining has gained much importance in the field of research these days. It makes perfect blend for analyzing data of any fields and provide decision based output. Data generation and storage these days are done at high speed. Non stationary systems play holistic role in providing such data. Availability of such data creates scope of analysis for researchers. Such data which are continuous, unbounded,...
In the face of development trend that a single XML document data volume is increasingly big, involved a core problem of XML data query technology is how to effectively solve result that meet certain query semantic, and label used by corresponding document tree based on XML document is the key factor of affecting the query efficiency. This paper, all experimental processing is completed through MapReduce...
High performance computing (HPC) means the aggregation of computational power to increase the ability of processing large problems in science, engineering, and business. HPC on the cloud allows performing on demand HPC tasks by high performance clusters in a cloud environment. The connection structure of the nodes in HPC clusters should provide fast internode communication. It is important that scalability...
Capsule endoscopy (CE), introduced as a modality for non-invasive examination of entire gastrointestinal tract, demands for an efficient computer-aided decision making system to relieve the physician from the responsibility of screening around 60,000 video frames per patient. An automatic and robust segmentation algorithm can aid the automation of CE screening and decision making procedure. In this...
This paper proposes an approach using MapReduce-based Rocchio relevance feedback algorithm, which improved the traditional Rocchio algorithm in the MapReduce paradigm, to resolve the problem of massive information filtering. Traditional text classification algorithms have vital impact on information filtering.
Feature detection or extraction for polygon mesh is one of the most fundamental and extensively used techniques. Applications such as mesh simplification, smoothing, parameterization, segmentation, morphing, and shape matching, etc. requires feature detection. A common technique is to identify the edge features through the dihedral angles of the two neighbouring faces adjacent to the edge. In this...
Virtual learning community is a kind of learning environment based on network is a new type of learning organization. However, Virtual Learning Community in the teaching data is often messy, fragmentary, it's value is often difficult to be detected and reasonable to use data mining techniques to deal with data will give us a analysis to study the effect of get twice the result with half the effort...
Mobile crowdsensing (MCS) is a new paradigm which takes advantage of pervasive mobile devices to collaboratively collect data and analyze physical phenomenon. As mobile devices are owned and controlled by individuals with various capabilities and intentions, a main challenge MCS applications face is to ensure the credibility of the crowd contributed data. Existed works attempt to increase confidence...
Image segmentation is found to be critical issue in vision processing. Segmenting the images in more proficient manner is still a challenging issue. Many techniques has been proposed so far, but no one is effective for a particular kind of image. The comprehensive review has shown that the use of fuzzy enhancement is not considered in gray stretch based method. The existing technique provides poor...
Detection of Dental plaque is important for the doctors, the victims as well as the researchers. Plaque is a Microbial Biofllm that continuously forms on the teeth surface, later it reacts with food materials that contains higher concentration of sugars, starches and then releases acids. Finally it attacks the tooth enamel causing gingivitis and other diseases if proper treatment is not given. Though...
Image matching is one of the active areas of research in today's world. There exist many ways in which two images can be matched. When the size of the image is very large, it incurs more computational time and cost. In this paper, an efficient algorithm to match large images is proposed. This algorithm works by generating key features using SIFT algorithm. This SIFT features are then clustered using...
Nowadays, it is widely accepted that exploiting all forms of parallelism is the only way to significantly improve performance. The three major forms of parallelism on a modern processor are ILP, DLP, and TLP, which are not mutually exclusive. To gain further performance improvements, MPI can be used on a cluster of computers. This paper exploits the capabilities of distributed multi-core Intel processors...
With the amount of data increasing rapidly, how to improve the scalability of nonlinear clustering has become a very crucial and challenging problem. In this paper, we design an efficient parallel nonlinear clustering algorithm by using a four-stage MapReduce framework. In our approach, we need to compute two quantities based on distance matrices, which, however, is difficult to compute in a MapReduce...
Investigating insider threat cases is challenging because activities are conducted with legitimate access that makes distinguishing malicious activities from normal activities difficult. To assist with identifying non-normal activities, we propose using two types of pattern discovery to identify a person's behavioral patterns in network data. The behavioral patterns serve to deemphasize normal behavior...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.