The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Process frameworks contain the main logic of business. Frameworks extracting and filling can formulate a new process efficiently and accurately. Aiming at the construction guidance of new business processes with framework, a direct-viewing solution based on the similarity analysis of process diagrams is proposed. In this solution, the process diagrams are abstracted into process trees to get process...
Datasets obtained through recently advanced measurement techniques tend to possess a large number of dimensions. This leads to explosively increasing computation costs for analyzing such datasets, thus making formulation and verification of scientific hypotheses very difficult. Therefore, an efficient approach to identifying feature subspaces of target datasets, that is, the subspaces of dimension...
In many organizations huge amount of data is generated. Organizations use this data for their own benefit. Data mining extracts useful knowledge from huge data. Association rule mining is a powerful technique to find hidden patterns in large database. The limitation of mining association rules is that some sensitive patterns are revealed from sensitive rules. It is necessary to hide sensitive rules...
In data mining, clustering is a technique of regrouping similar objects with common proprieties in some clusters. K-means algorithm is the basic of clustering technique; it is the most widely used algorithm for diverse applications. This paper studies and analyses the efficiency of extending k-means results of a perfect sample set, to different sets by using Z-test proprieties, this is based on the...
Crimes are a social irritation and cost our society deeply in several ways. Any research that can help in solving crimes quickly will pay for itself. About 10% of the criminals commit about 50% of the crimes [9]. The system is trained by feeding previous years record of crimes taken from legitimate online portal of India listing various crimes such as murder, kidnapping and abduction, dacoits, robbery,...
Data mining is an advanced technology, which is the process of discovering actionable information from large set of data, which is used to analyze large volumes of data and extracts patterns that can be converted to useful knowledge. Medical data mining has a great potential for exploring the hidden patterns in the data sets of medical domain. These patterns can be utilized to do clinical diagnosis...
Spatiotemporal data mining finds great importance with the increasing availability of spatiotemporal datasets in tremendous amounts. This work introduces an algorithm that identifies the clusters with respect to the spatial and temporal properties of objects. Ordinary density based algorithms like DBSCAN consider only spatial properties and it takes the spatial distance parameter and number of minimum...
Opinion mining is one of the new concepts of data mining. As World Wide Web is growing at higher rate, this has resulted in enormous increase in online communications. The online communication data consist of feedback, comments and reviews on particular topic that are posted on internet by internet users. Sentiment analysis is a sub-domain of opinion mining where the analysis is focused on the extraction...
Data mining can provide support for bank managers to effectively analyze and predict customer churn in the era of big data. After analyzing the reasons for the bank customer churn and the defects of FCM algorithm as a data mining algorithm, a new method of calculating the effectiveness function to improve the FCM algorithm was raised. At the same time, it has been applied to predict bank customer...
Cities are complex systems evolving constantly. Thus, it is necessary to improve the way we collect intra-urban data in order to quantify such evolution. We propose a methodology to transform geo-located tweets into labels for different areas of a given city using DBPedia, Wikipedia and Foursquare categories. We conduct experiments using 77K geolocated tweets posted in Milan during November and December...
Abstract-Location Based Services (LBS) have become extremely popular over the past decade, being used on a daily basis by millions of users. Instances of real-world LBS range from mapping services (e.g., Google Maps) to lifestyle recommendations (e.g., Yelp) to real-estate search (e.g., Redfin). In general, an LBS provides a public (often web-based) search interface over its backend database (of tuples...
One of the main challenging subjects of data mining is fuzzy-clustering time series in real-world applications. Its reason can be time-series data characteristics that include high dimensional, large volume and existence of temporal ordering in data. So far, many studies have performed about issues such as addressing time-series data high dimension and applying a different effect of each dimension...
Clustering is a classical unsupervised learning task, which is aimed to divide a data set into several groups with similar objects. Clustering problem has been studied for many years, and many excellent clustering algorithms have been proposed. In this paper, we propose a novel clustering method based on density, which is simple but effective. The primary idea of the proposed method is given as follows...
Social networking websites allows us to understand the user's interest and behavior pattern on various Travel and Tourism services, especially travel attractions and point of interest, which can be exploited to recommend personalized list of places to users. The major challenge faced by Travel and Tourism recommendation System is to understand the implicit relationships that exist between the user...
Frequent Itemset Mining is one of the most investigated fields of data mining. It is expensive to mine frequent itemsets for a large scale data set. Especially when some data is added into the data set, it is still time-consuming from the scratch to re-compute the complete data set to update the frequent itemsets of the data set. Aiming to improve the performance of frequent itemset mining for large...
Web contains a colossal volume and assortment of information so we have to remove the significant information from it. Distinctive strategies and devices are utilized to concentrate information like DOM parsers, fluffy Algorithms, label proportions and numerous more layout ward approaches. As clients are worried with pertinent information. In our proposed framework information extraction is finished...
Distributed Denial of Service (DDoS) attack is a congestion-based attack that makes both the network and host-based resources unavailable for legitimate users, sending flooding attack packets to the victim's resources. The non-existence of predefined rules to correctly identify the genuine network flow made the task of DDoS attack detection very difficult. In this paper, a combination of unsupervised...
Random sampling could enhance classification performance by selecting many representative samples to be included in the training dataset. The representative samples usually include the samples located at the border of each class or cluster. In this paper, a new sampling algorithm has been proposed which enforces the training sample to include the border points between classes. Considering a point...
The clustering of customer transaction data is very important to retail and e-commerce companies. The authors propose a local PurTree spectral clustering algorithm for massive customer transaction data that uses a purchase tree to represent customer transaction data and a PurTree distance to compute the distance between two trees. The new method learns a data similarity matrix from the local distances...
With the continuous growth of micro-blog services, Sina Weibo is increasingly found in the daily lives of ordinary Chinese individuals. More than one hundred million tweets are released in Sina Weibo everyday. By analyzing these mass data timely, media companies could learn how to generate buzz for new films, famous stars, or fashion shows more effectively. However, how to predict which topics will...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.