The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The biological signals collected by the multi-electrode array are contaminated by heavy noise signals. How to quickly classify the original action potential from the measured noisy signals accurately is the basis of researches in the field of neuroscience. In this paper, we analyze the characteristics and shortcomings of Wave-clus sorting algorithm, and present a novel sorting algorithm to solve the...
Clustering is an important tool for analyzing gene expression data. Many clustering algorithms have been proposed for the analysis of gene expression data. In this article we have clustered real life gene expression data via K-Means which is one of clustering algorithms. Also, we have proposed a new method determining the initial cluster centers for K-means. We have compared results of our method...
Clustering is an important unsupervised data analysis technique, which divides data objects into clusters based on similarity. Clustering has been studied and applied in many different fields, including pattern recognition, data mining, decision science and statistics. Clustering algorithms can be mainly classified as hierarchical and partitional clustering approaches. Partitioning around medoids...
Web spam is a big problem for search engine users in World Wide Web. They use deceptive techniques to achieve high rankings. Although many researchers have presented the different approach for classification and web spam detection still it is an open issue in computer science. Analyzing and evaluating these websites can be an effective step for discovering and categorizing the features of these websites...
Current hierarchical clustering algorithms face the risk of privacy leakage during the clustering process for big dataset. While differential privacy is a relatively recent development in the field of privacy-preserving data mining, offering more robust privacy guarantees. In the paper, BIRCH algorithm under differential privacy is studied and analyzed. Firstly, Diff-BIRCH algorithm which directly...
Association rule mining is a very essential data mining technique in different fields. The enormous development of the information needs increased computational power. To address this issue, it is important to study executions of mining algorithms. To find out the frequent itemsets is an essential and vital issue in numerous information mining applications. There are many algorithms present to extract...
In semi administered bunching is one of the vital errands and goes for gathering the information objects into classes (groups) to such an extent that the similitude of items inside bunches is high and the comparability of articles between bunches is Less. The dataset once in a while might be in blended nature that is it might comprise of both numeric and unmitigated sort of information. So two types...
Word wide web is considered as the most important information store in recent years. Web development expands to a great extent with new technologies. Search engines are ineffective when the number of docs in the web is multiplied. In the same way, the retrieval of queries, most of which are not related to what the user was looking for. The documents are of varied and flexible web, there are tough...
Clustering is one of the prime topics in data mining. Clustering partitions the data and classifies the data into meaningful subgroups. Document clustering is a set of the document into groups such that two groups show different characteristics with respect to likeness. In this paper, an experimental exploration of similarity based method, HSC for measuring the similarity between data objects particularly...
This paper proposes an improved KNN algorithm to overcome the class overlapping problem when the class distribution is skewed. Different from the conventional KNN algorithm, it not only finds out the k nearest neighbors of each sample (even the test object itself) in the training dataset, but also the neighbors of the unknown test object. Then the validity value of a data point is computed based on...
With the development of cloud computing technology, there are many scientists who want to perform their experiments in cloud environments. Because of the pay-per-use method, it is cost-optimal for scientists to only pay for the cloud services needed for their experiments. However, selection of suitable resources is difficult because they are composed of various characteristics. Therefore, a method...
One of the most important stage during the computer-aided engineering is a stage of commutation circuit partitioning. It requires of designer to make decisions in weakly formalized domains. During the designing a decision maker faces with a lot of problems, one of which is a choice of partitioning problem solution method, which provides an effective solution. This problem belongs to NP-hard and NP-full...
A technique and algorithms for early detection of the started attack and subsequent blocking of malicious traffic are proposed. The primary separation of mixed traffic into trustworthy and malicious traffic was carried out using cluster analysis. Classification of newly arrived requests was done using different classifiers with the help of received training samples and developed success criteria.
Web recommendation systems are helpful in overcoming the excess information on web by retrieving the information required by the user with respect to user's or similar users' preferences and interests. In order to make web recommendation system work, web users have to be clustered based on their common interest. The web user clusters are used to obtain the knowledge about the web pages accessed. This...
In traditional text sentiment analysis methods, text feature vector has the problem of high dimensionality and high sparseness. In view of this situation, we can cluster the similar words together and use the generated clusters to fit into a new dimension so that the text feature vector dimension will be decreased. By using Word2Vec tool and K-means clustering algorithm, this task can be completed...
We take inspirations from nature very often in solving many complex scientific and day to day problems. Nature inspired computing is a branch of computer engineering deals with the development of algorithms simulating behaviors of natural species for solving complex problems not easily solvable by available computational models. Based on biological systems, various algorithms have been presented in...
Social networking portals serve as an ideal platform for a person or an organization, to accomplish self-presentation and self-enhancement goals there by to understand their social relevance and hence, there have been many studies attempting to identify the relationship between different aspects of social media articles. Machine learning methods play a critical role in social media data analytics...
This paper presents a novel adaptive resampling algorithm based on the clustering by fast search and find of density peaks (CFSFDP) algorithm and the synthetic minority oversampling technique (SMOTE), named DP-SMOTE. The essential idea of the proposed method is to use the improved CFSFDP algorithm to find the subclasses and removing noisy data automatically, and then to generate the minority samples...
Voltage is one of the measurement standard of power quality. The issue of low voltage in distribution network seriously affect the social economic development and people's life. Consequently, it would optimize low voltage investment program, clear direction of low voltage investment and provide decision support for the management of low voltage. Canopy- Kmeans is one of the widely used classical partition...
Fruit fly optimization algorithm (FOA) is a new method for finding global optimization based on food finding behavior of the fruit fly. The original FOA can only solve problems that have optimal solutions in zero vicinity. To make FOA more universal for the continuous optimization problems, especially for those problems with optimal solution that are not zero. This paper proposes a hybrid fruit fly...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.