The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Dealing with high dimensional data is a challenging and computationally complex task in the data pre-processing phase of text clustering. Conventionally, union and intersection approaches have been used to combine results of different feature selection methods to optimize relevant feature space for document collection. Union method selects all features from considered sub-models, whereas, intersection...
Recognition of human actions by using wearable sensors has become an important research field. Segmentation to sensor data is a vital issue in reconstructing and understanding human daily actions, and strongly affects the accuracy of human actions recognition. Traditional online segmentation approaches are mostly designed for one-dimensional sensor data, which greatly limits these approaches to multi-dimensional...
Finding appropriate web APIs to develop mashup services is becoming difficult because of increasing number of web APIs offered from different sources. If we can recommend relevant web APIs for a mashup service based on its requirements, it will help software developers to find suitable APIs easily instead of searching from thousands of web APIs. Although there are many existing methods to recommend...
To extract key topics from news articles, this paper researches into a new method to discover an efficient way to construct text vectors and improve the efficiency and accuracy of document clustering based on Word2Vec model. This paper proposes a novel algorithm, which combines Jaccard similarity coefficient and inverse dimension frequency to calculate the importance degree between each dimension...
Network Anomaly Detection plays an importantpart in network security. Among the state-of-the-art approaches, unsupervised anomaly detection is effective when dealing with unlabelled data. However, these approaches also suffer from high false positive rate. We observed that different methods have their own defects and advantages. Inspired by this observation, we provide a new ensemble clustering(NEC)...
With the development of the Internet, it is vital for the security of the Internet to detect web-based anomalies. Clustering based on feature extraction by manually has been verified as a significant way to detect new anomalies. But the presentations of these features can't express semantic information of the URLs. In addition, few studies try to cluster the anomalies into specific types like SQL-injection...
The advancement of smartphones with various type of sensors enabled us to harness diverse information with crowd sensing mobile application. However, traditional approaches have suffered drawbacks such as high battery consumption as a trade off to obtain high accuracy data using high sampling rate. To mitigate the battery consumption, we proposed low sampling point of interest (POI) extraction framework,...
Object clustering is a very challenging unsupervised learning problem in machine learning and pattern recognition. In this paper, we will study visual object pattern clustering problem by combining the k-means clustering algorithm and the binary sketch templates, which quantify each image by a vector of indicators showing that a sketch at certain location, scale, and orientation exist or not. This...
In this paper, an approach for industrial machine vision system is introduced for effective maintenance of inventory in order to minimize the production cost in supply chain network. The objective is to propose an efficient technique for object identification, localization, and report generation to monitor the inventory level in real time video stream based on the object appearance model. The appearance...
In this article we propose a method to refine the clustering results obtained with the nonnegative matrix factorization (NMF) technique, imposing consistency constraints on the final labeling of the data. The research community focused its effort on the initialization and on the optimization part of this method, without paying attention to the final cluster assignments. We propose a game theoretic...
Community detection is an important approach to identify community's structure in a network and can also be considered as graph clustering. This paper conducted a research about community detection using combined topological and topical features in Twitter. The combined features were compared to topological only and topical only. The topological features that were used are following-follower relationship...
This paper introduces an automatic classification of mammogram images by categorizing malignant or normal after segmenting the suspected region. Fuzzy and fuzzy soft set approaches have been used successfully to deal with diverse uncertainties, imprecision and vagueness in data. We have advocated a method of fuzzy soft set using fuzzy soft aggregation operator for solving the problem. The proposed...
With the rapid growth of Internet consumption, the various product comments' form and redundant information are not convenient for the customers to grasp the hot opinions of the historical comments. In view of this, this paper studies the hot opinions of the products' comments and takes the hotel comments data as the main research objects. We filter the comment data from the length of the comments...
A powerful and flexible organization of documents can be obtained by mixing fuzzy and possibilistic clustering. In such organization, documents can belong to more than one cluster simultaneously with different compatibility degrees. Clusters represent topics, which are identified by one or more descriptors extracted by a proposed method. In this manuscript, we investigated whether or not the descriptors...
With the expansion of World Wide Web services due to the growing explosion of information during these recent years automatic summarization has become primordial to provide efficient mechanisms to resume and present effective textual information. This technology can summarize multiple or single documents to get a summary. In this paper we develop method based on Fuzzy ontology extraction technique...
This paper presents an implementation of probabilistic and statistical models for speech recognition. Three models namely Gaussian mixture model, hidden markov model and Gaussian mixture model -- universal background model are discussed. In GMM, both speech identification of unknown isolated words and classification of unknown test patterns are discussed. In HMM, speech identification of isolated...
A High Definition visual attention based video summarization algorithm is proposed to extract feature frames and create a video summary. It uses colour histogram shot detection algorithm to separate the video into shots, then applies a novel high definition visual attention algorithm to construct a saliency map for each frame. A multivariate mutual information algorithm is applied to select a feature...
A channel of communication for both human brain and computer system is provided via a system called Brain Computer Interface (BCI). The vital aim of BCI research is to develop a system that helps the disabled people to interact with other persons and allows their interaction with the external environments or as an additional man-machine interaction channel for healthy users. Different techniques have...
Anomaly detection is an important use of the Automatic Identification Systems (AIS), because it offers support to users to evaluate if a vessel is in trouble or causing trouble. For instance, it can be used to detect if a ship is doing something that may cause an accident or if it has changed its route to avoid bad weather condition. In this work, a new method for finding anomalies in the ships' movements...
Clustering, or unsupervised classification, is an important problem in bioinformatics which serves to automatically group protein sequences into families. In this paper we explain the process of our approach. In the first part, we present extraction phase and features weighting subsequently features selecting. Then we explain our new distance equation and finally we describe the clustering method:...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.