The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Recently, the multi-label learning has drawn considerable attention as it has many applications in text classification, image annotation and query/keyword suggestions etc. In recent years, a number of remedies have been proposed to address this challenging task. However, they are either tree based methods which has the expensive train costs or embedding based methods which has relatively lower accuracy...
Conventional pedestrian detection methods construct models based on hand-crafted features or deep learning. They are powerful but limited due to finite capabilities of single classifiers. Ensemble models escape these problems by assembling multiple classifiers using some man-made criteria which synthetically utilize information from all combined models. However, these criteria lack theoretical support...
We present a novel approach for large speech databases quantization. It uses an unsupervised iterative process to regulate a similarity measure to set the number of clusters and their boundaries, thus overcoming the shortcomings of conventional clustering algorithms such as k-Means and Fuzzy C-Means, which require a priori knowledge of the number of clusters and a similarity measure that follows the...
Defect prediction on projects with limited historical data has attracted great interest from both researchers and practitioners. Cross-project defect prediction has been the main area of progress by reusing classifiers from other projects. However, existing approaches require some degree of homogeneity (e.g., a similar distribution of metric values) between the training projects and the target project...
Bus arrival time prediction is the basis of promoting the development of urban public transport services. In order to solve the prediction problem, a model of predicting bus arrival time based on MapReduce combining clustering with neural network was proposed. Firstly, according to the running characteristics of the bus, the running time of the bus is divided by the K-means clustering method, and...
Analyzing and processing big data of quality inspection is the key factor in ensuring product quality and People's property security. Big data of quality inspection collected by social network and E-commerce is missing in most cases. And the incompleteness of data brings huge challenge for analyzing and processing. Therefore, the algorithm of data filling based on stacked denoising auto-encoder is...
The technical state evaluation of Vehicle Equipment is a necessary step to operate and support. Considered conditions such as technical characters, operate environments and support elements, this paperresearches its technical state cluster, which is based on BIC(Schwarz's Bayesian Criterion). The conclusion reveals that BIC is accurate and concise to cluster the technical state of vehicle equipment.
This paper focuses on designing an Intrusion Detection System(IDS), which detects the family of attack in a dataset. An IDS detects various types of malicious traffic and computer usage which cannot be detected by a conventional firewall. In this proposed work, the data is extracted from UNSW_NB15 dataset. To identify the data cluster centers, the k means algorithm is used. A new and one dimensional...
Random sampling could enhance classification performance by selecting many representative samples to be included in the training dataset. The representative samples usually include the samples located at the border of each class or cluster. In this paper, a new sampling algorithm has been proposed which enforces the training sample to include the border points between classes. Considering a point...
In this paper, a blind bandwidth extension algorithm for music signals has been proposed. This method applies the K-means algorithm to firstly cluster audio data in the feature space, and constructs multiple envelope predictors for each cluster accordingly using Support Vector Regression (SVR). A set of well-established audio features for Music Information Retrieval (MIR) has been used to characterize...
This paper proposes a novel active learning method to save annotation effort when preparing material to train sound event classifiers. K-medoids clustering is performed on unlabeled sound segments, and medoids of clusters are presented to annotators for labeling. The annotated label for a medoid is used to derive predicted labels for other cluster members. The obtained labels are used to build a classifier...
This paper proposes a new scheme for hyperspectral image classification through k-means clustering. The scheme includes three steps. Firstly, principal component analysis (PCA) is utilized for dimension reduction of the hyperspectral image. Secondly, the reduced features are clustered using k-means clustering algorithm and subsequently the clusters are trained separately by multi-class support vector...
Among the various indoor localization systems, received signal strength (RSS) based fingerprinting localization provides most cost-effective solution as it uses the existing wireless network infrastructure. The positioning accuracy of such localization systems can be improved by incorporating huge number of training data, which in turn, increases the searching overhead of such localization systems...
The self-organizing algorithm of Kohonen is well known for its ability to map an input space. That technique is named as Self-Organizing Map — SOM. A SOM can be trained in a short period of time with a few optimization techniques such as “winning” neurons search scope limit. In this paper we propose alternative options for improving the SOM learning speed. The basic idea of the proposed modification...
Internet users have to face to tremendous information from website. Clustering is a good solution to organize information. However, most clustering algorithms operate in the static situation. That means, it doesn't allow any incremental data. Certainly, this restrict is not fit to network environment, since data from internet is continuous increasing. Thus, an incremental clustering algorithm based...
We develop a new method for a particular type of a classification problem, where the positive class is a mixture of multiple clusters and the negative class is drawn from a single cluster. The new method employs an alternating optimization approach, which jointly discovers the clusters in the positive class, and at the same time, optimizes the classifiers that separate each positive cluster from the...
In order to solve defects of the Slope One algorithm that the effect of recommending is not well because of without considering the time weight, and has the problem of data sparsity and poor real-time performance. A weighted slope one algorithm based on cluster filling and time weight (WSOBCFT) was proposed in this paper. To reduce the time of generating the nearest neighbor, the rating matrix of...
In this paper, we present a tag-based recommendation system which generates personalized recommendations for TV users. The proposed approach, based on collaborative filtering recommendation algorithm, uses similarity calculation and vector production to process the users' data. In order to test the applicability of this method, we operated several experiments on random users' data, and the overall...
Everyday huge amount of information are transferred from one network to another, the information may be exposed to attacks. The information and information system should be protected from unauthorized users. To provide and maintain the Confidentiality and Integrity of the information is a very tedious job so Intrusion Detection plays a very important role. Although various methods are used to protect...
The efficiency of the Wang-Mendel (WM) algorithm is severely affected by the number of fuzzy rules and data scale. Thus, this paper proposes a reduced weighted WM algorithm to solve the problem by balancing the completeness and the computation time. The clustering algorithm is first introduced to obtain the cluster centers. Then, only the cluster centers are used to generate fuzzy rules, namely, the...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.