The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In the conventional regularized learning, training time increases as the training set expands. Recent work on L2 linear SVM challenges this common sense by proposing the inverse time dependency on the training set size. In this paper, we first put forward a Primal Gradient Solver (PGS) to effectively solve the convex regularized learning problem. This solver is based on the stochastic gradient descent...
Dimension reduction (DR) algorithms are generally categorized into feature extraction and feature selection algorithms. In the past, few works have been done to contrast and unify the two algorithm categories. In this work, we introduce a matrix trace oriented optimization framework to provide a unifying view for both feature extraction and selection algorithms. We show that the unified view of DR...
Subspace learning techniques for text analysis, such as latent semantic indexing (LSI), have been widely studied in the past decade. However, to our best knowledge, no previous study has leveraged the rank information for subspace learning in ranking tasks. In this paper, we propose a novel algorithm, called learning latent semantics for ranking (LLSR), to seek the optimal latent semantic space tailored...
Recently, learning to rank technique has attracted much attention. However, the lack of labeled training data seriously limits its application in real-world tasks. In this paper, we propose to break this bottleneck by considering the cross-domain ldquotransfer of rank learningrdquo problem. Simultaneously, we propose a novel algorithm called TransRank, which can effectively utilize the labeled data...
Many text processing applications adopted the bag of words (BOW) model representation of documents, in which each document is represented as a vector of weighted terms or n-grams, and then the cosine distance between two vectors is used as the similarity measurement. Although the great success in information retrieval and text categorization, the conventional BOW model ignores the detailed local text...
This paper presents a novel algorithm to cluster emails according to their contents and the sentence styles of their subject lines. In our algorithm, natural language processing techniques and frequent itemset mining techniques are utilized to automatically generate meaningful generalized sentence patterns (GSPs) from subjects of emails. Then we put forward a novel unsupervised approach which treats...
Subspace learning approaches aim to discover important statistical distribution on lower dimensions for high dimensional data. Methods such as principal component analysis (PCA) do not make use of the class information, and linear discriminant analysis (LDA) could not be performed efficiently in a scalable way. In this paper, we propose a novel highly scalable supervised subspace learning algorithm...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.