The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Clustering text data streams is an important issue in data mining community and has a number of applications such as news group filtering, text crawling, document organization and topic detection and tracing etc. However, most methods are similarity-based approaches and use the TF*IDF scheme to represent the semantics of text data and often lead to poor clustering quality. In this paper, we firstly...
Subspace clustering is one of the best approaches for discovering meaningful clusters in high dimensional space. However, the existing algorithms often produce clusters of great redundancy that are not easy to be understood. In this paper, based on the enumeration tree of subspace, we propose a new subspace clustering algorithm MSC to find the clusters hidden in the maximal subspace. MSC uses the...
Recently, semantic smoothing is proposed as an efficient solution for the improvement of document cluster quality. However, the existing semantic smoothing model is not effective for partitional clustering to enhance the document clustering quality. In this paper, inspired by the TF*IDF schema and background elimination strategy, we first introduce an improved semantic smoothing model, which is suitable...
Clustering text documents into different category groups is an important problem. The size of desired clusters is an important requirement for a clustering solution. In this paper, we present an efficient clustering algorithm called RTC based on the spherical k-means algorithm for small text documents. In RTC, we present a new initial centers choice method based on the density and farthest distance...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.