The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper proposes an extended vector space model (VSM), which is called M2VSM (meta keyword-based modified VSM). When conventional VSM is applied to document clustering, it is difficult to adjust the granularity of cluster in terms of topic. In order to solve the problem, M2VSM considers meta keywords such as
propose the framework of applying text and data mining techniques not only to analyze a large number of petitions filed to e-People but also to predict the trend of petitions. In detail, we apply text mining techniques to unstructured data of petitions to elicit keywords from petitions and identify groups of petitions with
The purpose of this research is to propose an appropriate classification approach to improving the effectiveness of spam filtering on the issue of skewed class distributions. A clustering-based classifier is proposed to first cluster documents into several groups, and then an equal number of keywords are extracted
In document categorization method by using similarity measures based on word vectors, it is important to determine key words to characterize each document. However, conventional methods select the key words based on their frequency or/and particular importance index such as tf-idf. In this paper, we propose a method to characterize each document by using temporal clusters of technical term usages...
of the documents considered for context assessment contains the authors list, keywords list and list of document versioning time schedules. The experiments were conducted to assess the significance of the proposed model.
using multidimensional scaling. Based on the Vector Space Model, several Term Spaces were built on the basis of a set of terms extracted from the posters’ abstracts and titles, and a set of free keywords assigned to the posters by their authors. The ensuing Term Spaces were compared from the point of view of retrieving the
compared for the analysis: Full Text, Abstract, Title, Keywords. Results and Conclusions: Highly cited documents were compared among Cortex, Neuropsychologia, andBrain, and a number of interesting parametric trends were observed. The characteristics of the papers that cite Cortex papers were examined, and some interesting
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.