The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
measuring thedistance between categories and the assigned points, ranking of key wordswill be generated. Then, keywords are selected as attributes according to the rank, andtraining example for classifiers will be generated. Finally learning methodsare applied to the training examples. Experimental validation shows that random
. Then, keywords are selected as attributes according to the rank, and training examples for classifiers will be generated. Finally, learning methods are applied to the training examples. Experimental validation shows that random forest achieved the best performance and the second best was the deep learner with a small
Web page classification plays an essential role in facilitating more efficient information retrieval and information processing. Conventionally, web text documents are represented by term frequency matrix for classification purpose. However, considering the limitations of representing documents using terms or keywords
, naive Bayes and rule-based (Ripper) classification algorithms for classification purpose. The classifiers from three algorithms were able to classify the tweets into one of six dialects with some error rate but the classifier study revealed that algorithms were able to pick the keywords that are the salient features of the
classification. A dataset of 2500 phishing and non phishing emails is analyzed after extracting 23 keywords from the email bodies using text mining from the original dataset. Further, we selected 12 most important features using t-statistic based feature selection. Here, we did not find statistically significant difference in
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.