Search results

Items from 1 to 17 out of 17 results

chapter

Performance analysis of Naïve Bayes, PART and SMO for classification of page interest in web usage mining

Saucha Diwandari, Adhistya Erna Permanasari, Indriana Hidayah

2015 International Seminar on Intelligent Technology and Its Applications (ISITIA) > 39 - 44

2015 International Seminar on Intelligent Technology and Its Applications (ISITIA)

User interaction with web sites generates a large amount of web access data stored in the web access logs. Those data can be used for e-commerce to conduct an evaluation of possessed website pages as one of the efforts to understand the desires of the user. Through classification techniques in web usage mining, we conducted an experiment to categorize a number of data obtained from the client log...

chapter

A Comparison of Stylometric and Lexical Features for Web Genre Classification and Emotion Classification in Blogs

Elisabeth Lex, Andreas Juffinger, Michael Granitzer

2010 Workshops on Database and Expert Systems Applications > 10 - 14

2010 21st International Conference on Database and Expert Systems Applications

In the blogosphere, the amount of digital content is expanding and for search engines, new challenges have been imposed. Due to the changing information need, automatic methods are needed to support blog search users to filter information by different facets. In our work, we aim to support blog search with genre and facet information. Since we focus on the news genre, our approach is to classify blogs...

chapter

Crawling Result Pages for Data Extraction Based on URL Classification

Tiezheng Nie, Zhenhua Wang, Yue Kou, Rui Zhang

2010 Seventh Web Information Systems and Applications Conference > 79 - 84

2010 7th Web Information Systems and Applications Conference (WISA 2010). Workshop on Semantic Web and Ontology (SWON2010). Workshop on Electronic Government Technology and Application (EGTA 2010)

In Web database integration, crawling data pages is important for data extraction. The fact that data are contained by multiple result pages increases the difficulty of accessing data for integration. Thus, it is necessary to accurately and automatically crawl query result pages from Web database. To address this problem, we propose a novel approach based on URL classification to effectively identify...

chapter

Improvement of Feature Extraction in Web Page Classification

Jiao Lijuan, Feng Liping

2010 2nd International Conference on E-business and Information System Security > 1 - 3

2010 2nd International Conference on E-business and Information System Security (EBISS 2010)

Mutual information formula is improved by using the hyperlink factor in this paper. Introduction of hyperlink elements of web pages can improve the classification accuracy in feature selection method based on mutual information and correlation by experiment, especially for those of strong. So the improvement is effective in web page classification.

chapter

Cross-domain classification: Trade-off between complexity and accuracy

E. Lex, C. Seifert, M. Granitzer, A. Juffinger

2009 International Conference for Internet Technology and Secured Transactions, (ICITST) > 1 - 6

2009 4th International Conference for Internet Technology and Secured Transactions (ICITST 2009)

Text classification is one of the core applications in data mining due to the huge amount of not categorized digital data available. Training a text classifier generates a model that reflects the characteristics of the domain. However, if no training data is available, labeled data from a related but different domain might be exploited to perform cross-domain classification. In our work, we aim to...

chapter

Phishing detection using classifier ensembles

F. Toolan, J. Carthy

2009 eCrime Researchers Summit > 1 - 9

2009 eCrime Researchers Summit. eCRIME 2009

This paper introduces an approach to classifying emails into phishing/non-phishing categories using the C5.0 algorithm which achieves very high precision and an ensemble of other classifiers that achieve high recall. The representation of instances used in this paper is very small consisting of only five features. Results of an evaluation of this system, using over 8,000 emails approximately half...

chapter

Simple linguistic processing effect on multi-label emotion classification

Ye Wu, F. Ren

2009 International Conference on Natural Language Processing and Knowledge Engineering > 1 - 5

2009 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE)

Emotion plays a significant role in human communications in our daily life. With progress in human-machine interface technology, recent research has placed more emphasis on the recognition of emotion reaction. Comparing to some other ideal experimental settings, blog posts online would be respond more to real-world events. And a huge resource of text-based emotion can be found from the World Wide...

chapter

Using sentiment orientation features for mood classification in blogs

F. Keshtkar, D. Inkpen

2009 International Conference on Natural Language Processing and Knowledge Engineering > 1 - 6

2009 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE)

In this paper we explore the task of mood classification for blog postings. We propose a novel approach that uses the hierarchy of possible moods to achieve better results than a standard machine learning approach. We also show that using sentiment orientation features improves the performance of classification. We used the Livejournal blog corpus as a dataset to train and evaluate our method.

chapter

True Positive Cost Curve: A Cost-Based Evaluation Method for High-Interaction Client Honeypots

C. Seifert, P. Komisarczuk, I. Welch

2009 Third International Conference on Emerging Security Information, Systems and Technologies > 63 - 69

2009 Third International Conference on Emerging Security Information, Systems and Technologies (SECURWARE)

Client honeypots are security devices designed to find servers that attack clients. High-interaction client honeypots (HICHPs) classify potentially malicious Web pages by driving a dedicated vulnerable Web browser to retrieve and classify these pages. Considering the size of the Internet, the ability to identify many malicious Web pages is a crucial task. HICHPs, however, present challenges: They...

chapter

Learning to Extract Content from News Webpages

A. Spengler, P. Gallinari

2009 International Conference on Advanced Information Networking and Applications Workshops > 709 - 714

2009 IEEE 23rd International Conference on Advanced Information Networking and Applications Workshops (WAINA)

We consider the problem of content extraction from online news Web pages. To explore to what extent the syntactic markup and the visual structure of a Web page facilitate the extraction of its content, we compare two state-of-the-art classifiers as first instantiations of a general framework that allows for proper model comparison. To this end, we introduce the publicly available NEWS600 corpus, a...

chapter

A Talent Classification Method Based on SVM

Hua Hu, Jing Ye, Chunlai Chai

2009 International Symposium on Intelligent Ubiquitous Computing and Education > 160 - 163

2009 International Symposium on Intelligent Ubiquitous Computing and Education, IUCE

Nowadays, any employment and recruitment Web sites receive immense personal information and recruit information every day. But most information canpsilat be properly analyzed and canpsilat meet the recruit requirement. In fact, the recruiting units are looking for talents of both high and low levels talents. However, many talents information canpsilat be evaluated correctly so that the appliers lose...

chapter

News Contents Recommendation Model Based on Feedback of Web Usage

Ping Ni, Jianxin Liao, Xiaomin Zhu, Keyan Ren

2009 WRI World Congress on Computer Science and Information Engineering > 4 > 431 - 435

2009 WRI World Congress on Computer Science and Information Engineering, CSIE

In this paper, reclassification for the current classification through K-means would be implemented based on the feedback of Web usage mining in order to improve the accuracy of news recommendation and convergence of classification. It could extract most relative keywords and eliminate the disturbance of multi-vocal word in one category based on feedback of Web usage. The reclassification of news...

chapter

An Efficient Text Classification Algorithm in E-commerce Application

Wu Da-sheng, Yu Qin-fen, Liu Li-juan

2009 WRI World Congress on Computer Science and Information Engineering > 4 > 458 - 461

2009 WRI World Congress on Computer Science and Information Engineering, CSIE

In this paper, an efficient text classification algorithm for repeating-text information on the e-commerce site can automatically classify and sort the similar string. This algorithm will greatly increase the efficiency and accuracy of audited information. All tests show that for the number of information between 100 and 1000 the algorithm is very efficient, and the 1000 text information(strings)...

chapter

Folksonomy for the Blogosphere: Blog Identification and Classification

Rujiang Bai, Xiaoyue Wang, Junhua Liao

2009 WRI World Congress on Computer Science and Information Engineering > 3 > 631 - 635

2009 WRI World Congress on Computer Science and Information Engineering, CSIE

Traditional automatic classifiers often conduct misclassifications. Folksonomy, a new manual classification scheme based on tagging efforts of users with freely chosen keywords can effective resolve this problem. Even though the scalability of folksonomy is much higher than the other manual classification schemes, the method cannot deal with tremendous number of items such as whole Weblog articles...

chapter

An Analysis of Visual and Presentation Factors Influencing the Design of E-commerce Web Sites

B. Soiraya, A. Mingkhwan, C. Haruechaiyasak

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology > 3 > 525 - 528

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology

Two important factors which indirectly influence the Internet shoppers to make some online purchases are the visual layout and the presentation of web page. In this paper, we propose an approach of web page layout analysis in order to assess the design of e-commerce Web sites. Firstly, our proposed method segments each web page into five different blocks: top, left, center, right and bottom. We study...

chapter

Cool Blog Identi?cation Using Topic-Based Models

K. Sriphaew, H. Takamura, M. Okumura

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology > 1 > 402 - 406

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology

Among a huge number of blogs on the internet, only some of them are considered to have great contents and worth to be explored. We call such kind of blogs cool blogs and attempt to identify them. To solve the cool blog identification problem, we consider three assumptions on cool blogs: (1) cool blogs tend to have definite topics, (2) cool blogs tend to have sufficient amount of blog entries, and...

chapter

The study on Detecting Near-Duplicate WebPages

YuJuan Cao, ZhenDong Niu, WeiQiang Wang, Kun Zhao

2008 8th IEEE International Conference on Computer and Information Technology > 95 - 100

2008 8th IEEE International Conference on Computer and Information Technology

Reprinting information among websites produces a great deal redundant WebPages. To improve search efficiency and user satisfaction, an algorithm to Detect near-Duplicate WebPages (DDW) is proposed. In the course of developing a near-duplicate detection system for a multi-billion page repository, we make two research contributions. First, we consider both syntactic and semantic information to present...

Filter options

Keywords:
CLASSIFICATION
WEB SITES

Publication date

Set your own date range

Keywords

CLASSIFICATION ALGORITHMS (9)
DATA MINING (9)
SUPPORT VECTOR MACHINES (8)
TRAINING (7)
WEB PAGES (7)
FEATURE EXTRACTION (6)
INFORMATION RETRIEVAL (5)
INTERNET (5)
TEXT ANALYSIS (4)
ALGORITHM DESIGN AND ANALYSIS (3)
BLOGS (3)
DISTANCE MEASUREMENT (3)
ARTIFICIAL NEURAL NETWORKS (2)
BLOGOSPHERE (2)
CLUSTERING ALGORITHMS (2)
DATABASES (2)
ELECTRONIC COMMERCE (2)
EMOTION CLASSIFICATION (2)
FEATURE SELECTION (2)
INFORMATION SERVICES (2)
MATHEMATICAL MODEL (2)
MUTUAL INFORMATION (2)
ONLINE FRONT-ENDS (2)
SUPPORT VECTOR MACHINE (2)
SVM (2)
VISUALIZATION (2)
ASSOCIATION RULES (1)
AUTOMATIC CONTENT EXTRACTION (1)
BEHAVIOURAL SCIENCES COMPUTING (1)
BLOG (1)
BLOG CLASSIFICATION (1)
BLOG IDENTIFICATION (1)
BLOG POST (1)
BLOG POSTING (1)
BLOG SEARCH (1)
BUSINESS (1)
C5.0 ALGORITHM (1)
CATEGORY MAP (1)
CENTROID-BASED ALGORITHM (1)
CLASS-FEATURE-CENTROID CLASSIFIER (1)
CLASSIFICATION ALGORITHM (1)
CLASSIFICATION TREE ANALYSIS (1)
CLASSIFIER ENSEMBLES (1)
CLIENT HONEYPOT (1)
CLIENT-SERVER SYSTEMS (1)
COMPANIES (1)
COMPLEXITY THEORY (1)
COMPONENT (1)
COMPUTATIONAL COMPLEXITY (1)
COMPUTATIONAL MODELING (1)
COMPUTER CRIME (1)
CONDITIONAL RANDOM FIELDS (1)
COOL BLOG (1)
COOL BLOG IDENTIFICATION (1)
COST-BASED EVALUATION METHOD (1)
CRAWLING DATA PAGES (1)
CRAWLING RESULT PAGES (1)
CROSS-DOMAIN CLASSIFICATION (1)
DATA EXTRACTION (1)
DEDICATED VULNERABLE WEB BROWSER (1)
DISTANCE FUNCTIONS (1)
DOCUMENT CLASSIFICATION (1)
DOCUMENT HANDLING (1)
E-COMMERCE (1)
E-COMMERCE WEB SITES (1)
ELECTRONIC MAIL (1)
EMAIL CLASSIFICATION (1)
EMOTION KEYWORDS (1)
EMOTION RECOGNITION (1)
EMOTIONALITY FACET (1)
EMPIRICAL ANALYSIS (1)
EQUATIONS (1)
ESTIMATION (1)
EVALUATION (1)
FEATURES (1)
FOLISONOMY-AND-SUPPORT VECTOR MACHINE CLASSIFIER (1)
FOLKSONOMY (1)
FSVMC (1)
HICHP (1)
HIERARCHY (1)
HIGH-INTERACTION CLIENT HONEYPOT (1)
HOME SHOPPING (1)
HTML (1)
HUMAN COMMUNICATION (1)
HUMAN COMPUTER INTERACTION (1)
HUMAN-MACHINE INTERFACE (1)
HUMANS (1)
HYPERLINK FACTOR (1)
HYPERLINKS (1)
INDEXES (1)
INDEXING (1)
INFORMATION REPRINTING (1)
INTERNET SHOPPING (1)
INTRUSION DETECTION (1)
JOB SITE (1)
JOB SPECIFICATION (1)
K-MEANS CLASSIFICATION (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options