Search results

Items from 1 to 17 out of 17 results

chapter

Presenting a fuzzy relation to classify the Persian Web documents

A Yari, A Abbasi, S Moemen Bellah

2010 IEEE International Conference on Intelligent Computing and Intelligent Systems > 2 > 220 - 223

2010 IEEE International Conference on Intelligent Computing and Intelligent Systems (ICIS 2010)

Accumulation of existing web documents on the Internet from one side and rapid changes of these pages and their exponential growth made their manually organizing and retrieving almost impossible. Therefore it is necessary to have a system that can automatically put these pages into the related classes to provide their results for the applied tools to be used. Unfortunately, the classification of Persian...

chapter

Associative web document classification based on word mixed weight

Xingyi Li, Jun Lan, Huaji Shi

2010 3rd International Conference on Computer Science and Information Technology > 3 > 578 - 581

2010 3rd IEEE International Conference on Computer Science and Information Technology (ICCSIT 2010)

There are two shortages when the method of classification based on association rules is applied to classify the web documents: one is that the method process the web document as a plain text, ignoring the HTML tags information of the web page; another is that either item of the association rules is only the word in the web page, without considering the weight of the word, or it quantifies the weight...

chapter

Web Document Classification Based on Fuzzy k-NN Algorithm

Juan Zhang, Yi Niu, Huabei Nie

2009 International Conference on Computational Intelligence and Security > 1 > 193 - 196

2009 International Conference on Computational Intelligence and Security (CIS 2009)

Web document classification is an important technique of Web mining. Web pages classification has been studied extensively since the Internet has become a huge database of information. The k-NN is a simple classification algorithm that is used to assign patterns of unknown classification to the class of the majority of its k nearest neighbors of known classification according to the distance measure,...

chapter

The Automatic Categorization of Arabic Documents by Boosting Decision Trees

S Raheel, J Dichy, M Hassoun

2009 Fifth International Conference on Signal Image Technology and Internet Based Systems > 294 - 301

2009 Fifth International Conference on Signal-Image Technology & Internet-Based Systems (SITIS 2009)

Automatic document classification has been subject to research since the early 1960s. However, additional research is still required and possible because the results obtained until now remain subject to further enhancement and refinement. Although a lot of literature has been written on the subject, very little research was reported on the automatic classification of Arabic documents none of which...

chapter

Classifying Sentence-Based Summaries of Web Documents

M.S. Pera, Yiu-Kai Ng

2009 21st IEEE International Conference on Tools with Artificial Intelligence > 433 - 440

2009 21st IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2009)

Text classification categories Web documents in large collections into predefined classes based on their contents. Unfortunately, the classification process can be time-consuming and users are still required to spend considerable amount of time scanning through the classified Web documents to identify the ones that satisfy their information needs. In solving this problem, we first introduce CorSum,...

chapter

Opinion Extraction & Classification of Reviews from Web Documents

S.K. Shandilya, S. Jain

2009 IEEE International Advance Computing Conference > 924 - 927

2009 IEEE International Advance Computing Conference. IACC 2009

Automatic extraction of opinions on products from Web has been receiving interest increasingly. Such extracted knowledge helps to find out what other people think about the particular product or service. With the growing availability of resources like online review sites and personal blogs, new opportunities and challenges arise as people can, and do, actively use information technologies to seek...

chapter

A Combined Template-Based and Case-Based Metadata Extraction for Heterogeneous Thai Documents

K. Khankasikam, N. Chakpitak, T. Udomsripaiboon

2009 International Conference on Advanced Computer Control > 292 - 296

2009 International Conference on Advanced Computer Control (ICACC)

Nowadays, a number of universities, laboratories, government agencies and companies that placing theirs documents online and making them searchable are increasing because the Internet infrastructure for global data access is fully functional. However, a large number of organizations have documents that lack metadata. The lack of metadata breaks off not only the discovery and dissemination of these...

chapter

Web Pages Classification and Clustering by Means of Genetic Algorithms: A Variable Size Page Representing Approach

Z. Hossaini, A.M. Rahmani, S. Setayeshi

2008 International Conference on Computational Intelligence for Modelling Control&Automation > 436 - 440

2008 International Conference on Computational Intelligence for Modelling Control & Automation (CIMCA 2008)

Arranging mass of data in related groups is an important way that helps us to decide about them better, clustering and classification are two efficient methods of grouping huge volume of data, most of clustering and classification methods that work on Web pages grouping problems, use fixed size vectors in their learning algorithm. In the real world of WWW this assumption is not reliable. In this paper...

chapter

NS-IMMC: A Method of Generating the Cause-and-Effect of News Topic

Wang Zhi-ming, Zhou Xusheng

2008 International Symposium on Information Science and Engineering > 1 > 330 - 335

2008 International Symposium on Information Science and Engineering (ISISE)

On the base of NS-IMMC, this paper propose a new method of generating the cause-and-effect of news topic. The new method choose representative sentences for news documents according to the specialty of news structure (NS, News structure), and then utilizes IMMC (Improved Min-Max clustering) to classify these representative sentences to generate multi-documents summary which represents the topic cause-and-effect...

chapter

Emotion Classification of Online News Articles from the Reader's Perspective

K.H.-Y. Lin, Changhua Yang, Hsin-Hsi Chen

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology > 1 > 220 - 226

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology

Past studies on emotion classification focus on the writerpsilas emotional state. This research addresses the reader aspect instead. The classification of documents into reader-emotion categories has several applications. One of them is to integrate reader-emotion classification into a Web search engine to allow users to retrieve documents that contain relevant contents and at the same time instill...

chapter

Ontology-supported webpage classifier for scholar’s webpages in ubiquitous information environment

Dong-Liang Lee, Sheng-Yuan Yang, Chun-Liang Hsu

2008 First IEEE International Conference on Ubi-Media Computing > 523 - 528

2008 First IEEE International Conference on Ubi-media Computing (U-Media 2008)

In this most developed and shifting era of Internet, the information of Internet does massively increase. Webpage indexing catalogues or search engines which can help Demanders on information to rapidly and precisely collect Web information on Internet have already become the indispensable and important tools in Internet. How to precisely do Webpage classification that can effectively assist with...

chapter

An Algorithm for Classifying Articles and Patent Documents Using Link Structure

K.V. Indukuri, P. Mirajkar, A. Sureka

2008 The Ninth International Conference on Web-Age Information Management > 203 - 210

2008 9th International Conference on Web-Age Information Management (WAIM)

Studying link structure of the World Wide Web (WWW) is an area which has attracted a lot of interest. Several papers have been published on structural analysis of hyperlinked environments such as the WWW. The WWW can be modeled as a graph and valuable information can be derived by analyzing links between the Web-pages primarily for the purpose of building better search engines. Many novel methods...

chapter

Free-Form Annotation Tool for Collaboration

Han-Zhen Wu, S.J.H. Yang, Yu-Sheng Su

2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing (sutc 2008) > 555 - 560

2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing (SUTC '08)

The people often establish taking notes on reading and browsing activities; hence annotation is being very important in human life in anytime and anywhere. We developed a free-form annotation tool for collaboration that provides a convenient way to create annotation easily. Our approach is characterized by two design criteria, including: 1) digital ink annotation: help users to focus on annotated...

chapter

A PSO-Based Web Document Classification Algorithm

Ziqiang Wang, Qingzhou Zhang, Dexian Zhang

Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD 2007) > 3 > 659 - 664

2007 8th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing

Due to the exponential growth of documents in the Internet and the emergent need to organize them, the automatic document classification has received an ever-increased attention in the recent years. The particle swarm optimization (PSO) algorithm, new to the document classification community, is a robust stochastic evolutionary algorithm based on the movement and intelligence of swarms. In this paper,...

chapter

Extraction of Reliable Reputation Information Using Contributor's Stance

Takayuki Yamada, Daisaku Sakano, Yoshiaki Yasumura, Kuniaki Uehara

2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'6) > 382 - 385

2006 IEEE/WIC/ACM International Conference on Web Intelligence

This paper describes a method for extracting reliable reputation on the Web. In this research, reliable reputation is the information that has an opposite polarity value of contributor's stance (positive or negative). We call this information "fair reputation". In order to extract fair reputations, we develop the following two tasks. The first task is classification of feedback documents...

chapter

Analysis of Web Search Engine Clicked Documents

David Nettleton, Liliana Calderon-Benavides, Ricardo Baeza-Yates

2006 Fourth Latin American Web Congress > 209 - 219

2006 4th Latin American Web Congress

In this paper we process and analyze Web search engine query and click data from the perspective of the documents (URs) selected. We initially define possible document categories and select descriptive variables to define the documents. The URL dataset is preprocessed and analyzed using some traditional statistical methods, and then processed by the Kohonen (1984) SOM clustering technique, which we...

chapter

Enhancing Web Based Services by Coupling Document Classification with User Profile

G. Potamias, L. Koumakis, V.S. Moustakis

EUROCON 2005 - The International Conference on "Computer as a Tool" > 1 > 205 - 208

EUROCON 2005-The International Conference on 'Computer as a Tool'

Current Web search engines are not able to adapt their operations to the evolving needs, interests and preferences of the users. To cope with this weakness, we developed a system able to classify HTML (or, XML) documents into user prespecified categories of interests. The system processes the user current profile and a set of representative documents - for each category of interest, and produces a...

Filter options

Keywords:
INTERNET
DOCUMENT HANDLING

Publication date

Set your own date range

Content availability

Available (16)
None (1)

Keywords

DATA MINING (7)
INFORMATION RETRIEVAL (5)
PATTERN CLASSIFICATION (5)
CLASSIFICATION ALGORITHMS (4)
DOCUMENT CLASSIFICATION (4)
ACCURACY (3)
PATTERN CLUSTERING (3)
QUERY PROCESSING (3)
SEARCH ENGINES (3)
TRAINING (3)
WEB DOCUMENT CLASSIFICATION (3)
WEB SEARCH ENGINE (3)
WEB SITES (3)
ARTIFICIAL NEURAL NETWORKS (2)
AUTOMATIC DOCUMENT CLASSIFICATION (2)
CLUSTERING (2)
CONFERENCES (2)
DECISION TREES (2)
EQUATIONS (2)
FEATURE EXTRACTION (2)
FUZZY SET THEORY (2)
HELIUM (2)
INFORMATION SCIENCE (2)
PROBABILITY DENSITY FUNCTION (2)
SUPPORT VECTOR MACHINES (2)
TEXT CATEGORIZATION (2)
TEXT MINING (2)
WEB MINING (2)
WEB PAGES CLASSIFICATION (2)
WORLD WIDE WEB (2)
ANCHOR (1)
ANNOTATION (1)
ARABIC DOCUMENT CLASSIFICATION (1)
ARABIC DOCUMENT CLASSIFICATION CATEGORIZATION (1)
ARTICLES CLASSIFICATION (1)
ARTIFICIAL INTELLIGENCE (1)
ASSOCIATED CLASSIFICATION METHOD (1)
ASSOCIATION RULES (1)
ASSOCIATIVE WEB DOCUMENT CLASSIFICATION (1)
AUTHORITATIVE WEB PAGES (1)
AUTOMATIC DOCUMENT CATEGORIZATION (1)
BIBLIOGRAPHY COUPLING (1)
BISMUTH (1)
BOOSTING (1)
BUILDINGS (1)
C4.5 RULE INDUCTION (1)
CASE-BASED METADATA EXTRACTION (1)
CASE-BASED REASONING (1)
CAUSE-AND-EFFECT (1)
CAUSE-EFFECT ANALYSIS (1)
CHARACTER GENERATION (1)
CITATION ANALYSIS (1)
CITATION GRAPH (1)
CLASSIFICATION SCHEMA (1)
CLASSIFICATION TREE ANALYSIS (1)
CLICK DATA ANALYSIS (1)
CLUSTERING ALGORITHMS (1)
CLUSTERING METHODS (1)
CO-CITATION (1)
COGNITION (1)
COLLABORATION (1)
COMPRESSION ALGORITHMS (1)
COMPUTATIONAL LINGUISTICS (1)
CONTRIBUTOR STANCE (1)
CORRELATION (1)
CORSUM-GENERATED SUMMARIES (1)
DAMPING (1)
DATA CLASSIFICATION (1)
DATA CLUSTERING (1)
DATASET (1)
DECISION SUPPORT SYSTEMS (1)
DECISION TREE (1)
DECISION TREES BOOSTING (1)
DICTIONARIES (1)
DIGITAL INK ANNOTATION (1)
DOCUMENT ANALYSIS (1)
DOCUMENT CATEGORIZATION (1)
DOCUMENT LEVEL REPUTATION (1)
DOCUMENT ORGANIZATION (1)
DOCUMENT RETRIEVAL (1)
DOCUMENT SIMILARITY (1)
EDUCATIONAL INSTITUTIONS (1)
EMOTION RANKING (1)
EMOTION RECOGNITION (1)
ENTROPY (1)
EVOLUTIONARY COMPUTATION (1)
EXTRACTIVE SINGLE-DOCUMENT SUMMARIZATION APPROACH (1)
FEEDBACK DOCUMENT CLASSIFICATION (1)
FIXED SIZE VECTOR (1)
FREE-FORM (1)
FREE-FORM ANNOTATION TOOL (1)
FUZZY K-NEAREST NEIGHBOR (1)
FUZZY K-NN (1)
FUZZY K-NN ALGORITHM (1)
FUZZY MODEL (1)
FUZZY RELATION (1)
FUZZY THEORY (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options