Document Analysis Systems VI
6th International Workshop, DAS 2004, Florence, Italy, September 8 - 10, 2004. Proceedings

Items from 1 to 5 out of 5 results

chapter

A Graph-Based Framework for Web Document Mining

Adam Schenker, Horst Bunke, Mark Last, Abraham Kandel

Lecture Notes in Computer Science > Document Analysis Systems VI > Internet Documents > 401-412

In this paper we describe methods of performing data mining on web documents, where the web document content is represented by graphs. We show how traditional clustering and classification methods, which usually operate on vector representations of data, can be extended to work with graph-based data. Specifically, we give graph-theoretic extensions of the k-Nearest Neighbors classification algorithm...

chapter

XML Documents Within a Legal Domain: Standards and Tools for the Italian Legislative Environment

Carlo Biagioli, Enrico Francesconi, Pierluigi Spinosa, Mirco Taddei

Lecture Notes in Computer Science > Document Analysis Systems VI > Internet Documents > 413-424

The Norme in rete (NIR) [Legislation on the Net] national project aims at making easier the retrieval and the navigation between legal documents in a distributed environment and to encourage the development of systems with characteristics of interoperability and effective of use. In order to obtain this, two standards have been defined: a URN standard, to identify these materials through uniform names,...

chapter

Rule-Based Structural Analysis of Web Pages

Fabio Vitali, Angelo Iorio, Elisa Ventura Campori

Lecture Notes in Computer Science > Document Analysis Systems VI > Internet Documents > 425-437

Structural analysis of web pages has been proposed several times and for a number of reasons and purposes, such as the re-flowing of standard web pages to fit a smaller PDA screen. elISA is a rule-based system for the analysis of regularities and structures within web pages that is used for a fairly different task, the determination of editable text blocks within standard web pages, as needed by the...

chapter

Extracting Table Information from the Web

Yeon-Seok Kim, Kyong-Ho Lee

Lecture Notes in Computer Science > Document Analysis Systems VI > Internet Documents > 438-441

With the ubiquity of the Web, the volume of Web documents continues to grow at a rapid speed. Since the Web is a vast source of information, extracting useful information from Web documents is important. HTML (Hypertext Markup Language), which is a format for visual rendering of Web documents, defines tag for representation of a table. On the other hand, most of the existing HTML documents...

chapter

A Neural Network Classifier for Junk E-Mail

Ian Stuart, Sung-Hyuk Cha, Charles Tappert

Lecture Notes in Computer Science > Document Analysis Systems VI > Internet Documents > 442-450

Most e-mail readers spend a non-trivial amount of time regularly deleting junk e-mail (spam) messages, even as an expanding volume of such e-mail occupies server storage space and consumes network bandwidth. An ongoing challenge, therefore, rests within the development and refinement of automatic classifiers that can distinguish legitimate e-mail from spam. A few published studies have examined spam...

Filter options

Part:
Internet Documents
Series:
Lecture Notes in Computer Science

Publication date

Set your own date range

INFONA - science communication portal

Document Analysis Systems VI 6th International Workshop, DAS 2004, Florence, Italy, September 8 - 10, 2004. Proceedings $("#expandableTitles").expandable();

A Graph-Based Framework for Web Document Mining

XML Documents Within a Legal Domain: Standards and Tools for the Italian Legislative Environment

Rule-Based Structural Analysis of Web Pages

Extracting Table Information from the Web

A Neural Network Classifier for Junk E-Mail

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Reporting an error / abuse

Sending the report failed

Accessibility options

Document Analysis Systems VI
6th International Workshop, DAS 2004, Florence, Italy, September 8 - 10, 2004. Proceedings