A Scalable and Distributed NLP Architecture for Web Document Annotation

Julien Deriviere; Thierry Hamon; Adeline Nazarenko

doi:10.1007/11816508_8

A Scalable and Distributed NLP Architecture for Web Document Annotation

Julien Deriviere, Thierry Hamon, Adeline Nazarenko

Źródło

Lecture Notes in Computer Science > Advances in Natural Language Processing > Research Papers > 56-67

Abstrakt

In the context of the ALVIS project, which aims at integrating linguistic information in topic-specific search engines, we develop a NLP architecture to linguistically annotate large collections of web documents. This context leads us to face the scalability aspect of Natural Language Processing. The platform can be viewed as a framework using existing NLP tools. We focus on the efficiency of the platform by distributing linguistic processing on several machines. We carry out an an experiment on 55,329 web documents focusing on biology. These 79 million-word collections of web documents have been processed in 3 days on 16 computers.

Identyfikatory

ISSN serii :	0302-9743
e-ISSN serii :	1611-3349
ISBN książki :	978-3-540-37334-6
e-ISBN książki :	978-3-540-37336-0
DOI	10.1007/11816508_8

Autorzy

Julien Deriviere

LIPN – UMR CNRS 7030, Villetaneuse, France

Thierry Hamon

LIPN – UMR CNRS 7030, Villetaneuse, France

Adeline Nazarenko

LIPN – UMR CNRS 7030, Villetaneuse, France

Informacje dodatkowe

Właściciel praw autorskich:Springer-Verlag Berlin Heidelberg, 2006

Zbiór danych: Springer

Wydawca

Springer Berlin Heidelberg

rozdział

Czytaj online
Pobierz
Dodaj do przeczytania
Dodaj do kolekcji
Dodaj do obserwowanych
Podziel się

Eksport do bibliografii


Przypisz innemu użytkownikowi
	×
Niepoprawny email

INFONA - portal komunikacji naukowej

A Scalable and Distributed NLP Architecture for Web Document Annotation $("#expandableTitles").expandable();

Źródło

Abstrakt

Identyfikatory

Autorzy

Przypisywanie użytkownika

Potwierdzenie anulowania przypisania

Czy jesteś pewien, że chcesz anulować to przypisanie?

Julien Deriviere

Thierry Hamon

Adeline Nazarenko

Informacje dodatkowe

Wydawca

Podziel się

Eksport do bibliografii

Zgłaszanie błędu / nadużycia

Nieudane wysłanie zgłoszenia

Ułatwienia dostępu

A Scalable and Distributed NLP Architecture for Web Document Annotation