Search results

Items from 1 to 11 out of 11 results

chapter

Web Information Extraction Algorithm Based on Ontology and DOM Tree

Li Liu, Junfang Shi, Xinrui Liu

2010 International Conference on Computational Intelligence and Software Engineering > 1 - 4

2010 International Conference on Computational Intelligence and Software Engineering (CiSE 2010)

Due to the information on the Web being tremendous, dynamic and irregular, it is difficult to search and integrate information from the Web. This paper proposes a Web information extraction algorithm based on Ontology and DOM tree. The areas are accurately found out and the interested information is extracted exactly by information extraction rules generated by ontology. Furthermore this algorithm...

chapter

Web Data Extraction Based on Tree Structure Analysis and Template Generation

Haikun Hong, Xiaoxin Chen, Guoshi Wu, Jing Li

2010 International Conference on E-Product E-Service and E-Entertainment > 1 - 5

2010 International Conference on E-Product E-Service and E-Entertainment (ICEEE 2010)

This paper studies the problem of extracting data from large numbers of semi-structured web pages. The fact that many websites have enormous pages generated dynamically from a underlying structured source like a database makes it feasible to induct a common template for similar web pages and then extract data accordingly. Previous work on this problem has limited practical utility because of either...

chapter

An Information Retrieval Method Based on Sequential Access Patterns

Xiaogang Wang, Yan Bai, Yue Li

2010 Asia-Pacific Conference on Wearable Computing Systems > 247 - 250

2010 Asia-Pacific Conference on Wearable Computing Systems (APWC 2010)

It has become much more difficult to access relevant information from the Web With the explosive growth of information available on the World Wide Web. One of the promising approaches is web usage mining, which mines web logs for user models and recommendations. Different from most web recommender systems that are mainly based on clustering and association rule mining, this paper proposes an web personalization...

chapter

Web behind Web - A Steganographic Web Framework

H. Hioki

2009 Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing > 60 - 63

2009 Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing. IIH-MSP 2009

This paper presents a Web framework as an application of steganography. The framework enables us to stealthily organize a tree of Web objects behind another. The Web objects to be embedded are automatically assigned to cover files appropriately. When embedding is done, we obtain stego-objects those can be uploaded to a Web server as ordinary Web objects. We can retrieve files embedded in stego-objects...

chapter

FastWrap: An efficient wrapper for tabular data extraction from the web

M.S. Amin, H. Jamil

2009 IEEE International Conference on Information Reuse&Integration > 354 - 359

2009 IEEE International Conference on Information Reuse & Integration (IRI 2009)

In the last few years, several works in the literature have addressed the problem of data extraction from Web pages. The importance of this problem derives from the fact that, once extracted, data can be handled in a way similar to instances of a traditional database, which in turn can facilitate application of Web data integration and various other domain specific problems. In this paper, we propose...

chapter

Reverse Method for Labeling the Information from Semi-Structured Web Pages

Z. Akbar, L.T. Handoko

2009 International Conference on Signal Processing Systems > 551 - 555

2009 International Conference on Signal Processing Systems (ICSPS)

We propose a new technique to infer the structure and extract the tokens of data from the semi-structured Web sources which are generated using a consistent template or layout with some implicit regularities. The attributes are extracted and labeled reversely from the region of interest of targeted contents. This is in contrast with the existing techniques which always generate the trees from the...

chapter

Data Retrieval and Security Using Lightweight Directory Access Protocol

M. Salim, M.S. Akhtar, M.A. Qadeer

2009 Second International Workshop on Knowledge Discovery and Data Mining > 685 - 688

2009 Second International Workshop on Knowledge Discovery and Data Mining. WKDD 2009

In the present world of communication and information interchange where more and more users are bound to use the same services and data with different access levels, the need for providing protection against potential breach of secured data has gained profound importance. Some of the common services are virtual private network (VPN), remote access server (RAS), Web server, mail server etc. In the...

chapter

An Approach to Extracting Central URLs on Catalog Page

He Bai, JinLin Wang, Ye Li

2008 International Symposium on Knowledge Acquisition and Modeling > 388 - 392

2008 International Symposium on Knowledge Acquisition and Modeling (KAM)

Catalog pages construct the intermediate layer in architecture of a standard Web site; therefore research on information retrieval for this kind of pages can be beneficial to improve Web crawler's efficiency. A page is called "catalog-style" if its main body is displayed as a sequence of regular entries, and the central link in each entry apparently contains the pagepsilas major information...

chapter

An Efficient Algorithm for Web Access Pattern Mining

Sen Yang, Jiankui Guo, Yangyong Zhu

Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007) > 2 > 726 - 729

2007 International Conference on Fuzzy Systems and Knowledge Discovery

WAP-Mine is one of algorithms for mining frequent web access patterns from web access database. It generates frequent web access patterns by recursively mining the web access pattern trees by use of WAP-tree. But in the process of mining frequent web access pattern, WAP-Mine generates many intermediate data which lowers efficiency especially at the lower support. In this paper, TD-Mine, a new algorithm...

chapter

The Mining and Extraction of Primary Informative Blocks and Data Objects from Systematic Web Pages

Yi-feng Tseng, Hung-yu Kao

2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'6) > 370 - 373

2006 IEEE/WIC/ACM International Conference on Web Intelligence

With the fast development of Internet, the Web has already been an enormous database so far, which contains extremely abundant information. Most of Web pages are represented their content by using a list of objects, such as search engine results, product information of shopping Web sites and so on, and these objects form the primary information of each page. In this paper, we focus on the issues of...

chapter

Discovery of Maximally Frequent Tag Tree Patterns with Height-Constrained Variables from Semistructured Web Documents

Y. Suzuki, T. Miyahara, T. Shoudai, T. Uchida, more

International Workshop on Challenges in Web Information Retrieval and Integration > 104 - 112

Proceedings. International Workshop on Challenges in Web Information Retrieval and Integration

In order to realize Web information retrieval using characteristic tree structured patterns in semistructured Web documents, methods for discovering frequent patterns or common characteristics in semistructured documents become more and more important. We have studied methods for discovering maximally frequent tree structured patterns in semistructured Web documents. A tag tree pattern is an edge...

Filter options

Data set:
ieee
Keywords:
DATA MINING
INFORMATION RETRIEVAL
INTERNET
TREE DATA STRUCTURES

Publication date

Set your own date range

Content availability

Available (8)
None (3)

Keywords

WEB PAGES (5)
HTML (4)
ACCURACY (2)
DATABASES (2)
DOCUMENT HANDLING (2)
INDEXES (2)
LAYOUT (2)
PEDIATRICS (2)
SEARCH ENGINES (2)
SECURITY (2)
WEB PAGE (2)
WEB SERVER (2)
WEB SITES (2)
XML (2)
ACCESS CONTROL (1)
ACCESS PATTERNS (1)
ACCESS PROTOCOLS (1)
ACLS (1)
ARTIFICIAL INTELLIGENCE (1)
ATTRIBUTE EXTRACTION (1)
AUTHENTICATION (1)
AUTHORISATION (1)
AUTOMATIC TABLE STRUCTURE DISCOVERY (1)
BACK-END DATABASE (1)
BLOCK IMPORTANCE (1)
BOOKS (1)
CATALOG PAGE (1)
CATALOGS (1)
CATALOGUING (1)
CENTRAL URL EXTRACTION (1)
CHARACTERISTIC TREE STRUCTURED PATTERNS (1)
CLASSIFICATION ALGORITHMS (1)
CLIENT (1)
CLUSTERING ALGORITHMS (1)
CONSISTENT TEMPLATE (1)
CONSTRUCTION INDUSTRY (1)
CONTENT-BASED FEATURE (1)
COVER FILE RETRIEVAL (1)
DATA EXTRACTION (1)
DATA HANDLING (1)
DATA OBJECT (1)
DATA RETRIEVAL (1)
DATA SECURITY (1)
DATABASE (1)
DEDICATED PROXY (1)
DIRECTORY INFORMATION TREE (1)
DOCUMENT OBJECT MODEL (1)
DOM (1)
DOM TREE (1)
DOM TREE BASED ANALYSIS (1)
EDGE LABELED TREE (1)
ENGINES (1)
FASTWRAP WRAPPER GENERATION (1)
FREQUENT PATTERN DISCOVERY (1)
FREQUENT WEB ACCESS PATTERN MINING (1)
HEIGHT-CONSTRAINED VARIABLES (1)
HEURISTIC ALGORITHMS (1)
HEURISTIC RULE (1)
HUMAN INTERVENTION (1)
HUMANS (1)
IMPLICIT REGULARITY (1)
INFORMATION EXTRACTION (1)
INFORMATION LABELING (1)
INFORMATION RETRIEVAL METHOD (1)
INFORMATIVE BLOCK (1)
LABELING (1)
LDAP (1)
LDIF (1)
LEARNING (ARTIFICIAL INTELLIGENCE) (1)
LIGHTWEIGHT DIRECTORY ACCESS PROTOCOL (1)
LINEAR TIME ALGORITHM (1)
MACHINE LEARNING (1)
MACHINE LEARNING CLASSIFICATION (1)
MAXIMALLY FREQUENT TAG TREE PATTERNS (1)
MESSAGE AUTHENTICATION (1)
NAVIGATION (1)
NEWS SITE (1)
ONLINE FRONT-ENDS (1)
ONTOLOGIES (1)
ONTOLOGIES (ARTIFICIAL INTELLIGENCE) (1)
ONTOLOGY (1)
PATTERN CLASSIFICATION (1)
PATTERN CLUSTERING (1)
PATTERN MATCHING (1)
PATTERN MINING (1)
PATTERN TREE (1)
PERSONALIZATION (1)
PROBABILITY DENSITY FUNCTION (1)
PROXY (1)
RAS (1)
RECOMMENDER SYSTEMS (1)
REGULAR EXPRESSION GENERATION (1)
REVERSE METHOD (1)
SCHEMAS (1)
SEARCH ENGINE (1)
SEMI STRUCTURED WEB PAGE (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options