Search results for: Xiaolong Wang

Items from 1 to 5 out of 5 results

chapter

A Block Segmentation Based Approach for Web Information Extraction

Chanwei Wang, Chengjie Sun, Lei Lin, Xiaolong Wang

2010 International Conference on Asian Language Processing > 154 - 157

2010 International Conference on Asian Language Processing (IALP 2010)

This paper addresses the issue of web information extraction to support automatic teacher information management. We propose an effective approach based on block segmentation. First, the teacher introduction web pages are divided into independent blocks, where html tags and punctuation marks are used as segmentation criterion. Then CRF model is employed to label the text. We apply this approach on...

chapter

Extracting Event Temporal Information Based on Web

Bo Yuan, Qingcai Chen, Xiaolong Wang, Liwei Han

2009 Second International Symposium on Knowledge Acquisition and Modeling > 1 > 346 - 350

2009 Second International Symposium on Knowledge Acquisition and Modeling (KAM 2009)

Temporal information is an important characteristic of event. It can be used in information retrieval process to organize the returned result. In Chinese, the presentations of time expression are very complex, which make it difficult to both accurately recognize a time expression and precisely connecting it with a given event in a Web page that contains multiple events. To address these problems,...

chapter

STRank: A SiteRank algorithm using semantic relevance and time frequency

Hongzhi Guo, Qingcai Chen, Xiaolong Wang, Zhiyong Wang, more

2009 IEEE International Conference on Systems, Man and Cybernetics > 4876 - 4881

2009 IEEE International Conference on Systems, Man and Cybernetics. SMC 2009

Most of the researches on Web information processing are concentrated on the Web pages and the hyperlinks among them. One of the important facts that a Web page is just one building block of the whole Website had been ignored. But the situation is gradually changed in recent years for the needs of Website reputation calculation, the high level Website structure mining etc. It causes the Website ranking...

chapter

Basic semantic units based web page content extraction

Jingqi Wang, Qingcai Chen, Xiaolong Wang, Hongzhi Guo

2008 IEEE International Conference on Systems, Man and Cybernetics > 1489 - 1494

2008 IEEE International Conference on Systems, Man and Cybernetics (SMC 2008)

Web page content extraction can be achieved by node-based and segmentation-based algorithms respectively on top of the document object model (DOM). However, the node-based algorithm often removes content embedded as anchor text; while the segmentation-based way can not distinguish irrelevant text from content text when they are divided into the same segment. The two kinds of algorithms don't keep...

chapter

Semantic feature reduction in chinese document clustering

Xianjun Meng, Qingcai Chen, Xiaolong Wang

2008 IEEE International Conference on Systems, Man and Cybernetics > 3721 - 3726

2008 IEEE International Conference on Systems, Man and Cybernetics (SMC 2008)

Text clustering techniques were usually used to structure the text documents into topic related groups which can facilitate users to get a comprehensive understanding on corpus or results from information retrieval system. Most of existing text clustering algorithm which derived from traditional formatted data clustering heavily rely on term analysis methods and adopted vector space model (VSM) as...

Filter options

Keywords:
WEB PAGES

Publication date

Set your own date range

Keywords

DATA MINING (4)
INFORMATION RETRIEVAL (4)
INTERNET (3)
HTML (2)
NATURAL LANGUAGE PROCESSING (2)
TAGGING (2)
TEXT ANALYSIS (2)
TIME FREQUENCY ANALYSIS (2)
ACCURACY (1)
ALGORITHM DESIGN AND ANALYSIS (1)
ANCHOR TEXT (1)
AUTOMATIC TEACHER INFORMATION MANAGEMENT (1)
BASIC SEMANTIC UNIT (1)
BLOCK SEGMENTATION (1)
BUILDINGS (1)
CHINESE DOCUMENT CLUSTERING (1)
CLUSTERING ALGORITHMS (1)
CLUSTERING METHOD (1)
CONFERENCES (1)
CONTENT BASED FEATURES (1)
CONTENT EXTRACTION (1)
CONTENT MANAGEMENT (1)
CRF (1)
CRF MODEL (1)
DAMPING (1)
DICTIONARIES (1)
DISTANCE MEASUREMENT (1)
DOCUMENT OBJECT MODEL TREE (1)
DOCUMENT REPRESENTATION (1)
EDUCATIONAL ADMINISTRATIVE DATA PROCESSING (1)
EDUCATIONAL INSTITUTIONS (1)
EVENT TEMPORAL INFORMATION EXTRACTION (1)
FEATURE EXTRACTION (1)
FEATURE SELECTION (1)
GRAPH THEORY (1)
HEURISTIC RULE (1)
HIDDEN MARKOV MODELS (1)
HTML TAG (1)
HYPERLINKS (1)
INFORMATION EXTRACTION (1)
INFORMATION RETRIEVAL PROCESS (1)
INFORMATION RETRIEVAL SYSTEM (1)
INFORMATION SERVICES (1)
INNOVATIVE EVENT TIME EXTRACTION MODEL (1)
KENDALL'S ?? DISTANCE (1)
LAYOUT (1)
LINE BREAK TAG (1)
NAVIGATION (1)
NODE-BASED ALGORITHM (1)
PAGE SEGMENTATION (1)
PART-OF-SPEECH (1)
PART-OF-SPEECH TAGS (1)
PARTITIONING ALGORITHMS (1)
PATTERN CLUSTERING (1)
PEDIATRICS (1)
PROBABILITY (1)
PROBABILITY DENSITY FUNCTION (1)
PUNCTUATION MARK (1)
SEARCH ENGINES (1)
SEGMENTATION-BASED ALGORITHM (1)
SEMANTIC FEATURE REDUCTION (1)
SEMANTIC RELEVANCE (1)
SEMANTIC UNIT (1)
SEMANTIC WEB (1)
SILICON (1)
SITE RANKING (1)
SITERANK ALGORITHM (1)
SPEARMAN'S FOOTRULE DISTANCE (1)
STRANK (1)
SYNONYM (1)
TERM ANALYSIS (1)
TEXT CLUSTERING (1)
TIME FREQUENCY (1)
TRANSITION PROBABILITY (1)
TREE DATA STRUCTURES (1)
UNSUPERVISED LEARNING (1)
UPDATING FREQUENCY (1)
VECTOR SPACE MODEL (1)
WEB INFORMATION EXTRACTION (1)
WEB INFORMATION PROCESSING (1)
WEB PAGE (1)
WEB PAGE CONTENT EXTRACTION (1)
WEB SITE LINK GRAPHS (1)
WEB SITE RANKING (1)
WEB SITE REPUTATION CALCULATION (1)
WEB SITE STRUCTURE MINING (1)
WEB SITES (1)
more

INFONA - science communication portal

Search results for: Xiaolong Wang

A Block Segmentation Based Approach for Web Information Extraction

Extracting Event Temporal Information Based on Web

STRank: A SiteRank algorithm using semantic relevance and time frequency

Basic semantic units based web page content extraction

Semantic feature reduction in chinese document clustering

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options