Search results

Items from 1 to 20 out of 35 results

chapter

Clustering search engine suggests by integrating a topic model and word embeddings

Tian Nie, Yi Ding, Chen Zhao, Youchao Lin, more

2017 18th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD) > 581 - 586

2017 18th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)

The background of this paper is the issue of how to overview the knowledge of a given query keyword. Especially, we focus on concerns of those who search for Web pages with a given query keyword. The Web search information needs of a given query keyword is collected through search engine suggests. Given a query keyword, we collect up to around 1,000 suggests, while many of them are redundant. We cluster...

chapter

Towards Solving Comprehensibility-Relevance Trade-off in Information Retrieval

Kouichi Akamatsu, Adam Jatowt, Katsumi Tanaka

2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT) > 1 > 1 - 8

2015 IEEE / WIC / ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)

Comprehensibility is an important quality aspect of documents. Incomprehensible documents are of little utility to readers even if they are relevant. However, for many difficult queries such as technical ones, the topically relevant documents tend to be characterized by poor comprehensibility. This makes it difficult for users to satisfy their information needs when searching for documents about difficult...

chapter

Combining Attributes and Links: Finding Homepage for Entity Searching

Junsan Zhang, Haoliang Sun, Qinghua Lu, Aiyan Zhang

2015 International Conference on Computational Intelligence and Communication Networks (CICN) > 1386 - 1390

2015 International Conference on Computational Intelligence and Communication Networks (CICN)

Web entities contain a wealth of information. Customers would more like to get a list of relevant entities instead of a list of web pages when they submit a query to the search engine. So the research of related entity finding (REF) is a meaningful work. In this paper we investigate the last task of REF: Entity Homepage Finding. In this paper, we propose a combining multi-attributes (five attributes)...

chapter

Static Analysis of JavaScript Web Applications in the Wild via Practical DOM Modeling (T)

Changhee Park, Sooncheol Won, Joonho Jin, Sukyoung Ryu

2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE) > 552 - 562

2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)

We present SAFEWapp, an open-source static analysis framework for JavaScript web applications. It provides a faithful (partial) model of web application execution environments of various browsers, based on empirical data from the main web pages of the 9,465 most popular websites. A main feature of SAFEWapp is the configurability of DOM tree abstraction levels to allow users to adjust a trade-off between...

chapter

A Hybrid Model for Experts Finding in Community Question Answering

Hai Li, Songchang Jin, Shudong LI

2015 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery > 176 - 185

2015 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC)

As a means to share knowledge, the community question answering (CQA) service provides users a chance to obtain or provide help by raising or answering questions. After a question is posted, the system must find an appropriate individual to answer this question. Several approaches have recently been proposed to find experts in CQA. In this paper, a new method to find experts in CQA is proposed by...

chapter

Link Analysis of Wikipedia Documents Using MapReduce

Vasa Hardik, Vasudevan Anirudh, Palanisamy Balaji

2015 IEEE International Conference on Information Reuse and Integration > 582 - 588

2015 IEEE International Conference on Information Reuse and Integration (IRI)

Wikipedia, a collaborative and user driven encyclopedia is considered to be the largest content thesaurus on the web, expanding into a massive database housing a huge amount of information. In this paper, we present the design and implementation of a MapReduce-based Wikipedia link analysis system that provides a hierarchical examination of document connectivity in Wikipedia and captures the semantic...

chapter

Enhancing Efficiency of Web Search Engines through Ontology Learning from Unstructured Information Sources

Eslam Amer

2015 IEEE International Conference on Information Reuse and Integration > 542 - 549

2015 IEEE International Conference on Information Reuse and Integration (IRI)

With the fast growth rate of information availability through the World Wide Web, search engines' ranking become limited to deal with such enormous amount of information. Web search engines should be enriched with methodologies that enable it to understand the content of Web pages, then to align pages to the correct query category that highly match its content. In this paper, a proposed system is...

chapter

An Information Classification Approach Based on Knowledge Network

Huakang Li, Guozi Sun, Bei Xu, Li Li, more

2014 IEEE 8th International Symposium on Embedded Multicore/Manycore SoCs > 3 - 8

2014 IEEE 8th International Symposium on Embedded Multicore/Manycore SoCs (MCSoC)

Numerous critical Internet applications with high-quality services, such as Web directory, search engine, Web crawler, recommendation system and user profile detector, etc. Almost depend on the efficient and accurate of web page classification system. Traditional supervised or semi-supervised machine learning methods become more and more difficult to adapt to the explosive Internet information. This...

chapter

User Interest Profile Identification Using Wikipedia Knowledge Database

Huakang Li, Longbin Lai, Xiaofeng Xu, Yao Shen, more

2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing > 2362 - 2367

2013 IEEE International Conference on High Performance Computing and Communications (HPCC) & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (EUC)

The interesting, targeted, relevant advertisement is considered as one of the most honest proceeds for personalizing recommendation. Topic identification is the most important technique for the unstructured web pages. Conventional content classification approaches based on bag of words are difficult to process massive web pages. In this paper, Wikipedia Category Network (WCN) nodes are used to identify...

chapter

A Bookmark Recommender System Based on Social Bookmarking Services and Wikipedia Categories

Takumi Yoshida, Ushio Inoue

2013 14th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing > 409 - 413

2013 14th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)

Social book marking services allow users to add bookmarks of web pages with freely chosen keywords as tags. Personalized recommender systems recommend new and useful bookmarks added by other users. We propose a new method to find similar users and to select relevant bookmarks in a social book marking service. Our method is lightweight, because it uses a small set of important tags for each user to...

chapter

Statistical Model for Content Extraction

Pir Abdul Rasool Qureshi, Nasrullah Memon

2011 European Intelligence and Security Informatics Conference > 129 - 134

2011 European Intelligence and Security Informatics Conference (EISIC)

We present a statistical model for content extraction from HTML documents. The model operates on Document Object Model (DOM) tree of the corresponding HTML document. It evaluates each tree node and associated statistical features to predict significance of the node towards overall content of the document. The model exploits feature set including link densities and text distribution across the nodes...

chapter

Automatic Annotation of Non-English Web Content

Jakub evcech, M´ria Bielikov´

2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology > 3 > 281 - 284

2011 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT)

Nowadays we are facing the daily information overload. It is thus difficult to get exactly the information we need. It often happens that while reading, we find a word we do not understand and we would need an explanation or some additional information about this word. For this purpose annotations in the Web environment are created and attached to such words. In this paper we propose a method for...

chapter

A Fast and Accurate Approach for Main Content Extraction Based on Character Encoding

Hadi Mohammadzadeh, Thomas Gottron, Franz Schweiggert, Gholamreza Nakhaeizadeh

2011 22nd International Workshop on Database and Expert Systems Applications > 167 - 171

2011 22nd International Conference on Database and Expert Systems Applications (DEXA)

This paper presents a novel approach for extracting the main content from Web documents written in languages not based on the Latin alphabet. In practice, the HTML tags are based on the English language and, certainly, the English character set is encoded in the interval [0,127] of the Unicode character set. On the other hand, many languages, such as the Arabic language, use a different interval for...

chapter

Cost-Optimal Validation Mechanisms and Cheat-Detection for Crowdsourcing Platforms

Matthias Hirth, Tobias Hoßfeld, Phuoc Tran-Gia

2011 Fifth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing > 316 - 321

2011 Fifth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS)

Crowd sourcing is becoming more and more important for commercial purposes. With the growth of crowd sourcing platforms like MTurk or Micro workers, a huge work force and a large knowledge base can be easily accessed and utilized. But due to the anonymity of the workers, they are encouraged to cheat the employers in order to maximize their income. Thus, this paper presents two crowd-based approaches...

chapter

Non-invasive Browser Based User Modeling Towards Semantically Enhanced Personlization of the Open Web

K Koidl, O Conlan, Lai Wei, A M Saxton

2011 IEEE Workshops of International Conference on Advanced Information Networking and Applications > 35 - 40

2011 25th IEEE International Conference on Advanced Information Networking and Applications Workshops (WAINA 2011)

Currently the user's web search is disjoint from the resources which is subsequently browsed. Specifically the related instances of the search are not displayed on the following pages. This lack of continuity between the actual search and the web sites displayed may lead to skimming by the user to identify what is relevant on the pages. This paper presents an approach to the continuous modeling of...

chapter

Accessing dynamic web page in users language

M K Sharma, P K Saha, S Sarcar, S Ghosh, more

IEEE Technology Students' Symposium > 35 - 38

2011 IEEE Students' Technology Symposium (TechSym)

In recent years, there is a rapid advancement in Information and Communication Technology (ICT). However, the explosive growth of ICT and its many applications in education, health, agriculture etc. are confined to a limited number of privileged people who have both language and digital literacy. At present the repositories in Internet are mainly in English, as a consequence users unfamiliar to English...

chapter

Using Wikipedia's Content for Cross-Website Page Recommendations that Consider Serendipity

Pei-Chia Chang, L M Quiroga

2010 International Conference on Technologies and Applications of Artificial Intelligence > 293 - 298

2010 International Conference on Technologies and Applications of Artificial Intelligence (TAAI 2010)

A majority of web personalization research concentrates on customizing a single website. On the contrary, recommending web pages across websites is the focus of this study. We emphasize that eliciting user interests among different topics within a domain is an important concern in cross-website page recommendations. Enhancing Wikipedia's categorization system through heuristic information extraction,...

chapter

Multi-Type Web Relation Extraction Based on Bootstrapping

Xiaojiang Liu, Nenghai Yu

2010 WASE International Conference on Information Engineering > 2 > 24 - 27

2010 WASE International Conference on Information Engineering (ICIE 2010)

Web-scale relation extraction is crucial to building the Web people search engines. Previous extraction models, such as Snowball, focus only on single type extraction, while the real applications always require as many as possible types of relation. In this paper, we propose a novel Web-scale relation extraction framework Multi-Type Snowball (MultiSnowball). MultiSnowball targets at extracting multiple...

chapter

Country of origin determination via Web mining techniques

Markus Schedl, Cornelia Schiketanz, Klaus Seyerlehner

2010 IEEE International Conference on Multimedia and Expo > 1451 - 1456

2010 IEEE International Conference on Multimedia and Expo (ICME)

The origin of a music artist or a band is an important kind of musical meta-data as it usually influences his/her/its music. In this paper, we propose three approaches to automatically determine the country of origin of a person or institution, which we apply to music artists and bands. The first approach investigates estimates of page counts returned for specific queries to Web search engines. The...

chapter

Combining the Missing Link: An Incremental Topic Model of Document Content and Hyperlink

Huifang Ma, Zhixin Li, Zhongzhi Shi

2010 12th International Asia-Pacific Web Conference > 229 - 235

2010 12th Asia Pacific Web Conference (APWEB 2010)

The content and structure of linked information such as sets of web pages or research paper archives are dynamic and keep on changing. Even though different methods are proposed to exploit both the link structure and the content information, no existing approach can effectively deal with this evolution. We propose a novel joint model, called Link-IPLSI, to combine texts and links in a topic modeling...

Data set:
ieee
Keywords:
INTERNET
ENCYCLOPEDIAS
WEB PAGES
Publication type:
book

Publication date

Set your own date range

Content availability

Available (34)
None (1)

Keywords

ELECTRONIC PUBLISHING (24)
DATA MINING (14)
SEARCH ENGINES (8)
INFORMATION RETRIEVAL (7)
INFORMATION SERVICES (6)
TEXT ANALYSIS (5)
WIKIPEDIA (5)
ONTOLOGIES (4)
TRAINING (4)
ADAPTATION MODEL (3)
DICTIONARIES (3)
DOCUMENT HANDLING (3)
FEATURE EXTRACTION (3)
HTML (3)
NATURAL LANGUAGE PROCESSING (3)
SEMANTICS (3)
WEB SITES (3)
WORLD WIDE WEB (3)
ACCURACY (2)
BROWSERS (2)
COMPUTATIONAL MODELING (2)
CRAWLERS (2)
ENTITY RANKING (2)
FUZZY SET THEORY (2)
IMAGE RETRIEVAL (2)
INDEXES (2)
INFORMATION FILTERING (2)
LANGUAGE TRANSLATION (2)
LINK ANALYSIS (2)
META DATA (2)
ONLINE FRONT-ENDS (2)
ONTOLOGIES (ARTIFICIAL INTELLIGENCE) (2)
QUERY PROCESSING (2)
RECOMMENDER SYSTEMS (2)
SEMANTIC SEARCH (2)
WEB 2.0 (2)
WEB DOCUMENTS (2)
WEB PAGE (2)
WEB SEARCH ENGINE (2)
WEB SITE (2)
ADAPTIVE ANNOTATIONS (1)
ALGORITHM (1)
ALGORITHM DESIGN AND ANALYSIS (1)
ANALYTICAL MODELS (1)
APPROXIMATION METHODS (1)
ARTIFICIAL NEURAL NETWORKS (1)
ASCII AND NON-ASCII CHARACTER SET (1)
AUTOMATIC CLASSIFICATION (1)
AVATARS (1)
BAYES METHODS (1)
BEAM SEARCH (1)
BENCHMARK TESTING (1)
BOOK REVIEWS (1)
BOOTSTRAPPING (1)
BROWSER BASED PLUGIN (1)
BROWSING SUPPORT (1)
CAMERAS (1)
CANDIDATE QUERY (1)
CATEGORIZATION SYSTEM (1)
CHEAT-DETECTION MECHANISM (1)
CLASSIFICATION (1)
CLASSIFICATION ALGORITHMS (1)
CLUSTERING (1)
CLUSTERING ALGORITHMS (1)
COGNITIVE ERROR (1)
COMMENT FUNCTIONALITIES (1)
COMMUNITY QUESTION ANSWERING (1)
COMPUTER BOOTSTRAPPING (1)
COMPUTERS (1)
CONCEPT DESCRIPTION VECTORS (1)
CONTENT EXTRACTION (1)
CONTENT-BASED METHODS (1)
CONTENT-BASED RETRIEVAL (1)
CONTEXT (1)
CONTEXT-BASED METHODS (1)
CONTINENTS (1)
CORPUS CONSTRUCTION (1)
COUNTRY OF ORIGIN DETECTION (1)
COUNTRY-OF-ORIGIN DETERMINATION (1)
COURSEWARE (1)
CRAWLING (1)
CROSS-LANGUAGE (1)
CROSS-LANGUAGE PROCESSING (1)
CROSS-LINGUAL POPULATION/ANNOTATION SYSTEM (1)
CROSS-WEB SITE PAGE RECOMMENDATION (1)
CROWDSOURCING (1)
CYBERSPACE (1)
DATA MODELS (1)
DICTIONARY (1)
DIGITAL LIBRARIES (1)
DIGITAL LIBRARY (1)
DIGITAL LITERACY (1)
DISCRIMINATIVE IMAGE MODEL (1)
DOCUMENT CONTENT (1)
DOCUMENT IMAGE PROCESSING (1)
DYNAMIC SAMPLING (1)
DYNAMIC WEB PAGE ACCESS (1)
more

INFONA - science communication portal

Search results

Clustering search engine suggests by integrating a topic model and word embeddings

Towards Solving Comprehensibility-Relevance Trade-off in Information Retrieval

Combining Attributes and Links: Finding Homepage for Entity Searching

Static Analysis of JavaScript Web Applications in the Wild via Practical DOM Modeling (T)

A Hybrid Model for Experts Finding in Community Question Answering

Link Analysis of Wikipedia Documents Using MapReduce

Enhancing Efficiency of Web Search Engines through Ontology Learning from Unstructured Information Sources

An Information Classification Approach Based on Knowledge Network

User Interest Profile Identification Using Wikipedia Knowledge Database

A Bookmark Recommender System Based on Social Bookmarking Services and Wikipedia Categories

Statistical Model for Content Extraction

Automatic Annotation of Non-English Web Content

A Fast and Accurate Approach for Main Content Extraction Based on Character Encoding

Cost-Optimal Validation Mechanisms and Cheat-Detection for Crowdsourcing Platforms

Non-invasive Browser Based User Modeling Towards Semantically Enhanced Personlization of the Open Web

Accessing dynamic web page in users language

Using Wikipedia's Content for Cross-Website Page Recommendations that Consider Serendipity

Multi-Type Web Relation Extraction Based on Bootstrapping

Country of origin determination via Web mining techniques

Combining the Missing Link: An Incremental Topic Model of Document Content and Hyperlink

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options