Search results

Items from 1 to 12 out of 12 results

chapter

Data Preprocessing in SVM-Based Keywords Extraction from Scientific Documents

Chunguo Wu, M. Marchese, Yufei Wang, M. Krapivin, more

2009 Fourth International Conference on Innovative Computing, Information and Control (ICICIC) > 810 - 813

2009 Fourth International Conference on Innovative Computing, Information and Control (ICICIC 2009)

Scientific documents are unstructured data consisting of natural language and hard for scientists to read and manage. Keywords are very helpful for scientists to search the related documents and know about their contents in a prompt way. In this paper we investigate a kind of data preprocessing technique used in SVM

chapter

Words Clustering Based on Keywords Indexing from Large-scale Categorization Corpora

Liu Hua

2009 Fifth International Conference on Information Assurance and Security > 1 > 407 - 410

2009 Fifth International Conference on Information Assurance and Security (IAS)

Keywords are indexed automatically for large-scale categorization corpora. Indexed keywords of more than 20 documents are selected as seed words, thus overcoming subjectivity of selecting seed words in clustering; at the same time, clustering is limited to particular category corpora and keywords indexed feature

chapter

Weighted Feature Subset Non-negative Matrix Factorization and Its Applications to Document Understanding

Dingding Wang, Tao Li, Chris Ding

2010 IEEE International Conference on Data Mining > 541 - 550

2010 10th IEEE International Conference on Data Mining (ICDM 2010)

Keyword (Feature) selection enhances and improves many Information Retrieval (IR) tasks such as document categorization, automatic topic discovery, etc. The problem of keyword selection is usually solved using supervised algorithms. In this paper, we propose an unsupervised approach that combines keyword selection and

chapter

Document space dimension reduction by Latent Semantic Analysis and Hebbian neural network

I. Mokris, L. Skovajsova

2008 6th International Symposium on Intelligent Systems and Informatics > 1 - 4

2008 6th International Symposium on Intelligent Systems and Informatics (SISY 2008)

This paper presents the comparison of the text document space dimension reduction and the text document clustering and also the keyword space dimension reduction and keyword clustering by the latent semantic analysis and by the Hebbian neural network with Oja learning rule. Results of this neural network are compared

chapter

A k-Nearest-Neighbour Method for Classifying Web Search Results with Data in Folksonomies

Ching-man Au Yeung, N. Gibbins, N. Shadbolt

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology > 1 > 70 - 76

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology

Traditional Web search engines mostly adopt a keyword-based approach. When the keyword submitted by the user is ambiguous, search result usually consists of documents related to various meanings of the keyword, while the user is probably interested in only one of them. In this paper we attempt to provide a solution to

chapter

Clustering Web Retrieval Results Accompanied by Removing Duplicate Documents

Xinye Li, Qinhai Yang, LinNa Zeng

2010 International Conference on Web Information Systems and Mining > 1 > 259 - 261

2010 International Conference on Web Information Systems and Mining (WISM 2010)

Since keyword-based search engine usually return large amount of results in which there are many unrelated documents and many documents with same content, automatic clustering technology is used to classify the retrieval results. While there are large amount of Web retrieval results, the clustering process usually

chapter

Compiling Remote Files: Redefining Electronic Document Management System Infrastructure (CReED)

K.G. Alberto, C.M. Abella, M.G.C.E. Sicat, J.D. Niguidula, more

2009 International Conference on Information and Multimedia Technology > 347 - 350

2009 International Conference on Information and Multimedia Technology (ICIMT 2009)

Remote Electronic Document (CReED) provided an access control to all documents that will grant different privileges to each user of the system. It also utilized a keyword analyzer and result matcher that will make searching and retrieving of documents faster and easier. CReED used a scanner device and file importing tool to

chapter

Real-time unsupervised classification of web documents

A. Sigogne, M. Constant

2009 International Multiconference on Computer Science and Information Technology > 281 - 286

2009 International Multiconference on Computer Science and Information Technology (IMCSIT)

This paper addresses the problem of clustering dynamic collections of web documents. We show an iterative algorithm based on a fine-grained keyword extraction (simple, compound words and proper nouns). Each new document inserted in the collection is either assigned to an existing class containing documents of the same

chapter

News Contents Recommendation Model Based on Feedback of Web Usage

Ping Ni, Jianxin Liao, Xiaomin Zhu, Keyan Ren

2009 WRI World Congress on Computer Science and Information Engineering > 4 > 431 - 435

2009 WRI World Congress on Computer Science and Information Engineering, CSIE

In this paper, reclassification for the current classification through K-means would be implemented based on the feedback of Web usage mining in order to improve the accuracy of news recommendation and convergence of classification. It could extract most relative keywords and eliminate the disturbance of multi-vocal

chapter

Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Khalid Elgazzar, Ahmed E Hassan, Patrick Martin

2010 IEEE International Conference on Web Services > 147 - 154

2010 IEEE International Conference on Web Services (ICWS)

user's request, the user has to construct the request using the keywords that best describe the user's objective and match correctly with the Web Service name or location. Clustering Web services based on function similarities would greatly boost the ability of Web services search engines to retrieve the most relevant Web

chapter

Self-organising map for document categorization using latent semantic analysis

B. Mahalakshmi, K. Duraiswamy

2010 International Conference on Innovative Computing Technologies (ICICT) > 1 - 6

2010 International Conference on Innovative Computing Technologies (ICICT)

With the increasing amount of unstructured content available electronically on the web, content categorization becomes very important for efficient information retrieval. The basic approaches for information retrieval in text documents are searching using keywords, categorization of the documents and filtering out the

chapter

Query expansion using information scent

S. Chawla, P. Bedi

2008 International Symposium on Information Technology > 3 > 1 - 8

2008 International Symposium on Information Technology

Web has grown to a huge mass of information resource and is diverse in content. To search such rich source of information one has to be very precise in using keywords in queries to retrieve the relevant documents. Most of the queries issued to search engines are short and have ambiguous context. One way to produce

Filter options

Keywords:
CLUSTERING ALGORITHMS
DOCUMENT HANDLING

Publication date

Set your own date range

Keywords

INFORMATION RETRIEVAL (7)
DATA MINING (5)
PATTERN CLUSTERING (5)
INTERNET (4)
SEARCH ENGINES (4)
ARTIFICIAL NEURAL NETWORKS (3)
CLASSIFICATION ALGORITHMS (3)
INDEXING (3)
MATRIX DECOMPOSITION (3)
ACCURACY (2)
CLASSIFICATION (2)
CLUSTERING (2)
CLUSTERING METHODS (2)
COMPUTER SCIENCE (2)
FEATURE EXTRACTION (2)
HISTORY (2)
LATENT SEMANTIC ANALYSIS (2)
PATTERN CLASSIFICATION (2)
SEMANTICS (2)
VOCABULARY (2)
2D GRID (1)
ACCESS CONTROL (1)
ALGORITHM IMPLEMENTATION (1)
ALGORITHM THEORY (1)
AMBIGUOUS CONTEXT (1)
ARCHITECTURE (1)
ARCHIVING (1)
ASSOCIATION RULES (1)
AUTHORISATION (1)
AUTOMATIC CLUSTERING (1)
AUTOMATIC QUERY EXPANSION (1)
BOOTSTRAP (1)
BRIDGES (1)
CATEGORIZATION CORPORA (1)
CLASS CONTAINING DOCUMENTS (1)
CLASSIFY DOCUMENT (1)
CLASSIFYING WEB SEARCH ENGINE (1)
CLEANING (1)
CLICKED DOCUMENTS (1)
CLUSTERING METHOD (1)
CLUSTERING PROCESS (1)
CLUSTERING WSDL DOCUMENTS (1)
COLLABORATIVE TAGGING (1)
COLLABORATIVE TAGGING SYSTEM (1)
COMPILING REMOTE ELECTRONIC DOCUMENT (1)
COMPOUNDS (1)
COMPUTER BOOTSTRAPPING (1)
CONCEPT OCCURRENCES (1)
CONCEPT RELATIONSHIPS (1)
DATA ANALYSIS (1)
DATA CLUSTERING (1)
DATA COLLECTION (1)
DATA PREPROCESSING (1)
DATA SECURITY (1)
DATA VISUALIZATION (1)
DOCUMENT ACCESS HISTORY (1)
DOCUMENT CATEGORIZATION (1)
DOCUMENT CLUSTERING (1)
DOCUMENT RETRIEVAL (1)
DOCUMENT SEARCHING (1)
DOCUMENTS RETRIEVAL (1)
DOMANIAL WORDS (1)
DUPLICATE DOCUMENTS (1)
DYNAMIC COLLECTIONS WEB DOCUMENTS (1)
ELECTRON TUBES (1)
ELECTRONIC DOCUMENT MANAGEMENT SYSTEM (EDMS) (1)
ELECTRONIC DOCUMENT MANAGEMENT SYSTEM INFRASTRUCTURE (1)
FEATURE SELECTION (1)
FILE IMPORTING TOOL (1)
FINE GRAINED KEYWORD EXTRACTION (1)
FINGERPRINT RECOGNITION (1)
FOLKSONOMY (1)
FREQUENCY ESTIMATION (1)
FREQUENCY MEASUREMENT (1)
GLOBAL TECHNIQUES (1)
GROUPWARE (1)
HEBBIAN LEARNING (1)
HEBBIAN NEURAL NETWORK (1)
HEURISTIC ALGORITHMS (1)
HIDDEN MARKOV MODELS (1)
HIGH DIMENSIONAL DOCUMENT VECTOR (1)
HIGH LEVEL LANGUAGES (1)
INFORMATION INTELLIGENCE (1)
INFORMATION NEED (1)
INFORMATION RESOURCES (1)
INFORMATION SCENT (1)
INPUT QUERY (1)
ITERATIVE ALGORITHM BASED (1)
K-MEANS (1)
K-MEANS CLASSIFICATION (1)
K-NEAREST-NEIGHBOUR METHOD (1)
KEYWORD AMBIGUITY (1)
KEYWORD ANALYZER (1)
KEYWORD BASED SEARCH ENGINE (1)
KEYWORD CLUSTERING (1)
KEYWORD EXTRACTION (1)
KEYWORD SELECTION (1)
KEYWORD SPACE DIMENSION REDUCTION (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options