Search results

Items from 1 to 8 out of 8 results

chapter

Research on Text Anti-plagiarism Algorithm in Chinese-English Mixed Context

Liu Yong, Huang Junhua

2010 Third International Conference on Intelligent Networks and Intelligent Systems > 357 - 361

2010 3rd International Conference on Intelligent Networks and Intelligent Systems (ICINIS 2010)

This paper proposes a multi-pattern matching algorithm-APT(Anti-plagiarism Trie) algorithm for Chinese-English mixed text based on text anti-plagiarism detector. The APT algorithm accepts the structure idea of the multi-pattern matching algorithm with Absolute Hash Trie tree, uses method of similarity measurement in the string matching, combines the strategy of skip characters and adding condition...

chapter

Minimum edit distance-based text matching algorithm

Yu Zhao, Huixing Jiang, Xiaojie Wang

Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010) > 1 - 4

2010 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE 2010)

This paper proposes a measurement based on Minimum Edit Distance (MED) to the similarity between two sets of MultiWord Expressions (MWEs), which we use to calculate matching degree between two documents. We test the matching algorithm in the position searching system. Experiments show that the new measurement has higher performance than the cosine distance.

chapter

Word length based zero-watermarking algorithm for tamper detection in text documents

Z Jalil, A M Mirza, H Jabeen

2010 2nd International Conference on Computer Engineering and Technology > 6 > V6-378 - V6-382

2010 2nd International Conference on Computer Engineering and Technology (ICCET)

Copyright protection and authentication of digital content has become a major concern in the current digital era. Plain text is the widely used means of information exchange on the Internet and it is essential to verify the authenticity of information in any form of communication. There are very limited techniques available for plain text watermarking, authentication, and tamper detection. This paper...

chapter

An Efficient Bit-Parallel Multi-Patterns Word Searching Algorithm through Splitting the Text

I. Yadav, B. Singh, S. Agarwal, R. Prasad

2009 International Conference on Advances in Recent Technologies in Communication and Computing > 406 - 410

2009 International Conference on Advances in Recent Technologies in Communication and Computing. ARTCom 2009

Word matching problem is to find all the occurrences of a pattern P[0...m-1] in the text T[0...n-1], where P neither contains any white space nor preceded and followed by space. In the multi-patterns word matching problem, all the occurrences of multiple word P₀, P₁, P₂ ...P_r-1, (rges1) in the given text T are to be reported. In the present discussion, we assume that all the patterns have equal size...

chapter

Research on Improved Algorithm for Chinese Word Segmentation Based on Markov Chain

Pang Baomao, Shi Haoshan

2009 Fifth International Conference on Information Assurance and Security > 1 > 236 - 238

2009 Fifth International Conference on Information Assurance and Security (IAS)

Chinese words segmentation is an important technique for Chinese Web data mining. After the research made on some Chinese word segmentation nowadays, an improved algorithm is proposed in this paper. The algorithm updates dictionary by using two-way Markov chain, and does word segmentation by applying an improved forward maximum matching method based on word frequency statistic. The simulation shows...

chapter

A simplified application of regular expressions: With the extraction of Chinese cultural terms as an example

Yao Zhenjun, Ji Xiangyu

2009 ISECS International Colloquium on Computing, Communication, Control, and Management > 1 > 439 - 442

2009 ISECS International Colloquium on Computing, Communication, Control, and Management (CCCM)

This article aims to solve the problem of extracting the cultural terms and their correspondent English translations from the heterogeneous literature of the translation of the ancient Chinese classics. As the tool of text processing, regular expressions can help to realize the matching in the patterned text. This research focuses on design the target-oriented regular expressions to fit the pattern...

chapter

Automatic extraction of definitions

Chunxia Zhang, Peng Jiang

2009 2nd IEEE International Conference on Computer Science and Information Technology > 364 - 368

2009 2nd IEEE International Conference on Computer Science and Information Technology (ICCSIT 2009)

The task of definition extraction aims to acquire definitions of terms from texts. This task is a subtask of terminology extraction, ontology construction, semantic relation learning, and question answering and so on. This paper presents a bootstrapping approach to automatic extracting definitions of domain-specific terms from unannotated Chinese free texts. Experimental results in three domains of...

chapter

Extracting Part-Whole Relations from Unstructured Chinese Corpus

Xinyu Cao, Cungen Cao, Shi Wang, Han Lu

2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery > 4 > 175 - 179

2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)

An important problem in text mining is the automatic extraction of semantic relations. The paper provides a domain independent method for automatic extraction of part-whole relations in Chinese corpusa. The method consists of there phases. First, a set of lexico-syntactical patterns for part-whole relations are designed using known pairs of concepts encoding part-whole relations as seeds, and manually...

Filter options

Keywords:
COMPUTERS
PATTERN MATCHING

Publication date

Set your own date range

Keywords

DATA MINING (5)
ALGORITHM DESIGN AND ANALYSIS (4)
NATURAL LANGUAGE PROCESSING (4)
FEATURE EXTRACTION (3)
STRING MATCHING (3)
ALGORITHM (2)
CLASSIFICATION ALGORITHMS (2)
INFORMATION RETRIEVAL (2)
INTERNET (2)
ACCURACY (1)
ANCIENT CHINESE CLASSICS TRANSLATION (1)
AND WORD SEARCHING (1)
ANTI PLAGIARISM TRIE (1)
APT (1)
ARCHAEOLOGY DOMAIN (1)
AUTHENTICATION (1)
AUTOMATA (1)
AUTOMATIC DEFINITION EXTRACTION (1)
BIT-PARALLEL MULTIPATTERNS WORD SEARCHING ALGORITHM (1)
BOOTSTRAPPING APPROACH (1)
BOOTSTRAPPING METHOD (1)
CENTRAL PROCESSING UNIT (1)
CHARACTER RECOGNITION (1)
CHINESE CULTURAL TERM EXTRACTION (1)
CHINESE CULTURAL TERMS (1)
CHINESE ENGLISH MIXED CONTEXT (1)
CHINESE WEB DATA MINING (1)
CHINESE WORD SEGMENTATION (1)
COMPLEXITY THEORY (1)
COMPUTER DOMAIN (1)
COPYRIGHT (1)
COPYRIGHT PROTECTION (1)
CULTURAL DIFFERENCES (1)
DEFINITION EXTRACTION (1)
DICTIONARIES (1)
DICTIONARY (1)
DIGITAL CONTENT AUTHENTICATION (1)
DOMAIN-SPECIFIC ONTOLOGY (1)
DOMAIN-SPECIFIC TERM (1)
ENCODING (1)
ENGLISH TRANSLATION (1)
EXTRACTION ALGORITHM (1)
FORCE (1)
FORWARD MAXIMUM MATCHING METHOD (1)
GENERATORS (1)
HASH TIRE TREE (1)
HASH TRIE TREE (1)
HETEROGENEOUS LITERATURE (1)
HEURISTIC RULES (1)
JAVA (1)
KNOWLEDGE ENGINEERING (1)
KNOWLEDGE EXTRACTION (1)
LANGUAGE TRANSLATION (1)
LEARNING (ARTIFICIAL INTELLIGENCE) (1)
LEXICO-SYNTACTICAL PATTERNS (1)
MARKOV PROCESSES (1)
MATCHING DEGREE (1)
MILITARY DOMAIN (1)
MINIMUM EDIT DISTANCE (1)
MULTI-PATTERN MATCHING (1)
MULTIPATTERN MATCHING ALGORITHM (1)
MULTIPATTERNS WORD MATCHING PROBLEM (1)
MULTIWORD EXPRESSION (1)
MULTIWORD EXPRESSIONS (1)
OFFLINE SEARCHING (1)
ONTOLOGIES (1)
ONTOLOGIES (ARTIFICIAL INTELLIGENCE) (1)
ONTOLOGY CONSTRUCTION (1)
PATTERN RECOGNITION (1)
PLAIN TEXT WATERMARKING (1)
POSITION SEARCHING SYSTEM (1)
QUESTION ANSWERING (1)
QUESTION ANSWERING SYSTEM (1)
RANDOM TAMPERING ATTACK (1)
REGULAR EXPRESSIONS (1)
SECURITY (1)
SECURITY OF DATA (1)
SEMANTIC RELATION LEARNING (1)
SEMANTIC RELATIONS EXTRACTION (1)
SHIFT-OR (1)
SHIFT-OR ALGORITHM (1)
SIMILARITY ALGORITHM (1)
SOFTWARE ALGORITHMS (1)
SYNTACTICS (1)
TAMPER DETECTION (1)
TARGET-ORIENTED REGULAR EXPRESSION (1)
TERMINOLOGY (1)
TERMINOLOGY EXTRACTION (1)
TEXT ANTI-PLAGIARISM (1)
TEXT ANTIPLAGIARISM ALGORITHM (1)
TEXT DOCUMENT (1)
TEXT MATCHING (1)
TEXT MATCHING ALGORITHM (1)
TEXT MINING (1)
TEXT PROCESSING (1)
TEXT SPLITTING (1)
TRAINING (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options