Achieving Linguistic Provenance via Plagiarism Detection

Nwokedi Idika; Harry Phan; Mayank Varia

doi:10.1109/ICDAR.2013.133

Achieving Linguistic Provenance via Plagiarism Detection

Idika, Nwokedi, Phan, Harry, Varia, Mayank

Source

2013 12th International Conference on Document Analysis and Recognition > 648 - 652

Abstract

To go beyond what current provenance systems can capture for natural language text documents, we propose the Lincoln Laboratory Plagiarism for Provenance System (LLPla Ì) as an approach for capturing linguistic provenance. Linguistic provenance infers the origin of text based on its linguistic structure. We take a plagiarism detection approach to this task as identifying similar sections of text is fundamental to linguistic provenance and central to LLPla Ì's performance. Thus, to determine the most viable plagiarism detection algorithm for use in LLPla Ì, we evaluate three state-of-the-art plagiarism detection algorithms. Moreover, we propose extensions to the best-performing algorithm that improve its precision with negligible effects on recall.

Identifiers

book ISSN :	1520-5363
book e-ISBN :	978-0-7695-4999-6
DOI	10.1109/ICDAR.2013.133

Authors

Keywords

Plagiarism Detection algorithms Pragmatics Probabilistic logic Conferences Generators Laboratories graphs provenance plagiarism detection

Additional information

Data set: ieee

Publisher

IEEE

chapter

Read online
Download
Add to read later
Add to collection
Add to followed
Share

Export to bibliography


Assign to other user
	×
Wrong email address

INFONA - science communication portal

Achieving Linguistic Provenance via Plagiarism Detection $("#expandableTitles").expandable();

Source

Abstract

Identifiers

Authors

User assignment

Assignment remove confirmation

You're going to remove this assignment. Are you sure?

Idika, Nwokedi

Phan, Harry

Varia, Mayank

Keywords

Additional information

Publisher

Share

Export to bibliography

Reporting an error / abuse

Sending the report failed

Accessibility options

Achieving Linguistic Provenance via Plagiarism Detection