A new approach for searching translated plagiarism
Pataki, Máté (2012) A new approach for searching translated plagiarism. In: 5th International Plagiarism Conference.
|
Image (cover image)
cover.jpg - Cover Image Download (5kB) | Preview |
|
|
Text
20120712_PlagiarismConference_NewApproachForSearchingTranslatedPlagiarism.pdf - Published Version Download (308kB) | Preview |
|
|
Text
20120712_PlagiarismConference_NewApproachForSearchingTranslatedPlagiarism_Slides.pdf - Presentation Download (1MB) | Preview |
Abstract
In 2010 we started a one-year research project to be able to search for translational plagiarism cases. Most current approaches use machine translation to detect similarity between texts written in different languages, but it was not feasible for the research goal to develop an algorithm that works effectively between Hungarian and English documents as well. The Hungarian language has three main obstacles when comparing to other (European) languages: a) loose word order, b) conjugation, c) having a significantly different grammar. These are also the reasons – alongside with small available parallel corpora – that machine translation to and from Hungarian are rather useless for serious applications, often not even understandable by humans. The new algorithm defines a dictionary-based distance function between sentences which are evaluated in multiple steps as to enable a fast candidate search and a precise comparison between possible translations. It basically searches for all possible translations, instead of going with one given by an automatic translator. This approach has proved to be effective and eliminated the necessity of using word-sense disambiguation first (at the machine translation stage) and then synonyms in the next step of the system.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Uncontrolled Keywords: | external, translational, plagiarism detection, algorithm |
Subjects: | Q Science > QA Mathematics and Computer Science > QA75 Electronic computers. Computer science / számítástechnika, számítógéptudomány |
Divisions: | Department of Distributed Systems |
Depositing User: | Máté Pataki |
Date Deposited: | 12 Dec 2012 08:40 |
Last Modified: | 06 Feb 2014 14:48 |
URI: | https://eprints.sztaki.hu/id/eprint/6539 |
Update Item |