Increasingly, unethical authors and predatory publishers are learning new tricks to make it more difficult to detect plagiarism in their writings and published articles. Here are five methods they are using to defeat automated plagiarism detection programs.
1. PDF files are made up of layers. One layer is the visual layer, and another one is the text layer. It is possible to alter the unseen text layer in a PDF file by changing all the letters to mojibake. For example, when you highlight the text in the article below, and then do Control + C to copy the text, when the text is pasted into Notepad, it is only garbage characters:
This makes it impossible for automated reading of the text. Unfortunately, for the authors, it means their papers are not crawled in Google, making them impossible to find using the search engine.
2. By coincidence, some letters in the Latin character set match letters in others. For example, the Latin letter e looks almost exactly the Cyrillic letter le:
Latin e: = e
Cyrillic le= е
Latin a = a
Cyrillic a = а
Latin o = o
Cyrillic o = о
There are additional matches with Latin letters and letters in the Greek character set. To exploit these similarities in the context of defeating plagiarism detection, someone would use a “find and replace” function, replacing Latin letters with similar-looking letters from other character sets. While some plagiarism detection programs are programmed to deal with this hack, not all are, so this trick may be successful in some systems.
3. Another trick is to use the find-and-replace feature to convert all spaces in a document to a character from a foreign characters set, and then use find-and-replace to convert that character to the color white, so it appears as a space again. Example:
Colorado is a U.S. state that encompasses most of the Southern Rocky Mountains as well as the northeastern portion of the Colorado Plateau and the western edge of the Great Plains. Colorado is part of the Western United States, the Southwestern United States, and the Mountain States. 
Text with spaces changed to another [Chinese] character:
Text with foreign character changed to white:
This technique does leave some spacing problems that could be taken care of manually. You can select the text above to see the hidden characters.
4. The next method is called thesaural substitution. This simply means changing words in the text to words with the same meaning. This can be done using a manual or automated process. For example, re-using the Colorado paragraph above, we might take the original text and massage it into a similar text:
Original: Colorado is a U.S. state that encompasses most of the Southern Rocky Mountains as well as the northeastern portion of the Colorado Plateau and the western edge of the Great Plains. Colorado is part of the Western United States, the Southwestern United States, and the Mountain States. 
Edited: Colorado is a United States state that includes most of the Southern Rocky Mountains and also the northeastern section of the Colorado Plateau and the western part of the Great Plains. Therefore, Colorado is a component of the Western United States, the Mountain States, and the Southwestern United States.
5. The last trick is to find an article that is only written in a foreign language and then translate it to English using an automatic translator or by translating it manually. Because the plagiarism detection software’s database is unlikely to have the article in its original language, it will not detect the article as plagiarized.
Authors who commit plagiarism want to hide evidence of their plagiarism. Predatory publishers who knowingly publish articles containing plagiarism want to prevent the plagiarism from being detected. In both cases, they sometimes use tricks to avoid detection by automated plagiarism detection programs.
The tricks can prevent open-access articles from being properly indexed in search engines. They can facilitate the publication of work that should never be published. Not all plagiarism detection software is the same, and we hope that software developers are able to defeat plagiarists’ tricks.
1. Gillam, L., Marinuzzi, J., Ioannou, P. “TurnItOff: defeating plagiarism detection systems.” In: 11th Higher Education Academy-ICS Annual Conference, University of Durham, 24–26 Aug 2010, UK.
2. Text taken from the Wikipedia article for Colorado.