Post
Topic
Board Meta
Merits 1 from 1 user
Re: "Multiple Accounts" / Copy-pasta detection scripts/bots
by
Piggy
on 19/09/2018, 19:27:54 UTC
⭐ Merited by LoyceV (1)
Few thoughts about the spinned texts:

If the spinned text is not using synonymous it may help before to run any check to prepare the data, for example reorder all the word of the sentence in alphabetical order.

If it is using synonymous is becoming quite complicated, you need to be able to identify two different words are actually the same. Peraphs as you read the sentences you should substitute words with a code wich corresponds to a subset of synonymous then use these cleaned sentences to run the checks.
Maybe there are dictionaries ready for this sort of things. In any case comparing 1 message with all the previous message running perhaps multiple check can be quite expensive to perform.