Maybe don't rush with that. Ignoring quotes could be a feature, at least an optional one. Most of the time I would probably want someone the user posted themselves, not when they quoted something.
I already plan to do that. But the index I had created was not only removing the quotes, but also removing parts of the original post.
However searching for partial words would be great. E.g. "bitcoin" should find "bitcoins". Or perhaps it should be an option too, for those cases where you don't want "ninja" to find "tryninja".
You do get results for "bitcoins" if you search "bitcoin". Most plural words are equal to their singular forms when you do a search. Ninja and TryNinja are too far apart and something like this would probably result in a lot of false positives (like in
this case).
Let me know if you'd like some help with the unknown titles. I can give you a dump of post IDs and titles that could significantly reduce the number of posts you'd need to re-scrape.
That would be great.
Since it's all in a DB, it would be possible to associate a user with all of the addresses they've posted, no? I see the opposite being available and I can't help but think of also searching by user.
Not exactly. Depends on the database schema. I'm already working on this, but I still didn't find a good solution that is fast and effective.