Re: Talksearch.io - Advanced Bitcointalk Search Engine

An update to the enhanced search feature:

A new dataset is being uploaded to Elasticsearch. This dataset is more enriched than the current unprocessed posts and includes even more metadata such as the lock type, scrape time and check time, which will be used along with other parameters to determine in what order should topics be checked for updates and the frequency they will be checked.

In an effort to remove irrelevant data such as quotes from the search results, posts are now divided into chunks, delimited by the presence of a quote or a line separator.

This upload process was started yesterday, and about 4 million records have been indexed so far, out of a total estimated to be around 120 million.

The v2 indices contain the data which Talksearch will use for searching in the future. Also, local language posts are categorized to facilitate for local search.

I continue to work on automatic scraping support. However, the v2 dataset is more recent than the original, and contains posts from up to March 2025.

New translated ANN links will be added shortly.