Post
Topic
Board Reputation
Re: Viewing unedited posts and deleted posts, view per post, per user or per topic
by
LoyceV
on 04/06/2020, 08:07:48 UTC
Millions more posts added:
I have now archived the first 35.5 million posts, all available online. This currently filles 43 GB.
Example: my first post!

See this quote on how to use it:
Sneak preview: http://loyce.club/archive/oldposts/
How to use:
  • Find the msgID you need. Let's use 28228
  • Remove the last 5 digits from the msgID to get the directory name (if there are less than 5 digits, use 0): 0
  • Replace the last 2 digits of the msgID by xx, and add .html (if there are less than 5 digits, use 0xx): 282xx.html
  • Add "#msg" and the msgID: #msg28228
  • Put everything together and go to http://loyce.club/archive/oldposts/0/282xx.html#msg28228

Limitations
  • Currently, the first 2.1 million posts are available.
  • I'll scrape the first 5.21 million topics and all posts in there.
  • That means I'll archive 53.36 million posts, this partially overlaps with my scraper for new posts.
  • This is a one-time thing, I won't update it with newer posts (I scrape unedited versions for those).
  • The time "scraped on" is Amsterdam time.

If no username is mentioned, it's either "Anonymous" or "random". I forgot those exist when I started scraping, and it's not important enough to start over.

This bug is not fixed yet:
I found a bug (which I'm posting here as a reminder to myself): Posts on the עברי (Hebrew) board don't show up. Example: this post is missing, while it exists.
I'll see if I can add them later. I think it has something to do with the right-to-left writing, even selecting text on that board doesn't work as expected.
Update: عربية (Arabic) has the same problem.
I'll re-scrape these boards after finishing scraping all posts.