Hey LoyceV
Have you ever considered scraping the bbcode of posts instead of the html? Or at least scrape both of them? This would solve some problems, one of them is determining accuratly if an image changed in a post or not
This idea might be naive but let me know your thoughts eitherway