The Scrape Escape

I hope you had a lovely weekend filled with some great coffee. Mine was a bit less lovely, because I spent a good chunk of it learning just how many websites now block scraping and crawling thanks to the flood of AI bots.

This matters for Coffee Management, because when you paste a coffee URL into Visualizer and ask it to import the details with AI, I want that to work as often as possible.

With the new changes the scraper still tries the old, simple approach first. But when that fails, Visualizer now falls back to Crawlbase, which can fetch the page through a real browser. That is much more reliable for blocked sites, but it can also take much longer. This longer runtime broke the existing approach I had, because the request would time out. So I was forced into switching the whole approach by using websockets via Action Cable.

The good news is that this actually made the whole feature much better. Instead of quietly doing mysterious things in the background while you watch the little animated logo, Visualizer now tells you exactly what it is working on. You can see it step through fetching the page, retrying in browser mode if needed, extracting the useful bits, and finally applying the results.

Another nice quality of life improvement: the PWA, when you save Visualizer to your phone home screen, now supports pull to refresh on the shot list. Small thing, but very handy if you use Visualizer on your phone and want that app-like feel.

I also spent some time rewriting the homepage and Premium messaging. No new features there, but I wanted the site to better reflect what Visualizer actually is: a place to keep useful coffee data in one place, learn from it, and brew better coffee because of it.

As always, if you run into bugs or have ideas for improvements, please keep opening GitHub issues. I really do read them all, and they very often end up shipped. And if you want all the nitty gritty details, here is the full diff.

Thanks for reading, and have a great week ahead! ☕