Show HN: An API that takes a URL and returns a file with browser screenshots

(github.com)

208 points | by gkamer8 419 days ago

25 comments

xnx 419 days ago
For anyone who might not be aware, Chrome also has the ability to save screenshots from the command line using: chrome --headless --screenshot="path/to/save/screenshot.png" --disable-gpu --window-size=1280,720 "https://www.example.com"
[-]
- cmgriffing 419 days ago
  Quick note: when trying to do full page screenshots, Chrome does a screenshot of the current view, then scrolls and does another screenshot. This can cause some interesting artifacts when rendering pages with scroll behaviors.
  Firefox does a proper full page screenshot and even allows you to set a higher DPS value. I use this a lot when making video content.
  Check out some of the args in FF using: `:screenshot --help`
  [-]
  - wereHamster 419 days ago
    That's not the behavior I'm seeing (with Puppeteer). Any elements positioned relative to the viewport stay within the area specified by screen size (eg. 1200x800) which is usually the top of the page. If the browser would scroll down these would also move down (and potentially appear multiple times in the image). Also intersection observers which are further down on the page do not trigger when I do a full-page screenshot (eg. an element which starts animation when it enters into the viewport).
    [-]
    - genewitch 419 days ago
      bravo for puppeteer, i guess? "singlefile" is the only thing i've ever seen not do weird artifacts in the middle of some site renders, or, like on reddit, just give up rendering comments and render blank space instead until the footer.
      anyhow i've been doing this exact thing for a real long time, e.g.
      https://raw.githubusercontent.com/genewitch/opensource/refs/...
      using bash to return json to some stupid chat service we were running
  - ranger_danger 419 days ago
    Where would you type that command in?
    [-]
    - pavel_lishin 419 days ago
      Firefox developer console.
  - xg15 419 days ago
    I mean, if you have some of those annoying "hijack scrolling and turn the page into some sort of interactive animation experience" sites, I don't think "full page" would even be well-defined.
    [-]
    - sixothree 419 days ago
      Pretty sure this refers to sticky headers. They have caused me many headaches when trying to get a decent screenshot.
- input_sh 419 days ago
  Firefox equivalent:
```
    firefox -screenshot file.png https://example.com --window-size=1280,720
```
  A bit annoyingly, it won't work if you have Firefox already open.
  [-]
  - UnlockedSecrets 419 days ago
    Does it work if you use a different profile with -p?
    [-]
    - paulryanrogers 419 days ago
      Maybe with --no-remote
  - genewitch 419 days ago
    on my firefox if i right click on a part of the page the website hasn't hijacked, it gives the option to "take screenshot" - which i think required enabling a setting somewhere. I hope it wasn't in about:config or wherever the dark-art settings are. I use that feature of FF to screenshot youtube videos with the subtitles moved and the scrub bar cropped out, i feel like it's a cleaner and smaller clipboard copy than using win+shift+s. Microsoft changed a lot about how windows handles ... files ... internally and screenshots are huge .png now, making me miss the days of huge .bmp.
    also as mentioned above, if you need entire sites backed up the firefox extension "singlefile" is the business. if image-y things? bulk image downloader (costs money but 100% worth; you know it if you need it: BID); and yt-dlp + ffmpeg for video, in powershell (get 7.5.0 do yourself a favor!)
```powershell
$userInput = Read-Host -Prompt '480 video download script enter URL'
Write-Output "URL:`t`t$userInput"
c:\opt\yt-dlp.exe `
-f 'bestvideo[height<=480]+bestaudio/best[height<=480]' `
--write-auto-subs --write-subs `
--fragment-retries infinite `
$userInput
```
  - blueflow 419 days ago
    > it won't work if you have Firefox already open
    now try and go ahead how you could isolate these instances so they cannot see each other. this leads into a rabbit hole of bad design.
    [-]
    - yjftsjthsd-h 419 days ago
      > now try and go ahead how you could isolate these instances so they cannot see each other. this leads into a rabbit hole of bad design.
      Okay, done:
      PROFILEDIR="$(mktemp -d)" firefox --no-remote --profile "$PROFILEDIR" --screenshot $PWD/output.png https://xkcd.com rm -r "$PROFILEDIR"
      What's the rabbit hole?
      [-]
      - blueflow 419 days ago
        Whats with the dbus interface?
        [-]
        yjftsjthsd-h 418 days ago
        What?
        (If you're trying to point out that two firefoxes are capable of talking to each other via system IPC, then yes, fully isolating any two programs on the same machine requires at least containers but probably full VMs, which has nothing to do with Firefox itself, and you'd need to explain why in this situation we should care)
  - amelius 419 days ago
    > A bit annoyingly, it won't work if you have Firefox already open.
    I hate it when applications do this.
  - cmgriffing 419 days ago
    LOL, you and I posted very similar replies at the same time.
- azhenley 419 days ago
  Very nice, I didn't know this. I used pyppeteer and selenium for this previously which seemed excessive.
- martinbaun 419 days ago
  Oh man, I needed this so many times didn't even think of doing it like this. I tried using Selenium and all different external services. Thank you!
  Works in chromium as well.
- antifarben 416 days ago
  Does anyone know whether this would also be possible with Firefox, including explicit extensions (i.e. uBlock) and explicit configured block lists or other settings for these extensions?
- hulitu 417 days ago
  > Chrome also has the ability to save screenshots
  Too bad that no browser is able to print a web page.
- Onavo 419 days ago
  What features won't work without GPU?
  [-]
  - kylecazar 419 days ago
    This flag isn't valid anymore in the new chrome headless. Disable GPU doesn't exist unless your on the old version (and then, it was meant as a workaround for Windows users only).
    I've used this via selenium not too long ago
  - xnx 419 days ago
    [flagged]
    [-]
    - dingnuts 419 days ago
      oh good an AI summary with none of the facts checked, literally more useless than the old lmgtfy and somehow more rude
      "here's some output that looks relevant to your question but I couldn't even be arsed to look any of it up, or copy paste it, or confirm its validity"
jot 419 days ago
If you’re worried about the security risks, edge cases, maintenance pain and scaling challenges of self hosting there are various solid hosted alternatives:
- https://browserless.io - low level browser control
- https://scrapingbee.com - scraping specialists
- https://urlbox.com - screenshot specialists*
They’re all profitable and have been around for years so you can depend on the businesses and the tech.
* Disclosure: I work on this one and was a customer before I joined the team.
[-]
- ALittleLight 419 days ago
  Looking at your urlbox - pretty funny language around the quota system.
  >What happens if I go over my quota?
  >No need to worry - we won't cut off your service. We automatically upgrade you to the next tier so you benefit from volume discounts. See the pricing page for more details.
  So... If I go over the quota you automatically charge me more? Hmm. I would expect to be rejected in this case.
  [-]
  - jot 419 days ago
    I’m sure we can do better here.
    In my experience our customers are more worried about having the service stop when they hit the limit of a tier than they are about being charged a few more dollars.
    [-]
    - ALittleLight 419 days ago
      Maybe I'm misreading. It sounds like you're stepping the user up a pricing tier - e.g. going from 50 a month to 100 and then charging at the better rate.
      I would also worry about a bug on my end that fires off lots of screenshots. I would expect a quota or limit to protect me from that.
      [-]
      - jot 419 days ago
        That’s right. On our standard self-service plans we automatically charge a better rate as volume increases. You only pay the difference between tiers as you move through them.
        It’s rare that anyone makes that kind of mistake. It probably helps that our rate limits are relatively low compared to other APIs and we email you when you get close to stepping up a tier. If you did make such a mistake we would, like all good dev tools, work with you to resolve. If it happened a lot we might introduce some additional controls.
        We’ve been in this business for over 12 years and currently have over 700 customers so we’re fairly confident we have the balance right.
        [-]
        ALittleLight 418 days ago
        I'm not a customer, so don't take what I say too seriously, but to me it seems like you are unilaterally making a purchasing decision on my behalf. That is, I agreed to pay you 50 dollars a month and you are deciding I should pay 100 (or more) - to "upgrade" my service. My intuition is that this is probably not legal, and, if I were a customer, I would not pay for a charge that I didn't explicitly agree to - if you tried to charge me I would reject it at the credit card level.
        If I sign up for a service to pay X and get Y, then I expect to pay X and get Y - even if my automated tools request more than Y - they should be rejected with a failure message (e.g. "quota limit exceeded").
- edm0nd 419 days ago
  https://www.scraperapi.com/ is good too. Been using them to scrape via their API on websites that have a lot of captchas or anti scraping tech like DataDome.
- rustdeveloper 419 days ago
  Happy to suggest another web scraping API alternative I rely on: https://scrapingfish.com
  [-]
  - xeornet 419 days ago
    What’s the chance you’re affiliated? Almost every one of your comments links to it. And curiously similar interest in Rust from the official HN page and yours. No need to be sneaky.
- bbor 419 days ago
  Do these services respect norobot manifests? Isn't this all kinda... illegal...? Or at least non-consensual?
  [-]
  - basilgohar 419 days ago
    robots.txt isn't legally binding. I am interested to know if and how services even interact with it. It's more like a clue on when the interesting content for scrapers is on your site. This is how I imagine it goes:
    "Hey, don't scrape the data here."
    "You know what? I'm scrape it even harder!"
    [-]
    - bbor 419 days ago
      Soooo nonconsensual.
      Maybe bluesky is right… are we the baddies?
    - tonyhart7 419 days ago
      it is legally binding if your company based on SV (only California implement this law) and they can prove it
  - fc417fc802 419 days ago
    [dead]
- theogravity 419 days ago
  there's also our product, Airtop (https://www.airtop.ai/), which is under the scraping specialist / browser automation category that can generate screenshots too.
  [-]
  - kevinsundar 419 days ago
    Hey I'm curious what your thoughts are on whether you need a full blown agent that moves the mouse and clicks to extract contents from webpages or a more simplistic tool that can just scrape pages + take screenshots and pass it through an LLM is generally pretty effective?
    I can see niches cases likes videos or animations being better understood by an agent though.
    [-]
    - theogravity 418 days ago
      Airtop is designed to be flexible, you can use it as part of a full-blown agent that interacts with webpages or as a standalone tool for scraping and screenshots.
      One of the key challenges in scraping is dealing with anti-bot measures, CAPTCHAs, and dynamic content loading. Airtop abstracts much of this complexity while keeping it accessible through an API. If you're primarily looking for structured data extraction, passing pages through an LLM can work well, but for interactive workflows (e.g., authentication, multi-step navigation), an agent-based approach might be better. It really depends on the use case.
jchw 419 days ago
One thing to be cognizant of: if you're planning to run this sort of thing against potentially untrusted URLs, the browser might be able to make requests to internal hosts in whatever network it is on. It would be wise, on Linux, to use network namespaces, and block any local IP range in the namespace, or use a network namespace to limit the browser to a wireguard VPN tunnel to some other network.
[-]
- leptons 419 days ago
  This is true for practically every web browser anyone uses on any site that they don't personally control.
  [-]
  - jchw 419 days ago
    This is true, although I think in a home environment, there aren't as many interesting things to hit, and you're limited by Same Origin Policy, as well as certain mitigations that web browsers deploy against attacks like DNS Rebinding. However, if you're running this on a server, there's a much greater likelihood that interesting services are under the firewall, e.g. maybe the Kubernetes API server. Code execution could potentially be a form post away.
- remram 419 days ago
  Very important note! This is called Server-Side Request Forgery (SSRF).
- anonzzzies 419 days ago
  Is there a self hosted version that does this properly?
- jot 419 days ago
  Too many developers learn this the hard way.
  It’s one of the top reasons larger organisations prefer to use hosted services rather than doing it themselves.
morbusfonticuli 419 days ago
Similar project: gowitness [1].
A really cool tool i recently discovered. Next to scraping and performing screenshots of websites and saving it in multiple formats (including sqlite3), it can grab and save the headers, console logs & cookies and has a super cool web GUI to access all data and compare e.g the different records.
I'm planning to build my personal archive.org/waybackmachine-like web-log tool via gowitness in the not-so-distant future.
[1] https://github.com/sensepost/gowitness
quink 419 days ago
> SCREENSHOT_JPEG_QUALITY
Not two words that should be near each other, and JPEG is the only option.
Almost like it’s designed to nerd-snipe someone into a PR to change the format based on Accept headers.
[-]
- gkamer8 419 days ago
  > Almost like it's designed to nerd-snipe someone into a PR to change the format based on Accept headers
  pls
westurner 419 days ago
simonw/shot-scraper has a number of cli args, a GitHub actions repo template, and docs: https://shot-scraper.datasette.io/en/stable/
From https://news.ycombinator.com/item?id=30681242 :
> Awesome Visual Regression Testing > lists quite a few tools and online services: https://github.com/mojoaxel/awesome-regression-testing
> "visual-regression": https://github.com/topics/visual-regression
hedora 419 days ago
It'd be nice if it produced a list of bounding boxes + URL's you'd get if you clicked on the bounding box.
Then it'd be close to my dream of a serverless web browser service, where the client just renders a clickmap .png or .webp, and the requests go to a farm of "one request per page load" ephemeral web browser instances. The web browsers could cache the images + clickmaps they return in an S3 bucket.
Assuming the farm of browsers had a large number of users, this would completely defeat fingerprinting + cookies. It'd also provide an archive (as in durable, not as in high quality) of the browsed static content.
mlunar 419 days ago
Similar one I wrote a while ago using Pupetteer for the IoT low power display purposes. Neat trick is that it learns the refresh interval, so that it takes a snapshot just before it's requested :) https://github.com/SmilyOrg/website-image-proxy
rpastuszak 419 days ago
Cool! In using sth similar on my site to generate screenshots of tweets (for privacy purposes):
https://untested.sonnet.io/notes/xitterpng-privacy-friendly-...
manmal 419 days ago
Being a bit frustrated with Linkwarden’s resource usage, I’ve thought about making my own self hosted bookmarking service. This could be a low effort way of loading screenshots for these links, very cool! It‘ll be interesting how many concurrent requests this can process.
[-]
- OptionOfT 419 days ago
  Have you looked into Wallabag?
  [-]
  - manmal 417 days ago
    Thanks for the tip, this looks interesting. The iOS app seems not well designed, not sure I‘d use this over the My Links app (which I could use if I made a Linkwarden compatible API).
codenote 419 days ago
I thought it was a scale of code that could have been included in Abbe. https://github.com/US-Artificial-Intelligence/abbey
Was the motivation for separating it based on security considerations, as stated in the "Security Considerations"? https://github.com/US-Artificial-Intelligence/ScrapeServ?tab...
[-]
- gkamer8 419 days ago
  Yes, sort of - that and scaling reasons. It's actually in that same repo now but in a different service. I'd like to remove it from the Abbey repo entirely eventually.
  [-]
  - codenote 416 days ago
    Thank you! I’ll also try to focus on building a scalable architecture.
kevinsundar 419 days ago
I'm looking for something similar that can also extract the diff of content on the page over time, in addition to screenshots. Any suggestions?
I have a homegrown solution using an LLM and scrapegraphai for https://getchangelog.com but would rather offload that to a service that does a better job rendering websites. There's some websites that I get error pages from using playwright, but they load fine in my usual Chrome browser.
[-]
- arnoldcjones 419 days ago
  Good point on offloading it as for the amount of work that's required in setting up a wrapper for something like Puppeteer, Playwright etc that also works with a probably quite specific setup, I've found the best way to get a quality image consistently is to just subscribe to one of the many SASS' out there that already do this well. Some of the comments above suggest some decent screenshot-as-a-service products.
  Really depends on how valuable your time is over your (or your companies) money. I prefer going for the quality (and more $) solution rather than the solution that boasts cheap prices, as I tend to avoid headaches of unreliable services. Sam Vines Boots theory and all that.
  For image comparison I've always found using pixelmatch by Mapbox works well for PNG's
  https://github.com/mapbox/pixelmatch
- caelinsutch 419 days ago
  The easiest solution to this is probably extracting / formatting the content, then running a diff on that. Otherwise you could use snapshot testing algorithms as a diffing method. We use browserbase and olostep which both have strong proxies (first one gives you a playwright instance, second one just screenshot + raw HTML).
joshstrange 419 days ago
This is cool but at this point MCP is the clear choice for exposing tools to LLMs, I'm sure someone will write a wrapper around this to provide the same functionality as an MCP-SSE server.
I want to try this out though and see how I like it compared to the MCP Puppeteer I'm using now (which does a great job of visiting pages, taking screenshots, interacting with the page, etc).
mpetrovich 419 days ago
Reminds me of this open source library I wrote to do the same thing: https://github.com/nextbigsoundinc/imagely
It uses puppeteer and chrome headless behind the scenes.
nottorp 418 days ago
Why is the repo called artificial intelligence when it just runs browsers?
tantaman 419 days ago
us ai?
[-]
- bangaladore 419 days ago
  The website [1] is very strange. What does U.S. stand for? If I were to stumble on this I'd assume it was a fishing / scam website trying to impersonate the government. Bad vibes all around.
  [1] - https://us.ai/
  [-]
  - johnmaguire 419 days ago
    Unrelated, but I continue to be confused by "ClaudeMind" a JetBrains Plugin by "73signals": https://plugins.jetbrains.com/plugin/25082-claudemind
    Their website doesn't even mention 73signals: https://claudemind.com/
    Surely Anthropic must have an issue with this use of their trademark? And 73signals seems so similar to 37signals as to be intentional.
  - gkamer8 419 days ago
    I'll try to improve the vibes :(
    I've been working at this startup for almost two years now and that page and branding etc has been changing a lot as you can imagine ...
    [-]
    - bangaladore 419 days ago
      But what is the branding?
      United States AI?
      Like the premise of the company name is bad. Real bad.
    - bbor 419 days ago
      A) Thanks for sharing your OSS with the world!!
      B) I'm also a little confused. Surely that domain cost(s) $$$ -- why not go with a cute "us" branding rather than "U.S."? Unless you're looking to sell in other countries where maybe U.S. expertise is a selling point, this definitely comes across like you're pretending to be part of the government.
      EDIT: For comparison, we.ai costs $500,000/y (!!!)
      EDIT2: It looks like you're positioning yourself as a defense/govt contractor, thus the branding? That's certainly cool, but IMHO, if I were you and owned that domain, I'd offer it to Palantir for $$$$$ and just go with your second choice. They're currently starting in on a whole genocide/global war thing, so they have cash to burn!
      [-]
      - gkamer8 419 days ago
        Hi thanks! The domain actually used to be a redirect link to U.S. Automotive Industries (a trade publication). I reached out to them and got a deal, so it was a lot for me but not, like, we.ai expensive lol.
        The name was always a corporate placeholder and I liked the idea of US Steel or General Electric type names. Some startups have done similar things, and many people actually like the name a ton. But I know it's controversial and so any products I made have their own names and branding that's pretty separate (see: Abbey).
        Over the past few months I've gone the gov contracting route and the name actually made some sense, so I've used it raw. Still, the plan is to get a DBA in the near future and switch it up. Thanks for the advice!
        [-]
        bbor 419 days ago
        Ok that’s actually kinda hilarious — hopefully some blogger picks up that tidbit. I bet there aren’t many people using “ai” for “automotive industry” anymore!
    - sgerenser 419 days ago
      Props for using Garamond Condensed and giving me flashbacks to 1990s Apple.
  - tolerance 419 days ago
    The similarities with WhiteHouse.gov’s design can’t be much help either, I imagine.
    [-]
    - bangaladore 419 days ago
      I thought the same, but I didn't double check so I did not mention it.
  - xp84 419 days ago
    Just a totally normal domain for an Anguillan perspective on all things America
- wildzzz 419 days ago
  It's one guy running his little AI startup fresh out of college. Claims to be a former national security analyst but makes no such claim on his LinkedIn.
  [-]
  - gkamer8 419 days ago
    Thanks for the catch on my LinkedIn, I really should have that there now. It was originally something I kept private.
    [-]
    - wildzzz 419 days ago
      Thanks, I'm always on the lookout for people with suspicious or over-exaggerated credentials cough-Lex Friedman-cough. Is the national security paper public? Is it something about Ufimstev?
      [-]
      - gkamer8 419 days ago
        Hi, it is unfortunately not public and cannot be made so to my understanding. It was frustrating to talk about in job interviews for that reason and therefore was not on the LinkedIn.
      - throwaway314155 419 days ago
        > cough-Lex Friedman-cough
        Oh please elaborate!
        [-]
        standardly 419 days ago
        listen to any episode and it's evident
        [-]
        throwaway314155 418 days ago
        Yeah I mean I gathered that, still would be interesting to know what specifically he lied about.
- ge96 419 days ago
  the very same
aspeckt-112 419 days ago
I’m looking forward to giving this a go. Great idea!
robertclaus 419 days ago
We developed a service like this internally at a previous company. It was nice to have a generic "preview" generating service.
cchance 419 days ago
The fact the github doesn't have a screenshot seems... like a sad omission
_nolram 419 days ago
I'm working on a project that requires automated website screenshots, and I've hit the cookie banner problem. I initially tried a brute-force approach, cataloging common button classes and text to simulate clicks, but the sheer variety of implementations makes it unmanageable. So many different classes, button texts etc. I've resorted to "https://screenshotone.com", because it takes a perfect screenshot every time, never had a single cookie banner visible on the screenshots.
I would really like to know how this is handled. Maybe there is someone here that can share some knowledge.
synthomat 419 days ago
That's nice and everything but what to do about the EU cookie banners? Does hosting outside of the EU help?
[-]
- cess11 419 days ago
  No. Tell the services you're using to stop with the malicious compliance.
- busymom0 419 days ago
  Would recommend using SeleniumBase's CDP mode to search for those substrings, click accept on those cookie banners and then take screenshot.
- gkamer8 419 days ago
  Yeah the EU cookie banners are annoying, I'm hoping to do some automation to click out of them before taking the screenshots
  [-]
  - cjr 419 days ago
    There are browser extensions you could run like consent-o-matic to try to click and hide the cookies from your screenshots:
    https://chromewebstore.google.com/detail/consent-o-matic/mdj...
    Otherwise using a combination of well-known class names, ‘accept’ strings, and heuristics such as z-index, position: fixed/sticky etc can also narrow down the number of likely elements that could be modals/banners.
    You could also ask a vision model whether a screenshot has a cookie banner, and ask for co-ordinates to remove it, although this could get expensive at scale!
    [-]
    - gkamer8 419 days ago
      Thanks, that's a great idea! I was originally going to go the vision model route because I'd also like people to be able to send instructions to sign in with some credentials (like when visiting the nytimes or something).
    - artur_makly 419 days ago
      yeah that's what we basically did here at https://VisualSitemaps.com, but it can also be quickly become over-the-top, and you may end up removing important content. That's why in the end we added a second option to just manually enter CSS classes.
s09dfhks 418 days ago
hmm cant defeat cloudflare unfortunately, otherwise not bad
ranger_danger 419 days ago
No license?
[-]
- gkamer8 419 days ago
  Oh wow, totally forgot. Just added MIT.
Mani_Pathak 419 days ago
[dead]