7 comments

  • BugsJustFindMe 125 days ago
    > Google Search is now making it easier than ever to access the past.

    I don't know about easier than ever. Easier than since google killed their own version of this that worked well until they killed it, sure.

    • RajT88 125 days ago
      From what I recall the cached versions only had fairly recent versions. IA were always better for the archived versions.
      • postalrat 125 days ago
        It's infuriating when there is text in the Google preview in search results that isn't in the linked document.

        My guess is that Google doesn't want to share the version of each page they get served to index.

        • dietr1ch 124 days ago
          Well, no one can ensure that same text they found will be found again on a link. It can simply change since the last time you saw it.

          Now, building an index means that after you are done you can drop the data you had and use that space for getting newer data, so you might not even have it. It's probably smart to keep the data to help the crawler be less aggressive at downloading the whole internet though, but this data is globally replicated and ready to serve, might be just in 1.5+ hard disk drives that are not publicly reachable because they are dedicated to run the indexing service, not serve end users.

          Could Google do it? Yeah, sure, but they are busy doing LLM things.

    • xnx 125 days ago
      cache:url still works (for now), but it's true there's no link to access it
      • p0358 125 days ago
        True, although only if the link is actually cached, which I find to be the case far more rarely than before. But it does work and it's still helpful...
  • xnx 125 days ago
    Cool to see the Internet Archive getting some official promotion from a megacorp like Google. I hope there's also proportional financial (and legal?) support.
  • lxgr 125 days ago
    Doesn't work for me – probably one of those things that Google rolls out over the course of months without any way of telling whether you're already in the feature flag or not.

    In the meantime, I'll keep using this handy bookmarklet: https://gist.github.com/n-st/0dd03b2323e7f9acd98e (which obviously only works for pages that are still available; for others, it requires copy-paste-ing the URL).

    Also, Google/IA and me seem to have very different definitions of "easy":

    > [...] conduct a search on Google as usual. Next to each search result, you’ll find three dots—clicking on these will bring up the “About this Result” panel. Within this panel, select “More About This Page” to reveal a link to the Wayback Machine page for that website.

    The only thing that's missing is the "Beware of the Leopard" sign.

    • jonah-archive 125 days ago
      When this was posted it was rolled out for a subset of users. At this time it should be fully rolled out for all users.
      • lxgr 125 days ago
        I can see it now, thank you!

        That said, it's one full page scroll down, and that after two clicks at not-too-obvious menus. Not exactly in your face in terms of discoverability.

        Hopefully this is just a first step – keeping archived sites indexed and/or automatically forwarding people to IA links once the original goes down would be amazing!

        Within reason, of course; there's possibly a point where it would be a bit "too discoverable" and make people exclude their site from archiving as a general precaution. (I could see people being generally fine with being archived, but not with being google-able forever.)

  • creer 125 days ago
    Internet Archive is amazing, and vulnerable: It's a single entity in a single jurisdiction unfriendly to this kind of effort. Are there efforts to duplicate it? For the book side - and some magazines, there is libgen and such. Is there something for the web side? music, photos, software? Any current effort by Internet Archive themselves?

    A quick look now at archive.org didn't find much.

    A hint that "partner institutions" can maintain local copies. Some which might be "one-off" copies, not maintained up to date.

    There was an IA.BAK project.

    There was a useful discussion 4 years ago, here:

    https://old.reddit.com/r/DataHoarder/comments/h02jl4/lets_sa...

  • ahmedfromtunis 125 days ago
    It was a bummer when Google removed access to their cached version of webpages.

    This is a step in the right direction, even though navigating the Wayback Machine often results in the tab crashing.

  • elektor 125 days ago
    Kagi does the same thing

    https://imgur.com/a/z4D8aDo

    • p0358 125 days ago
      At least here it's right there on the spotlight, in Google it's so hidden beneath that I doubt anyone will ever notice it.
      • msephton 117 days ago
        It really is hidden away isn't it. What a waste of time
  • terrycody 124 days ago
    But is there a way to check when a specific webpage first posted online? I doubt it...
    • msephton 117 days ago
      You can ask for the oldest version of any page that is on the Wayback Machine. That may or may not be the first posting, depending on the date. The further back in time you go, the more spotty the archive becomes. Late 1990s and early 2000s is quite patchy. Just enter the URL into Wayback Machine and you'll see two dates (first and last) and a timeline (all the archives versions).