14 comments

  • ghm2199 18 minutes ago
    > Building a comparable one from scratch is like building a parallel national railroad..

    Not too be pedantic here but I do have a noob question or two here:

    1. One is building the index, which is a lot harder without a google offering its own API to boot. If other tech companies really wanted to break this monopoly, why can't they just do it — like they did with LLM training for base models with the infamous "pile" dataset — because the upshot of offering this index for public good would break not just google's own monopoly but also other monopolies like android, which will introduce a breath of fresh air into a myriad of UX(mobile devices, browsers, maps, security). So, why don't they just do this already?

    2. The other question is about "control", which the DoJ has provided guidance for but not yet enforced. IANAL, but why can't a state's attorney general enforce this?

  • WhyNotHugo 27 minutes ago
    The statistics in this article sound like garbage to me.

    Google used by 90% or the world?

    ~20% of the human population lives in countries where Google is blocked.

    OTOH, Baidu is the #1 search engine in China, which has over 15% of the world’s population… but doesn’t reach 1%?

    These stats are made measuring US-based traffic, rather than “worldwide” as they claim.

    • lolc 11 minutes ago
      I guess they'd argue that the people in China don't count, because people in China don't get to choose Google. But yeah, the stats they use from "StatCounter" are clearly not representative for what the world uses.
    • 0x1ch 20 minutes ago
      Google is only blocked in places where it would already be hard for a company with morals to work in, if not outright blocked as well. This probably represents traffic globally, excluding those places.
  • the_arun 11 minutes ago
    If google is serving 90% traffic & others are unable to enter - Doesn't that mean google is doing something right for the customer and others are unable to outcompete it? Isn't this how life works?
    • CGMthrowaway 6 minutes ago
      Google is allowed to be big, be better and win users. But happy customers is not the full test of monopolization. The real question is, "Could a meaningfully better search engine realistically displace Google today?” If the answer is no, then competition is broken
    • rafterydj 5 minutes ago
      This is a woefully naive view on the nature of monopolies. You could have made the same argument for Standard Oil.
  • ajdude 46 minutes ago
    Does anyone else use the phrase "I'm going to google XYZ" while referring to actually searching it up on Kagi, DDG, or another search engine?
    • kqr 5 minutes ago
      I used to. Even when I actually used DDG. Now that I use Kagi (and thus am on the second web search service after I stopped using Google) it started to feel silly so I say "search the web" these days.
    • shervinafshar 13 minutes ago
      I've been using Kagi for the past few years, but I try to use a brand-agnostic language talking about web search; e.g. "I'm gonna search [the web] for it"; "Use your favorite search engine to look it up".
    • eli 24 minutes ago
      Ironically this is a bad thing for Google from a legal standpoint. If a term becomes "genericized" then it can lose trademark protection.

      "Aspirin" is a famous example. It used to be a brand name for acetylsalicylic acid medication, but became such a common way to refer to it that in the US any company can now use it.

    • pixl97 21 minutes ago
      Yes, but more in the past than now, simply because almost everybody seems to use google itself.

      For example I'd hear people say "I'll Google that", then use Yahoo when they were still a major search engine.

    • jeremyjh 33 minutes ago
      Yes, it’s like Xerox or Kleenex except it’s actually still a monopoly. In a happy Kagi user but I know hardly anyone else is.
    • dijksterhuis 33 minutes ago
      nope, i say “i’m going to search for XYZ” or similar
    • chroma205 35 minutes ago
      > Does anyone else use the phrase "I'm going to google XYZ" while referring to actually searching it up on Kagi, DDG, or another search engine?

      Not me. I only use Google.

      Never used Kagi or DDG. Don’t care enough.

  • nige123 3 minutes ago
    The user data (anonymised) and analytics also needs to be shared.
  • xnx 56 minutes ago
    > Because direct licensing isn’t available to us on compatible terms, we - like many others - use third-party API providers for SERP-style results

    Crazy for a company to admit: "Google won't let us whitelabel their core product so we steal it and resell it."

    • eli 19 minutes ago
      Seems like an open question as to whether that violates any laws.

      Another way to look at it is that if you publish a service on the web, you have limited rights to restrict what people do with it.

      Isn't that the logic Google search relies on in the first place? I didn't give permission for Google to crawl and index and deep link to my site (let alone summarize and train LLMs on it). They just did it anyway, because it's on a public website.

    • techjamie 31 minutes ago
      What's the alternative? Building a competing search index as a relative nobody on the web is very difficult, from the outset, and is made more difficult from sites taking extra measures to stop bots in general now.

      Google's crawler is given special privileges in this right and can bypass basically all bot checks. Anyone else has to just wade through the mud and accept they can't index much of the web.

    • direwolf20 55 minutes ago
      Pretty standard business practice though. There's no ethics in making money.
    • shadowgovt 38 minutes ago
      But in this current climate, they can admit it and then dare Google to tell them to stop... After Google has just had an antitrust ruling against it for dominating the search market.

      Google doesn't really have a leg to stand on and they know it.

    • Ar-Curunir 42 minutes ago
      Strange to pick on Kagi when there's much bigger companies on that list.
  • ares623 0 minutes ago
    Kagi should start building an index of sites that are trying to escape the current slop internet. It’s know they have the Small Web thing. But I’d like to see an index of a “neo internet” that blocks Google et al.
  • user3939382 1 minute ago
    For anyone not acquainted Kagi is excellent and the people who work there strike me as a nice, competent people. I’m a harsh critic usually. Highly recommended.
  • direwolf20 56 minutes ago
    I hope they cache search results to further reduce the number of calls to Google.

    And Marginalia Search was not mentioned? Marginalia Search says they are licensing their index to Kagi. Perhaps it's counted under "Our own small-web index" which is highly misleading if true.

    • packetlost 53 minutes ago
      The index is not necessarily the code, but the dataset. IMO it would be better to be more open about the technical stack, but I don't think this feels dishonest to me.
  • whs 1 hour ago
    >Google: Google does not offer a public search API. The only available path is an ad-syndication bundle with no changes to result presentation - the model Startpage uses. Ad syndication is a non-starter for Kagi’s ad-free subscription model.[^1]

    >Because direct licensing isn’t available to us on compatible terms, we - like many others - use third-party API providers for SERP-style results (SERP meaning search engine results page). These providers serve major enterprises (according to their websites) including Nvidia, Adobe, Samsung, Stanford, DeepMind, Uber, and the United Nations.

    The customer list matches what is listed on SerpAPI's page (interestingly, DeepMind is on Kagi's list while they're a Google company...). I suppose Kagi needs to pen this because if SerpAPI shuts down they may lose access to Google, but they may already have utilize multiple providers. In the past, Kagi employees have said that they have access to Google API, but it seems that it was not the case?

    As a customer, the major implication of this is that even if Kagi's privacy policy says they try to not log your queries, it is sent to Google and still subject to Google's consumer privacy policy. Even if it is anonymized, your queries can still end up contributing to Google Trends.

  • jeffbee 6 minutes ago
    "We will simply access the index" has always struck me as wild hand-waving that would instantly crumble at first contact with technical reality. "At marginal cost" is doing a huge amount of work in this article.
  • yomismoaqui 25 minutes ago
    One thing I have discovered after using AI chats that include a websearch tool is that I don't want to delve on diferent blogs, Medium posts, Stack overflow threads with passive-aggresive mod comments, dismissing cookie banners... Sorry I just want the info I'm looking for, I don't care for your personal expression or need to monetize your content.

    There are other times (usually not work related) when I want to explore the web and discovering some nice little blog or special corner on the net. This is what my RSS feed reader is for.

    • kqr 3 minutes ago
      With Kagi you can opt in to an LLM summary of the search result by appending a question mark to the query. It's a neat mechanism when it works!
  • hsuduebc2 31 minutes ago
    It is even worse that the Google search become shit in last years. So they gate keep only relevant information for themselves and not using them with intent to improve search quality. As always if you have no competition your innovation goes only towards cost reduction. Not product improvement.
  • OGEnthusiast 55 minutes ago
    Sounds like we need a nationalized search engine company then?
    • browningstreet 42 minutes ago
      I wouldn't trust a nationalized search engine company.

      That said, there are projects like Common Crawl and in Europe, Ecosia + Qwant.

      I personally would like to see a search enginge PaaS and a music streaming library PaaS that would let others hook up and pay direct usage fees.

      • NitpickLawyer 15 minutes ago
        > and in Europe, Ecosia

        I tried. It's just not good enough. Quick example: yesterday I set up a workstation with Ubuntu, wanting to try out wayland. One of the things I wanted was to run an app (w/ gui) from another (unprivileged) user under my own user. Ecosia gave me bad old stuff. Tried for a few minutes, nothing useful. Switched to google, one of the first results was about waypipe. Searched waypipe on ecosia. 1 and a half pages of old content. Glaringly, not one of those results was the ubuntu.manpages entry on waypipe. shrug

      • shadowgovt 38 minutes ago
        An interoperable search index access standard might work. We've done something similar for peering and the backbone of the IP-layer interconnects themselves.