Show HN: Free tool to find RSS feeds, even if not linked on the page

I developed a small tool to find RSS feeds for websites. You can try it out here: https://lighthouseapp.io/tools/feed-finder

In >90% of cases the standard way of checking meta tags is enough to find the feeds. But my goal for this tool is that it finds feeds regardless if they're linked somewhere or not. That if this feed finder doesn't find a feed, no feed exists.

It's a big goal and admittedly not there yet, but it does a few things that are a step in that direction.

* Checks meta tags of parent pages (sometimes the article itself doesn't have the meta tag, but the main blog page does)

* Checks common suffixes like /rss, /index.xml and many others (sometimes the feed exists but isn't linked)

* Checks the sitemap

* Checks all links on the page

* Checks 3rd party feeds (OpenRSS for now, when I find more such repositories I'll add them too)

There are a couple of additional ideas I have, like checking search engines and crawling the entire domain (highly inefficient, but possible).

Would love if you could try it, and even more if you post sites where it doesn't work.

152 points | by domysee 126 days ago

28 comments

  • rollcat 126 days ago
    Quick rant about websites that go into all the trouble of having an RSS feed but not linking to it in the <head>... I don't want to go hunting for the cute orange button, I want to copy and paste "https://example.com" into my feed reader and let the computer handle the work.

    If you maintain any website with a news feed, go right now and check that you have this in your <head>:

        <link rel="alternate" type="application/rss+xml" href="/rss.xml" title="News feed" />
                                                               ^^^^^^^^ change! ^^^^^^^^^
    
    (Also note whether and where you need to use application/rss+xml, application/atom+xml, or application/json.)
    • awanderingmind 124 days ago
      Thanks for this comment, it encouraged me to go and add this to the <head> of my blog.
  • jcul 126 days ago
    This is great, it's hard to believe sites can have RSS feeds but make it so difficult to find.

    I suspect some sites are just running some framework than enables it and don't even realize they have one.

    I have used this site in the past to find feeds: https://www.rsssearchhub.com/

    In the past I was looking for a feed for https://ra.co, but could not find it, though I had seen old posts referencing a RSS feed.

    I ended up emailing them and, to my delight, they let me know they still have an unsupported RSS feed here:

    https://ra.co/xml/rss_news.xml

    Just for feedback, this tool doesn't find the feed, though it doesn't look like a standard URL to me.

    • domysee 125 days ago
      Definitely not a standard path, but good to know for testing, thank you!
  • LorenDB 126 days ago
    If I can't find an RSS link directly, I generally copy the root URL into archive.org and search for all URLs matching "xml", which includes content type, not just URL names.
  • superkuh 126 days ago
    This is 100% a feature that should be in the browser, not a third party tool. I still use an very old version of Firefox for this. Too bad Mozilla decided auto-discovery wasn't necessary in 2016 and removed it. Then two years later claimed no one was aware of RSS/Atom feeds and didn't use them (I wonder why?!?). All so they could try to replace it with their profit/adware that is pocket and we all know how that went.

    >Mozilla is working on alternatives such as Pocket or Reader Mode, and on improving WebExtensions which could provide features related to RSS/Atom feeds without the toll on maintenance. (ref: https://www.ghacks.net/2018/07/25/mozilla-plans-to-remove-rs...)

  • AiAi 126 days ago
    Interesting. These days I was trying to subscribe to some blogs, and they didn’t have a RSS button in their page, so I had to inspect the page to find out the feed URL. Not sure why keep a RSS feed but hide from the visitors. It could be it expected the feed reader to be able to identify it, but since I was using Thunderbird it did not.
    • domysee 126 days ago
      Most feed readers find at least feeds that are linked with a link tag in the header, if it's <link rel="alternate" type="application/rss+xml" ... />

      Probably they're expecting people to just paste the website URL in the feed reader and them identifying it. But it would be nice to see the RSS URL linked somewhere.

    • Klonoar 126 days ago
      Some of these cases are sites that are built on a CMS that exposes RSS by default, but people don’t consider showing a link/button/whatever in their design.
  • account42 126 days ago
    > Application error: a client-side exception has occurred (see the browser console for more information).

    Ok then.

    Also, this would make more sense as a browser extension. Especially if it brought back the RSS icon in the address bar to indicate when a feed is available (although maybe you don't want it to do all of the checks until prompted).

    • domysee 126 days ago
      Which URL did you try?

      Yeah the checks are quite expansive, depending on the URL it might more than a hundred requests.

      A browser extension would make sense. Guess I have another project :D

      • djbusby 126 days ago
        100!? I have a tool to find feeds from sites - checks like 4 things.
        • mdp2021 126 days ago
          Well, it must miss many then: my list already is only (and omits a few variations e.g. with 'atom'):

            .../rss , .../rss.xml , .../.rss , .../rss_full.xml , .../feed , .../rss-feed , .../feed/all/ , .../MySection.xml , .../MySection.atom , feedserver.example.com/section/index
  • sodality2 126 days ago
    Great idea. I tried it with my personal site (https://matthew.science) and it didn't find any, which admittedly doesn't have any meta tags, but it is linked at the footer at https://matthew.science/atom.xml. It was the default feed URL for my SSG. I'd recommend adding this to the common suffix list.
    • domysee 125 days ago
      This I must check, it looks standard enough that the tool should've found it. Thanks for the feedback!
  • Cieric 126 days ago
    Tried the hacker news front page (https://news.ycombinator.com/news) and when clicking on OpenRSS I get this error:

    TypeError: URL constructor: is not a valid URL. [NextJS] (5603-cb6f1c5a9761f9d0.js:14:5466)

    Browser is Firefox 130.0 on Windows.

    Would be really nice to see this working really well since I search for RSS feeds a lot for a bunch of different things. Whether the RSS feed is good is always another question.

    • domysee 125 days ago
      I don't get the error on my machine, but there probably is a timing issue somewhere. Thanks for letting me know!
  • DamonHD 126 days ago
    FYI it's only finding one (Atom) feed at earth.org.uk, even though there are several feeds, Atom and RSS.

    Your method described above should have found at least two feeds I think.

    • domysee 126 days ago
      Interesting, I'll check that, thanks for letting me know!
  • freetonik 125 days ago
    I've been using an NPM package called rss-url-finder [1] in my blog search engine project to find the RSS link. It works relatively well, but still fails sometimes. For now I end up manually searching the source code of the HTML page for .xml or similar link.

    [1] https://www.npmjs.com/package/rss-url-finder

  • Circlecrypto2 126 days ago
    I am very grateful for this actually. I still read RSS and when I find a good news site I tend to spend 15 minutes or more looking for their feed.
  • jayemar 126 days ago
    Are you opposed to this being used programmatically? I've been working on a site [0] that replays feeds, but the initial step is to first find the feed given a website, and it's not always able to find it. I'd be interested in using your service to try to find the feed when I'm unable to do so.

    [0] https://refeed.to

    • pogue 126 days ago
      Can you explain the purpose of replaying a feed is?
      • jayemar 126 days ago
        My initial use case was for reading content from blogs that had been published before I'd subscribed to their feed. I could visit their site and read their previous posts, but I much prefer the slow drip of an RSS feed. So I created refeed.to to be able to add 1 post per day from the blog to my feed starting from their first post.

        Since creating it I also use it to inject a few extra cartoons into my feed (xkcd every day!) and have also had fun with tech flashbacks from trustedreviews.com. So it's just a way to add a little variation to my feed.

    • domysee 125 days ago
      Sure, email me at dominik at lighthouseapp.io
  • snthd 126 days ago
    • domysee 126 days ago
      This is great, thank you!
  • nanna 126 days ago
    Great work! I've stopped using Twitter but I managed to taper from it by following things using RSS feeds drawn from Nitter. Don't know if that still works but could be an idea?
    • domysee 126 days ago
      Twitter feeds would definitely be great to have, will check Nitter to see how I can get them. Thanks for the suggestion!
  • validatori 125 days ago
    add also .feed to common suffixes example: https://wiadomosci.onet.pl/.feed
  • chuanliang 125 days ago
    Great tools.

    I always use RSSHub Radar , Your tools support more website than RSSHub Radar

    Detection of /feed could be added, most wordpres supported sites have this suffix

  • cranberryturkey 126 days ago
    Cool. I wrote a script to search google and find sites with rss feeds so I can create a collection on a particular topic.
    • domysee 126 days ago
      That's awesome. Is there any specific search text you used to find the feeds? I know Bing has a command to do that but don't know about Google.
      • djbusby 126 days ago
        Don't forget DDG and Kagi - might of some tools too
  • richardbui95 126 days ago
    I tried it on my website, ebookany.com, but didn't find anything. So sad :(( But your idea is quite interesting.
    • domysee 126 days ago
      That's good to know, thank you, helps me debugging
  • stuaxo 124 days ago
    I bet this finds some feeds that sites don't know or have forgotten they even have.
  • oidar 126 days ago
    The tool misses reddit rss feeds.
    • domysee 126 days ago
      Thanks for the hint, will fix that!
  • AIPodNav-Team 123 days ago
    cant find lex fridman podcast's feed. https://lexfridman.com/
  • asddubs 126 days ago
    my suggestion is a way to have users of the extension suggest a feed URL if it doesn't find one
  • GavCo 126 days ago
    Cool. I'm a big fan of RSS feeds.

    Wondering if it's necessary to continue with the other checks if you find a feed in the meta tags?

    • domysee 126 days ago
      Probably not, but I'm trying to find all feeds.

      I guess the best option is to show results as soon as they are found, without waiting for everything to complete.

  • cxr 126 days ago
    [deleted]
    • domysee 126 days ago
      That's super interesting, will definitely try it, thank you!
  • dotBen 126 days ago
    RIP Google Reader
  • glub103011 126 days ago
    [dead]
  • jacobvespers 126 days ago
    [dead]
  • jacobvespers 126 days ago
    [dead]