Show HN: I built a full-text search for your browsing history

(chromewebstore.google.com)

41 points | by peterpelles 380 days ago

12 comments

KomoD 379 days ago
Gives me really bad vibes, sending all your browsing history, page info, etc. to some server, including screenshots of your tabs.
You also say it doesn't save in incognito but I can't find anything in the source code to support that claim.
I see a bunch of other red flags too, like no obvious monetization, the privacy policy saying updated in 2021, but then a little bit further down it says updated in 2019 (neither makes sense as the domain was registered in 2023), privacy policy sometimes says Nision Research LLC and sometimes Nision Research Kft, the chrome web store page has an email for "gethaystack.com" but the privacy policy says [email protected] (and that gethaystack site has a policy saying "info@localhost"), both of the reviews on the chrome web store are by people clearly affiliated with browspilot, the FAQ says you can't delete things from your history, privacy policy doesn't say where your data gets sent (i see mentions of firestore, firebase in the code, but it does not appear in the privacy policy)
Also this claim on your home page "Your data will only be read and used by you." doesn't align with what your privacy policy says.
[-]
- pedalpete 379 days ago
  This was my initial concern. Though this is something I want, and I'm surprised isn't built into modern browsers. I'm less interested in having a cache of images, but useful search of my history, which I think could be stored locally, would be helpful. We don't need the search to be particularly clever I wouldn't think. Even straight string search of cached text from the webpages I've viewed would be valuable. I don't need all the images, video, javascript, etc cached.
  [-]
  - tarasglek 378 days ago
    I proposed this at firefox when I was there, was deemed as too niche a feature...I'm still sad that I can't have this feature
- yjftsjthsd-h 378 days ago
  > You also say it doesn't save in incognito but I can't find anything in the source code to support that claim.
  Doesn't Chrome default to disabling extensions in incognito? So the extension wouldn't actually have to do anything itself for that to be true.
NVI 379 days ago
Why do I need to log in with Google to use it? I'm also experiencing a bug where after logging in, I see the same login popup over and over again.
[-]
- mef 378 days ago
  i too had this issue, and it persisted even after i uninstalled the extension. i had to restart the browser entirely
peterpelles 380 days ago
I’d love to hear your feedback – please share your thoughts and suggestions in the comments!
[-]
- dotcoma 379 days ago
  Was browsepilot.com not available ?
  (brows looks weird to me)
  [-]
  - KomoD 379 days ago
    They have both browse and brows.
    [-]
    - dotcoma 378 days ago
      So... do you know why they chose BrowsPilot and not BrowsePilot ?
      [-]
      - KomoD 377 days ago
        No clue, I too prefer Browsepilot over Browspilot
bradrn 379 days ago
This reminds me… some time ago I made my own Firefox extension to do full-text search of all my webpages. It’s in three parts: a server running in the background to interface with an SQLite database, a minimal extension to send text to that server, and a little GUI to query the database.
Unfortunately, all this makes it an utter pain to set up. It’s also somewhat specialised to my own very minimal needs. When I’ve mentioned it in the past, people have suggested open-sourcing it, but for these reasons I’ve resisted it. This post now makes me wonder if I should look into ways to improve it…
[-]
- KomoD 379 days ago
  > When I’ve mentioned it in the past, people have suggested open-sourcing it, but for these reasons I’ve resisted it.
  Could just open-source it "as is", there's probably some people that would be interested in just messing around with it or using it as a base
  [-]
  - bradrn 379 days ago
    Yeah, you’re probably right. They’d have to be familiar with the rather eclectic mix of languages I used (Haskell, C++ and JavaScript), but then again that’s no reason not to publish it. To be honest, I’m not quite sure why I haven’t just put it online… sheer laziness, I guess.
beeboobaa3 379 days ago
Where is my data stored?
Leftium 380 days ago
One of the features I wish Kagi had was the ability to search through my previous search queries. More details here: https://kagifeedback.org/d/4065-query-personal-search-histor...
Maybe Browspilot could fill this gap!
One thing I noticed is my search history is like a zero-effort personal journal. It gave me a detailed glimpse of what I was doing/thinking on a certain day from several years ago.
[-]
- peterpelles 380 days ago
  I read your feature request on Kagifeedback. With our tool, Browspilot, you can currently recall pages you have visited based on keyword matches. Given that you can also search the body of a page and sometimes the comments too, it's already quite useful, and I'm confident you will be able to find most of the things you're looking for most of the time. It was almost surprising to us too, how easily one can actually find stuff - given you are searching in a limited dataset, which is your own search history as opposed to everything like you do in google - just by typing in words that appear somewhere on the page you are looking for, as opposed to having to click a bunch of times and navigate through apps, messages, or emails to find a link again.
  However, I believe that once we introduce the advanced vector search, which we are already testing in our beta version, you should be able to find the page you are looking for in Browspilot with absolute certainty just by typing words related in meaning into the search box, so you won't even need to remember your exact search queries.
  We will also be adding image search capabilities soon.
- peterpelles 380 days ago
  Thanks for this. Very useful!
PostOnce 378 days ago
browsers stagnate by their monoculture, this feature should've been part of a browser (and local to the machine) 20 years ago and here we are adding WebMIDI and webUSB support when we cant even find shit we looked at 3 days ago.
[-]
future10se 378 days ago
I've been looking for something like this, but as a desktop app that runs locally. So far I've only found two:
1. HistoryHound - https://www.stclairsoft.com/HistoryHound/index.html
2. BrowserParrot - https://www.browserparrot.com/ (sadly seems abandoned)
Wish I could find something like these but open-source. Both of them parse your browser history, fetch the pages, and build their own index. Would be a "safer" and more space/cpu-efficient alternative to apps like Windows Recall and Rewind.ai.
[-]
- bbkane 378 days ago
  There's https://github.com/go-shiori/shiori?tab=readme-ov-file . It works on bookmarks and uses SQLite to enable full text search. It's also a CLI so I thibk you can write a script that parses your history file and loads it into this
- hamsterbase 378 days ago
  You could try hamsterbase. All functions are offline and data is stored locally.
  If you need to save all the pages you've seen, you can use singlefile, an open source plugin that works directly with hamsterbase.
janice1999 379 days ago
How do you plan to make money? The obvious answer would be selling people's data. What is your alternative?
[-]
- purple-leafy 379 days ago
  As someone who builds chrome extensions, and is very familiar with monetisation of most extensions…
  Ding ding ding! All your data are belong to us.
  I explicitly never save user data (apart from auth and subscription status) in my extensions, and the only call back “home” is to check whether the authenticated user has an active subscription.
  I’m also going to make my more powerful chrome extensions “source available” so people can see exactly what it does with your data, and on your machine. Not “open source” because I don’t want contributions
  [-]
  - yjftsjthsd-h 378 days ago
    > Not “open source” because I don’t want contributions
    Minor nit: You can make something FOSS by publishing the code under a FOSS license. You don't have to accept PRs, you don't have to take bug reports or feature requests, you don't have to foster a community. Open Source can be as simple as "here is a tarball of source code that you can use", full stop. (As an extreme case in multiple senses, sqlite famously is public domain, and also generally doesn't take any contributions - https://www.sqlite.org/copyright.html )
    Of course, you are fully entitled to go Source-available too, and if you want to facilitate audits without actually giving anyone else the right to use your code then that's the way to go. I just want to point out that there are options between "not FOSS" and "community-centric development".
  - beeboobaa3 378 days ago
    > I’m also going to make my more powerful chrome extensions “source available” so people can see exactly what it does with your data, and on your machine. Not “open source” because I don’t want contributions
    You should also encourage/teach users how to check the source of extensions they've installed locally. It's actually pretty easy. If you're not obfuscating, you may not even need to make the source explicitly available.
    https://gist.github.com/paulirish/78d6c1406c901be02c2d
iansinnott 378 days ago
[flagged]
[-]
- bcjordan 378 days ago
  Does this have semantic search of some form? May be possible to implement all client-side with local browser models soon
  [-]
  - iansinnott 378 days ago
    it does not. i did look into it though [1] and at the time didn't find a good client side vector search lib. i wanted to avoid in-memory vector search since the size of the data can be significant depending on browsing habits. It is definitely possible though. I got a proof of concept working with victor [2] and client-side embeddings but it wasn't good enough IMO to ship.
    [1]: https://github.com/iansinnott/full-text-tabs-forever/issues/... [2]: https://github.com/not-pizza/victor
alexliu518 378 days ago
Browspilot is a neat tool built by Peter and his team to help you find anything you've seen online with just a clue or by scrolling through your past activity. It's super handy for pulling up frequently used pages or digging up old stuff without keeping a bunch of tabs open.
Whether you're a student or a busy professional, just type in a bit of what you remember, and it’s there. Plus, exciting features are on the way, like searching across different apps and finding things based on meaning with advanced tech.
Overall, Browspilot makes finding online content a breeze!