Show HN: Viral Potential Predictor

(hn-ph.vercel.app)

31 points | by salebanolow 3 hours ago

9 comments

  • minimaxir 59 minutes ago
    As someone who has spent an embarrassing amount of time researching Hacker News title trends over the years, I was excited to look at the methodology (https://hn-ph.vercel.app/analysis) but after looking at it, I am calling shenanigans afoot.

    That's not a methodology paper and it doesn't explain how the model being advertised works in the spirit of open machine learning research; given that the startup is an AI startup, I assume that the actual model is more sophisticated. As Section 8 notes: "This analysis is descriptive and intended to summarize empirical patterns."

    It's an exploratory data analysis which not only does not explain the methodology around how the model is constructed, but it also makes a number of assumptions that imply the people making it without proper context of how Hacker News works:

    1. The extreme right-skewed nature should have raised a very large number of flags in the statistical methodology and calculations, but it mostly ignores them. The mean values are effectively useless, the p-values even more useless. It doesn't point out that the negative performing terms are likely spam.

    2. It does not question why there are so few questions with a title >80 characters (answer: 80 characters is the max for a HN submission)

    3. The analysis separates day of the week and hour: you can't do that. They're intrinsically linked and weekend behavior with respect to activity is far different than on weekdays.

    4. "Title length has a weak relationship with score (Pearson r = -0.017, Spearman r = 0.048, n = 100k)". No statistician would call that a weak correlation; those values are effectively no correlation.

    There is also no person tied to this paper, just the "Memvid Research Team", which raises further questions.

    • leohonexus 49 minutes ago
      I think it would have been much more appreciated as a dataset paper (and titled accordingly), rather than a "viral potential predictor".
  • delichon 2 hours ago
    Here are the result for this username, this title and this description:

    https://hn-ph.vercel.app/results/ZT06GF

    It got a 62, a C+, predicting that this won't be very viral. So you either didn't test this submission on your own product, or you did, but didn't feel that the low score was a handicap? You don't seem to be dogfooding. If this post does well it would be evidence against its own accuracy. If it fizzles out, congratulations on being correct.

    • baobun 1 hour ago
      Uncharitable and assumptious of the goals. I prefer submissions to not be hyper-optimized for virality.
  • tverbeure 1 hour ago
    Current nr 3 in the leaderboard: "Show HN: I built a Rust compiler in Rust with Rust"

    Could use some more Rust to boost it to nr 1.

    • Frotag 38 minutes ago
      Show HN: I built a Rust compiler in Python with JavaScript using Java on Android
    • baobun 1 hour ago
      I'm calling it: Some AI controversy in Rust core will be in the top 5 of 2026.
  • andr3wV 1 hour ago
    The analysis they ran in their research paper found most surface features don’t meaningfully separate viral from non‑viral outcomes. So the tool isn't actually predicting if your launch title will go viral, it's more like checking for heuristics and descriptive patterns.

    Cool idea though! And they're on the front page lol

  • higginsniggins 54 minutes ago
    According to your research paper you should have made this post a "Tell HN:" rather then a "Show HN:", lol
  • amitav1 1 hour ago
    This tool: "Avoid keyword stuffing; make the title read naturally."

    Also this tool: "Show HN (AI): I built GPT 6 in Rust Using Claude Gemini Grok OpenAI NVIDIA Google" - #1

    (No hate to the creators obviously. Just really funny.)

  • codybontecou 1 hour ago
    Well, he made it to the front page so there’s that.
  • simonw 1 hour ago
    (Replaced my original comment here which was a little unkind.)

    Question for OP, who created Memvid (the .mv2 file format that's used to distribute this data). Are you still taking text, chunking it and then storing those chunks as QR codes in a video file? That seems like an inherently inefficient storage mechanism to me compared with something like SQLite or Parquet - do you have concrete numbers or a demo that shows that your file format really is more effective for storing data for "AI agents" than those existing solutions?

  • mitexleo 2 hours ago
    Let's see if this goes viral
    • asciii 1 hour ago
      o7 see you in the 1% someday