Perhaps obviously this is the same technique that enables ACR on TVs.
It occurs to me that Shazam has such a better reputation online because the intent and consent of the user is honored.
It makes me wonder if there couldn’t be an implementation on TVs that is similar and actually is a net positive for consumers. Basically would customers actually like TV ACR if the data wasn’t just going to sell more ads?
So the value-add would be the consumer would get to find out the name of the show or movie that’s playing, the same info that also pops up if they hit the pause button?
Recognizing a recording isn't hard to do, because, for the same recording, the chords follow each other with precisely repeatable timing. That's been around for well over a decade. Recognizing a different recording, say, a, cover version, of the same song, is much more work.
Audible Magic claims to be able to recognize multiple performances of the same songs, and even parodies.[1] Using, of course, "AI technology" and much more compute.
Forgive my ignorance, but what does SCP mean in this context? (my normal go-to of 'secure copy' doesn't fit).
Thanks for the other links, the question in this title is one I've day-dreamily thought about on occasion, but never dug into. Will have a read of all three.
It occurs to me that Shazam has such a better reputation online because the intent and consent of the user is honored.
It makes me wonder if there couldn’t be an implementation on TVs that is similar and actually is a net positive for consumers. Basically would customers actually like TV ACR if the data wasn’t just going to sell more ads?
Shows could synchronize additional content that’d be visible when Shazam mode enabled.
- OG shazam paper https://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf (he has a talk on youtube btw look it up if really care)
- https://news.ycombinator.com/item?id=18069968 shazam employee blogpost
- https://news.ycombinator.com/item?id=38538996 shazam cofounder endorsed explainer
- go algo repro https://news.ycombinator.com/item?id=41127726
as with all ML things... the code is much less % of the value than the data...
Audible Magic claims to be able to recognize multiple performances of the same songs, and even parodies.[1] Using, of course, "AI technology" and much more compute.
[1] https://www.audiblemagic.com/2024/02/07/identifying-cover-so...
From CameronMacLeod (2022) - and much more complete analysis (587 points, 2023, 155 comments) https://news.ycombinator.com/item?id=38531428
Or Slate (2009) (50 points, 16 comments) https://news.ycombinator.com/item?id=893353
Thanks for the other links, the question in this title is one I've day-dreamily thought about on occasion, but never dug into. Will have a read of all three.
[1] https://scp-wiki.wikidot.com/glossary-of-terms
I think it'll take me longer to understand WTF SCP is than it will to understand how Shazam works.
https://hn.algolia.com/?q=royvanrijn