The most technical rundown of how Shazam (and presumably part of Soundhound) works. I prefer to call it, magic.
“This data is then sent to the server side (Shazam). Let’s take the same assumption than before (300 time-frequency points in the filtered spectrogram of the 10-second record and the size of the target zone of 5 points), it means there are approximately 1500 data sent to Shazam.
Each address from the record is used to search in the fingerprint database for the associated couples [“absolute time of the anchor in the song”;”Id of the song”]. In terms of time complexity, assuming that the fingerprint database is in-memory, the cost is the search is proportional to the number of address sent to Shazam (1500 in our case). This search returns a big amount of couples, let’s say for the rest of the article it returns M couples.”