The Million Song Dataset was
released in collaboration with The Echo Nest, and
uses Echo Nest identifiers
to refer to each track. While the metadata that comes with the dataset includes
names of tracks and artists, in June 2016, the Echo Nest shut down their
API, leaving no service available which understood the IDs.
A valuable service provided by the Echo Nest API was Rosetta Stone, a
mapping between Echo Nest IDs and IDs from other music services.
We performed a lookup of all Echo Nest Song IDs present in the Million
Song Dataset, obtaining mappings to IDs of other services where
available.
Method
We used the /song/profile endpoint for each Song ID. The queries included
the following rosetta stone buckets:
millionsongdataset_echonest.tar.bz2 (461M): The result of looking up /song/profile in the Echo Nest API for all Song IDs in the Million Song Dataset.
File contents and accuracy
Note that the track list in these files does not include the Million Song
Dataset track ID. Use the MSD SQLite database file to map Song IDs to Track IDs.
The archive has JSON files containing the results of looking up each
individual Song ID. The files are named in directories based on the 2nd
and 3rd letters of the Song ID, i.e., XX/SOXXnnnnnnnnnn.json
We have not validated any of the data in the archive.
Here is a truncated and annotated version of the file CW/SOCWJDB12A58A776AF.json: