![]() We need to convert the files to tab delimited (or any delimiter) text files to work with Hadoop. Format – The files in the dataset are in HDF5 format. Size – even the subset (10,000 songs) dataset is 1.8 GB what if we want to get 200 MB dataset or a dataset even smaller.Ģ. List the top 10 hottest songs closer to where you live using the artists latitude and longitude. Couple of examples –Ĭalculate song density for each song and list the top 10 high density songs. There are several experiments you can try with the dataset. The entire dataset is 280 GB and you can also download a subset (10,000 songs) which is 1.8 GB in size. The Million Song Dataset is a freely-available collection of audio features and metadata for a million contemporary popular music tracks. The Million Song Dataset started as a collaborative project between The Echo Nest and LabROSA. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |