Music Genre Clustering #3 – Analyzing Music Genres


Okay, so the last post was great and all, but what exactly did I get out of it that will help me identify something to analyze? I saw librosa’s spectrogram, chromagram, and tempogram capabilities which help us somewhat identify the instruments, key, and tempo respectively. These will absolutely change and evolve as we listen to different types of music, so why don’t we just check this out really quickly for a few genres of songs.

Music Genres

I think my objective is starting to shape itself a little bit, both out of my initial exploration and my personal musical tastes. I like to think that I listen to a variety of genres of music. That in itself is a subjective statement, so I’ll say I’ll listen to more genres than your average person.

This is not too difficult, as the average person probably does have a life and really is not prioritizing the listening and exploration of music like I do. A lot of my life revolves around music… When I wake up, when I’m walking to work, when I’m working (usually), when I’m walking home, when I’m cooking, when I’m going to bed… and a few hours every week I’ll usually just spend sitting there and downloading songs that I’ve heard over the past few days. I’ll Shazam pretty much anything these days… podcasts, radio, random songs in TV shows or movies, random songs on the street as I walk, songs that other people recommend to me… After a few days, the songlist starts to add up and I have to sit down and collect all the songs of the week to add to my iTunes collection.

A few hours a week is also spent just djing for fun. I bought a pair of CDJs and just dj in my living room purely out of enjoyment and when I need a break from work. It’s also great when you have guests over and they question your life decisions because you spent as much money as a used automobile on something you’ll probably never get a return out of.

All this to say… I REALLY ENJOY MUSIC. As you could imagine, the song I’m putting on when I wake up is probably not the same ones I put on when I try to dj (well, actually, sometimes it is…). In the morning I’ll generally listen to some slower stuff, right? R&B, jazz, soft rock, pop. When I’m trying to dj, I’ll select more disco, house, and techno. When I’m working, it’ll kind of span all those genres but you’ll also hear a bit more rap and soul in there as well. In the olden days, I used to listen to a lot of R&B, but nowadays, I’m finding myself grooving to more house and disco. It’s been an interesting progression, and I really do love everything.

I think maybe what I’ll start off this post by doing is just exploring the spectrograms, chromagrams, and tempograms of some of my favourite songs across different genres. But wait… which genres…? If we check out this applet here, am I going to go through all of these? Disco, dub, new wave, garage, speed garage, 2-step garage, eurodance, NRG, stupid. I don’t know about you, but I’m not sure if I’ve ever heard of a genre called stupid. Why does this thing have to demoralize me after I just bragged about how much music I listen to… jesus…

Let’s break it down a bit… First result from google for “music genres” gives us this list:
Alternative Music

  • Blues
  • Classical Music
  • Country Music
  • Dance Music
  • Easy Listening
  • Electronic Music
  • European Music (Folk / Pop)
  • Hip Hop / Rap
  • Indie Pop
  • Inspirational (incl. Gospel)
  • Asian Pop (J-Pop, K-pop)
  • Jazz
  • Latin Music
  • New Age
  • Opera
  • Pop (Popular music)
  • R&B / Soul
  • Reggae
  • Rock
  • Singer / Songwriter (inc. Folk)
  • World Music / Beats

This list is alright, but, once again, slaps me in the face a bit as to how much music I really listen to. There’s some stuff in here that I probably don’t have any in my library… I’d have to say I’m short for a lot of classical, country, inspirational, latin, opera, reggae, and world music. Also a few in here that aren’t necessarily genres so much as they are cultural generalizations like asian pop, latin music, and european music. I’ll use this list as a framework and add my own genres as I go along.

Here are my hand-picked genres:

  • Ambient
  • Dance
  • Easy Listening
  • Folk
  • Pop
  • Hip Hop
  • Jazz
  • R&B
  • Rock

That might be too many, but I’ll stick with this for now and see where it takes me.

— 2 days later–

Well that was fun. I just went through some odd 4000 songs in my iTunes library and standardized the “genres” metadata field to one of these 9 categories. Thank god for food and TV! Just kidding, as bad as it sounds, it was actually kind of nostalgic going through all the music I haven’t listened to in ages… all those songs that I skip during shuffle over and over and over again.

I’d say about half of the list was already tagged with something in the genre field, and probably most of those were tagged with the correct genre or a standardized enough value that I could just do a mass search and replace. The rest of them, I more or less had to go 1 by 1 and just type the values in. iTunes autocomplete helped A LOT. Actually, now that I think about it, some of the songs that were pre-populated with a genre had the wrong genre, in the sense that an entire album was tagged with “R&B” but certain songs from there were Jazz or Dance… etc. I guess this is a good time to dive a bit more into my methodology:

Genre Classification Methodology

I’m going to start off this section by very clearly stating:


How can I be so sure there is no right answer? Because I’m fighting with myself on even forming my own opinion. Some songs (especially modern day music) are so clearly a mix of 2 or 3 genres… I mean, I guess that’s why that whole list exists on that google search I made. Music is an expression of human creativity with the mantra that the only limits of music composition is the human brain itself. This literally goes again what we’re trying to do, which is literally try to put some structure around a concept we made up to box things together that don’t belong together in theory. So what makes a Jazz song Jazz and an R&B song R&B? That is basically in the eyes, or ears rather, of the beholder. When I was younger, I would listen to a bunch of R&B right? Nowadays, I’d probably classify some of those songs Hip Hop and some Jazz because I’ve now actually listened to real Hip Hop and Jazz songs. I just didn’t know enough about Jazz to even know what Jazz really sounded like, so how could I possibly classify it as Jazz? That situation right there explains clearly that this is, at the end of the day, a subjective discussion. But now that that’s over with, let’s make it objective haha.

Rather than explaining my methodology with words, I’ll explain it primarily with some music itself:


I’m going in alphabetical order here, but this is a really weird one to start off of. This should almost be last because this was kinda the genre that I put things in when it didn’t really fit into any other genre…

Ambient includes songs like Laraaji – I Am Sky and Gino Soccio – Closer which generally didn’t have prominent drums, but it also has tracks like Nese Karabocek – Yali Yali (Todd Terje Edit) which has thumping drums for sure… but where else would you put this? Rock? Dance? I dunno…

The general feel I was going for was something that as generally slow and smooth and perhaps you could just close your eyes and sit there to listen to.


Through this exercise, I learned that I basically am not at all diverse in my musical taste as probably 40-50% of my library turned out to be Dance haha… gotta love Dance music man…

In my own defense, I actually lumped a few categories into Dance itself. It includes the likes of Disco, House, Techno, Trance among others. One of the hardest distinctions I had to make was between Disco and R&B, which would categorize to Dance and R&B respectively. It was almost a BPM split for me, as anything 120 BPM or higher tended to “sound” more Disco-ish and anything under sounded R&B-ish. This is neither the best criteria, nor can I even explain my own true criteria in words, but maybe I’ll let some sounds do the talking:

I have to say that most of my songs come from the Disco and House genres. I thought about separating Disco, but then again… House… Disco… R&B… so much of it melds together these days (and even the google music genre page has Disco under R&B). I’m finding myself listening to a lot more Disco re-edits and remixes that have a House backbone to it, so I’m putting it all under Dance for that thumping effect of the drums and bass. I can see a world where the melodies of disco get confused with the lack of melody in Techno, but perhaps that could be said about House and Trance as well.

I just want to quickly take a look at some of the spectrum charts of some of these hits. Let’s load some libraries and define a general function that we can use because I’ll be exploring a few songs here.

In [2]:
# Enable plots in the notebook
%matplotlib inline
import matplotlib.pyplot as plt

# Seaborn makes our plots prettier
import seaborn
seaborn.set(style = 'ticks')

# Import the audio playback widget
from IPython.display import Audio

import numpy as np
import pandas as pd
import librosa
import librosa.display

If any of these librosa functions are foreign, or you’re wondering what librosa is in general. Check out my last post.

In [3]:
# Define a function to take in a path of a song, load the song into librosa, and generate the
#    CQT spectrogram, chromagram, and tempogram
def generate_song_plots(song_path):
    # Load song
    y, sr = librosa.load(song_path)
    # Plot CQT spectorgram
    plt.subplot(3, 1, 1)
    cqt = librosa.cqt(y)
    librosa.display.specshow(librosa.amplitude_to_db(cqt, ref = np.max), x_axis = 'time', y_axis = 'cqt_note')
    plt.colorbar(format='%+2.0f dB')
    # Plot chromagram
    plt.subplot(3, 1, 2)
    chroma_cqt = librosa.feature.chroma_cqt(y=y, sr=sr)
    tempo, beat_f = librosa.beat.beat_track(y = y, sr = sr, trim = False)
    beat_f = librosa.util.fix_frames(beat_f, x_max = chroma_cqt.shape[1])
    cqt_sync = librosa.util.sync(chroma_cqt, beat_f, aggregate = np.median)
    beat_t = librosa.frames_to_time(beat_f, sr = sr)
    librosa.display.specshow(cqt_sync, y_axis = 'chroma', x_axis = 'time', x_coords=beat_t)
    # Plot tempogram
    plt.subplot(3, 1, 3)
    tgram = librosa.feature.tempogram(y = y, sr = sr)
    librosa.display.specshow(tgram, x_axis = 'time', y_axis = 'tempo')
    plt.axhline(tempo, color = 'w', linestyle = '--', alpha = 1, label = 'Estimated tempo={:g}'.format(tempo))
    plt.legend(frameon=True, framealpha=0.75)
    # Return signal
    return y, sr
In [22]:
song_path = '/Users/chiwang/Documents/iTunes 20160601 copy/iTunes Media/Music/Carrie Lucas/Unknown Album/Dance With You (12\' Extended Mix).mp3'
y, sr = generate_song_plots(song_path)

Eats Everything feat. Tiga vs Audion – Dancing (Again!)

Let’s see what a techno song looks like:

In [24]:
song_path = '/Users/chiwang/Documents/iTunes 20160601 copy/iTunes Media/Music/Eats Everything ft. Tiga & Audion/Unknown Album/Dancing (Again!).mp3'
y, sr = generate_song_plots(song_path)

Pretty cool. One thing I can kinda see right off the bat is you can see how a techno song is synthesized digitally. Look at that intro of the techno song. It’s got no bass, at all! Not surprising having listened to songs created synthetically and getting a bit into djing and production myself. I think even a high hat, naturally, has some footprint in the lower octaves simply from natural vibrations, but in ableton or any other production software, you can filter out all those frequencies. This is definitely what we hear in the beginning of Dancing!.

Dancing! also doesn’t have much of a harmonic footprint. It’s just concentrated around D# / E / F! I guess a hard banging synthesized sound has about that frequency? At least this one does! In Dance With You, we see notes everywhere, which I can only assume make up the key of the song.

Easy Listening

Easy Listening vs Ambient… Who knows man… This is the way I’ve defined it: Ambient has a more mysterious mood whereas I’ll never get anxious or scared listening to Easy Listening. Easy Listening can be on in the background and I should never notice it going about my day. It’s generally happy, and doesn’t involve much effort in terms of emotional investment. Anything with a lot of raw emotion that grabs your attention I’d end up classifying as Ambient or R&B… very slow R&B haha.

Some examples include Julie London – Cry Me A River and MYMP – For All Of My Life.

Very light, if even at all, on the drums and bass.

Julie London – Cry Me A River

In [4]:
song_path = '/Users/chiwang/Documents/iTunes 20160601 copy/iTunes Media/Music/Julie London/Julie Is Her Name_Julie Is Her Name, Vol. 2/01.Cry Me A River.mp3'
y, sr = generate_song_plots(song_path)

What a contrast eh? I can tell this song flows as one song, and isn’t so distinctly broken down by a verse, chorus, break… etc. There’s no clear “build up” or “drop” like you would find in a traditional house or techno song. The notes are a bit more distinct as well so a key is more decipherable than techno.

The tempogram is such a contrast between house and techno as well. For techno, you see such distinct lines @ 64, 128, 256… etc BPM (other than at the buildups when it’s kinda just white noise and there isn’t such a distinct beat) but this Julie London song doesn’t quite have such distinct lines although we can see traces here and there. The tempo that librosa actually chose doesn’t even seem to be one of the places where there is a distinct tempo mark on the tempogram, but I suppose it’s doing some type of average across the whole song.


I have so few true folk songs that this category doesn’t even really garner big enough a sample size. I’ll probably end up taking this category out or lumping it in with Rock, but for now, this genre includes Feist – Mushaboom and Sufjan Stevens – Death With Dignity.

Sufjan Stevens – Death With Dignity

In [5]:
song_path = '/Users/chiwang/Documents/iTunes 20160601 copy/iTunes Media/Music/Sufjan Stevens/Carrie & Lowell/01 Death with Dignity.mp3'
y, sr = generate_song_plots(song_path)

A bit more of a rhythm here from the tempogram. Even though the track is pretty much guitar driven, the guitar player clearly sticks to a certain tempo throughout the entire song up till the last 30 seconds or so.


Like the Dance category, Pop also has a very very wide range of songs… Your classic pop songs like Backstreet Boys – I Want It That Way, to your more dancey pop Kylie Minogue – In Your Eyes (which could almost be Dance tbh…), to your soft pop Lianne La Havas – Elusive, to your more indie pop Chairlift – Amanaemonesia (GREAT VIDEO…), to your more contemporary pop Kishi Bashi – Say Yeah.

It almost feels wrong to have all these songs in one genre, but I’m not about to create a classification model with 80 classes.

Because there’s so much variety with pop, I’m going to explore the make up of a few songs here.

Backstreet Boys – I Want It That Way

In [6]:
song_path = '/Users/chiwang/Documents/iTunes 20160601 copy/iTunes Media/Music/Backstreet Boys/Best of/03 I Want It Like That (I Want It That Way).mp3'
y, sr = generate_song_plots(song_path)

Very similar footprint to Sufjan Stevens in the sense of very discernible tones. There is more structure in the tempogram as well, likely due to the fact that the beat is actually being driven by percussion.

Lianne La Havas – Elusive

In [7]:
song_path = '/Users/chiwang/Documents/iTunes 20160601 copy/iTunes Media/Music/Lianne La Havas/Is Your Love Big Enough_ (Deluxe Edition)/08 Elusive.mp3'
y, sr = generate_song_plots(song_path)

Not much that my untrained eye hasn’t already seen. Clear tones, got a consistent tempo structure. One thing that is cool about this one though is that, at times, there is just a voice and a light bassline. Around the 1:00 mark, we can see a very very distinct bassline in the C1 – C2 octave. In the same phase of this song, I guess I can infer that a lower female boiec lies within the C4 – C5 octave!

Chairlift – Amanaemonesia

In [8]:
song_path = '/Users/chiwang/Documents/iTunes 20160601 copy/iTunes Media/Music/Chairlift/Something/07 Amanaemonesia.mp3'
y, sr = generate_song_plots(song_path)

More upbeat song here with many more instruments than the last two songs. That reflects in how busy the spectrograms and chromagrams are. Nothing out of the ordinary here, though.

Hip Hop

This one was perhaps the easiest to classify. Whether or not it’s right or wrong, if the main vocal form of the song was rap, then it went to hip hop. Hip Hop included the likes of Blackstar – Astronomy (8th Light), Migos – Hannah Montana (straight ratchet), and Twista ft. Kanye West – Overnight Celebrity which, in another world with different vocals, could almost be an R&B song.

Different feels for all 3 songs, hopefully the model can pull out the rapping rhythm from these.

Blackstar – Astronomy (8th Light)

In [9]:
song_path = '/Users/chiwang/Documents/iTunes 20160601 copy/iTunes Media/Music/Black Star/Mos Def & Talib Kweli Are Black Star/02 Astronomy (8th Light).mp3'
y, sr = generate_song_plots(song_path)

Okay, Hip Hop is pretty interesting, rather, this specific song. In all the songs so far other than Julie London, we’ve seen more distinct phases… the verses, the breaks, the choruses, the hooks… Astronomy is such a great song more or less without any of these! It’s just Mos Def and Talib flowing throughout the entire song. In the back, we just hear a bassline which is very prevalent in the spectrogram! In the chromagram, we see a variety of prominent notes here… I wonder if it has anything to do with rapping in general. Rappers aren’t trying to hit notes necessarily, it’s more about flow and rhythm. The words kinda matter too haha.

Migos – Hannah Montana

In [10]:
song_path = '/Users/chiwang/Documents/iTunes 20160601 copy/iTunes Media/Music/Migos/Young Rich Ni__as/06 Hannah Montana.mp3'
y, sr = generate_song_plots(song_path)

Migos… oh man… ratchet af rap. More modern hip hop is kinda interesting because a lot of it is also created on a computer. A lot of synthesized sounds and quantized rhythms. This song is very bass heavy as well (no surprise here, it’s basically made to blow a club up), and the spectrogram intensity shows this (C1 – C2 octave is very strong). The chromagram also shows that the kick drums are basically an E, but more importantly, there really is no other presence other than that. Because I’m pretty sure this track was produced on a computer, I thought the tempogram would be a bit cleaner, but now listening to the song again, the percussion really isn’t that clean in terms of being 4 / 4. There are some off-beat high hats (I think) and claps, and they got that mid 2010’s hip hop flow of rapping in triplets (@ around the 1:00 mark, “Han-nah-Mon Ta-Na-I’m Sel-Ling-Them Brick-Out-The Phan-Tom”… that 1-2-3 1-2-3 1-2-3 1-2-3 1-2 flow).


Jazz was also tough. Is it Dance? Is it R&B? Is it Jazz? The age old questions… and by age old I mean questions I kept asking myself over the last 48 hours.

Jazz has a certain feel. In certain ways it’s not as bumpin as R&B, but it’s got more character than certain R&B songs. We got smooth jazz like Sade – Is It A Crime which has a ton of emotion, but generally takes it slower than most R&B songs. We also got more upbeat jazz that can bring people to their feet shake their you know whats: Takako Mamiya – Morning Flight. Then we blur the lines of even pop with stuff like Amy Winehouse – You Know I’m No Good.

Honestly, all 3 of these could be debated into other genres, but trust me, I have pretty traditional Jazz in here as well: Nat King Cole – L-O-V-E. I don’t have much instrumental jazz, but I have a few like Stan Getz & Gerry Mulligan – I Didn’t Know What Time It Was as well.

Sade – Is It A Crime

In [11]:
song_path = '/Users/chiwang/Documents/iTunes 20160601 copy/iTunes Media/Music/Sade/Promise/01 - Sade - Is It A Crime.mp3'
y, sr = generate_song_plots(song_path)

Man, I’m so surprised to see such a structured tempogram… I’ve listened to this song so many times and it seems like the tempo just wavers to whatever the band feels like. I know that they’re not just playing whatever, but I feel the whole point of the song is that she goes in and out of these emotional phases and tempo is one thing that dictates that. You see wide bands (e.g. 0:00 – 0:50) that are a bit messier, and maybe those parts are what I notice, but there still is a consistent 123 BPM throughout the entire song.

Stan Getz & Gerry Mulligan – I Didn’t Know What Time It Was

In [12]:
song_path = '/Users/chiwang/Documents/iTunes 20160601 copy/iTunes Media/Music/Stan Getz & Gerry Mulligan/Unknown Album/Stan Getz & Gerry Mulligan-I Didn\'t Know What Time It Was .mp3'
y, sr = generate_song_plots(song_path)

Instrumental jazz… look at all those distinct notes in the chromagram. Could literally be mapped to music! The tempogram is a bit messier though, and that’s what I thought I’d see from Is It A Crime as well. I never grew up playing jazz or listening to jazz, so I won’t pretend like I know even the first thing about the technicalities of jazz, but I understand that the bass player generally keeps time, but the 2nd and 3rd leg of a trio (drums, sax / piano / guitar) are free to kind of improvise around that and “feel” the music a bit more. I’d say that story maybe can be told in the tempogram because we obviously see some sort of backbone, but the scatteredness of the intensities indicate that the sax is probably not playing right in time with the bass and drums.


Man… R&B… the memories. Of course I gotta kick this off with some USHA USHA… Usher – U Remind Me. Most of my R&B collection has that vibe… the Beyonces, TLCs, Janet Jacksons, Maxwells, Alicia Keys, D’Angelos, and Erykah Badus of the world. Not exactly like U Remind Me, but a nice melody, heavy bassline, heavy kicks and snares… etc.

I’ll go into some of the stuff that teeters on the edge of other categories as well… Some were Jazz / R&B mix like Stevie Wonder – Sunny which I’m honestly still right now considering changing the label, or Dance / R&B mixes like Surface – Falling In Love which I am, again, considering changing to Dance as I type this… It’s got that Dance vibe but it’s just slow and sexy like an R&B song. In general, I lumped the likes of funk, soul, and slower dance tracks into R&B as well.

Anyways, I promised myself that I wouldn’t touch the labels anymore so I’m done.

Usher – U Remind Me

In [13]:
song_path = '/Users/chiwang/Documents/iTunes 20160601 copy/iTunes Media/Music/Usher/Greatest Hits Disc 1/1-10 U Remind Me.mp3'
y, sr = generate_song_plots(song_path)

Man, looking at some of these spectrograms, it’s quite apparent how repetitive some of these 90’s / 00’s pop songs are. The spectrogram and the chromagram barely show any signs of differentiation. I’ll caveat it with the thought that the key of a song would give us a better indication of differentiation, and that the chromagram doesn’t convey the key information directly (to an untrained eye, e.g mine), but knowing U Remind Me really well does help confirm the fact that, yes, it’s a super super repetitive song in terms of keys and rhythm. The break at around the 1:50 mark is the only part of the song that shows any kind of distinction. Not trying to rag on U Remind Me, because it is such a good song, but more so a commentary on the era and genre of music.

In [14]:
song_path = '/Users/chiwang/Documents/iTunes 20160601 copy/iTunes Media/Music/Stevie Wonder/For Once In My Life/07 Sunny.mp3'
y, sr = generate_song_plots(song_path)

Sunny by Stevie Wonder, off his 1968 album For Once In My Life, exhibits opposite characteristics of what I just said about U Remind Me. Sunny is a bit more sporadic in its key changes. The versus, bridges, choruses all have different feels. This is reflected in the chromagram.


I am NOT a rock person, so I think this category is actually decently unique. Generally for rock I’m listening for that electric guitar and steady beat. Again, I don’t listen to too much rock, but you have to listen to Radiohead and marvel… Radiohead – Creep. When that electric guitar comes in, you’re like “oh yeah that’s rock”.

I also have a bunch of… I dunno, even more alternative stuff lumped in here? Portishead – Roads I lumped into there and a bunch of Pink Floyd as well (do I even pick one song? Every song is so different lol).

Also got some traditional (?) rock in there like Fleetwood Mac – You Make Loving Fun.

Radiohead – Creep

In [17]:
song_path = '/Users/chiwang/Documents/iTunes 20160601 copy/iTunes Media/Music/Radiohead/Pablo Honey/02 - Creep.m4a'
y, sr = generate_song_plots(song_path)

The chromagram here shows how repetitive Creep is in its tones. The spectrogram shows us there’s more that meets the eye. The chorus @ 1:00 show the boost in frequencies that the electric guitar provides in the song. The tempogram also shows how intensely the electric guitar masks any semblance of rhythm and tempo from the drums. Despite that, it’s still quite clear what the tempo is throughout.

In [16]:
song_path = '/Users/chiwang/Documents/iTunes 20160601 copy/iTunes Media/Music/Portishead/Dummy/08 Roads.mp3'
y, sr = generate_song_plots(song_path)

This is the last song I’m going to look at. Nothing really that we haven’t seen yet, but look at how cool that bassline in the spectrogram looks…

The tempogram seems to be a bit confused in the first 50 seconds or so. The oscillation in that intro synth doesn’t actually match up with the tempo of the song itself (it’s oscillating in triplets), so it seems that the tempogram is picking up on that oscillation because there are no drums whatsoever in the intro to latch on to.


So if you were INSANE enough to actually listen to all of those songs, or even half of those songs, or haven’t judged me so much by my taste in music that you’ve already stopped reading, you can probably see that it wasn’t easy for me to categorize some of the songs. Some caveats are that by nature I listen to a lot of cross genre type of stuff, and I only took two days (although it felt like an eternity) to categorize them. If I were trying to make a research paper out of this, I’d probably pull a list of songs off a database that experts have labelled, or taken way more time to actually go through and put more of a method to it than “I feel like it”, but it is my collection after all and I should be able to at least put a decent list together off a 2-day skim.

So I guess I have a list of songs that I want to scan… Now to feature build… Actually this is probably enough for this post so far. I’ll explore feature building next time!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s