In light of the fresh Cosmogramma reissue from Vinyl Me, Please, I was having a conversation with VMP’s Matt Hessler about Flying Lotus, his legacy as a genre-straddling producer/composer, and how he helped shape a very particular corner of the music world with the Brainfeeder label (hello Kamasi Washington and Thundercat).
During the course of things, Hessler observed that FlyLo’s fanbase seems more diverse than most electronic artists when it comes to the styles of music they consume. While totally anecdotal, that idea sounded right to me for a host of reasons. Then I realized we didn’t have to rely on inference and informal reasoning, thanks to the massive amount of collection data at my fingertips in the Discogs database.
I’m always jazzed to figure out new and interesting angles for using Discogs’ data, so coming up with a strategy for definitively proving whether Flying Lotus has a more heterogeneous fanbase than his electronic peers seemed like a ridiculous and perfect project. This would be like sabermetrics for baseball nerds, except for music nerds. Discographies Specialist Brent Greissle and I set out to make it happen.
Warning: If you get bored easily by numbers and math talk, skip to the “What We Found” section right now. You should, for real. It won’t hurt my feelings. Promise. But if you’re interested in the nuts and bolts of how we chose the lineup for this experiment or how we developed the shiny new metric of “Total Diversity Index,” then hunker down.
Picking The Players
To make everything manageable, we settled on four other artists for comparison. All but one had roughly the same “popularity,” which we defined that as a combination of wants and haves among users. Since massively-collected artists might skew the numbers (after all, everyone of a certain age owns a copy of Oxygène), this was the easiest way to make sure we were comparing apples to apples.
We chose each act for different reasons. One was a noted electronic artist with a somewhat well-defined sound (Surgeon), one had a more circuitous decades-long career (David Sylvian), one was a contemporary with some crossover success (Bonobo), and we rounded things out with Aphex Twin. He’s more popular, but his reputation as a genre-bending IDM experimentalist most closely matches Flying Lotus in essence, if not style. And since Aphex Twin has attained serious crossover success, audience diversity would seem to follow. So it made sense to see how he matched up.
Finding The Total Diversity Index
The Problem Of Genre Vs. Style
Each genre in the database can have dozens of styles within. That means diversity in genre and style hint at similar — but still distinct — implications. Let’s say Lisa and Jamal each have collections with 15 records in them, and every record is a different style. They might look equally diverse at first glance, but if Lisa has 15 genres and Jamal has one genre, are they really as diverse? Lisa’s would be more diverse, right?
By the same token, let’s say Jamal’s buddy Frank has 15 records. All of Frank’s records are the same style (ergo all from one genre). Even though Frank and Jamal’s collections have the same number of genres, Jamal still has a more diverse collection.
As you can see, this might present issues when trying to determine the relative significance and weight of what we’ll call “Genre Diversity” (GD) and “Style Diversity” (SD) from here on out. We’ll deal with that problem soon, but let’s table it for now.
Collecting Collection Numbers
We looked at user collections that contained one or more releases from an artist. Since larger collections will be more variegated by default — and so we could look at data on a more granular level later on — that needed to be accounted for. To accomplish this, all collections were broken down into size segments: Small (1-9), Medium (11-99), Large (100-999), and Huge (1000+).
We averaged every collection in each segment to come up with GD and SD for Small, Medium, Large, and Huge. For example, a Small collection with at least one Surgeon record on average contains 1.36 different genres, hence a GD of 1.36. Similarly, Surgeon’s SD is 4.98 for a Small collection.
To understand the relative value of these numbers, we grabbed the average GD and SD for all user collections on Discogs. We were then able to tell how much each GD and SD deviated from the mean. This is where we came up with the concept of a “Diversity Index.” It works like this: The Genre Diversity Index (GDI) of each size segment on Discogs is 1. The higher the number, the more it deviates from the site-wide average.
Think of the difference between GD/SD and GDI/SDI as the raw number and the significance of the number. 152 styles sounds like a lot, but it’s actually the site-wide SD for a Huge collection. So if an artist’s SD is 154, that’s not a very diverse fanbase in the Huge segment. With an SDI score of 1.01, you can tell that easily.
The smaller the size segment, the more the Diversity Index is amplified (Surgeon’s Small SDI is 1.41 compared to a Huge SDI of 1.09). This allows us to deal with the issue of larger collections being more diverse by default when trying to see the overall picture of style and genre diversity, which is the next step.
All size segment GDIs are averaged together to come up with the Total Genre Diversity Index (TGDI); same thing with SDIs to arrive at TSDI. Continuing on with Surgeon as an example, we can see that his TGDI is 0.79, and his TSDI is 1.15.
Putting It All Together
Now onto the pesky problem of Genre Diversity and Style Diversity having different implications. In order to properly find total diversity, we need to reflect the significance of Genre Diversity over Style Diversity (remember, Lisa had the most diverse collection of all!).
We divided site-wide GD by site-wide SD for each size segment. We took those numbers, averaged them together, and used the mean to determine how much weight to give TGDI over TSDI. From there, we averaged the TGDI and adjusted TSDI to finally come up with an authoritative way to answer our initial question (drum roll): Total Diversity Index!
Surgeon has a TDI of 0.83, which showcases what our intuition might tell us: An artist who is decently (but not massively) popular in his chosen genre and has a relatively well-defined sound probably won’t have the most diverse audience compared to the average record collector. But now we don’t have to rely on intuition.
Before we get into our findings, I want to be clear about something that might seem obvious. TDI is not an indication of quality. And since artists in our study had roughly the same number of wants and haves in collection, but a big variance in TDI, it’s not an indication of popularity either.
What We Found
If you skipped ahead, welcome back. If you stuck it out through our methodology, big high five! It might’ve seemed like a lot, but we’re dedicated to getting data right, and we wanted to show you how dedicated. This is Discogs, for God’s sake. It’s what we do. Below are the topline results of our study:
It’s blatantly obvious that our initial suspicions were correct. Flying Lotus appears to have a higher Total Diversity Index than your average electronic musician. In fact, he has a higher TDI than at least one way-above-average electronic musician, beating Aphex Twin by a noticeable margin. That speaks volumes about the diversity of Flying Lotus’ fanbase. Don’t forget, the reason we included Aphex Twin was because we assumed his longevity, popularity, and mainstream success would lead to a more multifarious audience.
If for some reason you aren’t convinced by the topline numbers, digging into the granular data is eye opening as well. When comparing GDI for Medium, Large, and especially Huge collections, you can see tight competition between Flying Lotus, Bonobo, and David Sylvian.
FlyLo really flexes in the Small segment, though. As we mentioned earlier, and as you can see from the chart below showing GDI from all segments, Diversity Index is weighted to favor collections with fewer pieces.
This is important from a statistical standpoint, because the fewer records someone owns, the more of a commitment it is to buy new pieces. That means a collector with a small but diverse collection is making a more concerted effort to expand their horizons. In other words, it would be commonplace for someone with a collection of 3,000 to contain six genres — but a collection of six records where each one was a different genre would be surprising. Hence the weighting.
Paying special attention to Small collections also helps us glean anecdotal “common sense” ideas about an artist’s importance in broader pop culture. If an artist has a high GDI for collections of less than 10 records, it likely means their audience in that segment is choosing only “staple” records from various genres. In Flying Lotus’ case, an average small-time collector may think, “If I need to own one electronic album on vinyl, it’s probably Cosmogramma.”
It could also signify that Flying Lotus is seen as a prime gateway into electronic music, thanks to his connections and collaborations in the jazz and hip-hop worlds or his curation of Brainfeeder. It’s hard to argue against that idea. When an IDM neophyte notices him collaborating with Kendrick Lamar or realizes he put out those Thundercat albums, why wouldn’t they think, “This guy is doing all kinds of cool stuff. Maybe there’s something to this experimental dance music thing.”
While trying to figure out the why and how of it is entirely speculative, we now have a way to find out whether an artist is beloved by a deep but narrow set of rabid fans or a cross section of heads from all over the map. We can now say with some certainty that FlyLo is in the latter category.
PS: Developing this incredibly nerdy set of metrics was so much fun. We want to do more of these and continue finding new ways to cut up the treasure trove of data in the Discogs filing cabinets. If you have ideas, leave them in the comments or even tweet at me (@mrseancannon). Come on, who doesn’t want to develop a VORP or DVOA for music? Right? Right???