Search code examples
google-apifreebasemql

How to find related related songs or artists using Freebase MQL?


I have any Freebase mid such as: /m/0mgcr, which is The Offspring.

Whats the best way to use MQL to find related artists?

Or if I have a song mid such as: /m/0l_f7f, which is Original Prankster by The Offspring.

Whats the best way to use MQL to find related songs?


Solution

  • So, the revised question is, given a musical artist, find all other musical artists who share all of the same genres assigned to the first artist.

    MQL doesn't have any operators which can work across parts of the query tree, so this can't be done in a single query, but given that you're likely doing this from a programming language, it be done pretty simply in two steps.

    First, we'll get all genres for our subject artist, sorted by the number of artists that they contain using this query (although the last part isn't strictly necessary):

    [{
      "id": "/m/0mgcr",
      "name": null,
      "/music/artist/genre": [{
        "name": null,
        "id": null,
        "artists": {
          "return": "count"
        },
        "sort": "artists.count"
      }]
    }]
    

    Then, using the genre with the smallest number of artists for maximum selectivity, we'll add in the other genres to make it even more specific. Here's a version of the query with the artists that match on the three most specific genres (the base genre plus two more):

    [{
      "id": "/m/0mgcr",
      "name": null,
      "/music/artist/genre": [{
        "name": null,
        "id": null,
        "artists": {
          "return": "count"
        },
        "sort": "artists.count",
        "limit": 1,
        "a:artists": [{
          "name": null,
          "id": null,
          "a:genre": {
            "id": "/en/ska_punk"
          },
          "b:genre": {
            "id": "/en/melodic_hardcore"
          }
        }]
      }]
    }]
    

    Which gives us: Authority Zero, Millencolin, Michael John Burkett, NOFX, Bigwig, Huelga de Hambre, Freygolo, The Vandals

    The things to note about this query are that, this fragment:

        "sort": "artists.count",
        "limit": 1,
    

    limits our initial genre selection to the single genre with the fewest artists (ie Skate Punk), while the prefix notation:

          "a:genre": {"id": "/en/ska_punk"},
          "b:genre": {"id": "/en/melodic_hardcore"}
    

    is to get around the JSON limitation on not having more than one key with the same name. The prefixes are ignored and just need to be unique (this is the same reason for the a:artists elsewhere in the query.

    So, having worked through that whole little exercise, I'll close by saying that there are probably better ways of doing this. Instead of an absolute match, you may get better results with a scoring function that looks at % overlap for the most specific genres or some other metric. Things like common band members, collaborations, contemporaneous recording history, etc, etc, could also be factored into your scoring. Of course this is all beyond the capabilities of raw MQL and you'd probably want to load the Freebase data for the music domain (or some subset) into a graph database to run these scoring algorithms.

    In point of fact, both last.fm and Google think a better list would include bands like Sum 41, blink-182, Bad Religion, Green Day, etc.