Search code examples
phpcurlsoundcloudmeta-tagshttp-status-code-301

Grab Soundcloud artwork with plain PHP


I want to grab the URL of the artwork associated with a Soundcloud track with plain PHP without using their API. The HTML page has an og:image meta tag property which fits perfectly for my needs.

For example, the meta property of track https://soundcloud.com/dengue/sets/nuevos-sonidos looks like that:

<meta property="og:image" content="https://i1.sndcdn.com/artworks-000077991135-u5nvu1-t500x500.jpg">

The problem is that the HTTP request returns an 301 Moved Permanently code and so the use of DOMDocument class loadHTMLFile function gives an error.


Solution

  • If you really don't want to use their API (which seems like a bad call, because you don't need to do ANY auth; it's completely open), you can do some easy hacks.

    I'm not getting any redirects from cURL

    ~ $ curl -v https://soundcloud.com/dengue/sets/nuevos-sonidos
    *   Trying 68.232.44.127...
    * Connected to soundcloud.com (68.232.44.127) port 443 (#0)
    * TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA
    * Server certificate: *.soundcloud.com
    * Server certificate: GlobalSign Domain Validation CA - SHA256 - G2
    * Server certificate: GlobalSign Root CA
    > GET /dengue/sets/nuevos-sonidos HTTP/1.1
    > Host: soundcloud.com
    > User-Agent: curl/7.43.0
    > Accept: */*
    >
    < HTTP/1.1 200 OK
    < Cache-Control: private, max-age=0
    < Content-Type: text/html
    < Date: Sat, 07 May 2016 03:42:20 GMT
    < Server: am/2
    < Set-Cookie: sc_anonymous_id=363279-961735-991413-425081; path=/; expires=Tue, 05 May 2026 03:42:20 GMT; domain=.soundcloud.com
    < Via: sssr
    < X-Frame-Options: SAMEORIGIN
    < Content-Length: 47003
    <
    

    But if you are, you just have to add this option before you make the cURL from PHP:

    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    

    If you're seriously into the hacking business, why don't you just do this:

    <?php
    
    $url = `curl -L https://soundcloud.com/dengue/sets/nuevos-sonidos 2>/dev/null | grep 'og:image' | sed 's/.*og:image" content="\\([^"]*\\).*/\\1/'`;
    
    echo $url;
    

    Which does this

    ~/Code/stack-overflow $ php hack.php
    https://i1.sndcdn.com/artworks-000077991135-u5nvu1-t500x500.jpg