I want to grab the URL of the artwork associated with a Soundcloud track with plain PHP without using their API. The HTML page has an og:image
meta tag property which fits perfectly for my needs.
For example, the meta property of track https://soundcloud.com/dengue/sets/nuevos-sonidos looks like that:
<meta property="og:image" content="https://i1.sndcdn.com/artworks-000077991135-u5nvu1-t500x500.jpg">
The problem is that the HTTP request returns an 301 Moved Permanently
code and so the use of DOMDocument class loadHTMLFile
function gives an error.
If you really don't want to use their API (which seems like a bad call, because you don't need to do ANY auth; it's completely open), you can do some easy hacks.
I'm not getting any redirects from cURL
~ $ curl -v https://soundcloud.com/dengue/sets/nuevos-sonidos
* Trying 68.232.44.127...
* Connected to soundcloud.com (68.232.44.127) port 443 (#0)
* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA
* Server certificate: *.soundcloud.com
* Server certificate: GlobalSign Domain Validation CA - SHA256 - G2
* Server certificate: GlobalSign Root CA
> GET /dengue/sets/nuevos-sonidos HTTP/1.1
> Host: soundcloud.com
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Cache-Control: private, max-age=0
< Content-Type: text/html
< Date: Sat, 07 May 2016 03:42:20 GMT
< Server: am/2
< Set-Cookie: sc_anonymous_id=363279-961735-991413-425081; path=/; expires=Tue, 05 May 2026 03:42:20 GMT; domain=.soundcloud.com
< Via: sssr
< X-Frame-Options: SAMEORIGIN
< Content-Length: 47003
<
But if you are, you just have to add this option before you make the cURL from PHP:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
If you're seriously into the hacking business, why don't you just do this:
<?php
$url = `curl -L https://soundcloud.com/dengue/sets/nuevos-sonidos 2>/dev/null | grep 'og:image' | sed 's/.*og:image" content="\\([^"]*\\).*/\\1/'`;
echo $url;
Which does this
~/Code/stack-overflow $ php hack.php
https://i1.sndcdn.com/artworks-000077991135-u5nvu1-t500x500.jpg