I am trying to retrieve the meta data for a given links (url). I have implemented the following steps:
$url = "url is here";
$html = file_get_contents($url);
$crawler = new Crawler($html); // Symfony library
$description = $crawler->filterXPath("//meta[@name='description']")->extract(['content']);
Doing so, I manage to retrieve the meta data for some urls but not for all. Some urls, the file_get_contents($url) function returns special characters like (x1F‹\x08\x00\x00\x00\x00\x00\x04\x03ì½}{ãÆ‘/ú÷øSÀœ'\x1E)! ‘z§¬qlÇI..........) that is why I could not retrieve the meta data.
Notice that, I am using the same website for $url values but passing different slugs (different blog urls like https://www.example.com/blog-1).
Attempts:
Any thought, why I am getting special characters when I am calling file_get_contents function, and some time getting correct html format?
I have solved the issue by adding the following parameters to file_get_contents functions:
private const EMBED_URL_APPEND = '?tab=3&object=%s&type=subgroup';
private const EMBED_URL_ENCODE= 'CM_949A11_1534_1603_DAG_DST_50_ÖVRIGT_1_1';
$urlEncoded= sprintf($url.self::EMBED_URL_APPEND, rawurlencode(self::EMBED_URL_ENCODE));
$html = file_get_contents($urlEncoded);