This question is related to this one: Character encoding Microsoft.XmlHttp in Vbscript, but differs in one thing, the national characters are in the domain name, not only arguments.
The task is: download a page from the given URL.
I already solved problem of passing UTF8 string into VBScript by reading it from UTF8 encoded file through ADO.
But now when I try opening it MSXML2.ServerXMLHTTP returns error: The URL is invalid.
Here is VBScript code:
Set objStream = CreateObject("ADODB.Stream")
objStream.CharSet = "utf-8"
objStream.Open
objStream.LoadFromFile("fileWithURL.txt")
url = objStream.ReadText()
objStream.Close
Set XMLHttpReq = CreateObject("MSXML2.ServerXMLHTTP")
XMLHttpReq.Open "GET", url, False
XMLHttpReq.send
WEBPAGE = XMLHttpReq.responseText
If you put something like hxxp://россия.рф/main/page5.html into the UTF8 encoded fileWithURL.txt the script will raise an error while working ok with hxxp://google.com.
The workaround is to use ascii representation of the domain name - but I yet haven't found PunnyCode encoder for vbscript (apart from Chillkat which is an overkill for my task).
Will appreciate your help on the main problem or workaround.
I've made an amazing journey in to depth of my hard drive and found a code writen by / for Jesper Høy. This was the source code of SimpleDNS Plus' IDN Conversion Tool at that time.
Archive.org page snapshot: http://www.simpledns.com/idn-convert.asp
Archive.org file snapshot: idn-convert-asp.zip
You can also copy the whole code from this gist.
Create a function to convert URLs.
Function DummyPuny(ByVal url)
Dim rSegments : rSegments = Split(url, "/")
If UBound(rSegments) > 1 Then
rSegments(2) = DomainPunyEncode(rSegments(2))
End If
DummyPuny = Join(rSegments, "/")
End Function
Then convert your url before making the request.
XMLHttpReq.Open "GET", DummyPuny(url), False