Search code examples
htmlxcodeutf-8swift4arabic

getting error when trying to encode HTML page with uft8 using Swift 4


I used this code to get html content and it worked for most sites ..this code gave me problem with the site below (which I need! ) I don't know why ! code gave me :

Error: Error Domain=NSCocoaErrorDomain Code=261 "The file “d-0002.htm” couldn’t be opened using text encoding Unicode (UTF-8)." UserInfo={NSURL=http://www.mktbtk.com/dir/nab/2/d-0002.htm, NSStringEncoding=4}

let myURLString = "http://www.mktbtk.com/dir/nab/2/d-0002.htm"

guard let myURL = URL(string: myURLString) else {
        print("Error: \(myURLString) doesn't seem to be a valid URL")
        return
    }

    do {
        let myHTMLString = try String(contentsOf: myURL, encoding: .utf8)
        print("HTML : \(myHTMLString)")
    } catch let error {
        print("Error: \(error)")
    }

Note :when I use ascii encoding it worked .. but the content is in Arabic so I need utf8 .. can anyone help


Solution

  • The page you have shown responds with this header:

    Content-Type: text/html; charset=windows-1256

    It's not in UTF-8, but in Windows-1256.

    With preparation:

    extension String.Encoding {
        static let windows1256 = String.Encoding(rawValue:
            CFStringConvertEncodingToNSStringEncoding(
                CFStringEncoding(CFStringEncodings.windowsArabic.rawValue)
            )
        )
    }
    

    And use .windows1256 instead of .utf8:

    let myURLString = "http://www.mktbtk.com/dir/nab/2/d-0002.htm"
    
    guard let myURL = URL(string: myURLString) else {
        print("Error: \(myURLString) doesn't seem to be a valid URL")
        return
    }
    
    do {
        let myHTMLString = try String(contentsOf: myURL, encoding: .windows1256) //<- not .utf8
        print("HTML : \(myHTMLString)")
    } catch let error {
        print("Error: \(error)")
    }
    

    I do not read Arabic, so I'm not sure this really is the right solution. But I believe it's worth trying.


    By the way, you should not use String.init(contentsOf:encoding:) in the main thread, which may block the main thread, and may cause your app rejected.