Search code examples
jsonswifthtml-entities

How do I decode HTML entities in Swift?


I am pulling a JSON file from a site and one of the strings received is:

The Weeknd ‘King Of The Fall’ [Video Premiere] | @TheWeeknd | #SoPhi

How can I convert things like &#8216 into the correct characters?

I've made a Xcode Playground to demonstrate it:

import UIKit

var error: NSError?
let blogUrl: NSURL = NSURL.URLWithString("http://sophisticatedignorance.net/api/get_recent_summary/")
let jsonData = NSData(contentsOfURL: blogUrl)

let dataDictionary = NSJSONSerialization.JSONObjectWithData(jsonData, options: nil, error: &error) as NSDictionary

var a = dataDictionary["posts"] as NSArray

println(a[0]["title"])

Solution

  • This answer was last revised for Swift 5.2 and iOS 13.4 SDK.


    There's no straightforward way to do that, but you can use NSAttributedString magic to make this process as painless as possible (be warned that this method will strip all HTML tags as well).

    Remember to initialize NSAttributedString from main thread only. It uses WebKit to parse HTML underneath, thus the requirement.

    // This is a[0]["title"] in your case
    let htmlEncodedString = "The Weeknd <em>&#8216;King Of The Fall&#8217;</em>"
    
    guard let data = htmlEncodedString.data(using: .utf8) else {
        return
    }
    
    let options: [NSAttributedString.DocumentReadingOptionKey: Any] = [
        .documentType: NSAttributedString.DocumentType.html,
        .characterEncoding: String.Encoding.utf8.rawValue
    ]
    
    guard let attributedString = try? NSAttributedString(data: data, options: options, documentAttributes: nil) else {
        return
    }
    
    // The Weeknd ‘King Of The Fall’
    let decodedString = attributedString.string
    
    extension String {
    
        init?(htmlEncodedString: String) {
    
            guard let data = htmlEncodedString.data(using: .utf8) else {
                return nil
            }
    
            let options: [NSAttributedString.DocumentReadingOptionKey: Any] = [
                .documentType: NSAttributedString.DocumentType.html,
                .characterEncoding: String.Encoding.utf8.rawValue
            ]
    
            guard let attributedString = try? NSAttributedString(data: data, options: options, documentAttributes: nil) else {
                return nil
            }
    
            self.init(attributedString.string)
    
        }
    
    }
    
    let encodedString = "The Weeknd <em>&#8216;King Of The Fall&#8217;</em>"
    let decodedString = String(htmlEncodedString: encodedString)