Search code examples
swiftcore-text

How to get all characters of the font with CTFontCopyCharacterSet() in Swift?


How does one get all characters of the font with CTFontCopyCharacterSet() in Swift? ... for macOS?

The issue occured when implementing the approach from an OSX: CGGlyph to UniChar answer in Swift.

func createUnicodeFontMap() {
    // Get all characters of the font with CTFontCopyCharacterSet().
    let cfCharacterSet: CFCharacterSet = CTFontCopyCharacterSet(ctFont)

    //    
    let cfCharacterSetStr = "\(cfCharacterSet)"
    print("CFCharacterSet: \(cfCharacterSet)")  

    // Map all Unicode characters to corresponding glyphs
    var unichars = [UniChar](…NYI…) // NYI: lacking unichars for CFCharacterSet
    var glyphs = [CGGlyph](repeating: 0, count: unichars.count)
    guard CTFontGetGlyphsForCharacters(
        ctFont, // font: CTFont
        &unichars, // characters: UnsafePointer<UniChar>
        &glyphs, // UnsafeMutablePointer<CGGlyph>
        unichars.count // count: CFIndex
        )
        else {
            return
    }

    // For each Unicode character and its glyph, 
    // store the mapping glyph -> Unicode in a dictionary.
    // ... NYI
}

What to do with CFCharacterSet to retrieve the actual characters has been elusive. Autocompletion of the cfCharacterSet instance offers show no relavant methods.

enter image description here

And the Core Foundation > CFCharacterSet appears have methods for creating another CFCharacterSet, but not something the provides an array|list|string of unichars to be able to create a mapped dictionary.


Note: I'm looking for a solution which is not specific to iOS as in Get all available characters from a font which uses UIFont.


Solution

  • CFCharacterSet is toll-free bridged with the Cocoa Foundation counterpart NSCharacterSet, and can be bridged to the corresponding Swift value type CharacterSet:

    let charset = CTFontCopyCharacterSet(ctFont) as CharacterSet
    

    Then the approach from NSArray from NSCharacterSet can be used to enumerate all Unicode scalar values of that character set (including non-BMP points, i.e. Unicode scalar values greater than U+FFFF).

    The CTFontGetGlyphsForCharacters() expects non-BMP characters as surrogate pair, i.e. as an array of UTF-16 code units.

    Putting it together, the function would look like this:

    func createUnicodeFontMap(ctFont: CTFont) ->  [CGGlyph : UnicodeScalar] {
    
        let charset = CTFontCopyCharacterSet(ctFont) as CharacterSet
    
        var glyphToUnicode = [CGGlyph : UnicodeScalar]() // Start with empty map.
    
        // Enumerate all Unicode scalar values from the character set:
        for plane: UInt8 in 0...16 where charset.hasMember(inPlane: plane) {
            for unicode in UTF32Char(plane) << 16 ..< UTF32Char(plane + 1) << 16 {
                if let uniChar = UnicodeScalar(unicode), charset.contains(uniChar) {
    
                    // Get glyph for this `uniChar` ...
                    let utf16 = Array(uniChar.utf16)
                    var glyphs = [CGGlyph](repeating: 0, count: utf16.count)
                    if CTFontGetGlyphsForCharacters(ctFont, utf16, &glyphs, utf16.count) {
                        // ... and add it to the map.
                        glyphToUnicode[glyphs[0]] = uniChar
                    }
                }
            }
        }
    
        return glyphToUnicode
    }