Search code examples
swiftms-wordlocal-storagedocx

Converting Docx Files To Text In Swift


I have a .docx file in my temporary storage:

    let location: NSURL = NSURL.fileURLWithPath(NSTemporaryDirectory())
    let file_Name = location.URLByAppendingPathComponent("5 November 2016.docx")

What I now want to do is extract the text inside this document. But I cannot seem to find any converters or methods of doing this.

I have tried this:

    let file_Content = try? NSString(contentsOfFile: String(file_Name), encoding: NSUTF8StringEncoding)
    print(file_Content)

However it prints nil.

So how do I read the text in a docx file?


Solution

  • Your initial issue is with how you get the string from the URL. String(File_Name) is not the correct way to convert a file URL into a file path. The proper way is to use the path function.

    let location = NSURL.fileURLWithPath(NSTemporaryDirectory())
    let fileURL = location.URLByAppendingPathComponent("My File.docx")
    let fileContent = try? NSString(contentsOfFile: fileURL.path, encoding: NSUTF8StringEncoding)
    

    Note the many changes. Use proper naming conventions. Name variables more clearly.

    Now here's the thing. This still won't work because a docx file is a zipped up collection of XML and other files. You can't load a docx file into an NSString. You would need to use NSData to load the zip contents. Then you would need to unzip it. Then you would need to go through all of the files and find the desired text. It's far from trivial and it is far beyond the scope of a single stack overflow post.