I need to unzip a .epub file in swift to read the data myself entirely. I know how to parse the output of an ePub if I can get it (I've written a working example in python), but SSZipArchive apparently will not unzip .epubs. It does, however, works fine on a dummy .zip file; only .epub is a problem. So far as I can tell, there has been no question asking how to actually do this by hand on S.O. beyond simply pointing people to projects that do it for you in objective-c with lots of overhead (which I don't understand or need) that defeats the purpose of what I need to do. Below is my current attempt. Note that the epub in question can be found at the following link (project gutenberg) http://www.gutenberg.org/ebooks/158.epub.noimages and that when I run this the print statement emits: "true, true, true, false" (that is, the files and paths all exist, but won't unzip):
import Foundation
class EpubExtractor: NSObject, SSZipArchiveDelegate {
init(fileName: String) {
fName = fileName
}
func getEpubInfo() {
var paths = NSSearchPathForDirectoriesInDomains(NSSearchPathDirectory.DocumentDirectory, NSSearchPathDomainMask.UserDomainMask, true)
let documentsDir = paths[0]
let zipPath = documentsDir.stringByAppendingString("/MyZipFiles") // My folder name in document directory
let fileManager = NSFileManager.defaultManager()
let success1 = fileManager.fileExistsAtPath(zipPath) as Bool
if success1 == false {
print("no directory")
do {
try! fileManager.createDirectoryAtPath(zipPath, withIntermediateDirectories: true, attributes: nil)
}
}
let archivePath = zipPath.stringByAppendingString("/emma.epub") // Sample folder is going to zip with name Demo.zip
let success2 = fileManager.fileExistsAtPath(archivePath) as Bool
let destPath = zipPath.stringByAppendingString("/Hello")
let success3 = fileManager.fileExistsAtPath(destPath) as Bool
let worked = SSZipArchive.unzipFileAtPath(archivePath, toDestination: destPath, delegate:self)
print(success1, success2, success3, worked)
}
}
EDIT
Below is proof of concept code written in python in which I CAN get the very same epub to be recognized as a zip file and read its container content:
import zipfile
dir = "sampleData/epubs/"
fileName = "emma.epub"
print zipfile.is_zipfile(dir+fileName) # Check whether file is zip (this returns true, though in swift it fails)
zip = zipfile.ZipFile(dir+fileName)
txt = zip.read('META-INF/container.xml') # Print contents of container (this is what I need swift to be able to do)
print txt # This successfully prints the container content text
I figured it out after many many hours of reading. Turns out the solution is extremely simple if non-obvious.
The "fileName.epub" file needs to be renamed to "fileName.zip". That's it!
After that either SSZipArchive or Zip will unzip the file into its META-Inf, mimetype, and OEBPS files in a folder called "fileName" (at least as the default name).
Hope this helps anyone struggling with this. Of course if there is another way to do this please let me know in comments.