i have to read a file char by char in swift. The way I am doing it is to read a chunk from a FileHandler and returning the first character of a string.
This is my code so far:
/// Return next character, or nil on EOF.
func nextChar() -> Character? {
precondition(fileHandle != nil, "Attempt to read from closed file")
if atEof {
return nil
}
if self.stored.characters.count > 0 {
let c: Character = self.stored.characters.first!
stored.remove(at: self.stored.startIndex)
return c
}
let tmpData = fileHandle.readData(ofLength: (4096))
print("\n---- file read ---\n" , terminator: "")
if tmpData.count == 0 {
return nil
}
self.stored = NSString(data: tmpData, encoding: encoding.rawValue) as String!
let c: Character = self.stored.characters.first!
self.stored.remove(at: stored.startIndex)
return c
}
My problem with this is that the returning of a character is very slow. This is my test implementation:
if let aStreamReader = StreamReader(path: file) {
defer {
aStreamReader.close()
}
while let char = aStreamReader.nextChar() {
print("\(char)", terminator: "")
continue
}
}
even without a print it took ages to read the file to the end.
for a sample file with 1.4mb it took more than six minutes to finish the task.
time ./.build/debug/read a.txt
real 6m22.218s
user 6m13.181s
sys 0m2.998s
Do you have an opinion how to speed up this part?
let c: Character = self.stored.characters.first!
stored.remove(at: self.stored.startIndex)
return c
Thanks a lot. ps
++++ UPDATEED FUNCTION ++++
func nextChar() -> Character? {
//precondition(fileHandle != nil, "Attempt to read from closed file")
if atEof {
return nil
}
if stored_cnt > (stored_idx + 1) {
stored_idx += 1
return stored[stored_idx]
}
let tmpData = fileHandle.readData(ofLength: (chunkSize))
if tmpData.count == 0 {
atEof = true
return nil
}
if let s = NSString(data: tmpData, encoding: encoding.rawValue) as String! {
stored = s.characters.map { $0 }
stored_idx = 0
stored_cnt = stored.count
}
return stored[0];
}
Your implementation of nextChar
is terribly inefficient.
You create a String
and then call characters
over and over and you update that set of characters over and over.
Why not create the String
and then only store a reference to its characters
. And then track an index into characters
. Instead of updating it over and over, simply increment the index and return the next character. No need to update the string over and over.
Once you get to the last character, read the next piece of the file. Create a new string, reset the characters and the index.