I'm trying to read text from file and put it in the TextView. FileInputStream has read_bytes
, there is set_text
in TextBuffer that can take ustring
, but there seem to be no way to go from one to another.
In the InputStream's child classes i found DataInputStream which does have read_line_utf8
giving one an std::string (better than nothing), but even DataInputStream is on the separate class hierarchy branch from FileInputStream.
Of course, theoretically it is possible to just cycle through the array of bytes returned by the read_bytes
and turn them into characters, but somehow i just refuse to believe that there is no ready function that i'm overlooking.
Ultimately i'm looking for a function that would take Glib::RefPtr<Glib::Bytes>
and return me a Glib::ustring
OK, after searching far and wide i have managed to confirm that there is no way to do so within the confines of gtkmm library. This does seem pretty strange to me, but there it is.
So here is how to read the file via the normal tools, then convert what you've read, and display it in the TextArea:
I assume here that you've already opened the dialog and connected all that needs to be connected for it. If you have a Controller class you will end up with something along the lines of:
fh = dialog->get_file();
fh->read_async( sigc::mem_fun( *this, &Controller::on_file_read_complete ));
Make sure that you have Glib::RefPtr< Gio::File > fh;
as the private data member and not as a local variable. You will then need a function on_file_read_complete
void Controller::on_file_read_complete(Glib::RefPtr<Gio::AsyncResult>& res)
{
Glib::RefPtr<Gio::InputStream> fin = fh->read_finish(res);
Glib::RefPtr<Glib::Bytes> fbytes = fin->read_bytes(8192, Glib::RefPtr<Gio::Cancellable>());
Glib::ustring str = bytesToUstring(fbytes);
Gtk::TextView *textview = NULL;
refGlade->get_widget("textviewUser", textview);
assert(textview!=NULL);
textview->get_buffer()->set_text(str);
}
This function fires off when the file has been read and you can safely talk to the FileInputStream
. Use the function of the parent of that class read_bytes
, here i ask to read 8192 bytes, but it can potentially be more, the Cancellable
reference must be provided, but can be empty as is the case above. Now the tricky part, grab the Glib::RefPtr<Glib::Bytes>
and do the conversion with the function that had to be written for this:
Glib::ustring bytesToUstring(Glib::RefPtr<Glib::Bytes> data)
{
Glib::ustring result = "";
gsize s;
gconstpointer d = g_bytes_get_data(data->gobj(), &s);
unsigned char c;
wchar_t wc;
unsigned short toread = 0;
for(int i=0; i<(int)s; ++i)
{
c = ((char*)d)[i];
if((c >> 7) == 0b0)
{
//std::cout << "Byte 0b0" << std::endl;
if(toread!=0)
{
std::cerr << "Help. I lost my place in the stream" << std::endl;
}
wc = (wchar_t)c;
}
else if((c >> 6) == 0b10)
{
//std::cout << "Byte 0b10" << std::endl;
if(toread==0)
{
std::cerr << "Help. I lost my place in the stream" << std::endl;
}
wc <<= 6; // 6 more bits are coming in
wc |= (c & 0b00111111);
--toread;
}
else // we can be sure that we have something starting with at least 2 set bits
{
if(toread!=0)
{
std::cerr << "Help. I lost my place in the stream" << std::endl;
}
if((c >> 5) == 0b110)
{
//std::cout << "Byte 0b110" << std::endl;
wc = c & 0b00011111;
toread = 1;
}
else if((c >> 4) == 0b1110)
{
//std::cout << "Byte 0b1110" << std::endl;
wc = c & 0b00001111;
toread = 2;
}
else if((c >> 3) == 0b11110)
{
//std::cout << "Byte 0b11110" << std::endl;
wc = c & 0b00000111;
toread = 3;
}
else if((c >> 2) == 0b111110)
{
//std::cout << "Byte 0b111110" << std::endl;
wc = c & 0b00000011;
toread = 4;
}
else if((c >> 1) == 0b1111110)
{
//std::cout << "Byte 0b1111110" << std::endl;
wc = c & 0b00000001;
toread = 5;
}
else // wtf?
{
std::cerr << "Help! Something is probaby not a UTF-8 at all" << std::endl;
for(int j=(8*(int)sizeof c) - 1; j>=0; --j)
{
std::cerr << (char)('0'+ (char)((c >> j) & 1));
}
std::cerr << std::endl;
}
}
if(toread == 0)
{
result += (gunichar)wc;
wc = L'\0';
//std::cout << i << ' ' << result << std::endl;
}
}
return result;
}
In here we must first and foremost grab the real pointer to bytes, since Glib::Bytes
will refuse to give you the tools that you need. And then you can start converting into the wchar_t
. The process isn't that difficult and is described in Wikipedia article on UTF-8 well enough.
And luckily wchar_t
can be converted to gunichar
and that in turn can be added to Glib::ustring
.
So the path that we must take is:
Dialog -> Gio::File -> Glib::Bytes -> gconstpointer -> char -> (combining several chars) wchar_t -> gunichar -> Glib::ustring -> (add to TextArea's TextBuffer)
:Note: Currently this is not a ready to use code, it only reads 8192 bytes, and it won't help to then read more because there is no guarantee that the character didn't get broken in the middle of two reads, maybe i'll update the code a little later.