I ported some code from C to C++ and have just found a problem with paths that contain em-dash, e.g. "C:\temp\test—1.dgn". A call to fstream::open() will fail, even though the path displays correctly in the Visual Studio 2005 debugger.
The weird thing is that the old code that used the C library fopen() function works fine. I thought I'd try my luck with the wfstream class instead, and then found that converting my C string using mbstowcs() loses the em-dash altogether, meaning it also fails.
I suppose this is a locale issue, but why isn't em-dash supported in the default locale? And why can't fstream handle an em-dash? I would have thought any byte character supported by the Windows filesystem would be supported by the file stream classes.
Given these limitations, what is the correct way to handle opening a file stream that may contain valid Windows file names that doesn't just fall over on certain characters?
Posting this solution for others who run into this. The problem is that Windows assigns the "C" locale on startup by default, and em-dash (0x97) is defined in the "Windows-1252" codepage but is unmapped in the normal ASCII table used by the "C" locale. So the simple solution is to call:
setlocale ( LC_ALL, "" );
Prior to fstream::open. This sets the current codepage to the OS-defined codepage. In my program, the file I wanted to open with fstream was defined by the user, so it was in the system-defined codepage (Windows-1252).
So while fiddling with unicode and wide chars may be a solution to avoid unmapped characters, it wasn't the root of the problem. The actual problem was that the input string's codepage ("Windows-1252") didn't match the active codepage ("C") used by default in Windows programs.