Search code examples
ccompilationlegacy

C code written in 1990 executable runs. Recompiled now, it gets read errors on old file


I have a C program last compiled in 1990, that reads and writes some binary files. The executable still works, reading and writing them perfectly. I need to recompile the source, add some features, and then use the code, reading in some of the old data, and outputting it with additional information.

When I recompile the code, with no changes, and execute it, it fails reading in the old files, giving segmentation faults when I try to process the data read into an area of memory. I believe that the problem may be that the binary files written earlier used 4 8-bit byte integers, 8 byte longs, and 4 byte floats. The architecture on my machine now uses 64-bit words instead of 32. Thus when I extract an integer from the data read in, it is aligned incorrectly and sets an array index that is out of range for the program space.

On the Mac OS X 10.12.6, using its C compiler which might be:

Apple LLVM version 8.0.0 (clang-800.0.33.1)
Target: x86_64-apple-darwin16.7.0

Is there a compiler switch that would set the compiled lengths of integers and floats to the above values? If not, how do I approach getting the code to correctly read the data?


Solution

  • Welcome to the world of portability headaches!

    If your program was compiled in 1990, there is a good chance it uses 4 byte longs, and it is even possible that it use 2 byte int, depending on the architecture it was compiled for.

    The size of basic C types is heavily system dependent, among a number of more subtle portability issues. long is now 64-bit on both 64-bit linux and 64-bit OS/X, but still 32-bit on Windows (for both 32-bit and 64-bit versions!).

    Reading binary files, you must also deal with endianness, that changed from big-endian in 1990 MacOS to little-endian on today's OS/X, but still big-endian on other systems.

    To make matters worse, the C language evolved over this long period and some non trivial semantic changes occurred between pre-ANSI C and Standard C. Some old syntaxes are no longer supported either...

    There is no magic flag to address these issues, you will need to dive into the C code and understand what is does and try and modernize the code and make it more portable, independent on the target architecture. You can use the fixed width types from <stdint.h> to ease this process (int32_t, ...).

    People answering C questions on Stackoverflow are usually careful to post portable code that works correctly for all target architectures, even some purposely vicious ones such as the DS9K (a ficticious computer that does everything in correct but unexpected ways).