Search code examples
c++cemulationlibcmips32

How does libc work?


I'm writing a MIPS32 emulator and would like to make it possible to use the whole Standard C Library (maybe with the GNU extensions) when compiling C programs with gcc.

As I understand at this point, I/O is handled by syscalls on the MIPS32 architecture. To successfully run a program using libc/glibc, how can I tell what syscalls do I need to emulate? (without trial and error)

Edit: See this for an example of what I mean by syscalls.

(You can check out the project here if you are interested, any feedback is welcome. Keep in mind that it's in a very early stage)


Solution

  • Very Short Answer

    Read the much longer answer.

    Short Answer

    If you intend to provide a custom libc that uses some feature of your emulator to have the host OS execute your system calls, you have to implement all of them.

    Much Longer Answer

    Step back for a minute and look at the way things are typically layered in a real (non-emulated) system:

    1. The peripherals have some I/O interface (e.g., numbered ports or memory mapping) that the CPU can tickle to make them do whatever they do.
    2. The CPU runs software that understands how to manipulate the hardware. This can be a single-purpose program or an operating system that runs other programs. Since libc is in the picture, let's assume there's an OS and that it's something Unix-y.
    3. Userspace programs run by the OS use a defined interface between themselves and OS to ask for certain "system" functions to be carried out.

    What you're trying to accomplish takes place between layers 3 and 2, where a function in libc or user code does whatever the OS defines as triggering a system call. This opens up numerous cans of worms:

    • What the OS defines as triggering a system call differs from OS to OS and (rarely) between versions of the same OS. This problem is mitigated on "real" systems by providing a dynamically-linkable libc that takes care of hiding those details. That aside, if you have a MIPS32 binary you want to run, does it use a system call convention that your emulator supports?

    • You would need to provide a custom libc that does something your emulator can recognize as making a particular system call and carry it out. Any program you wish to run will have to be cross-compiled to MIPS32 and statically linked with it, as would any other libraries the program requires (libm comes to mind). Alternately, your emulator package will need to provide a simulation of a dynamic linker plus dynamically-linkable copies of all required libraries, because opening those on the host won't work. If you have enough source to recompile the program from scratch, porting might be better than emulation.

    • Any code that makes assumptions about paths to files on a particular system or other assumptions about what they'll find in certain devices (which are themselves files) won't run correctly.

    • If you're providing layer 2, you're signing yourself up to provide a complete, correct simulation of the behavior of one particular version of an entire operating system. Some calls like read() and write() would be easy to deal with; others like fork(), uselib() and ioctl() would be much more difficult. There also isn't necessarily a one-to-one mapping of calls and behaviors your program uses with those your host OS provides. All of this assumes the host is Unix and the target program is, too. If the target is compiled for some other environment, all bets are off.

    That last point is why most emulators provide just a CPU and the hardware behaviors of some target system (i.e., everything in layer 1). With those in place, you can run an original system's boot ROM, OS and user programs, all unaltered. There are a number of existing MIPS32 emulators that do just this and can run unaltered versions of the operating systems that ran on the hardware they emulate.

    HTH and best of luck on your project.