Porting AT&T inline-asm inb / outb wrappers to work with gcc -masm=intel

I am currently working on my x86 OS. I tried implementing the inb function from here and it gives me Error: Operand type mismatch for `in'.

This may also be the same with outb or io_wait.

I am using Intel syntax (-masm=intel) and I don't know what to do.

Code:

#include <stdint.h>
#include "ioaccess.h"

uint8_t inb(uint16_t port)
{
    uint8_t ret;
    asm volatile ( "inb %1, %0"
                   : "=a"(ret)
                   : "Nd"(port) );
    return ret;
}

With AT&T syntax this does work.

For outb I'm having a different problem after reversing the operands:

void io_wait(void)
{
    asm volatile ( "outb $0x80, %0" : : "a"(0) );
}

Error: operand size mismatch for `out'

Solution

If you need to use -masm=intel you will need to insure that your inline assembly is in Intel syntax. Intel syntax is dst, src (AT&T syntax is reverse). This somewhat related answer has some useful information on some differences between NASM's Intel variant¹ (not GAS's variant) and AT&T syntax:

Information on how you can go about translating NASM Intel syntax to GAS's AT&T syntax can be found in this Stackoverflow Answer, and a lot of useful information is provided in this IBM article.

[snip]

In general the biggest differences are:

With AT&T syntax the source is on the left and destination is on the right and Intel is the reverse.

With AT&T syntax register names are prepended with a %

With AT&T syntax immediate values are prepended with a $

Memory operands are probably the biggest difference. NASM uses [segment:disp+base+index*scale] instead of GAS's syntax of segment:disp(base, index, scale).

The problem in your code is that source and destination operands have to be reversed from the original AT&T syntax you were working with. This code:

asm volatile ( "inb %1, %0"
               : "=a"(ret)
               : "Nd"(port) );

Needs to be:

asm volatile ( "inb %0, %1"
               : "=a"(ret)
               : "Nd"(port) );

Regarding your update: the problem is that in Intel syntax immediate values are not prepended with a $. This line is a problem:

asm volatile ( "outb $0x80, %0" : : "a"(0) );

It should be:

asm volatile ( "outb 0x80, %0" : : "a"(0) );

If you had a proper outb function you could do something like this instead:

#include <stdint.h>
#include "ioaccess.h"

uint8_t inb(uint16_t port)
{
    uint8_t ret;
    asm volatile ( "inb %0, %1"
                   : "=a"(ret)
                   : "Nd"(port) );
    return ret;
}

void outb(uint16_t port, uint8_t byte)
{
    asm volatile ( "outb %1, %0"
                   :
                   : "a"(byte),
                     "Nd"(port) );
}

void io_wait(void)
{
    outb (0x80, 0);
}

A slightly more complex version that supports both the AT&T and Intel dialects:

Multiple assembler dialects in asm templates On targets such as x86, GCC supports multiple assembler dialects. The -masm option controls which dialect GCC uses as its default for inline assembler. The target-specific documentation for the -masm option contains the list of supported dialects, as well as the default dialect if the option is not specified. This information may be important to understand, since assembler code that works correctly when compiled using one dialect will likely fail if compiled using another. See x86 Options.

If your code needs to support multiple assembler dialects (for example, if you are writing public headers that need to support a variety of compilation options), use constructs of this form:

{ dialect0 | dialect1 | dialect2... }

On x86 and x86-64 targets there are two dialects. Dialect0 is AT&T syntax and Dialect1 is Intel syntax. The functions could be reworked this way:

#include <stdint.h>
#include "ioaccess.h"

uint8_t inb(uint16_t port)
{
    uint8_t ret;
    asm volatile ( "inb {%[port], %[retreg] | %[retreg], %[port]}"
                   : [retreg]"=a"(ret)
                   : [port]"Nd"(port) );
    return ret;
}

void outb(uint16_t port, uint8_t byte)
{
    asm volatile ( "outb {%[byte], %[port] | %[port], %[byte]}"
                   :
                   : [byte]"a"(byte),
                     [port]"Nd"(port) );
}

void io_wait(void)
{
    outb (0x80, 0);
}

I have also given the constraints symbolic names rather than using %0 and %1 to make the inline assembly easier to read and maintain.. From the GCC documentation each constraint has the form:

[ [asmSymbolicName] ] constraint (cvariablename)

Where:

asmSymbolicName

Specifies a symbolic name for the operand. Reference the name in the assembler template by enclosing it in square brackets (i.e. ‘%[Value]’). The scope of the name is the asm statement that contains the definition. Any valid C variable name is acceptable, including names already defined in the surrounding code. No two operands within the same asm statement can use the same symbolic name.

When not using an asmSymbolicName, use the (zero-based) position of the operand in the list of operands in the assembler template. For example if there are three output operands, use ‘%0’ in the template to refer to the first, ‘%1’ for the second, and ‘%2’ for the third.

This version should work² whether you compile with -masm=intel or -masm=att options

Footnotes

¹Although NASM Intel dialect and GAS's (GNU Assembler) Intel syntax are similar there are some differences. One is that NASM Intel syntax uses [segment:disp+base+index*scale] where a segment can be specified inside the [] and GAS's Intel syntax requires the segment outside with segment:[disp+base+index*scale].
²Although the code will work, you should place all these basic functions in the ioaccess.h file directly and eliminate them from the .c file that contains them. Because you placed these basic functions in a separate .c file (external linkage) the compiler can't optimize them as well as it could. You can modify the functions to be of type static inline and place them in the header directly. The compiler will then have the ability to optimize the code by removing function calling overhead and reduce the need for extra loads and stores. You will want to compile with optimizations higher than -O0. Consider -O2 or -O3.
Special Notes Regarding OS Development:
1. There are many toy OSes (examples, tutorials, and even code on OSDev Wiki) that do not work with optimizations on. Many failures are due to bad/poor inline assembly or using undefined behaviour. Inline assembly should be used as a last resort. If your kernel doesn't run with optimizations on it is likely not a bug in the compiler (it is possible just not likely).
2. Heed the advice in @PeterCordes answer regarding port access that may trigger DMA reads.