Search code examples
printfvariadic-functions

Determine types from a variadic function's arguments in C


I'd like a step by step explanation on how to parse the arguments of a variadic function so that when calling va_arg(ap, TYPE); I pass the correct data TYPE of the argument being passed.

Currently I'm trying to code printf. I am only looking for an explanation preferably with simple examples but not the solution to printf since I want to solve it myself.

Here are three examples which look like what I am looking for:

  1. https://stackoverflow.com/a/1689228/3206885
  2. https://stackoverflow.com/a/5551632/3206885
  3. https://stackoverflow.com/a/1722238/3206885

I know the basics of what typedef, struct, enum and union do but can't figure out some practical application cases like the examples in the links.

What do they really mean? I can't wrap my brain around how they work. How can I pass the data type from a union to va_arg like in the links examples? How does it match? with a modifier like %d, %i ... or the data type of a parameter?

Here's what I've got so far:

#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include "my.h"

typedef struct s_flist
{
    char c;
    (*f)();
}              t_flist;

int     my_printf(char *format, ...)
{
    va_list ap;
    int     i;
    int     j;
    int     result;
    int     arg_count;
    char    *cur_arg = format;
    char    *types;
    t_flist flist[] = 
    {
        { 's',  &my_putstr  },
        { 'i',  &my_put_nbr },
        { 'd',  &my_put_nbr }
    };

    i = 0;
    result = 0;
    types = (char*)malloc( sizeof(*format) * (my_strlen(format) / 2 + 1) );
    fparser(types, format);
    arg_count = my_strlen(types);

    while (format[i])
    {
        if (format[i] == '%' && format[i + 1])
        {
            i++;
            if (format[i] == '%')
                result += my_putchar(format[i]);
            else
            {
                j = 0;
                va_start(ap, format);
                while (flist[j].c)
                {
                    if (format[i] == flist[j].c)
                        result += flist[i].f(va_arg(ap, flist[i].DATA_TYPE??));
                    j++;
                }
            }
        }
        result += my_putchar(format[i]);
        i++;
    }

    va_end(ap);
    return (result);
}

char    *fparser(char *types, char *str)
{
    int     i;
    int     j;

    i = 0;
    j = 0;
    while (str[i])
    {
        if (str[i] == '%' && str[i + 1] &&
            str[i + 1] != '%' && str[i + 1] != ' ')
        {
            i++;
            types[j] = str[i];
            j++;
        }
        i++;
    }
    types[j] = '\0';
    return (types);
}

Solution

  • You can't get actual type information from va_list. You can get what you're looking for from format. What it seems you're not expecting is: none of the arguments know what the actual types are, but format represents the caller's idea of what the types should be. (Perhaps a further hint: what would the actual printf do if a caller gave it format specifiers that didn't match the varargs passed in? Would it notice?)

    Your code would have to parse the format string for "%" format specifiers, and use those specifiers to branch into reading the va_list with specific hardcoded types. For example, (pseudocode) if (fspec was "%s") { char* str = va_arg(ap, char*); print out str; }. Not giving more detail because you explicitly said you didn't want a complete solution.


    You will never have a type as a piece of runtime data that you can pass to va_arg as a value. The second argument to va_arg must be a literal, hardcoded specification referring to a known type at compile time. (Note that va_arg is a macro that gets expanded at compile time, not a function that gets executed at runtime - you couldn't have a function taking a type as an argument.)

    A couple of your links suggest keeping track of types via an enum, but this is only for the benefit of your own code being able to branch based on that information; it is still not something that can be passed to va_arg. You have to have separate pieces of code saying literally va_arg(ap, int) and va_arg(ap, char*) so there's no way to avoid a switch or a chain of ifs.

    The solution you want to make, using the unions and structs, would start from something like this:

    typedef union {
      int i;
      char *s;
    } PRINTABLE_THING;
    
    int print_integer(PRINTABLE_THING pt) {
      // format and print pt.i
    }
    int print_string(PRINTABLE_THING pt) {
      // format and print pt.s
    }
    

    The two specialized functions would work fine on their own by taking explicit int or char* params; the reason we make the union is to enable the functions to formally take the same type of parameter, so that they have the same signature, so that we can define a single type that means pointer to that kind of function:

    typedef int (*print_printable_thing)(PRINTABLE_THING);
    

    Now your code can have an array of function pointers of type print_printable_thing, or an array of structs that have print_printable_thing as one of the structs' fields:

    typedef struct {
      char format_char;
      print_printable_thing printing_function;
    } FORMAT_CHAR_AND_PRINTING_FUNCTION_PAIRING;
    
    FORMAT_CHAR_AND_PRINTING_FUNCTION_PAIRING formatters[] = {
      { 'd', print_integer },
      { 's', print_string }
    };
    int formatter_count = sizeof(formatters) / sizeof(FORMAT_CHAR_AND_PRINTING_FUNCTION_PAIRING);
    

    (Yes, the names are all intentionally super verbose. You'd probably want shorter ones in the real program, or even anonymous types where appropriate.)

    Now you can use that array to select the correct formatter at runtime:

    for (int i = 0; i < formatter_count; i++)
      if (current_format_char == formatters[i].format_char)
        result += formatters[i].printing_function(current_printable_thing);
    

    But the process of getting the correct thing into current_printable_thing is still going to involve branching to get to a va_arg(ap, ...) with the correct hardcoded type. Once you've written it, you may find yourself deciding that you didn't actually need the union nor the array of structs.