I want to create an API in C. My goal is to implement abstractions to access and mutate struct
variables that are defined in the API.
API's header file:
#ifndef API_H
#define API_H
struct s_accessor {
struct s* s_ptr;
};
void api_init_func(struct s_accessor *foo);
void api_mutate_func(struct s_accessor *foo, int x);
void api_print_func(struct s_accessor *foo);
#endif
API' implementation file:
#include <stdio.h>
#include "api.h"
struct s {
int internal;
int other_stuff;
};
void api_init_func(struct s_accessor* foo) {
foo->s_ptr = NULL;
}
void api_print_func(struct s_accessor *foo)
{
printf("Value of member 'internal' = %d\n", foo->s_ptr->internal);
}
void api_mutate_func(struct s_accessor *foo, int x)
{
struct s bar;
foo->s_ptr = &bar;
foo->s_ptr->internal = x;
}
Client-side program that uses the API:
#include <stdio.h>
#include "api.h"
int main()
{
struct s_accessor foo;
api_init_func(&foo); // set s_ptr to NULL
api_mutate_func(&foo, 123); // change value of member 'internal' of an instance of struct s
api_print_func(&foo); // print member of struct s
}
I have the following questions regarding my code:
Is there a direct (non-hackish) way to hide the implementation of my API?
Is this the proper way to create abstractions for the client-side to use my API? If not, how can I improve this to make it better?
"Accessor" isn't a good terminology. This term is used in object oriented programming to denote a kind of method.
The structure type struct s_accessor
is in fact something called a handle. It contains a pointer to the real object. A handle is a doubly indirect pointer: the application passes around pointers to handles, and the handles contain pointers to the objects.
An old adage says that "any problem in computer science can be solved by adding another layer of indirection", of which handles are a prime example. Handles allow objects to be moved from one address to another or to be replaced. Yet, to the application, the handle address represents the object, and so when the implementation object is relocated or replaced, as far as the application is concerned, it is still the same object.
With a handle we can do things like:
have a vector object that can grow
have OOP objects that can apparently change their class
relocate variable-length objects such as buffers and strings to compact their memory footprint
all without the object changing its memory address and thus identity. Because the handle stays the same when these changes occur, the application does not have to hunt down every copy of the object pointer and replace it with a new one; the handle effectively takes care of that in one place.
In spite of all of that, handles tend to be unusual in C API's, in particular lower-level ones. Given an API that does not use handles, you can whip up handles around it. Even if you think that the users of your object will benefit from handles, it may be good to split up the API into two: an internal one which only deals with s
, and the external one with the struct s_handle
.
If you're using threads, then handles require careful concurrent programming. So that is to say, even though from the application's point of view, you can change the handle-referenced object, which is convenient, it requires synchronization. Say we have a vector object referenced by a handle. Application code is working with it, so we can't just suddenly replace the vector with a pointer to a different one (in order to resize it). Another thread is just in the middle of working with the original pointer. The operations that access the vector or store values into it through the handle must be synchronized with the replacement operation. Even if all of that is done right, it's going to add a lot of overhead, and so then application people may notice some performance problems and ask for escape hatches in the API, like for some functions function to "pin" down a handle so that the object cannot move while an efficient operation works directly with the s
object inside it.
For that reason, I would tend stay away from designing a handle API, and make that sort of thing the application's problem. It may well be easier for a multi-threaded application to just use a well-designed "just the s
please" API correctly, than to write a completely thread-safe, robust, efficient struct s_handle
layer.
- Is there a direct (non-hackish) way to hide the implementation of my API?
Basically the "rule #1" of hiding the implementation of an API in C is not to allow an init
operation whereby the client application declares some memory and your API initializes it. That said, it is possible like this:
typedef struct opaque opaque_t;
#ifndef OPAQUE_IMPL
struct opaque {
int dummy[42]; // big enough for all future extension
} opaque_t;
#endif
void opaque_init(opaque_t *o);
In this declaration, we have revealed nothing to the client, other than that our objects are buffers of memory that require int
alignment, and are at least 42 int
wide.
In actual fact, the objects are smaller; we have just added a reserve amount for future growth. We can make our actual object larger withotu having to re-compile the clients, as long as our object does not require more than int [42]
bytes.
Why we have that #ifndef
is that the implementation code will do something like this:
#define OPAQUE_IMPL // suppress the fake definition in the header #include "opaque.h"
// actual definition struct opaque { int whatever; char *name; };
This kind of thing plays it loose with the "law" of ISO C, because effectively the client and implementation are using a different definition of the struct opaque
type.
Allowing clients to allocate the objects themselves yields certain efficiencies, because allocating objects in automatic storage (i.e. declaring them as local variables) can place them in the stack with very little overhead compared to dynamic memory allocation.
The more common approach for opaqueness is not to provide an init
operation at all, only an operation for allocating a new object and destroying it:
typedef struct opaque opaque_t; // incomplete struct
opaque_t *opaque_create(/* args .... */);
void opaque_destroy(opaque_t *o);
Now the caller knows nothing, other than that an "opaque" object is represented as a pointer, the same pointer over its entire lifetime.
Total opaqueness may not be worth it for an API which is internal to an application or application framework. It's useful for an API that has external clients, like application developers in a different team or organization.
Ask yourself the question: would the client of this API, and its implementation, ever be shipped and upgraded separately? If the answer is no, then that diminishes the need for total opaqueness.