Wednesday, June 29, 2011

Writing a C library, part 3

This is part three in a series of blog-posts about best practices for writing C libraries. Previous installments: part one, part two.

Modularity and namespaces

The C programming language does not support the concept of namespaces (as used in e.g. C++ or Python) so it is usually emulated simply by using naming conventions. The main reason of namespaces is to avoid naming collisions - consider both libwoot and libkool providing a function called get_all_objects() - which one should be used if a program links to both libraries? Namespacing is an important part of a naming strategy and applies to variables, function names, type names (including structs, unions, enums and typedefs) and macros.

The standard convention is to use a short identifier, e.g. for libnm-glib you will see nm_ and NM being used, for Clutter it's clutter and Clutter and for libpolkit-agent-1, it's polkit_agent and PolkitAgent. For libraries that don't use CamelCase for its types, the same prefix is normally used for functions and types - for example, libudev the prefix used is simply udev.

Code that isn't using namespaces properly is not only hard to integrate into other libraries and applications (the chance of symbol collisions is high), there's also a chance that it will collide with future additions to the standard C library or POSIX standards.

One benefit of using namespaces in C (one that ironically is not present in a language with proper support for namespaces), is that it's a lot easier to pinpoint what the code is doing by just looking at a fragment of the source code - e.g. when you see an item being added to a container, you are usually not in doubt whether the programmer meant to invoke GtkContainer's add() method or ClutterContainer's add() method because of how C namespacing forces the programmer to be explicit, for better or worse.

In addition to choosing a good naming strategy, note that the visibility of what symbols (typically variables and functions) a library export can be fine-tuned, see these notes for why this is desirable.

On the topic of naming, it is usually a good idea to avoid C++ keywords (such as "class") for variable names, at least in header files that you except C++ code to include using e.g. extern "C". Additionally, generally avoid names of functions in the C standard library / POSIX for variable names such as "interface" or "index" because these functions can (and on Linux, actually is) be defined as macros.

Checklist

  • Choose a naming convention - and stick to it.
  • Do not export symbols that are not public API.

Error handling

If there's one statement that adequately describes error handling in the C programming language, it's perhaps that it's something that people rarely agree on. Most programmers, however, would agree that errors can be broken down into two categories 1) programmer errors; 2) run-time errors.

A programmer error is when the programmer isn't using a function correctly - e.g. passing a non-UTF-8 string to a function expecting a valid UTF-8 string such as g_variant_new_string() (if unsure, validate with g_utf8_validate() before calling the function) or passing an invalid D-Bus name to g_bus_own_name() (if unsure, validate with g_dbus_is_name() and g_dbus_is_unique_name() before calling).

Most libraries have undefined behavior in the presence of being used incorrectly - in the GLib case the macros g_return_if_fail() / g_return_val_if_fail() are used, see e.g. the checks in g_variant_new_string() and the checks in g_dbus_own_name().  Additionally, for performance, these checks can be disabled by defining the macro G_DISABLE_CHECKS when building either GLib itself or applications using GLib (but usually aren't). Not all parameters may be checked, however, and the check might not cover all cases because checks can be expensive. Combined with the G_DEBUG flag, it's even easy to trap this in debugger by running the program in an environment where G_DEBUG=fatal-warnings.

Having g_return_if_fail()-style checks is usually a trade-off - for example, GLib didn't initially have the UTF-8 check in g_variant_new_string() - it was only added when it became apparent that a considerable amount of users passed non-UTF-8 data which caused errors in unrelated code that was extremely hard to track down - see the commit message for details. If this cost is unacceptable, the programmer can easily use the g_variant_new_from_data() function passing TRUE as the trusted parameter.

Even with a library doing proper parameter validation (to catch programmer errors early on), if you pass garbage to a function you usually end up with undefined behavior and undefined behavior can mean anything including formatting your hard disk or evaporating all booze in a five-mile radius (oh noz). That's why some libraries simply calls abort() instead of carrying on pretending nothing happened. In general, a C library can never guarantee that it won't blow up no matter what data is passed - for example the user may pass a pointer to invalid data and, boom, SIGSEGV is raised when the library tries to accesses it. Of course the library could try to recover, longjmp(3) style, but since it's a library it can't mess around with process-wide state like signal handlers. Unfortunately, even smart people sometime fail to realize that the caller has a responsibility and instead blames the library instead of its user (for the record, libdbus-1 is fine which is why process 1 is able to use it without any problems). In most cases, problems like these are solved by just throwing documentation at the problem.

To conclude, when it comes to programmer errors, one key take away is that it's a good idea to document exactly what kind of input a function accepts. As the saying goes, "trust is good, control is better", it is also a good idea to verify that the programmer gets it right by using g_return_if_fail() style checks (and possibly provide API that does no such checks). Also, if your code does any kinds of checks, make sure that the functions used for checking (if non-trivial) are public so e.g. language bindings have a chance to validate input before calling the function (see also: notes on errors in libdbus).

A run-time error is e.g. if fopen(3) returns NULL (for example the file to be opened does not exist or the calling process is not privileged to open it), g_socket_client_connect() returns FALSE (the network might not be up) or g_try_malloc() returns NULL (might not have enough address space for a 8GiB array). By definition, run-time errors are recoverable although the code you are using might treat some (like malloc(3) failing) as irrecoverable because handling some run-time errors (such as OOM) would complicate the API not only on the function level (possibly by taking an error parameter), but also by requiring transactional semantics (e.g. rollback) on most data types (see also: write-up on why handling OOM is hard and a good explanation of Linux's overcommit feature).

For simple libraries just using libc's errno is often simplest approach to handling run-time errors (since it's thread-safe and every C programmer knows it) but note that some functions including asprintf(3) does not set errno to ENOMEM if e.g. failing to allocate memory. If you are basing your code on a library like GLib, use its native error type, e.g. GError, for run-time errors. An interesting approach to handling errors is the one used by the cairo 2D graphics library where (non-trivial) object instances track the error state (see e.g. cairo_status() and cairo_device_status()). There are many many other ways to convey run-time errors - as always, the important thing when writing a C library is to be consistent.

Checklist

  • Document valid and invalid value ranges for parameters (if any) and provide facilities to validate parameters (unless trivial) for programmers and language bindings
  • Try to validate incoming parameters at public API boundaries
  • Establish a policy on how to deal with programmer errors (e.g. undefined behavior or abort()).
  • Establish a policy on how to deal with run-time errors (e.g. use errno or GError)
  • Ensure the way you handle run-time errors map to common exception handling systems.

Encapsulation and OO design

While C as programming language does not have built-in support for object-oriented programming lots of C programmers use C that way - in many ways it's almost hard not to. In fact, many C programmers regard the simplicity of C (compared to, say, C++) as a feature insofar that you are not bound to any one object model - for example, the kernel uses various OO techniques and the GLib/GTK+ stack has its own dynamic type system called GType on which the GObject base class (that many classes are derived from) is built.

There's of course a price to pay for defining your own object model - it typically involves more typing (identifiers are longer) and, especially for GObject, involves actual function calls to register properties, add private instance data and so on (example). On the other hand, such a dynamic type system often offer some level of type introspection so it's possible to easy link the property for whether a check-button widget is active with whether an text-entry widget should use password mode using the g_object_bind_property() function (screenshot). Polymorphism in GObject is provided by embedding a virtual method table in the class struct (example) and providing a C functions that uses the function pointer (example) - note that derived types can access the class struct to chain up (example).

One important feature of object-oriented design in C is that it usually promotes encapsulation and data hiding through the use of opaque data types - this is desirable as it allows extending the data type (e.g. adding more properties or methods) without breaking or requiring a recompile existing programs using the library (a future installment will discuss API and ABI and what it means wrt. API design). In an opaque data type, fields that would usually be in the C struct are hidden from the user and instead are made available via a getter (example) and/or setter (example) - additionally, if the object model support properties, the member may also be made available as a property (example) - for example, this is useful for notifying when the property changes.

Of course, not every single data structure need to be a full-blown GObject - for example, in some cases data hiding might not be desirable (sometimes it's awkward to use a C getter function) or maybe it's too slow to do from an inner loop (direct struct access is without a doubt faster). Also, for simple data structures it is sometimes desirable to initialize struct instances directly in the code.

Even when a full-blown object model (like GType and GObject) isn't used, it's never a bad idea to use opaque data structures and getters/setters. As an interesting alternative to this, note that some libraries explicitly allows extending a C structure without considering it an ABI change - while there's no easy way to enforce this (the user may allocate the structure on the stack), at least the library author can always tell the programmer that he shouldn't have done so (which may or may not be useful).

Checklist

  • Establish an object model for your library (if applicable).
  • Hide as many implementation details as is practical without impacting performance
  • Ensure that you can extend your library and types without breaking API or ABI.
  • If possible, build on top of an established and well-understood object system (such as the GLib one)

8 comments:

  1. This is a really nice resource, thanks!

    ReplyDelete
  2. >[errno] (since it's thread-safe and every C programmer knows it)

    It may be safe from other threads due to TLS, but it is still a object with global scope, which results in having to longwindingly save and restore errno across calls to other opaque library functions. The Linux kernel in contrast shows how to do without such a global.

    ReplyDelete
  3. >The Linux kernel in contrast shows how to do without such a global.

    Is there an article on this I can read? Please post if so. It sounds interesting.

    ReplyDelete
  4. Very nice articles - thank you!

    When it comes error handling I have made it a principle that _all_ functions calling abort() should have an accompanying _validate_input() function which the timid user can call first.

    ReplyDelete
  5. One lesson learned on the topic of namespaces for me was to *not* use abbreviations for library names. In my pet project buzztard I initially named a library libbtcore and that happened to later on clash with a libbtcore from ktorrent. I realized that 'bt' could mean lots of things from buzztard, bittorrent, bluetooth and decided to rename my lib to libbuzztard-core. The ktorrent guy did not rename theirs upon request though.

    ReplyDelete
  6. > One benefit of using namespaces in C (one that
    > ironically is not present in a language with proper
    > support for namespaces), is that it's a lot easier to
    > pinpoint what the code is doing by just looking at a
    > fragment of the source code
    actually it is a feature of the language with namespace support. C programmers maybe never heard of it in their whole life, but for completeness its called "abstraction" ;)

    Its useful for instance when you port some code from gtk to clutter. Then all you have to do is change the type of my_container form GtkContainer to ClutterContainer instead of changing all the calls on my_container.
    Of course this only works if you got the abstraction of your own code right in the first place.

    ReplyDelete
  7. @rotjberg: the point is that the code is easily grep(1)'able - also mentally grep'able which is what I tried to convey in the snippet you posted. If you've ever worked on e.g. a large c++ or Java codebase you will know what I'm talking about.

    ReplyDelete
  8. @Phillip Fry (Is there an article on [linux kernel way of errno] I can read?): Not that I know of, but http://jengelh.medozas.de/2011/0705-lkerrno.php for a start.

    ReplyDelete