Wednesday, July 6, 2011

Writing a C library, intro, conclusion and errata

This is a series of blog-posts about best practices for writing C libraries. See below for each part and the topics covered.

Table of contents

The entire series about best practices for writing C libraries covered 15 topics and was written over five parts posted over the course of approximately one week. Feel free to hotlink directly to each topic but please keep in mind that the content (like any other content on this blog) is copyrighted by its author and may not be reproduced without his consent (if you are friendly towards free software, like e.g. LWN, just ask and I will probably give you permission):

Topics not covered

Some topics relevant for writing a C library isn't (yet?) covered in this series either because I'm not an expert on the topic, the topic is still in development or for other reasons:
  • Networking
    You would think IP networking is easy but it's really not and the low-level APIs that are part of POSIX (e.g. BSD Sockets) are not really that helpful since they only do part of what you need. Difficult things here include name resolution, service resolutionproxy server handling, dual-stack addressing and transport security (including handling certificates for authentication).

    If you are using modern GLib networking primitives (such as GSocketClient or GSocketService) all of these problems are taken care of for you without you having to do much work; if not, well, talking to people (or at least, read the blogs) such as Dan Winship, Dan Williams or Lennart Poettering is probably your best bet.

  • Build systems
    This is a topic that continues to make me sad so I decided to not really cover it in the series because the best guidance I can give is to just copy/paste whatever other projects are doing - see e.g. the GLib source tree for how to nicely integrate unit testing (see Makefile.decl) and documentation (see docs/reference sub-directories) into the build system (link).

    Ideally we would have a single great IDE for developing Linux libraries and applications (integrating testing, documentation, distribution, package building and so on - see e.g. Sami's libhover video) but even if we did, most existing Linux programmers probably wouldn't use it because they are so used to e.g. emacs or vi (if you build it, they will come?). There's a couple of initiatives in this area including Eclipse CDT, Anjuta, KDevelop and MonoDevelop.

  • Bundling libraries/resources
    The traditional way of distributing applications on Linux is through so-called Linux distributions - the four most well-known being DebianFedoraopenSUSE and Ubuntu (in alphabetical order!). These guys, basically, take your source code, compile it against some version of other software it depends on (usually a different version than you, the developer, used), and then ship binary packages to users using dynamic linking.

    There's a couple of problems with this legacy model of distributing software (this list is not exhaustive): a) it can take up to one or two distribution release cycles (6-12 months) before your software is available to end users; and b) user X can't give a copy of the software to user Y - he can only tell him where to get it (it might not be available on user Y's distro); and c) it's all a hodgepodge of version skew e.g. the final product that your users are using is, most likely, using different versions of different libraries so who knows if it works; and d) the software is sometimes changed in ways that you, the original author, wasn't expecting or does not approve of (for example, by removing credits); and e) the distribution might not forward you bug reports or may forward you bug reports that is caused by downstream patches; and f) there's a peer pressure to not depend on too new libraries because distributions want to ship your software in old versions of their OS - for example, Mozilla wants to be able to run on a system with just GTK+ 2.8 installed (and hence won't use features in GTK+ 2.10 or later except for using dlopen()-techniques), similar for e.g. Google Chrome (maybe with a newer GTK+ version though). These problems are virtually unknown to developers on other platforms such as Microsoft Windows, Mac OS X or even some of the smartphone platforms such as iOS or Android - they all have fancy tools that bundles things up nicely so the developers won't have to worry about such things.

    There's a couple of interesting initiatives in this area see e.g. bockbuild, glick and the proposal to add a resource-system to GLib. Note that it's very very hard to do this properly since it depends not only on fixing a lot of libraries so they are relocatable, it also depends on identifying exactly what kind of run-time requirements each library in question has. The latter includes the kernel/udev version, the libc version (unless bundled or statically linked), the X11 server version (and its extensions such as e.g. RENDER) version, the presence of one or more message buses and so on. With modern techniques such as direct rendering this becomes even harder if you want to take advantage of hardware acceleration since you must assume that the host OS is providing recent enough versions of e.g. OpenGL or cairo libraries (since you don't want to bundle hardware drivers). And even after all this, you still need to deal with how each distribution patches core components. In some circumstances it might end up being easier to just ship a kernel+runtime along with the application, virtualized.
The way the series is set up is so it can be extended at a later point - so if there is a demand for one or more popular topics about writing a C library, I might write another blog entry and add it to this page as it's considered the canonical location for the entire series.

Errata

Please send me feedback and I will fix up the section in question and credit you here (I already have a couple of corrections lined up that I will add later).

10 comments:

  1. Amazing series, it was very interesting to read it. Thanks. :)

    ReplyDelete
  2. as for the build system there seems to be a consensus on CMake. KDE uses as well as some scientific packages as VTK and MySQL. I know also some windows centric closed source applications which use it. It might be not beautiful, but it does the job while remaining readable.

    reagarding the shared libraries on linux I would strongly disagree with you. Yes you have to plan more carefully which libraries are available on your deployment system, but you get the dependancy handling for free.
    Basically this is what allows you to depend on whatever you want when having on a package manager, while each dependancy is a pain if you try bundle it with your app.
    (the package manager also fixes bugs in the libraries you depend on through updating the so)

    ReplyDelete
  3. rojtberg: I don't think there's any consensus about CMake just because of those projects using it (and I personally don't think CMake is a step up compared to autotools ... again: YMMV).

    As for shared libraries and bundling, there's really no consensus here... but from working in this problem space for the last 7+ years (for a prominent Linux distro vendor) I found that treating e.g. glibc/kernel/udev and e.g. firefox/chrome/gedit in the same way when it comes to software distribution is not at all useful in any way (former group are core OS elements, the latter group are apps). There is definitely room for improvement here.

    ReplyDelete
  4. > the package manager also fixes bugs in the libraries you depend on through updating

    Btw, this is a fallacy - it might just as well introduce a bug in your app by fixing the very library bug the app depends on. The point I was trying to make in the blog entry is that the application developer QAs the entire app against a specific set of libraries with specific versions. All that QA effort goes the way of the dodo if you start using other versions of libraries.

    ReplyDelete
  5. @davidz: You really thing Open Source app developers QA their apps? In my younger days as an app dev, I just made releases and expected the end users to report bugs from building it on their distros and it worked surprisingly well. But I guess it's not the same for Firefox or any other apps that provides binary builds.

    ReplyDelete
  6. @ocrete: Yeah, I think projects like Firefox and Google Chrome are excellent examples of good applications - very high quality apps with a good amount of QA including beta- and devel-streams.

    In fact, we should encourage other apps to emulate this behavior - including giving them tools a'la bockbuild for creating bundles for easy deployment (both Google Chrome and Firefox bundles a lot of libraries too). I wish GNOME would ship a SDK that easily enabled something like this (tarballs != SDK).

    ReplyDelete
  7. We're working to fix the build and packaging problem with Apters (early work in progress at the moment). We've designed it specifically to solve several of the issues you mention, such as controlling version skew, and making it easier for upstream to provide useful builds without making life difficult for downstream distributions. We'd greatly appreciate any input you might have.

    ReplyDelete
  8. Argh, the link seems to have gotten eaten. That should have said: "We're working to fix the build and packaging problem with Apters http://apters.com/ (early work in progress at the moment)."

    ReplyDelete
  9. Thanks for this very informative series - really a nice check list.

    Also, the "Bundling libraries/resources" section is a pearl on its own; good to have some important facts collected here.

    ReplyDelete
  10. I miss filesystem label from previus vertion for selected Device.

    ReplyDelete