tag:blogger.com,1999:blog-5268847417417953349.post5649784287947814262..comments2024-02-11T03:28:39.770-05:00Comments on inactivity log for davidz: Writing a C library, part 1davidzhttp://www.blogger.com/profile/18166813552495508964noreply@blogger.comBlogger8125tag:blogger.com,1999:blog-5268847417417953349.post-7666687001188212032011-06-28T09:12:57.285-04:002011-06-28T09:12:57.285-04:00This article had all types of goodies, but you rea...This article had all types of goodies, but you really have to know what you're looking for to find them. To prevent from being the stereotypical internet pessimist... here are some real comments I have on the article (without peer review.) ;-)<br /><br />As a foot note:<br />I'd advise you get a peer review on any follow up parts to this series to make the language flow a little better. Other than the roughness of the article, it did have some good advise.<br /><br />Thanks,<br />ChenzCrazyhttps://www.blogger.com/profile/15205285877309169745noreply@blogger.comtag:blogger.com,1999:blog-5268847417417953349.post-87054595124718500222011-06-28T09:12:24.847-04:002011-06-28T09:12:24.847-04:00-- Multiple Threads and Processes --
"Docume...-- Multiple Threads and Processes --<br /><br />"Document if and how the library can be used from multiple threads."<br /><br />Agreed, documentation should include trivial and non-trivial examples and a list of potential pitfalls.<br /><br />"Document what steps need to be taken after fork() or if the library is now unusable."<br /><br />You need to understand that your library is now duplicated but looking at the same file descriptors and streams as another as well as having a different pid. To handle this case gracefully, you'll need advanced locking, or IPC techniques. I agree that it'd be safer to just exec() if possible.<br /><br />"Document if the library is creating private worker threads."<br />And don't forget about child processes.<br /><br />Some extra advised that I've struggled with in the past with my own libraries include:<br />- Another multi-threading pitfall I've found is knowing where/when to create a new reference to memory. In short, you should always create a reference that a thread will use outside and before the thread execution (if applicable.) The issue is when you have multiple threads executing on a structure, at any time a thread can "unref" the memory potentially causing it to be freed. But as long as your current scope has a valid reference, reference counted memory should prevent it from being freed.Crazyhttps://www.blogger.com/profile/15205285877309169745noreply@blogger.comtag:blogger.com,1999:blog-5268847417417953349.post-75520915841041475182011-06-28T09:11:59.634-04:002011-06-28T09:11:59.634-04:00-- Memory management --
"Provide a free() or...-- Memory management --<br /><br />"Provide a free() or unref() function for each type your library introduces."<br /><br />Agreed. But IMHO:<br />- type_init - sets and allocated type to sane values<br />- type_new - allocates a type and inits the new allocation<br />- type_free - deallocates a type<br />- type_ref, type_getref, type_get - creates a new reference (increments reference count)<br />- type_unref, type_release, type_put - removes reference (decrements reference count)<br /><br />Another one I like to include (separate from type_free()) is:<br />void type_destroy(type_t ** t)<br />Usually when freeing memory, you should always<br /><br /> free(ptr);<br /> ptr = NULL;<br /><br />I like to simplify that to a one liner that looks like:<br /><br /> type_destroy(&ptr);<br /><br />type_destroy exists to decrement reference count, deallocate memory if reference count is zero, and sets pointer to NULL so subsequent "if (ptr)" checks operate as intended.<br /><br />"Ensure that memory handling consistent across your library."<br /><br />Memory management should be complete and well defined. As always, it should be clearly described in documentation with trivial and non-trivial examples.<br /><br />"Note that multi-threading may impose certain kinds of API."<br /><br />This should be inherit in the programmers skill set, but I agree. OO programming with instanced variables/references and locking mechanisms lend toward a more friendly multi-threaded experience. Static variables, global variables, lend to a less thread safe experience. <br /><br />My rule of thumb for this is usually:<br />If there no non-constant variables defined globally, the library is _capable_ of being thread safe. The effort to make the library thread safe depends on the locking or concurrency logic overhead required.<br /><br />"Make sure the documentation is clear on how memory is managed."<br /><br />Documentation should always have trivial AND non-trivial examples and a list of potential Pitfalls<br /><br />"Abort on OOM unless there are very good reasons for handling OOM."<br /><br />Disagree. There may be valid exceptions to this, and one involves allocations that require _HUGE_ amounts of memory. If this occurs and fails, it is likely something caused by user input and not that the system is low on memory. Prime example is loading a >4GB file into memory on a 32bit system.<br /><br />Some extra advise (stuff I've struggled with in the past with my own libraries) includes:<br />- Minimize the usage of "user-defined" void* types.<br />- When doing memory management, don't forget to think about where the memory your pointers are pointing to is located with respect to the heap or the stack. Nastyness can occur if you free stack memory or don't free heap memory.<br />- C does not do reference counting, so having a reference counting mechanism for C in a multi-threaded environment should be an absolute requirement. Realistically, you won't be able to always track when to free an allocated type without reference counting. This should especially be considered when using lists or trees to store references to memory.Crazyhttps://www.blogger.com/profile/15205285877309169745noreply@blogger.comtag:blogger.com,1999:blog-5268847417417953349.post-75442019911544149922011-06-28T09:11:15.602-04:002011-06-28T09:11:15.602-04:00-- Library initialization and shutdown --
"A...-- Library initialization and shutdown --<br /><br />"Avoid init() / shutdown() routines - if you can’t avoid them, do make sure they are idempotent, thread-safe and reference-counted."<br /><br />This is not clear enough. Global init and shutdown should be avoided or treated as non-thread/fork safe. A structure that holds all the state information for a library should have corresponding init() and shutdown() calls.<br /><br />"Use environment variables for library initialization parameters, not argc and argv."<br /><br />At a low level, a library should have all parameters set with API calls. One level up would include a call such as get_env_opts() to grab environement variable settings and get_cmd_opts(char *) to parse a well defined command line argument list. Then the library user or executable can decide how to handle the library configuration.<br /><br />"You can easily have two unrelated library users in the same process - often without the main application knowing about the library at all. Make sure your library can handle that."<br /><br />Uh... this makes little to no sense. But I'll supplement it with... make sure to clearly document the capabilities and behavior of your library in a multi-threaded or multi-process environment. Simply making something "thread-safe" is pointless without context and usage guidance.<br /><br />"Avoid unsafe API like atexit(3) and, if portability is a concern, unportable constructs like library constructors and destructors (e.g. gcc’s __attribute__ ((constructor)) and __attribute__ ((destructor)))."<br /><br />OK... so to put this more simple, if portability (of source code) is of high concern, know your dependencies! Highly portable code should by default avoid compiler dependent extensions, and non-POSIX functions. The exception is allowing for tweaked code to be enabled with CPP (pre-processor macros).Crazyhttps://www.blogger.com/profile/15205285877309169745noreply@blogger.comtag:blogger.com,1999:blog-5268847417417953349.post-5081495367740227962011-06-28T08:39:58.990-04:002011-06-28T08:39:58.990-04:00>"Don’t reinvent basic data-types (unless ...>"Don’t reinvent basic data-types (unless performance is a concern)."<br /><br />glib violates this very simple rule right from the start. gpointer is just as bad as PVOID. “Just fuckin write void *.”<br /><br />>Avoid init() / shutdown() routines [...] Avoid unsafe API like atexit(3) and, if portability is a concern, unportable constructs like library constructors and destructors (e.g. gcc’s __attribute__ ((constructor)) and __attribute__ ((destructor))).<br /><br />Explicit destructor seems better. libgcrypt has a track record with all those issues http://comments.gmane.org/gmane.comp.encryption.gpg.libgcrypt.devel/2126 where it seems that explicit init/exit (refcounted of course) is the superior solution.<br /><br />>if you don’t call them from where they are used, you are possible forcing the application to call a init() function in main(), just because some library deep down in the dependency chain is using the library without initializing it.<br /><br />Fix the intermittent component in question, not main.j.enghttps://www.blogger.com/profile/09615346206411435754noreply@blogger.comtag:blogger.com,1999:blog-5268847417417953349.post-84162833350497745992011-06-28T07:51:51.907-04:002011-06-28T07:51:51.907-04:00Hey Aater, just to be clear the suggestion was onl...Hey Aater, just to be clear the suggestion was only to use atomic ops for reference counting, not any kind of add - the typical use is like this<br /><br /> http://git.gnome.org/browse/glib/tree/gio/gfileattribute.c?id=2.29.8#n923<br /><br />using g_atomic_int_inc() and g_atomic_int_dec_and_test() for atomic operations. See their docs here<br /><br /> http://developer.gnome.org/glib/unstable/glib-Atomic-Operations.html#glib-Atomic-Operations.description<br /><br />in particular the implementation details. Now, implementation-wise, with recent GLib libraries and recent gcc compilers, these are implemented as macros that expand using gcc built-ins, see<br /><br /> http://git.gnome.org/browse/glib/tree/glib/gatomic.h?id=2.29.8#n72<br /><br />which IIRC results to just a couple of assembler instructions if using a modern CPU (such as x86_64) and falls back to a slower path otherwise. See http://gcc.gnu.org/onlinedocs/gcc/Atomic-Builtins.html#Atomic-Builtins and http://gcc.gnu.org/wiki/Atomicdavidzhttps://www.blogger.com/profile/18166813552495508964noreply@blogger.comtag:blogger.com,1999:blog-5268847417417953349.post-25222776693920719442011-06-28T00:11:46.116-04:002011-06-28T00:11:46.116-04:00I am think that this is a very tricky requirement....I am think that this is a very tricky requirement. If you make all adds atomic, its a large amount of overhead. I don't see how you can prevent their modification by multiple thread without adding critical section or severely limiting functionality. Perhaps I am misunderstanding your point. Can you please clarify. <br /><br />Aater (i.e., the futurechips guy)Aater Sulemanhttps://www.blogger.com/profile/07068003544808755975noreply@blogger.comtag:blogger.com,1999:blog-5268847417417953349.post-40725525959850612082011-06-27T18:49:37.837-04:002011-06-27T18:49:37.837-04:00If your library does do proper synchronization for...If your library does do proper synchronization for multiple threads, but you don't want to pay the overhead of pthreads for single-threaded programs, or you want to preserve portability to platforms without pthreads, try libpthread-stubs.Josh Tripletthttps://www.blogger.com/profile/02593171817329248190noreply@blogger.com