Generated by spec.rb at: Thu Mar 27 22:15:39 +0100 2008

Why I LOVE Apache httpd hooks and pools

Table of contents

Intro

This article discusses what Apache httpd dev team did GREAT in Apache 2.0.x in terms of extensibility and what other people (PHP developers included) can learn from them.

Main matter

So, you're a fairly proficient sysadmin/developer. You're used to patching various software packages to mould them to your needs. It's a tedious process at times, but the results are rewarding.

Sometimes when you have to patch things like PHP, it feels a bit like swallowing razors. The internals are unreadable, there's almost no documentation and the codestyle is that of a one-year old. You manage to patch the code, but you'll never be the same.

Well, that is definitely not the case with Apache httpd. I've never seen large project written in C that's as easy to understand as Apache. But it doesn't end there! Due to Apache hook framework it's brain dead simple to alter even the core functionality like vhost lookup. And if that wasn't enough, the whole pool-based memory allocation makes you squeak in ecstasy. (or is it just me? who knows)

Hook framework

OK, let's cut to the chase... what's so special about Apache hooks?

Well, normally when you want to extend something and there's no interface you could use for that change, you're out of luck. Unless you want to create ugly patch that'll tie your modification to the main code. And if you want to disable your patch for a while or under certain conditions, you have to create config variable. Essentially you have to jump all over the code to get the job done.

While that patchwork is still required when patching Apache 2.0.X, it's so simple you'll love it at the first sight.

Let's show a real-world example. For some reason you need to retain some control over what variables are set as subprocess environment which is accessible in CGI and/or PHP's $_SERVER array. After little big of digging you find out that it is being set by ap_add_common_vars call. Unfortunately there's no way you can influence that function in standard distribution.

Well, apache hooks to the rescue! Based on apache hooks documentation you add one line of code (plus comments!) in appropriate header:

Index: include/util_script.h
===================================================================
--- include/util_script.h	(revision 3)
+++ include/util_script.h	(working copy)
@@ -135,6 +135,18 @@
 				       int (*getsfunc) (char *, int, void *),
 				       void *getsfunc_data);
 
+/* Hooks */
+
+/**
+ * Gives modules a chance to override variables in subprocess_env table during
+ * ap_add_common_vars call. You should modify "t" and not thinking about messing
+ * with r->subprocess_env directly.
+ * @param r The current request
+ * @param t The current subprocess_env table that was built
+ * @ingroup hooks
+ */
+AP_DECLARE_HOOK(void,common_vars_override,(request_rec *r, apr_table_t *t))
+
 #ifdef __cplusplus
 }
 #endif

and a few lines in the function itself:

Index: server/util_script.c
===================================================================
--- server/util_script.c	(revision 3)
+++ server/util_script.c	(working copy)
@@ -43,6 +43,12 @@
 #include <os2.h>
 #endif
 
+APR_HOOK_STRUCT(
+    APR_HOOK_LINK(common_vars_override)
+)
+
+AP_IMPLEMENT_HOOK_VOID(common_vars_override, (request_rec *r, apr_table_t *t), (r, t))
+
 /*
  * Various utility functions which are common to a whole lot of
  * script-type extensions mechanisms, and might as well be gathered
@@ -277,6 +283,8 @@
     if (e != r->subprocess_env) {
       apr_table_overlap(r->subprocess_env, e, APR_OVERLAP_TABLES_SET);
     }
+
+    ap_run_common_vars_override(r, r->subprocess_env);
 }
 
 /* This "cute" little function comes about because the path info on

and you're done patching Apache!

Then when the time comes that you want to override something in subprocess_env, you just add callback in your module:

/* override subprocess_env during ap_add_common_vars call */
static void common_vars_override(request_rec *r, apr_table_t *t) /* {{{1 */
{
	DD(r->pool, "%d/%d/%d: common_vars_override begin", r, r->main, getpid());
	apr_table_set(t, "HELLO_WORLD", "from_common_vars_override");
	DD(r->pool, "%d/%d/%d: common_vars_override end", r, r->main, getpid());
}
/* }}}1 */

hook it up in register_hooks (which is mandatory for all modules):

/* register module hooks */
static void register_hooks(apr_pool_t *p) /* {{{1 */
{
	/* [...] */
	ap_hook_common_vars_override(common_vars_override, NULL, NULL, APR_HOOK_FIRST);
	/* [...] */
}
/* }}}1 */

and voilla! It works like charm.

Now, if you know of ANY easier way to patch C code so that you can pull your modifications out by simply disabling some extension module, I'd love to hear it. But so far NOTHING I've come across beats Apache hook functions.

I should thank to Apache developers for creating such a wonderful framework which made my patchwork a breeze. (Btw, did you know that the name "Apache" came from "A PAtCHy sErver"?)

But the great news don't end there.

Memory allocation framework

Allow me to briefly disclose one more thing I love about Apache internals; and that's the pool-based memory allocation. If you've ever spent a few hours hunting down memory leak in your code, you would kill for this feature in other projects.

Essentially there's central memory allocation framework that wraps malloc() calls and allows you to ignore free() altogether. That statement should raise some red flags in your head. :-) You might be thinking: "What about free(), man?"

Well, the thing is that Apache is request-oriented, so for the most part you need the memory only until the request (or connection) finishes and then you don't care about it any longer. Which means you allocate most of your memory either from "connection" pool or "request" pool and apache takes care of the free()ing for you, when the right time comes.

And even if you do need longer-term memory, you just tap into different keg (eh, I meant "pool") and you get what you ask for.

And the price you pay for this goodness?

That's nothing compared to the benefits, IMO.

Final summation

Apache httpd is the best piece of software I've come across, when it comes to extensibility. If you ever need to write custom module for a webserver I encourage you to give Apache a shot. You'll be glad you did.

The "hook" and "pool" frameworks bring back joy to C programming. You no longer need to take care of freeing your allocated memory (unless you want to) AND you can tap into any core functionality of Apache with a patch that's few lines long. Of course, that also means you get more than enough rope to hang yourself, if you're so inclined. :-)

I'd like to thank to all Apache httpd developers for making such a great piece of software. It made my life MUCH easier and for that I'm grateful.

-- Michal S.

PS: If you like the article, consider digging it.