Since I did mention these recently, I might as well lay out in blog form whatever notes I have gathered so far, for future reference. Comments are more than welcome, of course, although as mentioned in previous instances, I doubt that this will be useful to anybody but myself and the proverbial alien anthropologist.
For the sake of both exercise and rigorousness, let us take some space to define the item under examination: a device driver is a piece of software, usually (but not necessarily!) delivered as part of an operating system, providing automation in order to mediate interaction between a user and a so-called "device". Said device may be a logical entity, such as a database, say, a FAT32 partition1, or otherwise a physical object such as a peripheral, say, a communication bus or a disk. These examples were not chosen arbitrarily, rather we use them to illustrate that drivers interact with one another: the system requires a communication bus to interact with the disk2; and it requires the disk in order to provide persistent storage for the database. And furthermore, the system requires some sort of user-facing application to expose the database to the human user -- but let us not be fooled, the driver, although supposedly a mechanism, is in fact an application for its user, i.e. the aforementioned user-facing application.
To sum this up: a driver is just another fancy name for an application that drives some underlying mechanism. When applied to Unix systems, the architectural specifics of these "driver applications" only serve to (irritatingly) drive us down a rabbit hole. While a few select Unix flavours3 treat all applications the same, most Unix systems enforce a more or less arbitrary divide between a so-called "kernel space" and "user space". The latter among the two (supposedly) runs user applications, while the former runs drivers, although in all fairness, the arbitrariness of this divide leads to situations where device drivers such as filesystems or GPU drivers end up running in user space4 and, why not, vice-versa, although no examples come to mind.
While in the Linux source tree, device drivers reside in the "drivers" directory, in OpenBSD they may be found in sys/dev5. Confusingly enough, however, basic definitions pertaining to device drivers do not reside there: since (some) drivers may also be accessed directly by user space applications6, and the user-kernel interface is hosted by sys/sys (which consists entirely of header files), device driver definitions are placed in sys/sys/device.h. A quick glance reveals this classification:
/*
* Minimal device structures.
* Note that all ``system'' device types are listed here.
*/
enum devclass {
DV_DULL, /* generic, no special info */
DV_CPU, /* CPU (carries resource utilization) */
DV_DISK, /* disk drive (label, etc) */
DV_IFNET, /* network interface */
DV_TAPE, /* tape device */
DV_TTY /* serial line interface (???) */
};
So by and large OpenBSD supports the following types of devices: generic7, CPU8, disk, networking, tape9 and serial/console. Devices are modelled through the following data structure:
struct device {
enum devclass dv_class; /* this device's classification */
TAILQ_ENTRY(device) dv_list; /* entry on list of all devices */
struct cfdata *dv_cfdata; /* config data that found us */
int dv_unit; /* device unit number */
char dv_xname[16]; /* external name (name + unit) */
struct device *dv_parent; /* pointer to parent device */
int dv_flags; /* misc. flags; see below */
int dv_ref; /* ref count */
};
The less obvious fields there are: cfdata, parent and flags. There is indeed only one (generic) flag, DVF_ACTIVE, which signals device activation. The parent field on the other hand immediately brings to mind the hierarchical object-oriented model, so devices don't lie in a flat world etc. Finally, cfdata is a record of its own:
/*
* Configuration data (i.e., data placed in ioconf.c).
*/
struct cfdata {
const struct cfattach *cf_attach; /* config attachment */
struct cfdriver *cf_driver; /* config driver */
short cf_unit; /* unit number */
short cf_fstate; /* finding state (below) */
long *cf_loc; /* locators (machine dependent) */
int cf_flags; /* flags from config */
short *cf_parents; /* potential parents */
int cf_locnames; /* start of names */
short cf_starunit1; /* 1st usable unit number by STAR */
};
I'm not sure how to describe the whole thing in a single (plain English) sentence, but let's try: any OpenBSD device has a configuration that combines a bunch of elements, the foremost among them being a so-called "config attachment" and a "config driver". In short, the former specifies under which condition a driver may be "attached" to its underlying device (i.e. when it's placed into operation); while the latter specifies information about the device instances that are driven. In more detail:
struct cfattach {
size_t ca_devsize; /* size of dev data (for malloc) */
cfmatch_t ca_match; /* returns a match level */
void (*ca_attach)(struct device *, struct device *, void *);
int (*ca_detach)(struct device *, int);
int (*ca_activate)(struct device *, int);
};
struct cfdriver {
void **cd_devs; /* devices found */
char *cd_name; /* device name */
enum devclass cd_class; /* device classification */
int cd_mode; /* device type subclassification */
int cd_ndevs; /* size of cd_devs array */
};
At this point I'm not sure how one would go about writing a driver based on this information alone. In addition to this, writing a Unix -- or, in particular, a BSD, or, in particular, an OpenBSD -- driver requires access to priors which are only available to the OS designer. In other words, if one isn't privileged enough to have been there when the original Unix was created, or otherwise to have learned from one of those folks, one is advised to (re)view the synthetic materials provided by Tanenbaum and Silberschatz. Intuitively, we may state that: on one hand, the user10 has some degree of control over what applications he or she wishes to load in their system, be they of the "userland" or the "kernelland" variety; while from the other perspective, not all hardwares are created equal, and inasmuch as the kernel can "decide" things, it must be able to load or otherwise to refuse loading a driver when faced with this decision.
I guess that at this point the best way to go about this is by means of example. I have arbitrarily selected the generic General-Purpose Input/Output (GPIO) driver in sys/dev/gpio/gpio.c for this purpose:
struct gpio_softc {
struct device sc_dev;
gpio_chipset_tag_t sc_gc; /* GPIO controller */
gpio_pin_t *sc_pins; /* pins array */
int sc_npins; /* number of pins */
int sc_opened;
LIST_HEAD(, gpio_dev) sc_devs; /* devices */
LIST_HEAD(, gpio_name) sc_names; /* named pins */
};
int gpio_match(struct device *, void *, void *);
int gpio_submatch(struct device *, void *, void *);
void gpio_attach(struct device *, struct device *, void *);
int gpio_detach(struct device *, int);
int gpio_search(struct device *, void *, void *);
int gpio_print(void *, const char *);
int gpio_pinbyname(struct gpio_softc *, char *gp_name);
int gpio_ioctl(struct gpio_softc *, u_long, caddr_t, int);
const struct cfattach gpio_ca = {
sizeof (struct gpio_softc),
gpio_match,
gpio_attach,
gpio_detach
};
struct cfdriver gpio_cd = {
NULL, "gpio", DV_DULL
};
I shall be brief regarding GPIO: some machines11 expose a set of programmable pins that the user may employ for whatever purpose he wishes, but in general in order to control some peripheral piece of equipment such as an audio processor or a robotic arm. While there is no specific standard defining this "GPIO" thing, they all pretty much work the same, in that they eat or produce a logic level or a digital signal or in any case, some data or the other. The hardware-specific implementation details are covered in separate device drivers12 which sit below this "generic" driver.
A few observations on the generic GPIO driver.
First, the gpio_softc structure encompasses sc_dev. This is yet again reminiscent of object-orientation, in that the GPIO device is a subclass and a superset of the generic device. This idiom, or design pattern if you will, is not accidental.
Second, and furthermore, the gpio_{match,attach,detach} functions all receive as argument a device structure, which puts them in their place as methods in this object-oriented model -- in particular, in gpio_attach (or in any other "attach" method) the first device structure is the parent, while the second one is the "self", i.e. an equivalent of the C++ "this" keyword. At this point the reader is encouraged to look further into the definitions of these methods.
Third, the aforementioned methods are encapsulated in a cfattach object. Intuitively, when a driver is loaded, "match" is first called, and if it "succeeds" (whatever that means), then "attach" loads the driver, and subsequently "detach" unloads the very same. These methods are called in different places throughout the kernel code -- the core is in sys/kern/subr_autoconf.c if you're curious -- I won't go into details.
Fourth and finally, a cfdriver object is defined, which comes with some, but not many pre-populated fields. In particular the author provides cd_name and cd_class -- in this particular case, "gpio" and DV_DULL, that is, the class denoting a generic driver.
Other than the intuition that "attach" is a sort of "module init" and "detach" is a sort of "module exit", I'll skip any parallels with Linux. Long story short, these are the basic elements required to add a driver in OpenBSD. It isn't as gruesome as some may think, so who knows, I may go a tad deeper in the future.
-
What is the so-called filesystem other than a hierarchical object-oriented database? where the leaves are most commonly regular files and the intermediate nodes are directories, along with more perverted additions such as symbolic links. As for the object-oriented aspect: well, each file has various attributes -- such as an owner or the access time -- does it not? ↩
-
Oftentimes it also requires a block I/O controller, but let us not get ahead of ourselves. ↩
-
Tanenbaum's MINIX, for example. ↩
-
Most often for ideological, e.g. "licensing" reasons.
I am indeed using "ideological" pejoratively, for reasons mentioned in the previous article in this series, or elsewhere, or elsewhere. ↩
-
More precisely: in /usr/src/sys/dev ↩
-
For example, a terminal emulator using a console driver. ↩
-
It's not precisely clear what this means, right? Well, a grep for
DV_DULL
reveals quite a few drivers in this class. I'll leave the exploration as an exercise to the reader. ↩ -
This doesn't seem to be used anywhere. ↩
-
In case you're wondering, tapes are heavily used in some environments nowadays. Just ask the CERN folks.
The weirder part -- and mayhaps some OpenBSD afficionado can illuminate me here -- is that I don't really see the difference between "tape" and "disk" drivers. You can lseek in both cases, right? Even though seek operations make no sense whatsoever in the context of SSDs. So why don't SSDs have their own class?
Before you say anything: fuck "historical reasons" with a hot poker, this is supposed to make sense and you can leave history lessons in the comments, or in the changelog, or the versioning tree or wherever else, only not in a file which is supposed to give the reader crucial information about the current system organization.
So unless someone changes my mind, I'll surmise that this classification is not very well thought out. ↩
-
Bear in mind that here by "user" I mean something completely different from the current-day meaning, as in e.g. "Android user". I'm sorry to be the one who (yet again) breaks it to the reader, but if an operating system doesn't provide the user with complete control over what applications he runs, then it's not much of a "general-purpose" system. It may make a perfect Nintendo-style contraption with which there is absolutely nothing wrong, but that Nintendo will be a gaming console or what have you, not a general-purpose computer.
In other sad news, this also means that the actual users of your smartphone are Apple or Google. Again, nothing particularly problematic about that, at least not if this is your sort of thing. But this is not the particular kind of usage that is under discussion here. ↩
-
Say, the very popular Raspberry Pi single-board computer. ↩
-
Since we've used the RPi as an example so far, let us continue: in OpenBSD, the GPIO implementation for this particular board is defined in sys/dev/fdt/bcm2835_gpio.c. See bcmgpio for details. ↩
Well technically speaking, tape seeking is O(N) and disk seeking is ~O(1), or if you wanna be really pedantic, O(N^-k) with k between 2..4 :P Then again, this distinction is of interest to the driver's upstream users, not necessarily to the driver itself.
Historical reasons might indeed be the best explanation for that classification tho. It looks like a list of the devices that would have been available in the early 90s. And indeed, the file history shows the list was in this form around 1993..1994 and never modified since.
I'm not sure what weight algorithmic complexity bears in all this.
A quick grep for DV_TAPE reveals that it's used in sys/kern/subr_autoconf.c (autoconf glue code for hibernation? not sure) and otherwise the only driver defined as "tape" is the SCSI tape driver in sys/scsi/st.c. Not to underplay the role of tape devices, but my guess is they could have given up the tape classification in st.c and refactored the piece in subr_autoconf.c (which doesn't seem to have a place there anyway), it just wasn't a priority... and probably the next OpenBSDer will invite me to have a shot at it, which is a fair point after all, as the thing is "open sores".
Anyway, I think the only meaningful distinction is between block and char devices, and that's defined somewhere else. I guess that'd make a good next post in this series.