Saturday, July 9, 2011

Anticipating the Post-ALTQ World

A peek into exciting new features in the upcoming OpenBSD 5.0 release and beyond, plus how to teach our favorite operating system better?

One of the more exciting pieces of news to come out of the Edmonton OpenBSD hackathon this week was that the face of OpenBSD traffic shaping is about to change.

This change is by no means the only news to come out of Edmonton (read source-changes@, either by subscribing or via public archives such as marc.info and, if you're at all BSD geekish, feel your excitement levels rise), but the new prio keyword that was added to PF grammar and announced via a message to the tech@ mailing list titled new small, fast, always on priority queueing is both a major new feature and a prime example of how the OpenBSD project works, avoiding sudden "revolutionary" steps in favor of well planned evolutionary steps. The stepwise approach ensures, among other things, that changes are properly tested, and is visible in the fact that the OpenBSD source tree is as close as doesn't matter to always being in a buildable state, even during hackathons.

ALTQ, the ALTernate Queueing subsystem, was integrated into the PF codebase for OpenBSD version 3.3 (mainly by the efforts of the very same Henning Brauer who posted the message linked in the previous paragraph), and from that release onward, managing your filtering and your traffic shaping via your pf.conf file has been a highly appreciated feature of OpenBSD and other systems that have adopted the PF networking toolset.

Although much appreciated by its users everywhere, the integration was never quite an easy fit and the developers have been privately discussing how the queuing should be rewritten to avoid the separateness of ALTQ compared to the rest of the PF features.

There have been other concerns too. Not all of the features outlined in Cho's original USENIX paper were ever fully implemented, and the code had begun showing its age in several respects. By amazing coincidence another mailing list thread appeared in the same week as the new queueing system was announced, pointing out that ALTQ's total bandwidth parameter is a 32-bit value, meaning in practical terms that the maximum bandwidth ALTQ will be able to manage is in the 4Gbit range. Just changing the relevant variables to a wider type apparently breaks the code in other interesting ways (if you're adventurous, you could try playing with FreeBSD developer Ermal Luci's patch (available among other places here) and see what happens). The results could be hackishly interesting, but with the new queueing system set for gradual introduction, this may not be the optimal time to submit ALTQ patches.

Once again, the 32 bit value is a sign of the code's age. At the time ALTQ was originally written, a 32-bit value for bandwidth, and by extension an absolute cap at four gigabits per second, was a no less reasonable choice than the choice of 32 bits as the length of IP addresses some years earlier. And while we are still in the ALTQ world, the obvious workaround in settings where you have bandwidth that comes in increments of ten or forty gigabit is to do your filtering and traffic shaping at network locations where bandwidth is a bit more scarce. In the post-ALTQ world, things may become significantly different.

The early outline of the new world comes in the form of in-ruleset traffic classification, at the face of it not that different from ALTQ when seen from a user perspective. In the upcoming OpenBSD 5.0 release onwards, you will be able to skip the familiar ALTQ queue definitions, but still do traffic classification like this:

pass in proto tcp to port ssh prio 6

meaning that incoming ssh traffic will pass with a relatively high priority (the range is 0 through 7), hopefully making it through earlier than the rest.

The new scheme even lets you duplicate the old speedup trick of setting aside a separate subqueue for ACKs and other lowdelay packets, like so:

pass in proto tcp to port ssh prio (5, 6)


This has yet to reach downloadable binary snapshots, but if you have the required bits in place, you can get an early start by building your own OpenBSD-current. See the relevant pieces of The FAQ, supplemented by man release.

It is important to note that in the upcoming release, priority rules will coexist with the existing ALTQ infrastructure. Over time, however, the plan is to replace ALTQ with a largely rewritten system where the internal workings of the HFSC and CBQ queues, for example, are not all that different. We are also likely to see the total bandwidth definition for an interface move out ouf your pf.conf and re-emerge as an ifconfig option.

Once a traffic shaping infrastructure that's functionally equivalent to ALTQ (or it is hoped, superior in all ways including performance) is in place, ALTQ will eventually be removed. In the meantime, we have already seen commits that set default priority for certain traffic types, and reading source-changes@ along with testing snapshots will continue to be a fun and exciting experience in the weeks and months ahead.

Over time, it might even become necessary to do more book work, if so you will hear about it first here.

Whatever happens on the book front, I'll more than likely get back to those new features in more detail in upcoming PF tutorial sessions (See here, here and here for announcements about the ones scheduled so far), and I'm trying to figure out a way to improve the quality of the tutorial experiences.

The lecture plus demonstrations format of the tutorial sessions has so far allowed for very limited interactivity with little or no hands-on experience for attendees during the sessions themselves. I would be very happy to hear ideas for and input on practical implementation of changes that would improve the experience. One possibility is setting up a set of virtual labs, hosted somewhere reachable from the conference locations. Suggestions are welcome via email or the comments field.


Update 2011-07-10: The prio code is in snapshots dated July 10, 2011 or newer. If you're upgrading from a previous version, the installer will now also offer to run sysmerge(8) and it will even offer to upgrade any non-free firmware it finds installed on your system.

Update 2011-09-14: Chris Cappucio wrote in, adding some more useful detail to ALTQ's history. Chris writes:

I did the initial altq port to OpenBSD and Kenjiro Cho (author of ALTQ) was invited to the next hackathon (I believe it was c2k2) along with myself to integrate it into the tree. I declined to participate for personal reasons but Kenjiro did turn my patch into an in-tree implementation, one or two years later the PF guys replaced the ALTQ packet classifier and configuration utility with PF itself. Now they are replacing the ALTQ scheduler framework completely with a simpler framework that still uses the PF classifier and configurator :)
Update 2015-04-02: The new queueing subsystem finally appeared in OpenBSD 5.5, and was the main motivation for revising The Book of PF into its 2014 third edition. If you follow the book links in this article, you will find the edition that describes both the new queueing system and the legacy ALTQ system.

2 comments:

  1. For now, the new prio code only works with the *interface* bandwidth, so it's not yet an alternative to altq priq for the common adsl/cablemodem use case where the available bandwidth is low but the interface connects at 100Mb.

    ReplyDelete
  2. This is why I love OpenBSD so much and am always happy to buy CD's, T-Shirts, other items and donate when I can.

    The evolution is so careful and well thought out and each new release brings more awesome features and/or performance improvements.

    I love OpenBSD's pf/altq as it currently stands and to think that the traffic shaping is going to be getting even better is just brilliant!

    Can't wait for my pre-ordered CD to arrive!

    ReplyDelete

Note: Comments are moderated. On-topic messages will be liberated from the holding queue at semi-random (hopefully short) intervals.

I invite comment on all aspects of the material I publish and I read all submitted comments. I occasionally respond in comments, but please do not assume that your comment will compel me to produce a public or immediate response.

Please note that comments consisting of only a single word or only a URL with no indication why that link is useful in the context will be immediately recycled so those poor electrons get another shot at a meaningful existence.

If your suggestions are useful enough to make me write on a specific topic, I will do my best to give credit where credit is due.