Friday, April 19, 2013

You've Installed It. Now What? Packages!

Once you've installed your OpenBSD system, packages are there to make your life easier. A works for me/life is good guide for your weekend reading.

Installing OpenBSD is easy, and takes you maybe 20 minutes. Most articles and guides you find out there will urge you to take a look at the files in /etc/ and explore the man pages to make the system do what you want. With a modern BSD, the base system is full featured enough that you can in fact get a lot done right away just by editing the relevant files and perhaps starting or restarting one or more services. If all you want to do is set up something like a gateway for your network with basic-to-advanced packet filtering, everything you need is already there in the basic install.

Then again, all the world is not a firewall, and it is likely you will want to use, for example, a web browser (we used to have the venerable lynx in base, and ftp will retrieve web pages, but does not allow browsing) or editing tools that are not vi or mg. That's where packages and package systems come in. I'll skip a little ahead of myself and make a confession: The machine I'm writing this piece on reports that it has some 381 packages installed.

Before we move on to the guts of this article, some ceremonial words of advice: If you're new to OpenBSD or it's your first time in a while on a freshly installed system, you could do a lot worse than spending a few minutes reading man afterboot. That man page serves as a handy checklist of things you should at least take a peek at to ensure that your system is in good working order.

Some packages will write important information, such as strings or stanzas to put in your rc.conf.local, rc.local or sysctl.conf files, to your terminal. If you're not totally confident what to do after the package install finishes, it may be a good idea to run your ports and packages installs in a script(1) session. See man script for details.

When dinosaurs roamed the Earth ...
The story of the ports and packages goes back to the early days of free software when we finally found ourselves with complete operating systems that were free and hackers^H^H^H^H^H^H system administrators found that even with full featured operating systems such as the BSDs, there were sometimes things you would want to do that was not already in there.

The way to get that something else was usually to fetch the source code, see if it would compile, make some changes (or a lot) to make it compile, possibly introduce the odd #ifdef block and keep at it until the software would compile, install and run. In the process you most likely found out what, if any, other software (tools or libraries) needed to be installed to complete the process. At that point, you could claim to have ported the software to your platform. If you had been careful and saved a copy of the original source files somewhere, you could use the diff(1) utility to create a patch you could then send to the program maintainer and hope that he or she would then incorporate your changes in the next release.

But then, why wait for the next release? Why not share those diffs with others? How about putting it into a CVS repository that would be available to everyone? That idea was tossed around on relevant mailing lists for a while, and the first version of the ports system appeared in FreeBSD 1.0 in December 1993.

The other BSD systems adopted the basic idea and framework soon after, with small variations. On NetBSD, the term port was already in use for ports of the operating system itself to specific hardware platforms, so on that operating system, the ports tree is referred to as 'package source', or pkgsrc for short. The ports and packages tools are still actively maintained and developed on all BSDs, and most notably Marc Espie rewrote the pkg_* tools for OpenBSD's 3.5 release. Marc and other OpenBSD developers have been refining the package tools with every release since then.

Parallel development has lead to some differences in the package handling on the various BSDs, and some of the operations I describe here from an OpenBSD perspective may not be identical on other operating systems.

Around the same time the BSDs started including a ports tree and packages, people on the Linux side of the fence started developing package systems too. With distributed development taken to the point where the kernel, basic system tools and libraries are maintained separately, perhaps the need there was even greater than on the BSDs.

In fact, some Linux distributions such as the Debian based ones have taken the package management to the point where 'everything is a package' - every component on a running system is a package that is maintained via the package system, including basic system tools, libraries and the operating system kernel. In contrast, the BSDs tend to treat the base system as a whole, with the package management tools intended solely for managing software that does not come as a part of the default install.

The anatomy of ports and packages
The ports system consists of a set of 'recipes' to build third party software to run on your system. Each port supplies its own Makefile, whatever patches are needed in order to make the software build and optionally package message files with information that will be displayed when the software has been installed.

So to build and install a piece of software using the ports system, you follow a slightly different procedure than the classical fetch - patch - compile cycle. You will need to install the ports tree, either by unpacking ports.tar.gz from your CD set or by checking out an updated version via cvs.

With a populated ports tree in hand, you can go to the port's directory, say

$ cd /usr/ports/misc screen

to see about installing screen, the popular GNU multi-screen window manager.

On a typical OpenBSD system, that directory contains the following files:

$ls -l
total 20
drwxr-xr-x  2 root  wheel   512 Mar 31 16:46 CVS
-rw-r--r--  1 root  wheel  1047 Mar 28 17:34 Makefile
-rw-r--r--  1 root  wheel   283 Apr  5  2007 distinfo
drwxr-xr-x  3 root  wheel   512 Jun 26  2012 patches
drwxr-xr-x  3 root  wheel   512 Mar 11  2012 pkg

here, the Makefile is the main player. If you open it now in a text editor or viewer such as less, you will see that the syntax is quite straightforward. What it does is mainly to define a number of variables such as the package name, where to fetch the necessary source files, which programs are required for the compile to succeed and which libraries the resulting program will need to have present in order to run correctly. The file defines a few other variables too, and you can look up the exact meaning of each in the man pages, starting with man ports and man bsd.port.mk

With all relevant variables set, at the very end the file uses the line

.include <bsd.port.mk>

to pull in the common infrastructure it shares with all other ports. This is what makes the common targets work, so for example, typing

$ sudo make install 

(probably the most common port-related make command for end users and administrators) in the port directory will start the process to install the software.

But before you type that command and press Enter, you may want to consider this: This command will generate a lot of output, most likely more than will fit in the terminal's buffer. If the build fails, it is likely that the message about the first thing that went wrong will have scrolled off the top of your screen and out of the terminal buffer. For that reason, it is good sysadmin practice to create a record of lengthy operations such as building a port by using the script command. Typing script in a shell will give you a subshell where everything displayed on the screen will be saved in a file. Escape sequences, asterisk-style progress bars and 'twirling batons' will end up a bit garbled, but that essential message you are looking for will be there too. man script will give you the details, and unless you're an incurable packrat, do remember to delete the typescript file afterwards.

That process will start with checking dependencies, go on with downloading the source archive and checking that the fetched file matches the cryptographic signatures stored in the distinfo file. If the signatures match, the source code is extracted to a working directory, the patches from the patches/ directory are applied, and the compilation starts. If the dependency check finds that one or more pieces are missing, you will see that the process fetches, configures and installs the required package before continuing with the build process for the original package.

After a while, the package build most likely succeeds and the install completes. At this point you will have a new piece of software installed on your system. You should be able to run the program, and the installed package will turn up in the package listings output by pkg_info, such as

$ pkg_info | grep screen
screen-4.0.3p3      multi-screen window manager

This information is taken from the package's subdirectory in /var/db/pkg, where the information about currently installed packages is stored.

If you paid close attention during the make install process, you may have noticed that the install step was performed from a binary package. This is one of the distinctive features of the OpenBSD version of the package system. The package build always generates an installable package based on a 'fake' install to a private directory, and software is always installed on the target system from a package.

And now we should mention that on a typical modern OpenBSD system, you wouldn't want to install GNU Screen at all. Since the OpenBSD 4.6 release, equivalent (or better!) functionality has been included in the OpenBSD base system via tmux(1).

But you don't need to do that!

This means several things. If you have built and installed a package by typing make install in the relevant ports directory and later run the make deinstall or pkg_delete to remove the software, any subsequent install of the software will take place from the package file stored in a subdirectory of /usr/ports/packages.

But more importantly, in most cases you can keep your system's packages up to date without a ports tree on the machine. The main exceptions to the rule that precompiled packages are available from the mirrors are software with licenses that do not allow redistribution or require the end user to do specific things such as go to a web site and click a specific button to formally accept a set of conditions. In those cases it cant' be helped, and you will need to go via the ports system to create a package locally and install that.

For each release, a full set of packages is built and made available on the OpenBSD mirrors, and by the time you read this, there is reason to hope that running updates to -stable packages will be available for supported releases too.

The way to make good use of this is to set the PKG_PATH variable to include the packages directory for your release on one or more mirrors close to you and/or a local directory, and then run pkg_add with the -u flag.

My laptop runs -current and I'm based in Europe, so the PKG_PATH is set to

PKG_PATH=http://ftp.eu.openbsd.org/pub/OpenBSD/snapshots/packages/`uname -m`/

On a more conservatively run system, you may want to set it to something like

PKG_PATH=http://ftp.eu.openbsd.org/pub/OpenBSD/`uname -r`/packages/`uname -m`/

If you want to find out what packages are available at your favorite mirror, you can get a listing of package names by fetching the file $PKG_PATH/index.txt. Another nice resource is openports.se, which offers a nice clickable interface.

Once your PKG_PATH is set to something sensible, you can use pkg_add and the package base name to install packages, so a simple

$ sudo pkg_add screen

would achieve the same thing as the 'make install' command earlier (minus the lengthy compilations, and still assuming that you would want to install the package instead of getting to know tmux(1), which is included in the base system), and most likely a lot faster too.

Once you have a set of packages installed, and keeping in mind that you need a meaningful PKG_PATH, you can keep them up to date using pkg_add -u. If you want more detailed information about the package update process and want pkg_add to switch to interactive mode when necessary, you can use something like this command:

$ sudo pkg_add -vui

I have at times tended to run my pkg_add -u with some of the -F flags in order to force resolution of certain types of conflict, but given the quality of the work that goes into the packages, most of the -F options are rarely needed. pkg_add and its siblings in the pkg_* tools collection has a number of options we have not covered here, all intended to make your package management on OpenBSD as comfortable and flexible as possible. The tools come with readable man pages, and may very well be the topic of future articles. You should also be aware that Michael W Lucas's Absolute OpenBSD, 2nd Edition is available from better bookstores with a more in-depth treatment of the package system than what I've presented here. Look at the end of the article for further links.

How do I make a package then?
That is a large question, and the first question you should ask if you think you want to port a particular piece of software is, "Has this already been ported?". There are several ways to check. If you are thinking of creating a port, you most likely already have the ports tree installed, so using the ports infrastructure's search infrastructure is the obvious first step. Simply go to the /usr/ports directory and run the command

$ make search key=mykeyword

where mykeyword is a program name or keyword related to the software you are looking for. One other option with even more flexible search possibilities is to install databases/sqlports. And of course, searching the ports mailing list archives (http://marc.info/?l=openbsd-ports) or asking the mailing list works too. When you have determined that the software you want to port is not already available as a package, you can go on to prepare for the porting effort. Porting and package making is the subject of much usenet folklore and rumor, but in addition you have several man pages with specific information on how to proceed. These are, ports(7), package(5), packages(7), packages-specs(7), library-specs(7) and bsd.port.mk(5).

Read those and use your familiarity with the code you are about to port to find your way. The OpenBSD web offers a quite a bit of information too. You could start with re-reading the main ports and packages page at http://www.openbsd.org/faq/faq15.html, and follow up with the pages about the porting process at http://www.openbsd.org/faq/ports/, testing the port at http://www.openbsd.org/faq/ports/testing.html and finally the checklist for a sound port at http://www.openbsd.org/faq/ports/guide.html#PortsChecklist.

All the while, try first to figure out the solution to any problems that pop up, read the supplied documentation, and only then ask port maintainers via the ports mailing list for help. Port maintainers are generally quite busy, but if you show signs of having done your homework first, there is no better resource available for helping you succeed in your porting or port maintenance efforts.

One fine resource for the aspiring porter is Bernd Ahlers' ports tutorial from OpenCon 2007 (hm. doesn't that need a refresh?), you can look up Bernd's slides at http://www.openbsd.org/papers/opencon07-portstutorial/index.html, and it is possible he can be persuaded to repeat the tutorial at a conference near you. And for some recent advances in the OpenBSD ports and packages system, see Marc Espie's EuroBSDCon 2012 presentation Advances in packages and ports in OpenBSD.

More information on the net
The main source of information about the OpenBSD ports and packages system is to be found on the OpenBSD project's web site. The FAQ's ports and packages section at http://www.openbsd.org/faq/faq15.html has more information about all the issues covered in this article, and goes into somewhat more detail than space allows here. If you encounter problems while installing or managing your packages, it is more than likely that you will find a solution or a good explanation there. And of course, if nothing else works or you can't figure it out, there is always the option of asking the good people at misc@openbsd.org or ports@openbsd.org (do read the OpenBSD Mailing Lists page before just butting in) or search the corresponding mailing list archives.


An earlier version of this article appeared in BSD Magazine 2/2008. You can now also find this updated version featured at OpenBSD Journal (aka undeadly.org), the primary OpenBSD news site.

If you're interested in OpenBSD in general, you have a real treat coming up in the form of Michael W. Lucas' Absolute OpenBSD, 2nd edition. If a firewall or other networking is closer to your heart, you could give my own The Book of PF and the PF tutorial (or here) it grew out of. You can even support the OpenBSD project by buying the books from them at the same time you buy your CD set, see the OpenBSD Orders page for more information.

Upcoming talks: I'll be speaking at BSDCan 2013, on The Hail Mary Cloud And The Lessons Learned, with a preview planned for the BLUG meeting a couple of weeks before the conference. There will be no PF tutorial at this year's BSDCan, fortunately my staple tutorial item was crowded out by new initiatives from some truly excellent people.

Sunday, April 14, 2013

Maintaining A Publicly Available Blacklist - Mechanisms And Principles

When you publicly assert that somebody sent spam, you need to ensure that your data is accurate. Your process needs to be simple and verifiable, and to compensate for any errors, you want your process to be transparent to the public with clear points of contact and line of responsibility. Here are some pointers from the operator of the bsdly.net greytrap-based blacklist.

Regular readers will be aware that bsdly.net started sharing our locally generated blacklist of known spam senders back in July 2007, and that we've offered hourly updates for free since then.

The mechanics of maintaining a list boil down to a few simple steps, as described in the original article and the various web pages it references as well several followups, but the probably most informative recipe for how it's all done was this one, written in May 2012 in response to (as usual) a heated exchange on openbsd-misc.

As I've explained in earlier articles, once the basic spamd(8) setup is in place, maintaining the blacklist starts with defining your list of known bad, never to become deliverable adresses in domains you control. It is worth noting that you can run spamd on any OpenBSD computer even if you do not run a real mail service (several of my correspondents do, and do evil things like crank up the time between response bytes to 10 seconds for entertainment), but as it happens we have a few real mail servers behind our spamd equipped gateways, so it seemed natural to restrict our pool of trap addresses to the domains that are actually served by our kit here.

Collecting addresses for the spamtraps list started with a totally manual process of fishing out addresses from the mail server logs, greping for log entries for delivery attempts to non-existent addresses in our domains. Spammers would do (as they still do) Joe jobs on one or more of our domains, making up or generating fake addresses to use as From: or Reply-to: addresses on their spam messages, and messages that for one reason or the other were not deliverable would end up generating bounce messages that our mail service would need to deal with. But a manual process is error prone and we're bound to have missed a few, so not too long after I'd written the script that generates the downloadable blacklist, I had it checking the active greylist for any addresses not already in the pool of known bad addresses.

This is the process that has helped generate the current list of 'imaginary friends', now 24,324 entries long and with a growth of usually a handful per day (but there have been whole days without a single new entry) but up to a few hundred, in rare cases, whenever the script runs. I assume there will be more entries arriving as I write and post this article, but right now the latest entry so far, received 13 Apr 2013 15:10 CEST, was pfpeter@bsdly.net (which mildly suggests that somebody is having a bit of fun with my address and obvious keywords -- if you get the trap address list, you'll see that grep peter@bsdly.net sortlist turns up close to a hundred entries, mostly combinations of well-known keywords and my email address).

You could argue that fishing out bounce-to addresses of the greylist quickly for trapping purposes runs the risk of unfairly penalizing innoncent third parties with badly configured mail services, and I must admit that risk exists. However, my experiment of planting my own made-up adresses in the spamtraps list reveals that the list is indeed read and used by spammers, and after all, early sufferers would be blacklisted from here only for 24 hours after their last attempt at bouncing back the worthless stuff to us.

And once an address is in the spamtrap list, attempts at delivering mail to that address turns up in logs something like this:

Apr 14 15:19:22 skapet spamd[1733]: (GREY) 201.215.127.126: <switchbackiwh0@google.com> -> <aramforbess@bsdly.net>
Apr 14 15:19:22 skapet spamd[31358]: Trapping 201.215.127.126 for tuple 201.215.127.126 pc-126-127-215-201.cm.vtr.net <switchbackiwh0@google.com> <aramforbess@bsdly.net>
Apr 14 15:19:22 skapet spamd[31358]: sync_trap 201.215.127.126

the sync_trap line indicates that this spamd is set up to synchronize with a sister site, like I described in the In The Name Of Sane Email... article. When the miscreant returns, it looks something like this:

Apr 14 15:28:01 skapet spamd[30256]: 201.215.127.126: connected (3/3), lists: spamd-greytrap
Apr 14 15:30:15 skapet spamd[30256]: 201.215.127.126: disconnected after 134 seconds. lists: spamd-greytrap

most likely with repeat attempts until the sender gives up.


That's the basic mechanism. Now for the principles. I outlined some of the operating principles in a kind of terms of service statement here, but I'll offer a rehash here with a tiny sprinkling of tweaks I've made to the process in order to make the quality of the data I offer better.

First, as I already pointed out in the ingress, you want your process to be simple and verifiable. Run of the mill spamd greytrapping passes the first test with flying colors; after all, any host that ends up in the blacklist verifiably tried to send mail to a known bad address. Keep your logs around for a while, and you should be in good shape to verify what happened.

You also want your data to be accurate, with each entry representing a host that verifiably sent spam. This means watching out for errors of any kind, including but not limited to finding and removing false positives. The automatic 24-hour expiry that's part of the whole greytrapping experience helps a lot here. Any perpetrator or unlucky victim will be out of harm's or blockage's way within 24 hours of the last undesired action we register from their side. There is no requirement that the system administrator track down a web form and swear on their grandmother's pituitary gland that they have 'cleaned up the system'. We (perhaps naively) assume that anyone we don't hear from is no longer our problem.

However, spamd was designed to be a solution to a relatively simple and limited set of problems. Every day some spam messages will manage to get past the outer defenses and face the content filtering that in most cases makes the right decision and drops any spam messages that reaches it on the floor. And there is a small, but not entirely non-existent body of messages that are spam of some kind that will end up in users' inboxes.

For the case where messages are dropped by the content filtering, I found that it was fairly simple to extract the IP addresses of the last hop before entering our network from the logs generated by the content filtering, and at regular intervals these IP addresses are collected from the mail servers with the content filtering in place, and fed into the local greytrap via spamdb(8). It took more than a few dry runs before I trusted the process, but setting up something similar for your environment should be within any sysadmin's scripting skills. We use spamassassin and clamav here, but you should be able to extract fairly easily the information you need to fit the behavior of your particular combination of software. We also offer our users the option of saving messages in spam and not-spam folders on a network drive to train spamassassin's Bayesian engine, indirectly helping the quality of the generated blacklist via more accurate detection of spam. In addition, a so-minded administrator can even extract IP addresses from any headers the user had a mind to conserve and use spamdb(8) to manually insert offending IP addresses in the local greytrap list.

And finally to compensate for any errors, you want your process to be transparent to the public with clear points of contact and line of responsibility. In other words, make sure that you have people in place who are indeed accessible and responsive when somebody tries to contact you via any of the RFC 2142 required addresses. And post something like this article to somewhere reachable. At bsdly.net and associated domains, it's a distinct advantage that contact attempts happen from hosts not currently in the blacklists, but as far as I am aware any errors in the published list have been dealt with before anybody else noticed, and we have avoided being party to the blocklist vendettas and web forum flame wars that have plagued other blacklist maintainers (it has been suggested that the December 2012 DDOS incident could have been part of somebody's revenge, but we do not have sufficient evidence to point any fingers).

In short, you need to keep things simple, act responsibly and be responsive to anyone contacting you about your (mostly automatically generated) work product.

Good night and good luck.

2013-04-15 update: Clarified that manual spamdb(8) manipulation can be used to insert IP addreses in the blacklist too.

2013-04-16 update: It is also possible to fetch the hourly dump from the NUUG mirror here: http://home.nuug.no/~peter/bsdly.net.traplist. In fact, fetching from there should under most circumstances be faster than getting it from the original location. The file is copied at 15 minutes past the hour, while the generating starts at 10 past the hour.

In addition to the techniques described here, it is useful to know that OpenBSD developer Peter Hessler is working on distributing spamd data via BGP, as described in his AsiaBSDCon 2012 paper. Not part of the base distribution yet, but work continues and could come in useful in addition to the batch import of exported lists like the bsdly.net hourly dump.

If you're interested in setting up your own spamd, your main source of information is included in your OpenBSD (or FreeBSD or NetBSD) installation: the man pages such as the one I refer to here. Recommended secondary sources include my own The Book of PF and the PF tutorial (or here) it grew out of. You can even support the OpenBSD project by buying the book from them at the same time you buy your CD set, see the OpenBSD Orders page for more information.

If you're interested in OpenBSD in general, you have a real treat coming up in the form of Michael W. Lucas' Absolute OpenBSD, 2nd edition, also available from the OpenBSD site, and for a few hours more the auction of the first copy printed is running. Surely you can top USD 1145? With your boss' credit card, perhaps?

Upcoming talks: I'll be speaking at BSDCan 2013, on The Hail Mary Cloud And The Lessons Learned, with a preview planned for the BLUG meeting a couple of weeks before the conference. There will be no PF tutorial at this year's BSDCan, fortunately my staple tutorial item was crowded out by new initiatives from some truly excellent people. But you can lobby other organizers to host one.