That grumpy BSD guy: Twenty-plus years on, SMTP callbacks are still pointless and need to die

A rarely used legacy misfeature of the main Internet email protocol creeps back from irrelevance as a minor annoyance. You should ask your mail and antispam provider about their approach to 'SMTP callbacks'. Be wary of any assertion that is not backed by evidence.

Even if you are an IT professional and run an email system, you could be forgiven for not being immediately aware that there is such a thing as SMTP callbacks, also referred to as callback verification. As you will see from the Wikipedia article, the feature was never widely adopted, and for all too understandable reasons.

If you do run a mail system, you have probably heard about that feature's predecessor, the still-required but rarely used SMTP VRFY and EXPN commands. Those commands offer a way to verify whether an address is valid and to show the component addresses that a mailing list resolves to, respectively.

Back when all things inter-networking were considered experimental and it was generally thought that information should flow freely in and between those experimental networks, it was quite common for mail servers to offer VRFY and EXPN service to all comers.

I'm old enough to remember using VRFY by hand, telnet-ing to port 25 on a mail server and running VRFY $user@$domain.$tld commands to check whether an email address was indeed valid. I've forgotten which domains and persons were involved, but I imagine the reason why was that I wanted to contact somebody who had said something interesting in a post to a USENET news group.

But networkers trying to make contact with each other were not the only ones who discovered the VRFY and EXPN commands. Soon spammers were using those commands to actively harvest actually! valid! deliverable! addresses, and by 1999 the RFC2505 best practices document recommended disabling the features altogether. After all, there would usually be some other way available to find somebody's email address (there was even a FAQ, a longish Frequently Asked Questions document with apparent USENET origins written and maintained on the subject, a copy of which can be found here).

In roughly the same time frame, somebody came up with the idea of SMTP callbacks. The idea was that all domains on the Internet need to publish the address of their mail exchangers via DNS MX (mail exchanger) records. The logical next step is then that when a piece of mail arrives over SMTP, the receiving end should be able to contact the sender domain's known mail exchanger to check that the sender address is indeed valid. If you by now hear the echoes of VRFY and EXPN, you're right. There are indications that some early implementations did in fact use VRFY for that purpose.

But then the world changed, and you could not rely on VRFY being available in the post-RFC2505 world.

In the post-RFC2505 world, the other side would most likely not offer up any useful information in response to VRFY commands, and you would most likely be limited to the short interchange that the Wikipedia entry quotes,

HELO <verifier host name>
MAIL FROM:<>
RCPT TO:<the address to be tested>
QUIT

which a perceptive reader would identify as only verifying in a very limited sense that the domain's mail exchanger was indeed equipped with a functional SMTP service.

It is worth noting, as many have over the years, that the MX records only specify where a domain expects to receive mail, not where valid mail from the domain is supposed to originate. Several mechanisms to help identify valid mail senders for a domain have been devised in the intervening years, but none existed at the time SMTP callbacks were considered even remotely useful.

For reasons that are not entirely clear, some developers kept working on SMTP callback code and several mail server implementations available today (2017) still contain code that looks like it was intended to support information-rich callbacks, if the system was configured to enable the feature at all. The default configurations in general do not enable the SMTP callback feature, and mail admins rarely bother to even learn much about the largely disused and (in my opinion at least) not too well thought out feature.

This all happened back in the 1990s, but quite recently an incident occurred that indicates that in some pockets of the Internet, SMTP callbacks are still in use, and in at least some cases data from the callbacks are used for generating blacklists and block mail delivery. The last part should raise a few eyebrows at least.

Jumping forward from the distant 1990s to the present day, regular readers of this column will be aware that bsdly.net and cooperating domains run SMTP service with OpenBSD spamd(8) doing greylisting service, and that the spamd(8) setup produces a greytrapping-based blacklist which is available for download, dumped to a file (available here and here) once per hour.

Maintaining the mail system and the blacklist also involves keeping an eye on mail-related activities, and invalid addresses in our domains that turn up in the greylist are usually added to the list of spamtrap addresses within not too many hours after they first appear. The process of actually adding spamtrap addresses is a manual one, but based on the output of pathetically simple shell scripts that run as cron jobs.

The list of spamtraps has grown over the years to more than 38 000 entries. Most of the entries have local parts that are pure generated gibberish, some entries are probably degraded versions of earlier spamtrap addresses and some again seem to conform with specific patterns, including but not limited to SMTP or NNTP message IDs.

On August 19th and 20th 2017 I noticed a different, but yet familiar pattern in some of the new entries.

The entry that caught my eye had the MAIL FROM: part as

mx42.antispamcloud.com-1503146097-testing@bsdly.com

The local part pattern was somewhat familiar, and breaks down to

$localhostname-$epochtime-testing

with @targetdomain.$tld (in our case, bsdly.com) appended. I had at this point totally forgotten about SMTP callbacks, but I decided to check the logs for any traces of activity involving that host. The only trace I could find in the logs was at the spamd-serving firewall in front of the bsdly.com domain's secondary mail exchanger:

Aug 19 14:35:27 delilah spamd[26915]: 207.244.64.181: connected (25/24)
Aug 19 14:35:38 delilah spamd[26915]: (GREY) 207.244.64.181: <> -> <mx42.antispamcloud.com-1503146097-testing@bsdly.com>
Aug 19 14:35:38 delilah spamd[15291]: new entry 207.244.64.181 from <> to <mx42.antispamcloud.com-1503146097-testing@bsdly.com>, helo mx18-12.smtp.antispamcloud.com
Aug 19 14:35:38 delilah spamd[26915]: 207.244.64.181: disconnected after 11 seconds.

Essentially a normal first contact: spamd at our end answers slowly, one byte per second, but the greylist entry is created in the expectation that any caller with a valid message to deliver would try again within a reasonable time. The spamd synchronization between the hosts in our group of greylisting hosts would see to that an entry matching this sequence would appear in the greylist on all participating hosts.

But the retry never happened, and even if it had, that particular local-part would anyway have produced an "Unknown user" bounce. But at that point I decided to do a bit of investigation and dug out what seemed to be a reasonable point of contact for the antispamcloud.com domain and sent an email with a question about the activity.

That message bounced, with the following explanation in the bounce message body:

DOMAINS@ANTISPAMCLOUD.COM

host filter10.antispamcloud.com [31.204.155.103]

SMTP error from remote mail server after end of data:

550 The sending IP (213.187.179.198) is listed on https://spamrl.com as a source of dictionary attacks.

As you have probably guessed, 213.187.179.198 is the IPv4 address of the primary mail exchanger for bsdly.net, bsdly.com and a few other domains under my care.

If you go to the URL quoted in the bounce, you will notice that the only point of contact is via an email adress in an unrelated domain.

I did fire off a message to that address from an alternate site, but before the answer to that one arrived, I had managed to contact another of their customers and got confirmation that they were indeed running with an exim setup that used SMTP callbacks.

The spamrl.com web site states clearly that they will not supply any evidence in support of their decision to blacklist. Somebody claiming to represent spamrl.com did respond to my message, but as could be expected from their published policy was not willing to supply any evidence to support the claim stated in the bounce.

In my last message to spamrl.com before starting to write this piece, I advised

I remain unconvinced that the description of that problem is accurate, but investigation at this end can not proceed without at least some supporting evidence such as times of incidents, addresses or even networks affected.

If there is a problem at this end, it will be fixed. But that will not happen as a result of handwaving and insults. Actual evidence to support further investigation is needed.

Until verifiable evidence of some sort materializes, I will assume that your end is misinterpreting normal greylisting behavior or acting on unfounded or low-quality reports from less than competent sources.

The domain bsdly.com was one I registered some years back mainly to fend off somebody who offered to help the owner of the bsdly.net domain acquire the very similar bsdly.com domain at the price of a mere few hundred dollars.

My response was to spend something like ten dollars (or was it twenty?) to register the domain via my regular registrar. I may even have sent back a reply about trying to sell me what I already owned, but I have not bothered to dig that far back into my archives.

The domain does receive mail, but is otherwise not actively used. However, as the list of spamtraps can attest (the full list does not display in a regular browser, since some of the traps are interpreted as html tags, if you want to see it all, fetch the text file instead), others have at times tried to pass off something or other with from addresses in that domain.

But with the knowledge that this outfit's customers are believers in SMTP callbacks as a way to identify spam, here is my hypothesis on what actually happened:

On August 19th 2017, my greylist scanner identified the following new entries referencing the bsdly.com domain:

anecowuutp@bsdly.com
pkgreewaa@bsdly.com
eemioiyv@bsdly.com
keerheior@bsdly.com
mx42.antispamcloud.com-1503146097-testing@bsdly.com
vbehmonmin@bsdly.com
euiosvob@bsdly.com
otjllo@bsdly.com
akuolsymwt@bsdly.com

I'll go out on a limb and guess that mx42.antispamcloud.com was contacted by any of the roughly 5000 hosts blacklisted at bsdly.net at the time, with an attempt to deliver a message with a MAIL FROM: of either anecowuutp@bsdly.com, pkgreewaa@bsdly.com, eemioiyv@bsdly.com or perhaps most likely keerheior@bsdly.com, which appears as a bounce-to address in the same hourly greylist dump where mx42.antispamcloud.com-1503146097-testing@bsdly.com first appears as a To: address.

The first seen time in epoch notation for keerheior@bsdly.com is

1503143365

which translates via date -r to

Sat Aug 19 13:49:25 CEST 2017

while mx42.antispamcloud.com-1503146097-testing@bsdly.com is first seen here at epoch 1503146138, which translates to Sat Aug 19 14:35:38 CEST 2017.

The data indicate that this initial (and only) attempt to contact was aimed at the bsdly.com domain's secondary mail exchanger, and was intercepted by the greylisting spamd that sits in the incoming signal path to there. The other epoch-tagged callbacks follow the same pattern, as can be seen from the data preserved here.

Whatever action or address triggered the callback, the callback appears to have followed the familiar script:

register attempt to deliver mail
look up the domain stated in the MAIL FROM: or perhaps even the HELO or EHLO
contact the domain's mail exchangers with the rump SMTP dialog quoted earlier
with no confirmation or otherwise of anything other than the fact that the domain's mail exchangers do listen on the expected port, proceed to whatever the next step is.

The known facts at this point are:

a mail system that is set up for SMTP callbacks received a request to deliver mail from keerheior@bsdly.com
the primary mail exchanger for bsdly.com has the IPv4 address 213.187.179.198

Both of these are likely true. The second we know for sure, and the first is quite likely. What is missing here is any consideration of where the request to deliver came from.

From the data we have here, we do not have any indication of what host contacted the system that initiated the callback. In a modern configuration, it is reasonable to expect that a receiving system checks for sender validity via any SPF, DKIM or DMARC records available, or for that matter, greylist and wait for the next attempt (in fact, greylisting before performing any other checks - as an OpenBSD spamd(8) setup would do by default - is likely to be the least resource intensive approach).

We have no indication that the system performing the SMTP callout used any such mechanism to find an indication as to whether the communication partner was in fact in any way connected to the domain it was trying to deliver mail for.

My hypothesis is that whatever code is running on the SMTP callback adherents' systems does not check the actual sending IP address, but assumes that any message claiming to be from a domain must in fact involve the primary mail exchanger of that domain and since the code likely predates the SPF, DKIM and DMARC specifications by at least a decade, it will not even try to check those types of information. Given the context it is a little odd but perhaps within reason that in all cases we see here, the callback is attempted not to the domain's primary mail exchanger, but the secondary.

With somebody or perhaps even several somebodies generating nonsense addresses in the bsdly.com domain at an appreciable rate (see the record of new spamtraps starting May 20th, 2017, alternate location here) and trying to deliver using those fake From: addresses to somewhere doing SMTP callback, it's not much of a stretch to assume that the code was naive enough to conclude that the purported sender domain's primary mail exchanger was indeed performing a dictionary attack.

The most useful lesson to take home from this sorry affair is likely to be that you need to kill SMTP callback setups in any system where you may find them. In today's environment, SMTP callbacks do not in fact provide useful information that is not available from other public sources, and naive use of results from those lookups is likely to harm unsuspecting third parties.

So,

If you are involved in selling or operating a system that behaves like the one described here and are in fact generating blacklists based on those very naive assumptions, you need to stop doing so right away.

Your mistaken assumptions help produce bad data which could lead to hard to debug problems for innocent third parties.

Or as we say in the trade, you are part of the problem.
If you are operating a system that does SMTP callbacks but doesn't do much else, you are part of a small problem and likely only inconveniencing yourself and your users.

The fossil record (aka the accumulated collection of spamtrap addresses at bsdly.net) indicates that the callback variant that includes epoch times is rare enough (approximately 100 unique hosts registered over a decade) that callback activity in total volume probably does not rise above the level of random background noise.

There may of course be callback variants that have other characteristics, and if you see a way to identify those from the data we have, I would very much like to hear from you.
If you are a customer of somebody selling antispam products, you have reason to demand an answer to just how, if at all, your antispam supplier utilizes SMTP callbacks. If they think it's a fine and current feature, you have probably been buying snake oil for years.
If you are the developer or maintainer of mail server code that contains the SMTP callbacks feature, please remove the code. Leaving it disabled by default is not sufficient. Removing the code is the only way to make sure the misfeature will never again be a source of confusing problems.

For some hints on what I consider a reasonable and necessary level of transparency in blacklist maintenance, please see my April 2013 piece Maintaining A Publicly Available Blacklist - Mechanisms And Principles.

The data this article is based on still exists and will be available for further study as long as the request comes wit a reasonable justification. I welcome comments in the comment field or via email (do factor in any possible greylist delay, though).

Any corrections or updates that I find necessary based on your responses will be appended to the article.

Update 2017-09-05: Since the article was originally published, we've seen a handful of further SMTP callback incidents. The last few we've handled by sending the following to the addresses that could be gleaned from whois on the domain name and source IP address (with mx.nxdomain.nx and 192.0.2.74 inserted as placeholders here to protect the ignorant):

Hi,

I see from my greylist dumps that the host identifying as

mx.nxdomain.nx, IP address 192.0.2.74

is performing what looks like SMTP callbacks, with the (non-existent of course) address

mx.nxdomain.nx-1504629949-testing@bsdly.com

as the RCPT TO: address.

It is likely that this activity has been triggered by spam campaigns using made up addresses in one of our little-used domains as from: addresses.

A series of recent incidents here following the same pattern are summarized in the article

http://bsdly.blogspot.com/2017/08/twenty-plus-years-on-smtp-callbacks-are.html

Briefly, the callbacks do not work as you expect. Please read the article and then disable that misfeature. Otherwise you will be complicit in generating false positives for your SMTP blacklist.

If you have any questions or concerns, please let me know.

Yours sincerely,

Peter N. M. Hansteen

If you've received a similarly-worded notice recently, you know why and may be closer to having a sanely run mail service.

Update 2017-11-02: It looks like the spamrl.com service still has customers that believe in the same claim made in the bounce messages quoted earlier in this article and perform SMTP callbacks exactly like they did when this article was first written. If you have any verifiable information on that outfit and their activities, I would very much like to hear from you.

9 comments:

AnonymousAugust 30, 2017 at 1:50 AM
Actually, callback code is useful in some situations. For example, exim allows you to make a callout for the recipient, which is extremely useful in the case of a mail relay where you don't know which users exist on the backend. But I agree with you it's almost always a bad idea when used to verify the sender.
AnonymousSeptember 10, 2017 at 1:14 AM
OpenBsd with i3wm is a great optiom,try it!
XyoreOctober 18, 2017 at 9:02 PM
Very nice post, thank you for sharing
grinNovember 30, 2017 at 12:23 PM
I am not entirely sure that you'd appreciate my comment, and hesitating to write it. Anyway.

Your post is a bit misguided, based on lots of guesswork and firm opinions. There are other possible chains of events where your results aren't matching the facts. Let's see some.

First, greylisting is ineffective, and it has to die!

Nice ain't it? Well, technically, it has its bright and dark sides and often causes so much trouble without the gains that it really should be exterminated. However I wouldn't write a blog post about it and definitely wouldn't give it such title since it'd only show my ignorance and opinionated approach. I live with greylisting servers around, and try to counteract their braindeadness. However, this is not my main topic, I just wanted to show _your_ approach to SAV (sender address verification).

So, first, what are you observing?

You see the part where the external server checks whether you'd accept _any_ email as valid: it tries to send you a patent nonsense address and see whether you accept it as valid or not. Either way it gets cached and you're not bothered about it for a longer while. You can defeat it by _not_ replying, then there's nothing to cache, and the checks will be repeatedly seen. (Unless the admin specifically modifies the caching to actually neg-cache non-responding and smtp-invalid hosts.) You decide how useful it is for you.

Then, as you have correctly guessed, this information is used for SAV: if the server is known to accept any garbage as valid then there's no point to go on with verification, and you won't see any of such; when the server properly rejects the rubbish the check will verify whether the sender is deliverable, and act on it.

(to be continued)
grinNovember 30, 2017 at 12:24 PM
Let's see if you'd reached this point in your analysis (theoretically), what could be the reason behind SAV?

You would guess "filtering spam". Partially, yes, since - despite your guessing - there are plenty of spam using invalid (random) senders, and plenty of once-valid senders are disabled by the original provider noticing that it's been used for spamming, so the email nicely gets rejected. I mean, in reality, not in theory (that's "fact based").

But the main reason for many people are _deliverability_. The SAV ensures one important thing: the email coming in contain a Return-path with would accept the reply, DSN, bounce, or similar.

What happens when there's no SAV, and the email gets accepted, and later it turns out that the Return-Path is bogus? Well, you have an email, which cannot be delivered to the recipient (for various reasons, including policy, full mailbox, timeouts, whatever) and cannot be delivered to the sender (since it isn't deliverable).

What then? Either you keep it forever (it's not really feasible on the long term, apart from it being completely pointless) or you detroy it, both means "you lose email".

You said you've been around for a long time. You may remember those RFCs and discussions, when people said "never, ever, EVER lose email; deliver it this way or that way, don't lose it; reliability is king". Yep. It's been a long time ago. The North still remembers. ;-)

So, yes, reliability. Greylist causes (valid) mail to be delayed or bounced, but not to be lost: greylist is not evil. SAV ensures mail not to be lost: SAV is not evil. Receiving mail, and then deleting it is evil: it's lost without a useful trace (in the world of non-responding mail relay admins). Maybe when DMARC will be universal and those detroyed mail will be traceable (which I sincerely doubt ever would happen, and it'd use resources a few magnitudes more than it'd use now), maybe then SAV would lose its ground. Until then it serves a specific purpose and while it may be your opinion to dislike it, technically you are possibly able to understand the utility of it.

(And let me point out that killing off VRFY haven't offered much gain since the same just happens by RCPT TO, so the criminals are just fine while the valid users suffer. Neither "solution" have stopped spam flowing, nor did greylist or SAV or BATV or SPF or ARC or ....)

Sidenote: discussion using this submicro comment window is extremely painful.

Sidenote2: it was rejected as "can't be longer than 4096 chars" while it's 3819.
AnonymousOctober 9, 2018 at 8:35 PM
Behind spamrl.com is Spam Experts. It is the only software that can access this list (spamrl.com is not publicly available). On top of this, if you check Spam Expert's knowledge base, they make it clear that spamrl.com is part of their commercial operations.
mamtaJuly 25, 2020 at 4:03 PM
I am really grateful to read this article, this is nice and informative thanks for sharing.
J. E. KlasekDecember 15, 2022 at 12:14 PM
Just a very minor note on date command and epoch time conversion: with GNU tools I would expect only "date -d @EPOCHTIMEINSECONDS" to work ... the stated "date -r SECONDS" won't generally work (only on BSD environments?). Or did I miss something?

Note: Comments are moderated. On-topic messages will be liberated from the holding queue at semi-random (hopefully short) intervals.

I invite comment on all aspects of the material I publish and I read all submitted comments. I occasionally respond in comments, but please do not assume that your comment will compel me to produce a public or immediate response.

Please note that comments consisting of only a single word or only a URL with no indication why that link is useful in the context will be immediately recycled so those poor electrons get another shot at a meaningful existence.

If your suggestions are useful enough to make me write on a specific topic, I will do my best to give credit where credit is due.

That grumpy BSD guy

Sunday, August 27, 2017

Twenty-plus years on, SMTP callbacks are still pointless and need to die

9 comments:

About Me

Buy The Book of PF

Post, tweet, follow

Upcoming Talks

Blog Archive

Follow me on Twitter

Twitter Updates

Friends

Links to other nice sites

Popular Posts

Total Pageviews

Amazon Deals (#ad)

Amazon Deals (#ad)

That grumpy BSD guy

Sunday, August 27, 2017

Twenty-plus years on, SMTP callbacks are still pointless and need to die

9 comments:

About Me

Buy The Book of PF

Post, tweet, follow

Upcoming Talks

Blog Archive

Subscribe To Posts From Here

Follow me on Twitter

Twitter Updates

Friends

Links to other nice sites

Popular Posts

Total Pageviews

Amazon Deals (#ad)

Amazon Deals (#ad)