Last month Domain Keys Identified Mail (DKIM) was accepted as a chartered IETF effort, and since then it’s been touted as the next big thing in anti-spam. In the interest of clearing up some personal confusion about the framework, I went straight to the source and dove headfirst into RFC4871 and DKIM.org to get a good look at it.
If you’re not familiar with DKIM already, it’s a mechanism that allows Mail Transfer Agents (MTAs) and Mail User Agents (MUAs) to cryptographically sign outgoing e-mails in a manner that allows the receiving mail agents to verify the signer and the message integrity. It provides this mechanism through the use of a public/private keys and a key server (currently implemented using DNS).
A quick glance at the top of the RFC shows that the DKIM standard is backed by some pretty important industry names: Sendmail, Inc., PGP Corporation, Yahoo! Inc., and Cisco Systems, Inc. Wide-spread industry support is extremely important when adopting a new standard, especially those relating to anti-spam tech. So I also took time to scan the list of organizations on dkim.org that publicly support DKIM. Some of the more significant names include: AOL, EarthLink, EBay/PayPal and qmail.org.
After determining the kind of support DKIM has already received, I delved into the RFC abstract and noticed a fairly important message that seems almost slipped in at the end:
"Protection of email identity may assist in the global control of "spam" and "phishing." (RFC4871)
Read that again: “… may assist in the global control of …” In other words, DKIM is not specifically an anti-spam technology. In fact, if you read the “DKIM: Introduction and Overview” article on DKIM.org, you’ll see it specifically stated on the “Overview of DKIM” slide. Instead, it provides a mechanism for attempting to verify that the message sender is really who they say they are.
When a message is signed with a DKIM signature, the mail agents add the signature to the header of the message. The signature contains the domain for the sender; a selector to use in the public key lookup; the cryptography algorithm used to generate the digest; the digest itself; a public-key lookup mechanism (currently only DNS is supported); and a number of configuration parameters.
On the other end of the pipeline, the receiving agent uses the signature fields to create a DNS query that allows the message header and contents to be verified against the sending domain’s public key. Public keys are looked up via domain name, and it can be assumed that the owner of the domain is responsible for their DNS records, hence there’s no need for a centralized authority to maintain and verify the public keys.
I’ve grossly over-simplified the mechanism in my description, but I don’t want to go too deep into the actual mechanics of DKIM (that’s what RFCs are for). I do want to point out a few things about using the standard in an anti-spam capacity.
First of all, as previously stated, DKIM is not an anti-spam mechanism itself. There is no reason a spammer could not simply setup DKIM for their domains, sign their messages, and have the receiving mail agents happily report that the message is validly signed.
Of course, a user could always just block any messages received from the signing domain as soon as it’s determined to be spam, but what if the senders start using arbitrary domain names to send a single mass mailing? After all, “asdfghjkl.com” can be signed just as easily as “iamaspammer.com.” Once a single mailing is sent out on asdfghjkl.com, the spammer could discard the domain and move on to asdfghjkm.com and so on.
The eventual goal is to have all legitimate mail signed. At that point, any unsigned messages could be dropped as “bad,” forcing the spammers to use DKIM as well. This goal is well into the future, but for now DKIM suggests that unsigned messages simply fall-through to regular spam filters for processing.
However, this goal means that a lot of DKIM’s worth is wrapped into the idea of senders being able to build a reputation. Signers who frequently send “good” or “bad” messages will build a reputation for doing so. Spam filters can then be configured to be more lenient on DKIM signers who have “good” reputations and more strict on those who have “bad” reputations.
Here-in rests the problem … once all mail is being signed, how do we determine which senders are sending “good” messages and which are sending “bad” messages? Certainly the messages may still be passed through regular spam filtering mechanisms, but what happens if the message makes it through them? How do we utilize DKIM as a reputation system? The only way I can think of is to provide the end-user receiving the message a way to report it as “bad.”
This brings us back to the “spam” button at the top of your e-mail console – along with a new host of questions. For example, should the reputation system be a centralized global system, or should ISPs setup a system for use by only their users? Does every message that’s reported as spam degrade the signer’s reputation? How do MUAs and MTAs query the reputation system? There are a lot of open-ended questions regarding what to do after everyone (including the spammers) is properly signing their messages.
In conclusion, as a standard for signing and verifying messages, DKIM certainly fits the bill, but there are too many open questions relating to its implementation as an anti-spam tool after it’s widely implemented as a verification tool. These questions need to be answered before DKIM really can gain traction as an anti-spam heuristic.
Maybe I’ll write that RFC …
- Tekkie
Resources: DKIM home page - RFC4871