Home > Articles > Anti-SPAM research center > Fixing the SPAM problem Bookmark page
Fixing the SPAM problem once and for all
White paper

Revision 1.1
George Ou
Copyright August 12, 2003

Contents:
Introduction What is SPAM?
Fundamental weaknesses in email Understanding the fundamental weaknesses of the email and SMTP protocol
Partial solutions and schemes Things that will never work in the long run
Defining the solution Requirements for a long term solution that is practical and enforceable
Fixing the foundation Securing the SMTP servers, the most deployable solution
Client side control Client specified server side filters and client side filters
Standardized abuse statistics Abuse statistics and applying formulas to calculate server scores
Kick starting the solution How the major ISPs alone can jump start support for such a standard
Final words The purpose of this paper


Introduction:
Although spam is defined as any piece of unsolicited email, it is more commonly comprised of scams that flood millions of mailboxes and is rapidly becoming the scourge of the Internet.  Every known scheme in the con man's book and snake oil salesman is now in electronic mail format hoping to catch a few gullible victims out of thousands and in the process causing mass collateral damage of our email infrastructure.  Spam traffic has now easily surpassed legitimate email and has rendered many mailboxes virtually unusable.  Even mobile devices that have to pay by the byte for each message they receive are beginning to get spam.  This not only threatens the growth and viability of mobile electronic messaging, but email in general as a valuable communications tool. 


Fundamental weaknesses in email:
The SMTP (Simple Mail Transfer Protocol) email protocol is fundamentally flawed because it was never designed to be secure in the first place and lacks any authentication of the source of an email.  Simply put, SMTP is based on the honor system, with no way to confirm the authenticity of the sender let alone track the sender.  What this means is that anyone can send email as any assumed identity from anywhere in the world.  I can say I'm the CEO of your company or I can say I'm the Pope when I send you an email and there is no way to confirm or deny it's legitimacy.  Although this may seem like a shock that we are standing on such a shoddy foundation, this really isn't that difficult to understand because at the time of it's invention, it was only used among the few academics and researchers of the world and security was never an issue.  A lesser but still important issue is that SMTP servers use to be open relay, where anyone can use and abuse or spam from that server as they please.  As a result, most email administrators now lock down their SMTP servers to not behave in this promiscuous way and configure them to only relay email from their own network or from users that have authenticated with the server.  These lock down efforts however have not reduced the capacity of spammers to spam because there is nothing to prevent you from putting up your own SMTP server.  Just about any modern Windows, Unix, or Linux operating system have their own built in SMTP server capabilities.  Even viruses as small as 10 kilobytes can contain their own SMTP server to spread them selves.  Because of this, spammers can send hundreds of thousands of unsolicited mail in a few hour from abundant unsecured wired or wireless networks with virtual impunity or traceability.  Laws prohibiting this are all but useless because Spammers are already engaging in felonious activities in fraud and think nothing of breaking another law that can't even be enforced.  The fundamental problem with email is that SMTP has no mechanism for authenticating the source of the message, and therefore has no way to enforce proper behavior against those who would abuse it even within our own borders let alone offshore. 


Partial solutions and schemes:
As with any type of conflict, there are measures and counter measures.  The problem with the current spam conflict is that we are fighting an uphill battle because of the fundamental weaknesses in SMTP.  Any solution devised to fight under the current email standard is at most effective when it is used by so few people that spammers don't care to target them.  The minute any solution establishes a large enough user base, they become targets and their defenses are easily overcome.  This section will list some of the most common anti-spam techniques and how they can all be circumvented. 

Filtering (keyword matching or heuristic algorithms):
Keyword matching is one of the most common solutions out there and is only somewhat effective.  Heuristic statistical analysis algorithms are more sophisticated, but they still cannot be perfect.  Just about any type of email filtering system will have a certain percentage of false positives and false negatives.  Case in point, my own experience with these have been false positive rates of only 0.01%, so you would loose 10 legitimate emails for every 100,000 pieces of spam that are collected during that time.  Although 10 out of 100,000 messages seem like a really good score, you wouldn't be so happy if one of those 10 messages is yours or if one was extremely important.  Even if you accept those kind of statistics, spammers still hold the ultimate ace in their sleeves with anti-parsing techniques such as bad spelling or padding a word like sex as S.E.X with symbols between the characters.  Any filtering system you can buy can also be bought by the spammer to "test" their payload, and they will simply modify the message until it passes the filter.  No automated algorithm stands a chance against these tactics and even if they did, spammers can resort to sending spam in flash or handwritten bitmaps which eliminate any chance of software analyzing the message.  Filtering will never work unless you have a trained professional human to read through ever message, which is not possible without a bottomless pit of money. 

If you look closely to the right, that is not html.  That is a bitmap which I received through in my MSN inbox which normally filters a lot of the spam.  The reason it came through is because it can't be read by text filters.  In order to filter something like this, it would have to have an OCR (Optical Character Recognition) engine built in.  Anyone with any experience with OCR is well aware of how inaccurate they can be if someone just accidentally uses a fancy font.  Now imagine if someone didn't want you to OCR their message.  This is why filters will never work no matter how "intelligent" the filtering software purports to be.
This isn't your father's spam!  This is bitmap spam designed to defeat any anti-spam filtering system.


Blacklisting or Whitelisting:
Blacklisting is where you block known spammers or the even IP addresses of known spammers.  Blocking known email addresses will only work if you can verify authenticity of the sender.  Since authenticity of the sender is impossible with the current SMTP protocol, blacklisting is useless and often results in collateral damage of legitimate messages.  This is exactly what the spammers intended to happen if you attempt to block them in this manner.  Blocking IP addresses has also started becoming a popular practice, but it has a nasty habit of sometimes punishing the innocent while the spammer flees to another IP address.  You have to keep in mind that with the scarcity of IPv4 addresses, many of us poor souls have to share our IP addresses with other domains.  A good system should punish the offending SMTP server, not the IP addresses it came through.  Whitelisting is exactly the same as Blacklisting only it contains a list of known good instead of known bad, and like blacklisting it suffers the same fundamental shortcomings in lack of verification.

Copyright enforcement schemes:
One of the most ridiculous schemes I've ever seen is where a company offers you your own copyrighted poem to stamp at the end of each message.  The company promises to sue any person or business that uses your poem to stamp their own messages.  As a result, that poem could be used as a form of a "digital signature" (I use this term loosely) which could be used to put you in a whitelist.  This scheme would be perfect if spammers operated within the law in this country let alone in some 3rd world country.  The fact that someone can spam from anywhere on the planet with virtual impunity and stealth means this scheme is absolutely useless.  The fact that truly strong digital signature technology such as PGP or the standard x.509 digital signature have long existed makes one wonder how such a company can get so much support behind it. 

Do-not-spam lists:
I have written extensively about this topic and you can read it here.  To quickly summarize, this is absolutely the dumbest thing you can do right now.  Such a list would be spammer's paradise.  Spammers spent significant resources to find lists of legitimate email addresses, and now you are just going to hand them the biggest one of all for free.  Any do-not-spam list is effectively a please-spam-me list. 

Go after the beneficiary legal tactics:
Recently on Charley Rose's PBS program, they had a panel of experts from Industry, Press, and Government.  One of the suggestions brought up was something to the effect that; even though you cannot track down the person that actually sent the spam, you can always track down the source of the spam if you simply went after the product or service being advertised.  Well I'm no lawyer, but it would seem to me that people or organizations are innocent until proven guilty.  Given that it is so easy to send spam with impunity and that there is no way to track down the personal who sent the spam, how do you prove that the spammer was hired by said person or business that allegedly benefited from the spam?  A company could simply pay a so called "marketing agency" to "advertise wink-wink" for them when it fact they are really just a front for spammers.  How do you prove the agency is resorting to spam if you can never catch them in the act?  The minute you start bringing these cases to court and if I were the defendant of such a case, I would simply plea that I never paid or hired anyone to spam for me.  Not only can you not prove that I paid or hired the spammer, but I could prove to you that it was extremely easy for someone with a grudge against me to frame me.  Taking this proposal to it's logical conclusion, I could conceivably sue any large company for allegedly sending spam to me and others when in fact I was the one who secretly spammed myself and others on behalf of that company to facilitate a frivolous law suit.  I don't see how such a law could ever be enforceable or fair, because you could never prove a company paid a spammer to spam for them nor can you prove their marketing agencies are resorting to spam.  Remember, you're dealing with professional scam artists who already operate above the law.  Under the current unauthenticated email system, laws are useless.

The reason all of these methods are all doomed to failure because the rules of the game give an infinite advantage to the spammer.  You can put titanium alloy reinforcement on a building only to watch it collapse if the foundation is weak, and this is exactly what we face with the current spam epidemic.


Defining the solution:  

  • The solution must be universally compatible, and could adopt multiple solutions as part of the standard.
  • The solution must first fix the SMTP protocol itself, primarily the spoofing problem.
  • The next version of SMTP must be downwards compatible.
  • Any new SMTP protocol must be simple and cheap to implement.
  • Once such a protocol is devised, an intelligent and seamless transition phase must be devised.
  • The initial pace of the transition phase must be in the control of each end user.
  • A standardized public abuse statistic database must be devised.
  • Users must be able to set their own parameters for rejecting mail before or after an email is ever transmitted.
  • Mail that is allowed through can be stamped with a standard header to facilitate priority sorting.


Fixing the foundation:
The only way to level the playing field against spam is to upgrade the SMTP protocol beyond the honor system and make spoofing nearly impossible.  For the remainder of this document, I will call the new protocol as SMTP v2 and the existing SMTP protocol as SMTP v1.  Unlike some who are suggesting a new SMTP protocol all together which could never be implemented unless behind the barrel of a gun, SMTP v2 should be downwards compatible to the existing protocol to facilitate a seamless migration.  To make SMTP v2 possible, a new set of security extensions for the existing SMTP protocol would need to be developed.  Such a set of extensions to follow existing naming conventions could simply be called SMTPSEC, as in IPSEC or DNSSEC, all of which would use strong authentication and strong encryption.  In searching the news groups and google, I didn't find a single reference to the word SMTPSEC used in this manner, so I am proposing the use of the word SMTPSEC.  SMTP plus SMTPSEC will be the new SMTP v2 protocol.

To make SMTP v2 spoof proof, we can consider the following candidates:

  • Digital certificates
        SMTP server digital certificates
        End-user digital certificates
  • IP validation:
        DNS Reverse lookup
        Create new DNS field type to define valid SMTP relay servers
        Require DNSSEC (authenticated DNS)

Digital certificates:
Only one technology can truly fulfill those requirements, and that is public key cryptography.  ITU (International Telecommunication Union) standard X.509 digital certificates are universally deployed for any application that requires strong authentication.  There are two ways they can be used to strengthen the SMTP protocol.  You either require x.509 digital certificates for each and every end user, or you require digital certificates on just the SMTP servers for SMTP to SMTP server authentication.  Since end users out number SMTP servers by at least 100 to 1, the only feasible solution is to do is the latter.  You will never be able to convince each and every user to go through the hassle of getting their own public digital certificates, and doing so also makes it impossible for them to use web-based email from public consoles because users don't carry their digital certificate everywhere they go.  This is exactly why you don't see client side certificate deployment for E-Commerce sites and why the vast majority of SSL implementations only use server side certificate deployment.  Even ignoring the costs, management of all those end user certificates is a massive undertaking because of the physical verification process for all those users.  Although just doing strong SMTP server authentication only validates the identity of an SMTP server's domain and not each and every one of it's end users, it does make it extremely likely that the end user (sender) is in fact who they say they are if their SMTP server enforces strict mail relay from only authenticated users and protects it's user's credentials with SSL (Secure Socket Layer) based email clients.  Fortunately, the same digital certificate used to prove the SMTP server's identity to other SMTP servers can also be used to facilitate secure SSL encrypted communications between server-to-server and server-to-end user with SSL enabled POP, IMAP, and Web-based email clients.  Not only is this technology nothing new, but the concept of only using server side certificates is widely deployed from Online shopping to Wireless LAN security because of it's ease of implementation and relatively strong security.

This proposed standard doesn't exclude end users from getting their own digital certificate for true end-to-end authentication and cryptography, it simply doesn't require them for cost and deployment considerations.  Although end-to-end authentication is a laudable and perhaps ideal goal, mandating it's universal implementation will only serve to hamper any kind of solution for the foreseeable future.  However, there is no reason end user certificates can't be integrated into the new SMTPSEC standard as an optional enhanced security level in addition to the mandated SMTP server certificates.  There is room enough in the proposed standard for both certificate based approaches.  The digital certificates would undeniably prove the 1024-bit public key and domain of the SMTP server, and for a optionally higher level of authentication also include individual name, business or organization name, business license, business address, and other binding information.  The certificate is then verified and then digitally signed by a publicly trusted CA (Certificate Authority) such as Verisign, Thawte, or some other public CA.  The only possible way you can fake an X.509 certificate is if you can steal the private key of the possessor of the Certificate, and you can forget about stealing a Public CA's private keys because they guard it with their lives.  Key stealing is extremely impractical because even if you manage to steal a private key, you can only use it a few times before that digital certificate is revoked.  As for brute forcing a 1024-bit digital certificate, the best estimate for a super computer cluster to make a world record attempt is not for another 30 years, and the certificate is only good for 1 to 5 years anyways.  Another important feature of this solution is that you can use your SMTP server's private key to sign other 3rd party SMTP server's digital certificates so that they can relay on behalf of your SMTP server for a specified amount of time.  This is an important feature because SMTP relay is an important part of the email system.  Because of all these benefits, Digital certificates are the ideal candidate for SMTPSEC and SMTP v2 as it is with just about any other "SEC" extension.

Once you look at the billions of dollars used to combat spam with little or no success, you can move beyond the $300/domain mail server certificate.  Not only do you end spoofing, but that same mail server certificate can also be used to facilitate SSL enabled POP3, IMAP, and Webmail.  Now isn't that worth $300 for an entire domain?

Technical animation of the SMTPSEC process.  Hit "Play" to advance:


SMTP server IP validation:

IP validation is another solution proposed by some in the industry, but is nowhere near as secure or flexible as the Certificate based approach shown above.  One of the current methods that can be employed is DNS reverse lookup.  This method basically compares the IP address of the sending SMTP server against the publicly listed owner of the IP address.  Unfortunately, most people and businesses don't own their own IP address blocks and instead "rent" their IP blocks from their Internet Service Provider.  Although most IP renters can control their DNS forward lookup zones, few have the luxury of controlling their public reverse lookups zones.  This leaves DNS reverse lookup out of reach for all but the ISPs and large corporations that own their own IP blocks, so this automatically makes DNS reverse lookup a poor solution because it lacks universal compatibility.

Anther proposed method is to upgrade the DNS specification in the form of a new field type that defines the valid SMTP relay servers of a domain, perhaps called an "RX" DNS field for "Relay Exchange" where as the existing "MX" (Mail Exchange) field is used to define the inbound mail server for a domain.  Some will criticize this approach as spoofable because IP address headers can be forged and while there is some merit to that concern, the reality is that you cannot easily spoof an IP address where a connection oriented TCP session such as SMTP is involved.  IP spoofing is not as simple as just having raw sockets in your operating system, you must also be able to hijack the Internet routing protocol or the routers that control them.  Sure you can just send off some IP packets with forged IP source headers, but the instant the other side responds to you, the response packet will go to the rightful owner of the IP addresses and not the spoofed source making it extremely difficult to have a back and forth dialog required in a connection oriented session.  Nevertheless, there are some rare instances where a blind timed attack can spoof an entire TCP session so therefore, IP validation is no where near as secure as using SMTPSEC.  Even if the new proposed DNSSEC were employed, that would only serve to guarantee that the IP lookup from that DNSSEC server for the "RX" record is accurate.  However, that doesn't help you if someone can spoof that "accurate" IP address, and you're right back to requiring SMTPSEC per domain to truly secure SMTP v2.  This is exactly why Digital Certificates on Web servers can be successfully used to secure E-Commerce sites without the use of DNSSEC whereas the converse is not true.  That same logic also applies to SMTP.  Of course, I'm not down playing the need for DNSSEC since that is important for other security reasons, just not for these.

Aside from it's weaker security, DNS validation also runs in to problems because of it's requirement for every domain to have exclusive use of it's own IP address.  Often times a single IP address is shared by multiple organizations and their SMTP servers.  Although DNS validation at least means you don't need to own your own IP block outright, it does mean you have to at least not be sharing your IP address with anyone else and this can be a problem for some.  The biggest problem with DNS validation is that it requires an upgrade to the DNS infrastructure in addition to the SMTP infrastructure.  Why require an upgrade to DNS as well as SMTP when we're trying to solve the email problem, especially since SMTPSEC is much more secure and only require an upgrade to the SMTP protocol and not DNS?  Therefore, I believe IP validation should not be employed in SMTPSEC for SMTP v2, and that Digital Certificates are the only way to go.


Client side control:
Once the foundation of email is secured by SMTPSEC under SMTP v2, we can begin the transition phase away from SMTP v1.  One of the biggest problems with current anti-SPAM filtering technology aside from the fact that they don't work well is that the damage is already partially done even if the filter works.  What I mean by that is that the bandwidth is already wasted before the filter even gets a chance to do it's job.  Under the SMTP v2 protocol, users could be given the chance by their ISP to opt out of SMTP v1 outright by a simple user preferences database which the user is in control of.  If the users chooses to accept SMTP v1 (most will during the initial transition phase), messages could be stamped with an SMTP v1 or v2 header before it is given to the end user's email client.  That email client could have two inboxes, one for SMTP v1 unauthenticated email and one for SMTP v2 authenticated email.  The actual sender is implicitly trusted as likely accurate if the SMTP server verifies it's clients before sending email on behalf of them.  Existing techniques for filtering can still be applied to the legacy inbox, although with less and less efficacy.  This would basically set up a two tier class system for email, verified and unverified.  As SMTP v1 becomes inundated by spam, users will be less and less inclined to drudge through their massively polluted SMTP v1 inbox if it isn't already happening now.  If you want your email to be read first (or even read at all), you will insist that your ISP supports SMTP v2 or you find a new ISP that does so that your messages will end up in the SMTP v2 inbox.  No one will want to use second class email where delivery is questionable.  Once critical mass for SMTP v2 is achieved, the industry can begin to negotiate a termination date for SMTP v1.  Of course, I am not claiming that this automatically means SMTP v2 it self will end spam, it only makes it's senders accountable and traceable.  Some of those techniques that had very little meaning under SMTP v1 like new laws, blacklists, and whitelists will all of a sudden have meaning under SMTP v2.  SMTP v2 only fixes the foundation, but now we can begin to rebuild the house on this new solid foundation with meaningful laws, filters, whitelists, and blacklists.


Statistical tracking and scoring:
There are currently many services that blacklist entire blocks of IP addresses right now, but they are having a devastating effect on legitimate email while not having much of an effect on spam.  Unfortunately, the only way current black lists to ever stop all spam is to block out 0.0.0.0/0, which is technical jargon for the entire address range of the entire Internet and is obviously ludicrous.  It is just plain silly to block out IP addresses you should only be blocking out illegitimate outbound mail servers.  What is really needed is more accuracy and granularity with a numeric scoring system instead of just black or white reject or accept criteria.  The score can be an effective criteria for sorting user's inboxes and the user can even specify a cutoff for the minimum score.  Once SMTP v2 is implemented, legacy SMTP servers will be scored lowest, since their identity cannot be confirmed.  SMTP servers with certificates guaranteeing domain name identity are the minimum standard for SMTP v2 and more specific information such as individual name, business name, address, and/or license # would boost the SMTP server to a much higher default score.  Anyone can buy a new domain name but it's a lot harder to acquire new business licenses and physical addresses, and even extreme to get a new personal identity.  The more information a digital certificate guarantees, the higher it's base rank and the higher it is sorted in the user's inbox by default.  If mail coming from a certain domain is often abusive, points could be deducted from their base score.  The score will ultimately determine if the message will be read first or read at all by the email recipient.  An accurate public database of domains and companies that abuse email can be created as a deterrent to sending unsolicited email and as a way to block those spammers out across the board for anyone who uses that database.  SMTP v2 will greatly reduce collateral damage while also making spammers accountable because for the first time they can be easily identified.


Kick starting the solution:
To start the process of fixing the email system, the Internet community would need to ratify the new SMTPSEC extensions and the SMTP v2 protocol, perhaps not exactly as how I proposed it but hopefully meeting it's primary design goal of eliminating spoofing.  Once such a protocol is ratified, we could see a rapid acceptance of the new SMTP v2 protocol as soon as the major ISPs such as AOL, MSN/hotmail, Earthlink, Netzero, and other large ISPs adopted it themselves.  Compared to the billions already spent to combat spam, my proposed SMTPSEC extensions would be very cheap to implement because it only requires SMTP server side software upgrades along with the purchase of a single SMTP domain certificate that can be used by all of the SMTP relay servers of a single domain.  All of the client technologies such as SSL browsers and SSL enabled POP3/IMAP applications are all existing technology.  Although there are already efforts under way by the major ISPs, any solution must be universal in order to fix email as a whole.  As David Berlind of Jamspam.org said on the Charley Rose panel discussion on SPAM, we must have a unified effort from the entire industry.  Email belongs to everyone, and it must be solved by everyone.  Only when the majority of people are using SMTP v2 authenticated email can we ever hope to see the end of spam.


Final words
I am a freelance writer and the owner of this website www.LANArchitect.net.  I have been working on the frontlines in IT for over the last decade, and my purpose for writing this paper is to fix the email system because I truly feel that I have something to contribute to the community based on my experience and knowledge of network infrastructure.  I hope to influence the future of email in a way to save it from impending doom.  I hope as many people will read this as possible and pass it on to as many people as possible.  I intent to submit this paper to any business or organization with a vested interest in email infrastructure and hope that they will take to time to consider my proposals.  If any website or media organization wishes to reprint or republish the contents of this paper in it's entirety, you can contact me and I can grant you permission to freely do so.  Questions are also welcome.  Linking is always welcome, but do let me know since I like the feedback.

Sincerely,
George Ou