It’s just data

Inbox Failure

Mark Pilgrim: My attempts at compartmentalization have failed. There is only one inbox.

Nearly three decades ago, I had an opportunity to witness the effect of an email sent to a select group being widely forwarded.  I no longer remember the details, but I do remember the decision that I made at that time.  Ironically, it is not a decision that I widely publicize as I do not condone breaking of netiquette.  But it was that decision that leads me to open source, open standards, and blogging.

But this post isn’t about that.  It is about a failure that has begun to affect me:

From xxxx@gmail.com  Sun Mar 28 15:25:52 2010
Return-Path: <xxxx@gmail.com>
X-Spam-Flag: YES
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on vanadium.sabren.com
X-Spam-Level: *****
X-Spam-Status: Yes, score=5.8 required=5.0 tests=DNS_FROM_OPENWHOIS,
    FH_DATE_PAST_20XX autolearn=disabled version=3.2.5
X-Spam-Report:
    *  2.4 DNS_FROM_OPENWHOIS RBL: Envelope sender listed in bl.open-whois.org.
    *  3.4 FH_DATE_PAST_20XX The date is grossly in the future.

I don’t understand DNS_FROM_OPENWHOIS RBL or why that would affect everything sent from gmail, but I do recognize FH_DATE_PAST_20XX as bug 6269.  Apparently, when combined this means that I didn’t see anything in the past week or so from a number of sources, including not only gmail but yahoo and my publisher.

Update: I need to clarify that this does not affect the apache secretary email address.


Sam, the DNS_FROM_OPENWHOIS is https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6157

And I actually check (and clean) the spam folder in GMail several times a day, just in case.

Posted by Thomas Broyer at

Sam, the DNS_FROM_OPENWHOIS is https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6157

Thanks!

And I actually check (and clean) the spam folder in GMail several times a day, just in case.

I actually have a two layer system: SpamAssassin is run on the server to eliminate obvious spam, and then ThunderBird’s Bayesian filter is run on the client to process the rest.  I do periodically check my Junk folder, but I had pretty much forgotten about the server checks as until recently they “just worked” and I had never found any non-SPAM in the emails that were filtered out by that process.

Posted by Sam Ruby at

Bug 6269 doesn’t seem to affect me. Is that because I’m so lazy about updating and am still using
SpamAssassin 3.1.7-deb (2006-10-05)?

Posted by Lee Phillips at

Apparently, scores can be adjusted in .spamassassin/user_prefs:

score FH_DATE_PAST_20XX 0
score DNS_FROM_OPENWHOIS 0

I’ve also added a few whiltelist_from lines as failsafes.

Posted by Sam Ruby at

“I’ve also added a few whiltelist_from lines as failsafes.”

Are you using procmail or some other sorting/forwarding step before spamassassin? If so, I think it would be better to put whitelisting, blacklisting, etc. there, to avoid a call to the expensive spamassassin process. (And this, of course, applied after the filtering done by your smtp demon, which avoids calls to the expensive procmail process — a burst of a few thousand spam emails hitting your server more or less all at once can really slow things down if each one gets processed by spamassassin.)

Posted by Lee Phillips at

I can’t imagine optimizing for the case where I get a few thousand emails more at less all at once originating from sites I whitelisted, but sure, it does make sense to skip running spamassassin entirely for sites I’ve whitelisted.  For future reference, here’s the syntax:

:0
* ^From: *@blacklisted\.com
/dev/null

:0:
* ^From: *@whitelisted\.com
${DEFAULT}

These lines need to appear before the call out to spamc.

Posted by Sam Ruby at

Add your comment