l i n u x - u s e r s - g r o u p - o f - d a v i s
Next Meeting:
July 7: Social gathering
Next Installfest:
Latest News:
Jun. 14: June LUGOD meeting cancelled
Page last updated:
2003 Oct 26 09:16

The following is an archive of a post made to our 'vox mailing list' by one of its subscribers.

Report this post as spam:

(Enter your email address)
[vox] dealing with old unscanned mail mbox and spamassassin...
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[vox] dealing with old unscanned mail mbox and spamassassin...

keywords: spamassassin, mbox, archive, rescan

I don't know about you, but I have been bothered by having so much of my
old archived mail (in mbox format) using up space on my HD. Keeping well
over 10 years of e-mail can lead to problems with utilization of unwanted
space with spam. I recently decided to play around with processing my old
mbox messages through spamassassin to weed out spam.

This solution is not an "obvious" cat-ing of your old mbox through
spamassassin. Spamassassin appear to expect messages to come trhough one
at a time. Catting the whole mbox does not seem to have the desired
result. Also, for large mboxes, you can expect a lot of memory to be used
during the process. (I was using well over 800MB of RAM on some of my
smaller mbox files while testing this method.)

Check some of these useful methods that work for me:
(reformail is part of many courier-imap packages)
(grepmail is part fo another package)
(mboxgrep is a package by itself)

$ reformail -s spamc < INPUTFILE > OUTPUTFILE

I supposed if you do not used spamd/spamc but instead use spamassassin
called per message, you could do this:)

$ reformail -s spamassassin < INPUTFILE > OUTPUTFILE

What you end up with is another mbox of messages that has instead been
parsed through spamassassin for scoring.

You can then use your favorite mailer to filter based on header
information, or use procmail or maybe you can check out "mboxgrep" and use
the -H flag to search through an mbox and only pull out messages than are
not marked with "^X-Spam-Status: Yes" and dump those into a new mbox file
that is clean.

Why do I keep these old message in mbox format?
Now that I seldom even look at them, I can compress the mbox file and it
compresses much better than several files in a maildir.

Anyway, I found the above useful in weeding spam from my old mboxes and
will probably be able to reclaim 800MB-1000MB of space by doing this. :-D

So, why else do this? If you are on an older, slower machine, you could
copy your mail out to a faster and better machine, process it, and then
transfer it back. (Woohoo!)

Maybe your gateway and/or mail server is weak, but your desktop machine is
more powerful. (ding, ding ding!)

I'm sure you can come up with other reasons.

HTH someone out there,

vox mailing list

LUGOD Group on LinkedIn
Sign up for LUGOD event announcements
Your email address:
LUGOD Group on Facebook
'Like' LUGOD on Facebook:

Hosting provided by:
Sunset Systems
Sunset Systems offers preconfigured Linux systems, remote system administration and custom software development.

LUGOD: Linux Users' Group of Davis
PO Box 2082, Davis, CA 95617
Contact Us

LUGOD is a 501(c)7 non-profit organization
based in Davis, California
and serving the Sacramento area.
"Linux" is a trademark of Linus Torvalds.

Sponsored in part by:
Sunset Systems
Who graciously hosts our website & mailing lists!