| Events |
|
|
|
|
|
|
|
|
| Services |
|
|
|
|
| Interact |
|
|
| -
|
| -
|
|
|
|
|
| About Us |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Re: [vox-tech] bogofilter newbie
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [vox-tech] bogofilter newbie
--On Tuesday, September 23, 2003 07:26:08 -0700 p@dirac.org wrote:
1. update bogofilter's wordlists with every incoming message, using the
-u option. if i understand it, -u will first classify the spam, then
update bogofilter's wordlist. that seems like asking for trouble.
if you filter to /dev/null based on bogofilter's output, how do you
correct mistakes? and it seems like mistakes here will cause more
mistakes in the future.
i assume you do this with:
:0fw
| bogofilter -f -p -u -l -e -v
also, shouldn't there be a "c" in the procmail colon line? how does
mail get past this recipe? isn't it considered "delivered" when an
email matches a recipe unless you use ":0c"?
A procmail recipe tagged with "f" is a filtering recipe. Procmail pipes
the message through the specified program, then continues on using the
filtered version of the message. It's not a delivering recipe, so "c"
isn't needed.
I seeded bogofilter just like you did. I use maildirs for my email so
every message is in a separate file, so I built a big list of every
message less than a year old, divided them into spam & non-spam, and
piped each set into bogofilter.
Incoming mail is piped through this set of rules:
:0 fw
| /usr/bin/bogofilter -u -2 -p -e
# Spam? Save it in the spam folder
:0
* ^X-Bogosity: (yes|spam)
$SPAM
It's a good idea to collect your spam rather than deleting it. You might
want to delete your wordlist one day and build a new one; you'll need a
collection of current spam to do that. More important, any time
bogofilter makes a mistake you need to correct it, whether it was a false
positive or false negative. I can't remember the last time I found
non-spam in my spam folder, but it does happen from time to time.
You'll need to find a method of feeding mail back into bogofilter that
works for you. I copy the mail into a special mailbox that's swept by a
cron job several times per day. These messages are fed back into procmail
using a special set of rules:
# Messages labelled spam. Tell bogofilter it's not, and save to INBOX
:0
* ^X-Bogosity: (Spam|Yes)
{
:0 c
| /usr/bin/bogofilter -Sn
:0
$DEFAULT
}
# Messages not labelled spam.
:0 E
{
:0 c
* ^X-Bogosity: (ham|no)
| /usr/bin/bogofilter -Ns
:0
$SPAM
}
Note I'm not using bogofiler as a filter this time. Without -p
(passthrough mode) it won't output a new copy of the message with the
corrected spam header.
--
"We actually do 100,000 pages or more a day in Bork"
-- Marissa Mayer, Google
Kenneth Herron Kherron@newsguy.com 916-366-7338
_______________________________________________
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech
|
|