autolearn=yes but sa-learn dump magic shows no new spam

Discussion:

(too old to reply)

Brett Millett

2008-08-01 21:28:47 UTC

Hi,

I've been googling quite a bit today to find the answer to what I'm
seeing that is happening on my mail server. However, I just can't seem
to find a definitive answer. When looking at my mail logs I see a number
of autolearn=spam, however when I run "sa-learn --dump magic" nspam does
not increment. If I run sa-learn manually, nspam increments. Is this
normal or should each autolearn=spam indicate that nspam should
increment by one.

Here is my local.cf file as it pertains to bayes and autolearn:

use_bayes 1
bayes_auto_learn 1
bayes_ignore_header X-Bogosity
bayes_ignore_header X-Spam-Flag
bayes_ignore_header X-Spam-Status
bayes_auto_learn_threshold_nonspam -0.1
bayes_auto_learn_threshold_spam 5.0
use_auto_whitelist 1
bayes_use_hapaxes 1
bayes_min_ham_num 150
bayes_min_spam_num 150
score BAYES_00 -3
score BAYES_05 -1
score BAYES_95 6
score BAYES_99 9
score BAYES_20 -0.8
score BAYES_40 0
score BAYES_50 1.567
score BAYES_60 3.515
score BAYES_80 3.608

Thanks,

Brett

Karsten Bräckelmann

2008-08-01 21:58:47 UTC

Permalink

Post by Brett Millett
Hi,
I've been googling quite a bit today to find the answer to what I'm
seeing that is happening on my mail server. However, I just can't seem
to find a definitive answer. When looking at my mail logs I see a number
of autolearn=spam, however when I run "sa-learn --dump magic" nspam does
not increment. If I run sa-learn manually, nspam increments. Is this
normal or should each autolearn=spam indicate that nspam should
increment by one.

The site-wide spamassassin user or the user spamassassin has been called
on behalf is not the user you are running sa-learn --dump magic as?

Just a guess. :)

guenther

--
char *t="\10pse\0r\0dtu\***@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Brett Millett

2008-08-01 23:48:18 UTC

Permalink

Great guess! I was running as root before (sudo.) Here are the results when I run the command as the site-wide user.

sa-learn --dump magic

0.000 0 3 0 non-token data: bayes db version
0.000 0 329 0 non-token data: nspam
0.000 0 42903 0 non-token data: nham
0.000 0 158973 0 non-token data: ntokens
0.000 0 1204946608 0 non-token data: oldest atime
0.000 0 1205337646 0 non-token data: newest atime
0.000 0 1217631298 0 non-token data: last journal sync atime
0.000 0 1205335826 0 non-token data: last expiry atime
0.000 0 417421 0 non-token data: last expire atime delta
0.000 0 0 0 non-token data: last expire reduction count

Thanks!

Also, I'm just using the flat files (non-sql.)

I'm still confused though...a "$ps aux | grep spam" reveals

/usr/sbin/spamd --create-prefs --max-children 5 --helper-home-dir -x --virtual-config-dir=/etc/mail/spamassassin --username spamassassin -d --pidfile=/var/run/spamd.pid

Of course my manual sa-learns were altering the bayes files under /etc/mail/spamassassin whereas the bayes files spam was writing to were under /home/spamassassin/.spamassassin. So I changed the home directory for spamassassin to /etc/spamassassin. That created a hidden directory .spamassassin under /etc/spamassassin and when running "sa-learn --dump magic" I got errors (because spamd is still writing to bayes files just one directory up.) Thus, I made symlinks to the bayes files under the .spamassassin pointing to the same files one dir up and everything seems to be working.

My question is: Is that the best way to do that. Did I miss something?

Thanks for all who have helped.

Brett

-----Original Message-----
From: Karsten Bräckelmann [mailto:***@rudersport.de]
Sent: Friday, August 01, 2008 3:59 PM
To: ***@spamassassin.apache.org
Subject: Re: autolearn=yes but sa-learn dump magic shows no new spam

The site-wide spamassassin user or the user spamassassin has been called
on behalf is not the user you are running sa-learn --dump magic as?

Just a guess. :)

guenther

Duane Hill

2008-08-01 22:26:32 UTC

Permalink

Post by Brett Millett
I've been googling quite a bit today to find the answer to what I'm
seeing that is happening on my mail server. However, I just can't seem
to find a definitive answer. When looking at my mail logs I see a number
of autolearn=spam, however when I run "sa-learn --dump magic" nspam does
not increment. If I run sa-learn manually, nspam increments. Is this
normal or should each autolearn=spam indicate that nspam should
increment by one.

I can only speculate this would have something to do with the user
autolearn is running against. I'm going to assume you are using MySQL as
you did not state. Have you tried going into MySQL and:

select count(*) from bayes_vars;

to see how many usernames are in the table?

Perhaps you can post the startup parameters spamd is using. Also how spamc
is being called (if that is the case). That may shed more light on why you
are getting the results you are.

Post by Brett Millett
use_bayes 1
bayes_auto_learn 1
bayes_ignore_header X-Bogosity
bayes_ignore_header X-Spam-Flag
bayes_ignore_header X-Spam-Status

You do not need to include the X-Spam-* header fields as they are
stripped before learning.

Post by Brett Millett
bayes_auto_learn_threshold_nonspam -0.1
bayes_auto_learn_threshold_spam 5.0
use_auto_whitelist 1
bayes_use_hapaxes 1
bayes_min_ham_num 150
bayes_min_spam_num 150
score BAYES_00 -3
score BAYES_05 -1
score BAYES_95 6
score BAYES_99 9
score BAYES_20 -0.8
score BAYES_40 0
score BAYES_50 1.567
score BAYES_60 3.515
score BAYES_80 3.608

-d

Brett Millett

2008-08-01 23:53:19 UTC

Permalink

As you can see from the response I just posted, I'm not using MySQL for
bayes (albeit, maybe I should be, that seems very convenient.)

Post by Duane Hill
You do not need to include the X-Spam-* header fields as they are
stripped before learning.

Thanks. I'll pull those out.

-----Original Message-----
From: Duane Hill [mailto:***@yournetplus.com]
Sent: Friday, August 01, 2008 4:27 PM
To: ***@spamassassin.apache.org
Subject: Re: autolearn=yes but sa-learn dump magic shows no new spam

Post by Duane Hill
I've been googling quite a bit today to find the answer to what I'm
seeing that is happening on my mail server. However, I just can't seem
to find a definitive answer. When looking at my mail logs I see a number
of autolearn=spam, however when I run "sa-learn --dump magic" nspam does
not increment. If I run sa-learn manually, nspam increments. Is this
normal or should each autolearn=spam indicate that nspam should
increment by one.

I can only speculate this would have something to do with the user
autolearn is running against. I'm going to assume you are using MySQL as

you did not state. Have you tried going into MySQL and:

select count(*) from bayes_vars;

to see how many usernames are in the table?

Perhaps you can post the startup parameters spamd is using. Also how
spamc
is being called (if that is the case). That may shed more light on why
you
are getting the results you are.

Post by Duane Hill
use_bayes 1
bayes_auto_learn 1
bayes_ignore_header X-Bogosity
bayes_ignore_header X-Spam-Flag
bayes_ignore_header X-Spam-Status

You do not need to include the X-Spam-* header fields as they are
stripped before learning.

Post by Duane Hill
bayes_auto_learn_threshold_nonspam -0.1
bayes_auto_learn_threshold_spam 5.0
use_auto_whitelist 1
bayes_use_hapaxes 1
bayes_min_ham_num 150
bayes_min_spam_num 150
score BAYES_00 -3
score BAYES_05 -1
score BAYES_95 6
score BAYES_99 9
score BAYES_20 -0.8
score BAYES_40 0
score BAYES_50 1.567
score BAYES_60 3.515
score BAYES_80 3.608

-d