DSpam report

Michael Torrie torriem at chem.byu.edu
Fri Aug 25 21:32:42 MDT 2006

On Fri, 2006-08-25 at 21:25 -0600, Ryan Simpkins wrote:
> On Fri, August 25, 2006 20:40, Michael Torrie wrote:
> > Now the not so good.  On my account, I get somewhere around 100 legit e-
> > mails a day and only 5-20 spam messages (I used to get up to 50 spam a
> > day until I started using grey-listing to kill 80% of the spam right at
> > the connection).  After three days of training the spam catching is
> > still only 60%.  It will eventually improve I hope.  The wide variety of
> > ham I receive makes the learning process take a lot longer (the "good"
> > characteristics far outnumber the "bad").
> This reflects my experience as well. I received a low volume of spam, so it took a
> LONG time to train the filter (weeks). After 1000 messages and still only ~60%
> accuracy I got frustrated and set the filter sensitivity to 'most sensitive.' This
> was the single best thing I did. I have yet to match a single false positive at the
> maximum sensitivity. Maybe it's just not sensitive enough. Now after about 4000
> messages (still at max sensitivity) I get about a 96% filter rate.

I started out really conservatively.  I'm boosting the sensitivity and
we'll see what happens over the weekend.

> The nicest thing about dspam in my opinion is the ultra easy web interface to view
> quarantined messages, statistics, graphs, and message history. (Michael, did you set
> your user to be the admin user so you can view global stats?) There are probably
> better solutions out there now that both have an informative web interface, and
> catch more spam. Perhaps someone can post alternatives.

The web interface is nice, but I'm not crazy about how some things need
disk storage.  I'd rather have everything in SQL. Then I could put the
dspam webui on a different machine than the mail server.  I don't really
want to run a web server on my mail server.  With the preferences SQL
extension on, most things are in SQL, including user preferences.  But
the dspam data directory for quarantine is still a problem.


> My only beef remains that at max sensitivity I have zero false positives. I'd love
> to kick it up a few more notches to see if the accuracy goes up.
> -Ryan
