[Dspam-user] What do these dspam_train results mean

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[Dspam-user] What do these dspam_train results mean

Alan Chandler
I have just setup a fresh dspam database and want to train it.  I have
maildir folders and decided to try this

dspam_train [hidden email] --client .Junk/ .Debian/

where .Junk is the maildir holding all the (thunderbird classified) junk
mail and .Debian is sorted mail from the debian mailing list

however I don't really understand the results - it appears something is
wrong

Taking Snapshot...
[hidden email]  TP:     0 TN:     4 FP:     0 FN:     1
SC:     0 NC:     0
Training .Debian/ / .Junk/ corpora...
[test: nonspam] .Debian//maildirfolder           result: BROKEN result!!
[test: spam   ] .Junk//maildirfolder             result: BROKEN result!!
[test: nonspam] .Debian//dovecot.index.log       result: PASS
[test: spam   ] .Junk//dovecot.index.log         result: FAIL (Innocent)
[test: nonspam] .Debian//cur                     result: Can't call
method "as_string" without a package or object reference at
/usr/bin/dspam_train line 185.

What does these various messages mean?



--
Alan Chandler
http://www.chandlerfamily.org.uk


------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Dspam-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspam-user
Reply | Threaded
Open this post in threaded view
|

Re: [Dspam-user] What do these dspam_train results mean

Tom Hendrikx
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 13-03-14 08:55, Alan Chandler wrote:
> I have just setup a fresh dspam database and want to train it.  I
> have maildir folders and decided to try this
>
> dspam_train [hidden email] --client .Junk/ .Debian/
>
> where .Junk is the maildir holding all the (thunderbird classified)
> junk mail and .Debian is sorted mail from the debian mailing list

If it's actually maildir format, then you should probably be using
'.Junk/cur' for spam input, and '.Debian/cur' for ham. In your output
below you 'll see that it tries to handle all files directly below the
path you pass in.

Maildir formatted in the man page means: 1 message per input file.

>
> however I don't really understand the results - it appears
> something is wrong
>
> Taking Snapshot... [hidden email]  TP:     0 TN:     4
> FP:     0 FN:     1 SC:     0 NC:     0 Training .Debian/ / .Junk/
> corpora... [test: nonspam] .Debian//maildirfolder           result:
> BROKEN result!! [test: spam   ] .Junk//maildirfolder
> result: BROKEN result!!

These files don't return a valid classification.

> [test: nonspam] .Debian//dovecot.index.log       result: PASS
> [test: spam   ] .Junk//dovecot.index.log         result: FAIL
> (Innocent)

Correct training for these too: the second will be retrained. But
obviously the files don't have interesting contents...

> [test: nonspam] .Debian//cur                     result: Can't call
>  method "as_string" without a package or object reference at
> /usr/bin/dspam_train line 185.

Training tool goes nuts when it tries to use a directpry as if it's a
file ;)


Tom
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBCAAGBQJTIhtlAAoJEJPfMZ19VO/1h4kQAKhJR64aMzMDZUrS6c0EKYOL
HoPhzmd+8uVAYnklQHNkLZAeHtAtfpBGLTZ8jkkKBXrzbp04sjQhKyKUnPdkob4D
TBcFmr5Q7cwS+1BFyKtVALCIeAXt9FjeAOM5uAz43mGb5IKddY830pjzVt9uPJJT
7d8SIKpMaFxd2bpauziUUtCGDthMjm3Tw/uExBvUSgxzK1pStz7bEjtXu1GPA1Yd
YtrQz4EQcCLAg7ryn9MaWb+/WrdgD75SInxQCTeZtYLykyyz3UNEnxn9/LyJbVIs
qGpPUaFtK1K7xGIs1OHk1lZvmaHSxBenxJPbZTOxFyky4ZShOhiQn+x/DUp1LD33
yhcUT+4zuz7bd2keAMl05z8YSPWDoQpysBxbCp9vOd3OciSH0q3wWLNblEzwIvTU
I2Rooh6OjY/vYJPRScP0IwsZyNMAxeDWazrHbMCo4k5Uq8kK/bjcRcxgdHBktKqp
Bf5tfU4oOv4pGqnzYzwg5aDFIBlA2oD6m1JDjICow0XiHn3GLwSNU20ki290MMjO
Q3tRpvib0fOebZFKX5/3tS8WXtgOh86teEXz9+MhQPSOBVdjdZBeatJXO1dEMIqj
7l1fqaq0ZLkjd72p9EgF9nlToEEBHyIG6ZDxAqxBDXZF6LSgiMANXxqrgcPHwL2O
dEsZOUtw7uDgiy8DKzp2
=lNGD
-----END PGP SIGNATURE-----

------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Dspam-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspam-user
Reply | Threaded
Open this post in threaded view
|

Re: [Dspam-user] What do these dspam_train results mean

Alan Chandler
On 13/03/14 20:56, Tom Hendrikx wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> On 13-03-14 08:55, Alan Chandler wrote:
>> I have just setup a fresh dspam database and want to train it.  I
>> have maildir folders and decided to try this
>>
>> dspam_train [hidden email] --client .Junk/ .Debian/
>>
>> where .Junk is the maildir holding all the (thunderbird classified)
>> junk mail and .Debian is sorted mail from the debian mailing list
> If it's actually maildir format, then you should probably be using
> '.Junk/cur' for spam input, and '.Debian/cur' for ham. In your output
> below you 'll see that it tries to handle all files directly below the
> path you pass in.
>
> Maildir formatted in the man page means: 1 message per input file.
>
It is Maildir formatted, so your suggestion worked a treat.

Thanks


--
Alan Chandler
http://www.chandlerfamily.org.uk


------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Dspam-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspam-user