[Dspam-user] Upgrading 3.6.8 to 3.10.2

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[Dspam-user] Upgrading 3.6.8 to 3.10.2

Jason J. W. Williams
Hi,

Has anyone attempted an upgrade of 3.6.8 to 3.10.2? We've got a
sizable token database. Our configuration is:

Algorithm: graham burton
P-Value: graham

MySQL backend. Ubuntu 10.04


Just curious if there are any gotchas to watch out for. Thank you in advance.

-J

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Dspam-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspam-user
Reply | Threaded
Open this post in threaded view
|

Re: [Dspam-user] Upgrading 3.6.8 to 3.10.2

ktm@rice.edu
On Thu, Aug 21, 2014 at 12:27:17PM -0700, Jason J. W. Williams wrote:

> Hi,
>
> Has anyone attempted an upgrade of 3.6.8 to 3.10.2? We've got a
> sizable token database. Our configuration is:
>
> Algorithm: graham burton
> P-Value: graham
>
> MySQL backend. Ubuntu 10.04
>
>
> Just curious if there are any gotchas to watch out for. Thank you in advance.
>
> -J
>

Hi Jason,

One of the biggest improvements is the ability to use the OSB tokens and
their greatly increased accuracy as well as that due to the new tokenizer
support for HTML Email. You really ought to consider using some sort of
parallel training process to bring the new tokenizer/OSB online and not
just use the old data as is.

Regards,
Ken

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Dspam-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspam-user
Reply | Threaded
Open this post in threaded view
|

Re: [Dspam-user] Upgrading 3.6.8 to 3.10.2

Jason J. W. Williams
Hi Ken,

That's not a bad idea, but for the time being we just want to upgrade
to fix the period escaping issue. Can 3.10.2 use the existing database
without any changes in accuracy or classification?

-J

On Thu, Aug 21, 2014 at 12:59 PM, [hidden email] <[hidden email]> wrote:

> On Thu, Aug 21, 2014 at 12:27:17PM -0700, Jason J. W. Williams wrote:
>> Hi,
>>
>> Has anyone attempted an upgrade of 3.6.8 to 3.10.2? We've got a
>> sizable token database. Our configuration is:
>>
>> Algorithm: graham burton
>> P-Value: graham
>>
>> MySQL backend. Ubuntu 10.04
>>
>>
>> Just curious if there are any gotchas to watch out for. Thank you in advance.
>>
>> -J
>>
>
> Hi Jason,
>
> One of the biggest improvements is the ability to use the OSB tokens and
> their greatly increased accuracy as well as that due to the new tokenizer
> support for HTML Email. You really ought to consider using some sort of
> parallel training process to bring the new tokenizer/OSB online and not
> just use the old data as is.
>
> Regards,
> Ken

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Dspam-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspam-user
Reply | Threaded
Open this post in threaded view
|

Re: [Dspam-user] Upgrading 3.6.8 to 3.10.2

ktm@rice.edu
Hi Jason,

I think that tokens for non-HTML are the same so you would have
minimal effects on accuracy. Since HTML is processed differently,
you accuracy could change based on what tokens are currently
being used to categorize a message. You could run a comparision
by using the same tokens with the new version over already processed
messages and see how they compare. I would want to validate that
before going live.

Regards,
Ken

On Thu, Aug 21, 2014 at 01:09:25PM -0700, Jason J. W. Williams wrote:
> Hi Ken,
>
> That's not a bad idea, but for the time being we just want to upgrade
> to fix the period escaping issue. Can 3.10.2 use the existing database
> without any changes in accuracy or classification?
>
> -J
>

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Dspam-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspam-user
Reply | Threaded
Open this post in threaded view
|

Re: [Dspam-user] Upgrading 3.6.8 to 3.10.2

Jason J. W. Williams
Hi Ken,

That pretty much tells me what I need to know. Thank you!

-J

On Thu, Aug 21, 2014 at 1:33 PM, [hidden email] <[hidden email]> wrote:

> Hi Jason,
>
> I think that tokens for non-HTML are the same so you would have
> minimal effects on accuracy. Since HTML is processed differently,
> you accuracy could change based on what tokens are currently
> being used to categorize a message. You could run a comparision
> by using the same tokens with the new version over already processed
> messages and see how they compare. I would want to validate that
> before going live.
>
> Regards,
> Ken
>
> On Thu, Aug 21, 2014 at 01:09:25PM -0700, Jason J. W. Williams wrote:
>> Hi Ken,
>>
>> That's not a bad idea, but for the time being we just want to upgrade
>> to fix the period escaping issue. Can 3.10.2 use the existing database
>> without any changes in accuracy or classification?
>>
>> -J
>>

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Dspam-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspam-user