[Dspam-user] DSPAM Mysql low performances and locks

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Dspam-user] DSPAM Mysql low performances and locks

Marco
Hello,
  I have a slow performance on my dspam-mysql conf. The purge nightly  
script take 7-12 hours to complete. During this time I see a lot of  
locks

[07/02/2014 09:28:35] 32082: Lock wait timeout exceeded; try  
restarting transaction: UPDATE dspam_token_data SET  
last_hit=CURRENT_DATE(),innocent_hits=innocent_hits+1 WHERE uid=715  
AND token IN  
('5719708924817070673','15748594722301702351','12523852096982625759','15747193994780569807','7072984635321818198','8566666067201058025','3214617105374076928','667817407838676578','2746530540300936818','7855920995652397412','5465147151345906752','5252602293815960809','12797608577848435519','7966374547679573945','12661687167322745101','1476367451104814719','15905813578389094591','7592741750694013197','15748943523967104959',

During these locks DSPAM fails to analize the mails:

2014-07-02T09:28:14.901941+02:00 02ucas amavis[32751]: (32751-06)  
spam-tag, <OMISSIS@OMISSIS> -> <OMISSIS@OMISSIS>, No, score=-1.275  
tagged_above=-999 required=4.5 tests=[DSPAM_ERROR=0.1,  
FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7,  
RCVD_IN_MSPIKE_H2=-0.676, SPF_PASS=-0.001] autolearn=disabled
2014-07-02T09:28:30.486044+02:00 01ucas amavis[21712]: (21712-18)  
spam-tag, <OMISSIS@OMISSIS> -> <OMISSIS@OMISSIS>, No, score=-4.93  
tagged_above=-999 required=4.5 tests=[DSPAM_ERROR=0.1,  
HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_HI=-5,  
RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001,  
T_RP_MATCHES_RCVD=-0.01] autolearn=disabled
2014-07-02T09:28:39.428388+02:00 01ucas amavis[28266]: (28266-03)  
spam-tag, <OMISSIS@OMISSIS> -> <OMISSIS@OMISSIS>, No, score=-0.617  
tagged_above=-999 required=4.5 tests=[DSPAM_ERROR=0.1,  
FREEMAIL_FORGED_FROMDOMAIN=0.001, FREEMAIL_FROM=0.001,  
HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001,  
RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01,  
RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled
2014-07-02T09:28:44.233983+02:00 03ucas amavis[25593]: (25593-09)  
spam-tag, <OMISSIS@OMISSIS> -> <OMISSIS@OMISSIS>, No, score=2.8  
tagged_above=-999 required=4.5 tests=[DSPAM_ERROR=0.1,  
HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_IMAGE_ONLY_28=0.726,  
HTML_IMAGE_RATIO_02=0.805, HTML_MESSAGE=0.001, MIME_HTML_ONLY=1.105,  
RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=-1,  
RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001,  
T_RP_MATCHES_RCVD=-0.01, URIBL_GREY=1.084] autolearn=disabled
2014-07-02T09:28:45.563818+02:00 03ucas amavis[25462]: (25462-11)  
spam-tag, <OMISSIS@OMISSIS> -> <OMISSIS@OMISSIS>, No, score=-0.814  
tagged_above=-999 required=4.5 tests=[DSPAM_ERROR=0.1,  
MISSING_DATE=1.396, RCVD_IN_DNSWL_MED=-2.3, T_RP_MATCHES_RCVD=-0.01]  
autolearn=disabled

(dspam check the mail and a plugin of spamassassin assigns a score).

I have about 2 thousand of accounts. DSPAM have osb, tum.
This is the mysql schema:

MySQL [dspam]> desc dspam_token_data;
+---------------+---------------------+------+-----+---------+-------+
| Field         | Type                | Null | Key | Default | Extra |
+---------------+---------------------+------+-----+---------+-------+
| uid           | int(10) unsigned    | NO   | PRI | NULL    |       |
| token         | bigint(20) unsigned | NO   | PRI | NULL    |       |
| spam_hits     | bigint(20) unsigned | NO   |     | NULL    |       |
| innocent_hits | bigint(20) unsigned | NO   |     | NULL    |       |
| last_hit      | date                | NO   |     | NULL    |       |
+---------------+---------------------+------+-----+---------+-------+
5 rows in set (0.00 sec)

MySQL [dspam]> desc dspam_signature_data;
+------------+------------------+------+-----+---------+-------+
| Field      | Type             | Null | Key | Default | Extra |
+------------+------------------+------+-----+---------+-------+
| uid        | int(10) unsigned | NO   | PRI | NULL    |       |
| signature  | char(32)         | NO   | PRI | NULL    |       |
| data       | longblob         | NO   |     | NULL    |       |
| length     | int(10) unsigned | NO   |     | NULL    |       |
| created_on | date             | NO   | MUL | NULL    |       |
+------------+------------------+------+-----+---------+-------+
5 rows in set (0.00 sec)

MySQL [dspam]> desc dspam_virtual_uids;
+----------+------------------+------+-----+---------+----------------+
| Field    | Type             | Null | Key | Default | Extra          |
+----------+------------------+------+-----+---------+----------------+
| uid      | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| username | varchar(128)     | YES  | UNI | NULL    |                |
+----------+------------------+------+-----+---------+----------------+
2 rows in set (0.00 sec)

MySQL [dspam]> desc dspam_stats;
+------------------------+---------------------+------+-----+---------+-------+
| Field                  | Type                | Null | Key | Default  
| Extra |
+------------------------+---------------------+------+-----+---------+-------+
| uid                    | int(10) unsigned    | NO   | PRI | NULL    
|       |
| spam_learned           | bigint(20) unsigned | NO   |     | NULL    
|       |
| innocent_learned       | bigint(20) unsigned | NO   |     | NULL    
|       |
| spam_misclassified     | bigint(20) unsigned | NO   |     | NULL    
|       |
| innocent_misclassified | bigint(20) unsigned | NO   |     | NULL    
|       |
| spam_corpusfed         | bigint(20) unsigned | NO   |     | NULL    
|       |
| innocent_corpusfed     | bigint(20) unsigned | NO   |     | NULL    
|       |
| spam_classified        | bigint(20) unsigned | NO   |     | NULL    
|       |
| innocent_classified    | bigint(20) unsigned | NO   |     | NULL    
|       |
+------------------------+---------------------+------+-----+---------+-------+
9 rows in set (0.01 sec)

MySQL [dspam]> desc dspam_preferences;
+------------+------------------+------+-----+---------+-------+
| Field      | Type             | Null | Key | Default | Extra |
+------------+------------------+------+-----+---------+-------+
| uid        | int(10) unsigned | NO   | PRI | NULL    |       |
| preference | varchar(32)      | NO   | PRI | NULL    |       |
| value      | varchar(64)      | NO   |     | NULL    |       |
+------------+------------------+------+-----+---------+-------+
3 rows in set (0.00 sec)

Indexes:
+----------------------+----------------------+
| TABLE_NAME           | INDEX_NAME           |
+----------------------+----------------------+
| dspam_preferences    | id_preferences_01    |
| dspam_signature_data | id_signature_data_01 |
| dspam_signature_data | id_signature_data_02 |
| dspam_stats          | PRIMARY              |
| dspam_token_data     | id_token_data_01     |
| dspam_virtual_uids   | PRIMARY              |
| dspam_virtual_uids   | id_virtual_uids_01   |
+----------------------+----------------------+

This is the size:
+----------------------+------------+
| Tables               | Size in MB |
+----------------------+------------+
| dspam_token_data     |   23929.03 |
| dspam_signature_data |    6936.11 |
| dspam_virtual_uids   |       0.30 |
| dspam_stats          |       0.25 |
| dspam_preferences    |       0.02 |
+----------------------+------------+


If I try to count tokens I take very log time:
MySQL [dspam]> select count(*) from dspam_token_data;

+-----------+
| count(*)  |
+-----------+
| 203125948 |
+-----------+
1 row in set (*1 hour 24 min 52.41 sec*)

MySQL [dspam]> select count(*) from dspam_signature_data;
Current database: dspam

+----------+
| count(*) |
+----------+
|   228701 |
+----------+
1 row in set (12.21 sec)


This is my dspam.conf:

Home /var/lib/dspam
StorageDriver /usr/lib64/dspam/libmysql_drv.so
DeliveryHost          127.0.0.1
DeliveryPort          10024
DeliveryIdent         localhost
DeliveryProto         LMTP
OnFail error
Trust dspam
DebugOpt process classify spam fp inoculation corpus
TrainingMode tum
TestConditionalTraining on
Feature noise
Feature whitelist
Algorithm graham burton
Tokenizer osb
PValue bcr
WebStats on
ImprobabilityDrive on
Preference "trainingMode=TUM"
Preference "spamAction=deliver"
Preference "spamSubject=[SPAM]"         # { string } -> default:[SPAM]
Preference "statisticalSedation=0"      # { 0 - 10 } -> default:0
Preference "enableBNR=on"               # { on | off } -> default:off
Preference "enableWhitelist=on"         # { on | off } -> default:on
Preference "signatureLocation=headers"
Preference "tagSpam=off"                # { on | off }
Preference "tagNonspam=off"             # { on | off }
Preference "showFactors=off"            # { on | off } -> default:off
Preference "optIn=on"                   # { on | off }
Preference "optOut=off"                 # { on | off }
Preference "whitelistThreshold=10"      # { Integer } -> default:10
Preference "makeCorpus=off"             # { on | off } -> default:off
Preference "storeFragments=off"         # { on | off } -> default:off
Preference "localStore="                # { on | off } -> default:username
Preference "processorBias=on"           # { on | off } -> default:on
Preference "fallbackDomain=off"         # { on | off } -> default:off
Preference "trainPristine=off"          # { on | off } -> default:off
Preference "optOutClamAV=off"           # { on | off } -> default:off
Preference "ignoreRBLLookups=off"       # { on | off } -> default:off
Preference "RBLInoculate=off"           # { on | off } -> default:off
Preference "notifications=off"          # { on | off } -> default:off
MySQLServer             OMISSIS
MySQLPort               3333
MySQLUser               dspam
MySQLPass               <password>
MySQLDb                 dspam
MySQLCompress           true
MySQLReconnect          true
MySQLConnectionCache    100
MySQLUIDInSignature     on
HashRecMax              98317
HashAutoExtend          on
HashMaxExtents          0
HashExtentSize          49157
HashPctIncrease         10
HashMaxSeek             10
HashConnectionCache     10
ExtLookup               on                              # Turns on/off  
external lookup
ExtLookupMode           map                             # available  
modes are 'verify', 'map' and 'strict'.
                                                         # 'strict'  
enforces both verify and map
ExtLookupDriver ldap                            # Currently only ldap  
and program are supported.
                                                         # There are  
plans to support both MySQL and Postgres.
ExtLookupServer OMISSIS                # Can either be a database  
hostname or the full path to
                                                         # an  
executable lookup program and its arguments.
ExtLookupPort           989                             # Desired port  
when connecting to the lookup database.
ExtLookupDB             "ou=base"  # Can either be an LDAP search base  
or a database name (TODO).
ExtLookupQuery          
"(&(|(mail=%u)(mailalternateaddress=%u))(|(&(objectclass=mailrecipient)(mailUserStatus=active))(objectclass=mailgroup)))"       # Can either be an LDAP search filter or an SQL query  
(TODO)
ExtLookupLDAPAttribute  "uid"                           # Attribute to  
be used when ExtLookupDriver is 'ldap'
                                                         # and  
ExtLookupMode 'map' or 'strict'
ExtLookupLDAPScope      sub                             # Can be set  
to 'base', 'sub' or 'one'. Only used when ExtLookupDriver is 'ldap'.
ExtLookupLDAPVersion    3                               # Sets the  
LDAP protocol version (1, 2 or 3)
ExtLookupLogin          "uid=dspam,cc=oo"      # Login to be used when  
connecting to any direct database backend.
ExtLookupPassword       <password>                 # Password to use  
with ExtLookupLogin.
Notifications   off
PurgeSignatures 14      # Stale signatures
PurgeNeutral    90      # Tokens with neutralish probabilities
PurgeUnused     90      # Unused tokens
PurgeHapaxes    30      # Tokens with less than 5 hits (hapaxes)
PurgeHits1S     15      # Tokens with only 1 spam hit
PurgeHits1I     15      # Tokens with only 1 innocent hit
LocalMX 127.0.0.1
SystemLog       on
UserLog         on
Opt in
TrackSources spam nonspam
ServerPort              20024
ServerQueueSize         40
ServerPID               /var/run/dspam/dspam.pid
ServerMode auto
ServerPass.ucmailfarm1  "pass"
ServerPass.ucmailfarm2  "pass"
ServerPass.CSICyrus     "pass"
ServerParameters        "--deliver=innocent,spam -d %u"
ServerIdent             "localhost.localdomain"
ProcessorURLContext on
ProcessorBias on
StripRcptDomain off


In purge-4.1.sql I set:
SET @TrainingMode    = 'TUM';      -- Default training mode
SET @PurgeSignatures = 14;          -- Stale signatures
SET @PurgeUnused     = 90;          -- Unused tokens
SET @PurgeHapaxes    = 30;          -- Tokens with less than 5 hits (hapaxes)
SET @PurgeHits1S     = 15;          -- Tokens with only 1 spam hit
SET @PurgeHits1I     = 15;          -- Tokens with only 1 innocent hit
SET @today           = to_days(current_date());



Could you help me to improve db performances?

Thank you very much
Best Regards
Marco


------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Dspam-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspam-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Dspam-user] DSPAM Mysql low performances and locks

Reindl Harald


Am 02.07.2014 10:00, schrieb Marco:

> I have a slow performance on my dspam-mysql conf. The purge nightly  
> script take 7-12 hours to complete. During this time I see a lot of  
> locks
>
> [07/02/2014 09:28:35] 32082: Lock wait timeout exceeded; try  
> restarting transaction: UPDATE dspam_token_data SET  
> last_hit=CURRENT_DATE(),innocent_hits=innocent_hits+1 WHERE uid=715  
> AND token IN  
> ('5719708924817070673','15748594722301702351','12523852096982625759','15747193994780569807','7072984635321818198','8566666067201058025','3214617105374076928','667817407838676578','2746530540300936818','7855920995652397412','5465147151345906752','5252602293815960809','12797608577848435519','7966374547679573945','12661687167322745101','1476367451104814719','15905813578389094591','7592741750694013197','15748943523967104959',
>
> Could you help me to improve db performances?
http://mysqltuner.com/ would be a good start

in case of innodb you should have a lot of RAM and in the best case
your whole dataset fit into innodb_buffer_pool_size


------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Dspam-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspam-user

signature.asc (254 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Dspam-user] DSPAM Mysql low performances and locks

Phil Stracchino
In reply to this post by Marco
On 07/02/14 04:00, Marco wrote:
> Hello,
>   I have a slow performance on my dspam-mysql conf. The purge nightly  
> script take 7-12 hours to complete. During this time I see a lot of  
> locks

First and most obvious question:  Are your tables InnoDB or MyISAM?
MyISAM should not be used in any production DB unless you absolutely
require MyISAM features not supported by InnoDB.


--
  Phil Stracchino
  Babylon Communications
  [hidden email]
  [hidden email]
  Landline: 603.293.8485

------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Dspam-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspam-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Dspam-user] DSPAM Mysql low performances and locks

ktm@rice.edu
In reply to this post by Marco
On Wed, Jul 02, 2014 at 10:00:52AM +0200, Marco wrote:

> Hello,
>   I have a slow performance on my dspam-mysql conf. The purge nightly  
> script take 7-12 hours to complete. During this time I see a lot of  
> locks
>
> [07/02/2014 09:28:35] 32082: Lock wait timeout exceeded; try  
> restarting transaction: UPDATE dspam_token_data SET  
> last_hit=CURRENT_DATE(),innocent_hits=innocent_hits+1 WHERE uid=715  
> AND token IN  
> ('5719708924817070673','15748594722301702351','12523852096982625759','15747193994780569807','7072984635321818198','8566666067201058025','3214617105374076928','667817407838676578','2746530540300936818','7855920995652397412','5465147151345906752','5252602293815960809','12797608577848435519','7966374547679573945','12661687167322745101','1476367451104814719','15905813578389094591','7592741750694013197','15748943523967104959',
>
> During these locks DSPAM fails to analize the mails:
>
> 2014-07-02T09:28:14.901941+02:00 02ucas amavis[32751]: (32751-06)  
> spam-tag, <OMISSIS@OMISSIS> -> <OMISSIS@OMISSIS>, No, score=-1.275  
> tagged_above=-999 required=4.5 tests=[DSPAM_ERROR=0.1,  
> FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7,  
> RCVD_IN_MSPIKE_H2=-0.676, SPF_PASS=-0.001] autolearn=disabled
> 2014-07-02T09:28:30.486044+02:00 01ucas amavis[21712]: (21712-18)  
> spam-tag, <OMISSIS@OMISSIS> -> <OMISSIS@OMISSIS>, No, score=-4.93  
> tagged_above=-999 required=4.5 tests=[DSPAM_ERROR=0.1,  
> HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_HI=-5,  
> RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001,  
> T_RP_MATCHES_RCVD=-0.01] autolearn=disabled
> 2014-07-02T09:28:39.428388+02:00 01ucas amavis[28266]: (28266-03)  
> spam-tag, <OMISSIS@OMISSIS> -> <OMISSIS@OMISSIS>, No, score=-0.617  
> tagged_above=-999 required=4.5 tests=[DSPAM_ERROR=0.1,  
> FREEMAIL_FORGED_FROMDOMAIN=0.001, FREEMAIL_FROM=0.001,  
> HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001,  
> RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01,  
> RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled
> 2014-07-02T09:28:44.233983+02:00 03ucas amavis[25593]: (25593-09)  
> spam-tag, <OMISSIS@OMISSIS> -> <OMISSIS@OMISSIS>, No, score=2.8  
> tagged_above=-999 required=4.5 tests=[DSPAM_ERROR=0.1,  
> HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_IMAGE_ONLY_28=0.726,  
> HTML_IMAGE_RATIO_02=0.805, HTML_MESSAGE=0.001, MIME_HTML_ONLY=1.105,  
> RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=-1,  
> RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001,  
> T_RP_MATCHES_RCVD=-0.01, URIBL_GREY=1.084] autolearn=disabled
> 2014-07-02T09:28:45.563818+02:00 03ucas amavis[25462]: (25462-11)  
> spam-tag, <OMISSIS@OMISSIS> -> <OMISSIS@OMISSIS>, No, score=-0.814  
> tagged_above=-999 required=4.5 tests=[DSPAM_ERROR=0.1,  
> MISSING_DATE=1.396, RCVD_IN_DNSWL_MED=-2.3, T_RP_MATCHES_RCVD=-0.01]  
> autolearn=disabled
>

Hi Marco,

Your best bet is to increase the granularity of the SQL statements that
you are running to make the time taken per piece short enough to prevent
scanning problems. We currently process by first letter of the account
to break it into 26 smaller pieces. This is with DSPAM 3.6.8 w/ MySQL and
MYISM tables. We are moving to DSPAM 3.10.2 w/ PostgreSQL and a partitioned
set of backend tables to allow us to process each piece independently. You
may be able to use partitioning with MySQL, but I am not familiar enough
with it to be certain. Breaking the query up will definitely work since we
do that here for the same reason.

Regards,
Ken

------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Dspam-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspam-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Dspam-user] DSPAM Mysql low performances and locks

Marco
In reply to this post by Phil Stracchino

> First and most obvious question:  Are your tables InnoDB or MyISAM?
> MyISAM should not be used in any production DB unless you absolutely
> require MyISAM features not supported by InnoDB.

Ops, yes: all tables are InnoDB.
Thanks
Marco




------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Dspam-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspam-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Dspam-user] DSPAM Mysql low performances and locks

Marco
In reply to this post by ktm@rice.edu

> Your best bet is to increase the granularity of the SQL statements that
> you are running to make the time taken per piece short enough to prevent
> scanning problems. We currently process by first letter of the account
> to break it into 26 smaller pieces. This is with DSPAM 3.6.8 w/ MySQL and
> MYISM tables. We are moving to DSPAM 3.10.2 w/ PostgreSQL and a partitioned
> set of backend tables to allow us to process each piece independently. You
> may be able to use partitioning with MySQL, but I am not familiar enough
> with it to be certain. Breaking the query up will definitely work since we
> do that here for the same reason.

Hello Ken,

  thank you for these useful hints.
This leave me a little confused. I can use partitioning with MySQL,  
but I think that it is more efficient with MyISAM. However INNODB is  
strongly recommended in production DB.
I would like to know if partitioning with INNODB could mitigate the  
locks problem.

Do you use only partitioning, or do you have also modified the query  
of dspam crontab script in order to group statements?


I'm also wondering if it could be a good choice to switch to TOE mode.  
I already use amavis+spamassassin, and dspam should be used only to  
add a positive score to mails trained as spam by users... I don't know  
if this could reduce size on my large dspam_token_data table...

  Thanks a lot
Best Regards
Marco


------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Dspam-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspam-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Dspam-user] DSPAM Mysql low performances and locks

ktm@rice.edu
On Thu, Jul 03, 2014 at 02:22:36PM +0200, Marco wrote:

>
> > Your best bet is to increase the granularity of the SQL statements that
> > you are running to make the time taken per piece short enough to prevent
> > scanning problems. We currently process by first letter of the account
> > to break it into 26 smaller pieces. This is with DSPAM 3.6.8 w/ MySQL and
> > MYISM tables. We are moving to DSPAM 3.10.2 w/ PostgreSQL and a partitioned
> > set of backend tables to allow us to process each piece independently. You
> > may be able to use partitioning with MySQL, but I am not familiar enough
> > with it to be certain. Breaking the query up will definitely work since we
> > do that here for the same reason.
>
> Hello Ken,
>
>   thank you for these useful hints.
> This leave me a little confused. I can use partitioning with MySQL,  
> but I think that it is more efficient with MyISAM. However INNODB is  
> strongly recommended in production DB.
> I would like to know if partitioning with INNODB could mitigate the  
> locks problem.
>
> Do you use only partitioning, or do you have also modified the query  
> of dspam crontab script in order to group statements?
>
>
> I'm also wondering if it could be a good choice to switch to TOE mode.  
> I already use amavis+spamassassin, and dspam should be used only to  
> add a positive score to mails trained as spam by users... I don't know  
> if this could reduce size on my large dspam_token_data table...
>
>   Thanks a lot
> Best Regards
> Marco
>

Hi Marco,

First, you really, really, REALLY need to only use TOE. The biggest cause
of poor performance for our users is the result of using any other training
method. You need to disallow changing it from TOE and default to it. We
currently use a SQL purge that is adjusted to only process a subset of the
users per query. We use the first letter of the user name, but anything that
partitions the query into smaller pieces will work. The actual DB partitioning
allows maintenance to work on smaller pieces at a time, but that will be an
optimization we are including when we update.

Regards,
Ken

------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Dspam-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspam-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Dspam-user] DSPAM Mysql low performances and locks

P.V.Anthony
In reply to this post by Marco
On 07/03/2014 08:22 PM, Marco wrote:

> I'm also wondering if it could be a good choice to switch to TOE mode.

I am using TOE for about a year now. It is great.

P.V.Anthony


------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Dspam-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspam-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Dspam-user] DSPAM Mysql low performances and locks

P.V.Anthony
In reply to this post by ktm@rice.edu
On 07/03/2014 09:01 PM, [hidden email] wrote:

> First, you really, really, REALLY need to only use TOE. The biggest cause
> of poor performance for our users is the result of using any other training
> method. You need to disallow changing it from TOE and default to it.

+1

P.V.Anthony

------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Dspam-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspam-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Dspam-user] DSPAM Mysql low performances and locks

Marco
In reply to this post by ktm@rice.edu
ktm <at> rice.edu wrote:
> We currently use a SQL purge that is adjusted to only process a subset of the
> users per query. We use the first letter of the user name, but anything that
> partitions the query into smaller pieces will work. The actual DB  
> partitioning
> allows maintenance to work on smaller pieces at a time, but that will be an
> optimization we are including when we update.

Hello,

  we made a table optimization and now purge-maintenance has greatly improved.

I also switch to TOE, it seems this is really good and I will evaluate  
how db size will change.

In future we could consider to:

* reduce range of users per query.
* implement native mysql partitioning. But in this case it seems we  
should back to MyISAM to improve partitioning. Uhm...
* limit dspam to users who really want to retrain.

Thank you very much for all hints
Marco


------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Dspam-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspam-user
Loading...