Mail::SpamAssassin::BayesStore::SQL
Mail::SpamAssassin::BUserSContributed)PMail::SpamAssassin::BayesStore::SQL(3)
NAME
Mail::SpamAssassin::BayesStore::SQL - SQL Bayesian Storage Module
Implementation
SYNOPSIS
DESCRIPTION
This module implementes a SQL based bayesian storage module.
METHODS
new
public class (Mail::SpamAssassin::BayesStore::SQL) new (Mail::Spamas-
sassin::Bayes $bayes)
Description: This methods creates a new instance of the Mail::SpamAs-
sassin::BayesStore::SQL object. It expects to be passed an instance
of the Mail::SpamAssassin:Bayes object which is passed into the
Mail::SpamAssassin::BayesStore parent object.
This method sets up the database connection and determines the user-
name to use in queries.
tie_db_readonly
public instance (Boolean) tie_db_readonly ();
Description: This method ensures that the database connection is prop-
erly setup and working. If necessary it will initialize a user’s
bayes variables so that they can begin using the database immediately.
tie_db_writable
public instance (Boolean) tie_db_writable ()
Description: This method ensures that the database connetion is prop-
erly setup and working. If necessary it will initialize a users bayes
variables so that they can begin using the database immediately.
untie_db
public instance () untie_db ()
Description: This method is unused for the SQL based implementation.
calculate_expire_delta
public instance (%) calculate_expire_delta (Integer $newest_atime,
Integer $start,
Integer $max_expire_mult)
Description: This method performs a calculation on the data to deter-
mine the optimum atime for token expiration.
token_expiration
public instance (Integer, Integer,
Integer, Integer) token_expiration(\% $opts,
Integer $newdelta,
@ @vars)
Description: This method performs the database specific expiration of
tokens based on the passed in $newdelta and @vars.
sync_due
public instance (Boolean) sync_due ()
Description: This method determines if a database sync is currently
required.
Unused for SQL based implementation.
seen_get
public instance (String) seen_get (string $msgid)
Description: This method retrieves the stored value, if any, for
$msgid. The return value is the stored string (’s’ for spam and ’h’
for ham) or undef if $msgid is not found.
seen_put
public (Boolean) seen_put (string $msgid, char $flag)
Description: This method records $msgid as the type given by $flag.
$flag is one of two values ’s’ for spam and ’h’ for ham.
seen_delete
public instance (Boolean) seen_delete (string $msgid)
Description: This method removes $msgid from the database.
get_storage_variables
public instance (@) get_storage_variables ()
Description: This method retrieves the various administrative vari-
ables used by the Bayes process and database.
The values returned in the array are in the following order:
0: scan count base
1: number of spam
2: number of ham
3: number of tokens in db
4: last expire atime
5: oldest token in db atime
6: db version value
7: last journal sync
8: last atime delta
9: last expire reduction count
10: newest token in db atime
dump_db_toks
public instance () dump_db_toks (String $template, String $regex,
Array @vars)
Description: This method loops over all tokens, computing the proba-
bility for the token and then printing it out according to the passed
in token.
set_last_expire
public instance (Boolean) set_last_expire (Integer $time)
Description: This method sets the last expire time.
get_running_expire_tok
public instance (String $time) get_running_expire_tok ()
Description: This method determines if an expire is currently running
and returns the last time set.
There can be multiple times, so we just pull the greatest (most
recent) value.
set_running_expire_tok
public instance (String $time) set_running_expire_tok ()
Description: This method sets the time that an expire starts running.
remove_running_expire_tok
public instance (Boolean) remove_running_expire_tok ()
Description: This method removes the row in the database that indi-
cates that and expire is currently running.
tok_get
public instance (Integer, Integer, Integer) tok_get (String $token)
Description: This method retrieves a specificed token ($token) from
the database and returns it’s spam_count, ham_count and last access
time.
tok_get_all
public instance (\@) tok_get (@ $tokens)
Description: This method retrieves the specified tokens ($tokens) from
storage and returns an array ref of arrays spam count, ham acount and
last access time.
tok_count_change
public instance (Boolean) tok_count_change (Integer $spam_count,
Integer $ham_count,
String $token,
String $atime)
Description: This method takes a $spam_count and $ham_count and adds
it to $tok along with updating $toks atime with $atime.
multi_tok_count_change
public instance (Boolean) multi_tok_count_change (Integer $spam_count,
Integer $ham_count,
\% $tokens,
String $atime)
Description: This method takes a $spam_count and $ham_count and adds
it to all of the tokens in the $tokens hash ref along with updating
each tokens atime with $atime.
nspam_nham_get
public instance ($spam_count, $ham_count) nspam_nham_get ()
Description: This method retrieves the total number of spam and the
total number of ham learned.
nspam_nham_change
public instance (Boolean) nspam_nham_change (Integer $num_spam,
Integer $num_ham)
Description: This method updates the number of spam and the number of
ham in the database.
tok_touch
public instance (Boolean) tok_touch (String $token,
String $atime)
Description: This method updates the given tokens ($token) atime.
The assumption is that the token already exists in the database.
tok_touch_all
public instance (Boolean) tok_touch (\@ $tokens
String $atime)
Description: This method does a mass update of the given list of
tokens $tokens, if the existing token atime is < $atime.
The assumption is that the tokens already exist in the database.
We should never be touching more than N_SIGNIFICANT_TOKENS, so we can
make some assumptions about how to handle the data (ie no need to
batch like we do in tok_get_all)
cleanup
public instance (Boolean) cleanup ()
Description: This method peroms any cleanup necessary before moving
onto the next operation.
get_magic_re
public instance get_magic_re (String)
Description: This method returns a regexp which indicates a magic
token.
Unused in SQL implementation.
sync
public instance (Boolean) sync (\% $opts)
Description: This method performs a sync of the database
perform_upgrade
public instance (Boolean) perform_upgrade (\% $opts);
Description: Performs an upgrade of the database from one version to
another, not currently used in this implementation.
clear_database
public instance (Boolean) clear_database ()
Description: This method deletes all records for a particular user.
Callers should be aware that any errors returned by this method could
causes the database to be inconsistent for the given user.
backup_database
public instance (Boolean) backup_database ()
Description: This method will dump the users database in a marchine
readable format.
restore_database
public instance (Boolean) restore_database (String $filename, Boolean
$showdots)
Description: This method restores a database from the given filename,
$filename.
Callers should be aware that any errors returned by this method could
causes the database to be inconsistent for the given user.
db_readable
public instance (Boolean) db_readable()
Description: This method returns a boolean value indicating if the
database is in a readable state.
db_writable
public instance (Boolean) db_writeable()
Description: This method returns a boolean value indicating if the
database is in a writable state.
Private Methods
_connect_db
private instance (Boolean) _connect_db ()
Description: This method connects to the SQL database.
_get_db_version
private instance (Integer) _get_db_version ()
Description: Gets the current version of the database from the special
global vars tables.
_initialize_db
private instance (Boolean) _initialize_db ()
Description: This method will check to see if a user has had their
bayes variables initialized. If not then it will perform this initial-
ization.
_put_token
private instance (Boolean) _put_token (string $token,
integer $spam_count,
integer $ham_count,
string $atime)
Description: This method performs the work of either inserting or
updating a token in the database.
_put_tokens
private instance (Boolean) _put_tokens (\% $tokens,
integer $spam_count,
integer $ham_count,
string $atime)
Description: This method performs the work of either inserting or
updating tokens in the database.
_get_oldest_token_age
private instance (Integer) _get_oldest_token_age ()
Description: This method finds the atime of the oldest token in the
database.
The use of min(atime) in the SQL is ugly and but really the most effi-
cient way of getting the oldest_token_age after we’ve done a mass
expire. It should only be called at expire time.
_get_num_hapaxes
private instance (Integer) _get_num_hapaxes ()
Description: This method gets the total number of hapaxes (spam_count
+ ham_count == 1) in the token database for a user.
_get_num_lowfreq
private instance (Integer) _get_num_lowfreq ()
Description: This method gets the total number of lowfreq tokens
(spam_count < 8 and ham_count < 8) in the token database for a user
_token_select_string
private instance (String) _token_select_string
Description: This method returns the string to be used in SELECT
statements to represent the token column.
The default is to use the RPAD function to pad the token out to 5
characters.
perl v5.8.8 2008-Mail::SpamAssassin::BayesStore::SQL(3)