Tie::RDBM

TriggerTek Logo
abcdefghijklmnopqrstuvwxyz_
Tie::RDBM(3)	     User Contributed Perl Documentation	 Tie::RDBM(3)



NAME
       Tie::RDBM - Tie hashes to relational databases

SYNOPSIS
	 use Tie::RDBM;
	 tie %h,’Tie::RDBM’,’mysql:test’,{table=>’Demo’,create=>1,autocommit=>0};
	 $h{’key1’} = ’Some data here’;
	 $h{’key2’} = 42;
	 $h{’key3’} = { complex=>[’data’,’structure’,’here’],works=>’true’ };
	 $h{’key4’} = new Foobar(’Objects work too’);
	 print $h{’key3’}->{complex}->[0];
	 tied(%h)->commit;
	 untie %h;

DESCRIPTION
       This module allows you to tie Perl associative arrays (hashes) to SQL
       databases using the DBI interface.  The tied hash is associated with a
       table in a local or networked database.	One field of the table
       becomes the hash key, and another becomes the value.  Once tied, all
       the standard hash operations work, including iteration over keys and
       values.

       If you have the Storable module installed, you may store arbitrarily
       complex Perl structures (including objects) into the hash and later
       retrieve them.  When used in conjunction with a network-accessible
       database, this provides a simple way to transmit data structures
       between Perl programs on two different machines.

TIEING A DATABASE
	  tie %VARIABLE,Tie::RDBM,DSN [,\%OPTIONS]

       You tie a variable to a database by providing the variable name, the
       tie interface (always "Tie::RDBM"), the data source name, and an
       optional hash reference containing various options to be passed to the
       module and the underlying database driver.

       The data source may be a valid DBI-style data source string of the
       form "dbi:driver:database_name[:other information]", or a previously-
       opened database handle.	See the documentation for DBI and your DBD
       driver for details.  Because the initial "dbi" is always present in
       the data source, Tie::RDBM will automatically add it for you.

       The options array contains a set of option/value pairs.	If not pro-
       vided, defaults are assumed.  The options are:

       user [’’]
	   Account name to use for database authentication, if necessary.
	   Default is an empty string (no authentication necessary).

       password [’’]
	   Password to use for database authentication, if necessary.
	   Default is an empty string (no authentication necessary).

       db [’’]
	   The data source, if not provided in the argument.  This allows an
	   alternative calling style:

	      tie(%h,Tie::RDBM,{db=>’dbi:mysql:test’,create=>1};

       table [’pdata’]
	   The name of the table in which the hash key/value pairs will be
	   stored.

       key [’pkey’]
	   The name of the column in which the hash key will be found.	If
	   not provided, defaults to "pkey".

       value [’pvalue’]
	   The name of the column in which the hash value will be found.  If
	   not provided, defaults to "pvalue".

       frozen [’pfrozen’]
	   The name of the column that stores the boolean information indi-
	   cating that a complex data structure has been "frozen" using
	   Storable’s freeze() function.  If not provided, defaults to
	   "pfrozen".

	   NOTE: if this field is not present in the database table, or if
	   the database is incapable of storing binary structures, Storable
	   features will be disabled.

       create [0]
	   If set to a true value, allows the module to create the database
	   table if it does not already exist.	The module emits a CREATE TA-
	   BLE command and gives the key, value and frozen fields the data
	   types most appropriate for the database driver (from a lookup ta-
	   ble maintained in a package global, see DATATYPES below).

	   The success of table creation depends on whether you have table
	   create access for the database.

	   The default is not to create a table.  tie() will fail with a
	   fatal error.

       drop [0]
	   If the indicated database table exists, but does not have the
	   required key and value fields, Tie::RDBM can try to add the
	   required fields to the table.  Currently it does this by the dras-
	   tic expedient of DROPPING the table entirely and creating a new
	   empty one.  If the drop option is set to true, Tie::RDBM will per-
	   form this radical restructuring.  Otherwise tie() will fail with a
	   fatal error.	 "drop" implies "create".  This option defaults to
	   false.

	   A future version of Tie::RDBM may implement a last radical
	   restructuring method; differences in DBI drivers and database
	   capabilities make this task harder than it would seem.

       autocommit [1]
	   If set to a true value, the "autocommit" option causes the
	   database driver to commit after every store statement.  If set to
	   a false value, this option will not commit to the database until
	   you explicitly call the Tie::RDBM commit() method.

	   The autocommit option defaults to true.

       DEBUG [0]
	   When the "DEBUG" option is set to a true value the module will
	   echo the contents of SQL statements and other debugging informa-
	   tion to standard error.

USING THE TIED ARRAY
       The standard fetch, store, keys(), values() and each() functions will
       work as expected on the tied array.  In addition, the following meth-
       ods are available on the underlying object, which you can obtain with
       the standard tie() operator:

       commit()
	      (tied %h)->commit();

	   When using a database with the autocommit option turned off, val-
	   ues that are stored into the hash will not become permanent until
	   commit() is called.	Otherwise they are lost when the application
	   terminates or the hash is untied.

	   Some SQL databases don’t support transactions, in which case you
	   will see a warning message if you attempt to use this function.

       rollback()
	      (tied %h)->rollback();

	   When using a database with the autocommit option turned off, this
	   function will roll back changes to the database to the state they
	   were in at the last commit().  This function has no effect on
	   database that don’t support transactions.

DATABASES AND DATATYPES
       Perl is a weakly typed language.	 Databases are strongly typed.	When
       translating from databases to Perl there is inevitably some data type
       conversion that you must worry about.  I have tried to keep the
       details as transparent as possible without sacrificing power; this
       section discusses the tradeoffs.

       If you wish to tie a hash to a preexisting database, specify the
       database name, the table within the database, and the fields you wish
       to use for the keys and values.	These fields can be of any type that
       you choose, but the data type will limit what can be stored there.
       For example, if the key field is of type "int", then any numeric value
       will be a valid key, but an attempt to use a string as a key will
       result in a run time error.  If a key or value is too long to fit into
       the data field, it will be truncated silently.

       For performance reasons, the key field should be a primary key, or at
       least an indexed field.	It should also be unique.  If a key is
       present more than once in a table, an attempt to fetch it will return
       the first record found by the SQL select statement.

       If you wish to store Perl references in the database, the module needs
       an additional field in which it can store a flag indicating whether
       the data value is a simple or a complex type.  This "frozen" field is
       treated as a boolean value.  A "tinyint" data type is recommended, but
       strings types will work as well.

       In a future version of this module, the "frozen" field may be turned
       into a general "datatype" field in order to minimize storage.  For
       future compatability, please use an integer for the frozen field.

       If you use the "create" and/or "drop" options, the module will auto-
       matically attempt to create a table for its own use in the database if
       a suitable one isn’t found.  It uses information defined in the pack-
       age variable %Tie::RDBM::Types to determine what kind of data types to
       create.	This variable is indexed by database driver.  Each index con-
       tains a four-element array indicating what data type to use for each
       of the key, value and frozen fields, and whether the database can sup-
       port binary types.  Since I have access to only a limited number of
       databases, the table is currently short:

	  Driver     Key Field	    Value Field	    Frozen Field  Binary?

	  mysq	     varchar(127)   longblob	    tinyint	  1
	  mSQL	     char(255)	    char(255)	    int		  0
	  Sybase     varchar(255)   varbinary(255)  tinyint	  1
	  default    varchar(255)   varbinary(255)  tinyint	  1

       The "default" entry is used for any driver not specifically mentioned.

       You are free to add your own entries to this table, or make correc-
       tions.  Please send me e-mail with any revisions you make so that I
       can share the wisdom.

STORABLE CAVEATS
       Because the Storable module packs Perl structures in a binary format,
       only those databases that support a "varbinary" or "blob" type can
       handle complex datatypes.  Furthermore, some databases have strict
       limitations on the size of these structures.  For example, SyBase and
       MS SQL Server have a "varbinary" type that maxes out at 255 bytes.
       For structures larger than this, the databases provide an "image" type
       in which storage is allocated in 2K chunks!  Worse, access to this
       image type uses a non-standard SQL extension that is not supported by
       DBI.

       Databases that do not support binary fields cannot use the Storable
       feature.	 If you attempt to store a reference to a complex data type
       in one of these databases it will be converted into strings like
       "HASH(0x8222cf4)", just as it would be if you tried the same trick
       with a conventional tied DBM hash.  If the database supports binary
       fields of restricted length, large structures may be silently trun-
       cated.  Caveat emptor.

       It’s also important to realize the limitations of the Storable mecha-
       nism.  You can store and retrieve entire data structures, but you
       can’t twiddle with individual substructures and expect them to persist
       when the process exits.	To update a data structure, you must fetch it
       from the hash, make the desired modifications, then store it back into
       the hash, as the example below shows:

       Process #1:
	  tie %h,’Tie::RDBM’,’mysql:Employees:host.somewhere.com’,
			  {table=>’employee’,user=>’fred’,password=>’xyzzy’};
	  $h{’Anne’} = { office=>’999 Infinity Drive, Rm 203’,
			 age	=>  29,
			 salary =>  32100 };
	  $h{’Mark’} = { office=>’000 Iteration Circle, Rm -123’,
			 age	=>  32,
			 salary =>  35000 };

       Process #2:
	  tie %i,’Tie::RDBM’,’mysql:Employees:host.somewhere.com’,
			  {table=>’employee’,user=>’george’,pass-
       word=>’kumquat2’};
	  foreach (keys %i) {
	     $info = $i{$_};
	     if ($info->{age} > 30) {
		# Give the oldies a $1000 raise
		$info->{salary} += 1000;
		$i{$_} = $info;
	     }
	  }

       This example also demonstrates how two Perl scripts running on differ-
       ent machines can use Tie::RDBM to share complex data structures (in
       this case, the employee record) without resorting to sockets, remote
       procedure calls, shared memory, or other gadgets

PERFORMANCE
       What is the performance hit when you use this module?  It can be sig-
       nificant.  I used a simple benchmark in which Perl parsed a 6180 word
       text file into individual words and stored them into a database,
       incrementing the word count with each store.  The benchmark then read
       out the words and their counts in an each() loop.  The database driver
       was mySQL, running on a 133 MHz Pentium laptop with Linux 2.0.30.  I
       compared Tie::RDBM, to DB_File, and to the same task using vanilla DBI
       SQL statements.	The results are shown below:

		     STORE	 EACH() LOOP
	 Tie::RDBM     28 s	   2.7	s
	 Vanilla DBI   15 s	   2.0	s
	 DB_File	3 s	   1.08 s

       During stores, there is an approximately 2X penalty compared to
       straight DBI, and a 15X penalty over using DB_File databases.  For the
       each() loop (which is dominated by reads), the performance is 2-3
       times worse than DB_File and much worse than a vanilla SQL statement.
       I have not investigated the bottlenecks.

TO DO LIST
	  - Store strings, numbers and data structures in separate
	    fields for space and performance efficiency.

	   - Expand data types table to other database engines.

	   - Catch internal changes to data structures and write them into
	     database automatically.

BUGS
       Yes.

AUTHOR
       Lincoln Stein, lstein@w3.org

COPYRIGHT
	 Copyright (c) 1998, Lincoln D. Stein

       This library is free software; you can redistribute it and/or modify
       it under the same terms as Perl itself.

AVAILABILITY
       The latest version can be obtained from:

	  http://www.genome.wi.mit.edu/~lstein/Tie-DBM/

SEE ALSO
       perl(1), DBI(3), Storable(3)



perl v5.8.8			  2005-03-07			 Tie::RDBM(3)