XML::Twig

TriggerTek Logo
abcdefghijklmnopqrstuvwxyz_
Twig(3)		     User Contributed Perl Documentation	      Twig(3)



NAME
       XML::Twig - A perl module for processing huge XML documents in tree
       mode.

SYNOPSIS
       Small documents (loaded in memory as a tree):

	 my $twig=XML::Twig->new();    # create the twig
	 $twig->parsefile( ’doc.xml’); # build it
	 my_process( $twig);	       # use twig methods to process it
	 $twig->print;		       # output the twig

       Huge documents (processed in combined stream/tree mode):

	 # at most one div will be loaded in memory
	 my $twig=XML::Twig->new(
	   twig_handlers =>
	     { title   => sub { $_->set_gi( ’h2’) }, # change title tags to h2
	       para    => sub { $_->set_gi( ’p’)  }, # change para to p
	       hidden  => sub { $_->delete;	  }, # remove hidden elements
	       list    => \&my_list_process,	     # process list elements
	       div     => sub { $_[0]->flush;	  }, # output and free memory
	     },
	   pretty_print => ’indented’,		     # output will be nicely formatted
	   empty_tags	=> ’html’,		     # outputs <empty_tag />
				);
	   $twig->flush;			     # flush the end of the document

       See XML::Twig 101 for other ways to use the module, as a filter for
       example

       Note that this documentation is intended as a reference to the module.
       A tutorial is available at http://www.xmltwig.com/xmltwig/tuto-
       rial/index.html and a FAQ is at
       http://www.xmltwig.com/xmltwig/XML-Twig-FAQ.html

DESCRIPTION
       This module provides a way to process XML documents. It is build on
       top of "XML::Parser".

       The module offers a tree interface to the document, while allowing you
       to output the parts of it that have been completely processed.

       It allows minimal resource (CPU and memory) usage by building the tree
       only for the parts of the documents that need actual processing,
       through the use of the "twig_roots " and "twig_print_outside_roots "
       options. The "finish " and "finish_print " methods also help to
       increase performances.

       XML::Twig tries to make simple things easy so it tries its best to
       takes care of a lot of the (usually) annoying (but sometimes neces-
       sary) features that come with XML and XML::Parser.

XML::Twig 101
       XML::Twig can be used either on "small" XML documents (that fit in
       memory) or on huge ones, by processing parts of the document and out-
       putting or discarding them once they are processed.

       Loading an XML document and processing it

	 my $t= XML::Twig->new();
	 $t->parse( ’<d><title>title</title><para>p 1</para><para>p 2</para></d>’);
	 my $root= $t->root;
	 $root->set_gi( ’html’);	       # change doc to html
	 $title= $root->first_child( ’title’); # get the title
	 $title->set_gi( ’h1’);		       # turn it into h1
	 my @para= $root->children( ’para’);   # get the para children
	 foreach my $para (@para)
	   { $para->set_gi( ’p’); }	       # turn them into p
	 $t->print;			       # output the document

       Other useful methods include:

       att: "$elt->{’att’}->{’foo’}" return the "foo" attribute for an ele-
       ment,

       set_att : "$elt->set_att( foo => "bar")" sets the "foo" attribute to
       the "bar" value,

       next_sibling: "$elt->{next_sibling}" return the next sibling in the
       document (in the example "$title->{next_sibling}" is the first "para",
       you can also (and actually should) use "$elt->next_sibling( ’para’)"
       to get it

       The document can also be transformed through the use of the cut, copy,
       paste and move methods: "$title->cut; $title->paste( after => $p);"
       for example

       And much, much more, see Elt.

       Processing an XML document chunk by chunk

       One of the strengths of XML::Twig is that it let you work with files
       that do not fit in memory (BTW storing an XML document in memory as a
       tree is quite memory-expensive, the expansion factor being often
       around 10).

       To do this you can define handlers, that will be called once a spe-
       cific element has been completely parsed. In these handlers you can
       access the element and process it as you see fit, using the navigation
       and the cut-n-paste methods, plus lots of convenient ones like "prefix
       ".  Once the element is completely processed you can then "flush " it,
       which will output it and free the memory. You can also "purge " it if
       you don’t need to output it (if you are just extracting some data from
       the document for example). The handler will be called again once the
       next relevant element has been parsed.

	 my $t= XML::Twig->new( twig_handlers =>
				 { section => \&section,
				   para	  => sub { $_->set_tag( ’p’);
				 },
			      );
	 $t->parsefile( ’doc.xml’);
	 $t->flush; # don’t forget to flush one last time in the end or anything
		    # after the last </section> tag will not be output

	 # the handler is called once a section is completely parsed, ie when
	 # the end tag for section is found, it receives the twig itself and
	 # the element (including all its sub-elements) as arguments
	 sub section
	   { my( $t, $section)= @_;	 # arguments for all twig_handlers
	     $section->set_tag( ’div’);	 # change the tag name.4, my favourite method...
	     # let’s use the attribute nb as a prefix to the title
	     my $title= $section->first_child( ’title’); # find the title
	     my $nb= $title->{’att’}->{’nb’}; # get the attribute
	     $title->prefix( "$nb - ");	 # easy isn’t it?
	     $section->flush;		 # outputs the section and frees memory
	   }

       There is of course more to it: you can trigger handlers on more elabo-
       rate conditions than just the name of the element, "section/title" for
       example.

	 my $t= XML::Twig->new( twig_handlers =>
				  { ’section/title’ => sub { $_->print } }
			      )
			 ->parsefile( ’doc.xml’);

       Here "sub { $_->print }" simply prints the current element ($_ is
       aliased to the element in the handler).

       You can also trigger a handler on a test on an attribute:

	 my $t= XML::Twig->new( twig_handlers =>
			     { ’section[@level="1"]’ => sub { $_->print } }
			      );
			 ->parsefile( ’doc.xml’);

       You can also use "start_tag_handlers " to process an element as soon
       as the start tag is found. Besides "prefix " you can also use "suffix
       ",

       Processing just parts of an XML document

       The twig_roots mode builds only the required sub-trees from the docu-
       ment Anything outside of the twig roots will just be ignored:

	 my $t= XML::Twig->new(
	      # the twig will include just the root and selected titles
		  twig_roots   => { ’section/title’ => \&print_n_purge,
				    ’annex/title’   => \&print_n_purge
		  }
			     );
	 $t->parsefile( ’doc.xml’);

	 sub print_n_purge
	   { my( $t, $elt)= @_;
	     print $elt->text;	  # print the text (including sub-element texts)
	     $t->purge;		  # frees the memory
	   }

       You can use that mode when you want to process parts of a documents
       but are not interested in the rest and you don’t want to pay the
       price, either in time or memory, to build the tree for the it.

       Building an XML filter

       You can combine the "twig_roots" and the "twig_print_outside_roots"
       options to build filters, which let you modify selected elements and
       will output the rest of the document as is.

       This would convert prices in $ to prices in Euro in a document:

	 my $t= XML::Twig->new(
		  twig_roots   => { ’price’ => \&convert, },   # process prices
		  twig_print_outside_roots => 1,	       # print the rest
			     );
	 $t->parsefile( ’doc.xml’);

	 sub convert
	   { my( $t, $price)= @_;
	     my $currency=  $price->{’att’}->{’currency’};	    # get the currency
	     if( $currency eq ’USD’)
	       { $usd_price= $price->text;		       # get the price
		 # %rate is just a conversion table
		 my $euro_price= $usd_price * $rate{usd2euro};
		 $price->set_text( $euro_price);	       # set the new price
		 $price->set_att( currency => ’EUR’);	       # don’t forget this!
	       }
	     $price->print;				       # output the price
	   }

Simplifying XML processing
       Whitespaces
	   Whitespaces that look non-significant are discarded, this
	   behaviour can be controlled using the "keep_spaces ",
	   "keep_spaces_in " and "discard_spaces_in " options.

       Encoding
	   You can specify that you want the output in the same encoding as
	   the input (provided you have valid XML, which means you have to
	   specify the encoding either in the document or when you create the
	   Twig object) using the "keep_encoding " option

       Comments and Processing Instructions (PI)
	   Comments and PI’s can be hidden from the processing, but still
	   appear in the output (they are carried by the "real" element
	   closer to them)

       Pretty Printing
	   XML::Twig can output the document pretty printed so it is easier
	   to read for us humans.

       Surviving an untimely death
	   XML parsers are supposed to react violently when fed improper XML.
	   XML::Parser just dies.

	   XML::Twig provides the "safe_parse " and the "safe_parsefile "
	   methods which wrap the parse in an eval and return either the
	   parsed twig or 0 in case of failure.

       Private attributes
	   Attributes with a name starting with # (illegal in XML) will not
	   be output, so you can safely use them to store temporary values
	   during processing.

CLASSES
       XML::Twig uses a very limited number of classes. The ones you are most
       likely to use are "XML::Twig" of course, which represents a complete
       XML document, including the document itself (the root of the document
       itself is "root"), its handlers, its input or output filters... The
       other main class is "XML::Twig::Elt", which models an XML element.
       Element here has a very wide definition: it can be a regular element,
       or but also text, with an element "tag" of "#PCDATA" (or "#CDATA"), an
       entity (tag is "#ENT"), a Processing Instruction ("#PI"), a comment
       ("#COMMENT").

       Those are the 2 commonly used classes.

       You might want to look the "elt_class" option if you want to subclass
       "XML::Twig::Elt".

       Attributes are just attached to their parent element, they are not
       objects per se. (Please use the provided methods "att" and "set_att"
       to access them, if you access them as a hash, then your code becomes
       implementaion de[endant and might break in the future).

       Other classes that are seldom used are "XML::Twig::Entity_list" and
       "XML::Twig::Entity".

       If you use "XML::Twig::XPath" instead of "XML::Twig", elements are
       then created as "XML::Twig::XPath::Elt"

METHODS
       XML::Twig

       A twig is a subclass of XML::Parser, so all XML::Parser methods can be
       called on a twig object, including parse and parsefile.	"setHandlers"
       on the other hand cannot be used, see "BUGS "

       new This is a class method, the constructor for XML::Twig. Options are
	   passed as keyword value pairs. Recognized options are the same as
	   XML::Parser, plus some XML::Twig specifics.

	   New Options:

	   twig_handlers
	       This argument replaces the corresponding XML::Parser argument.
	       It consists of a hash "{ expression =" \&handler}> where
	       expression is a generic_attribute_condition, string_condition,
	       an attribute_condition,full_path, a partial_path, a gi,
	       _default_ or _all_.

	       The idea is to support a usefull but efficient (thus limited)
	       subset of XPATH. A fuller expression set will be supported in
	       the future, as users ask for more and as I manage to implement
	       it efficiently. This will never encompass all of XPATH due to
	       the streaming nature of parsing (no lookahead after the ele-
	       ment end tag).

	       A generic_attribute_condition is a condition on an attribute,
	       in the form "*[@att="val"]" or "*[@att]", simple quotes can be
	       used instead of double quotes and the leading ’*’ is actually
	       optional. No matter what the gi of the element is, the handler
	       will be triggered either if the attribute has the specified
	       value or if it just exists.

	       A string_condition is a condition on the content of an ele-
	       ment, in the form "gi[string()="foo"]", simple quotes can be
	       used instead of double quotes, at the moment you cannot escape
	       the quotes (this will be added as soon as I dig out my copy of
	       Mastering Regular Expressions from its storage box).  The text
	       returned is, as per what I (and Matt Sergeant!) understood
	       from the XPATH spec the concatenation of all the text in the
	       element, excluding all markup. Thus to call a handler on the
	       element"<p>text <b>bold</b></p>" the appropriate condition is
	       "p[string()="text bold"]". Note that this is not exactly con-
	       formant to the XPATH spec, it just tries to mimic it while
	       being still quite concise.

	       A extension of that notation is "gi[string(child_gi)="foo"]"
	       where the handler will be called if a child of a "gi" element
	       has a text value of "foo".  At the moment only direct children
	       of the "gi" element are checked.	 If you need to test on
	       descendants of the element let me know. The fix is trivial but
	       would slow down the checks, so I’d like to keep it the way it
	       is.

	       A regexp_condition is a condition on the content of an ele-
	       ment, in the form "gi[string()=~ /foo/"]". This is the same as
	       a string condition except that the text of the element is
	       matched to the regexp. The "i", "m", "s" and "o" modifiers can
	       be used on the regexp.

	       The "gi[string(child_gi)=~ /foo/"]" extension is also sup-
	       ported.

	       An attribute_condition is a simple condition of an attribute
	       of the current element in the form "gi[@att="val"]" (simple
	       quotes can be used instead of double quotes, you can escape
	       quotes either).	If several attribute_condition are true the
	       same element all the handlers can be called in turn (in the
	       order in which they were first defined).	 If the "="val"" part
	       is ommited ( the condition is then "gi[@att]") then the han-
	       dler is triggered if the attribute actually exists for the
	       element, no matter what it’s value is.

	       A full_path looks like ’/doc/section/chapter/title’, it starts
	       with a / then gives all the gi’s to the element. The handler
	       will be called if the path to the current element (in the
	       input document) is exactly as defined by the "full_path".

	       A partial_path is like a full_path except it does not start
	       with a /: ’chapter/title’ for example. The handler will be
	       called if the path to the element (in the input document) ends
	       as defined in the "partial_path".

	       WARNING: (hopefully temporary) at the moment "string_condi-
	       tion", "regexp_condition" and "attribute_condition" are only
	       supported on a simple gi, not on a path.

	       A gi (generic identifier) is just a tag name.

	       #CDATA can be used to call a handler for a CDATA.

	       A special gi _all_ is used to call a function for each ele-
	       ment.  The special gi _default_ is used to call a handler for
	       each element that does NOT have a specific handler.

	       The order of precedence to trigger a handler is:
	       generic_attribute_condition, string_condition, regexp_condi-
	       tion, attribute_condition, full_path, longer partial_path,
	       shorter partial_path, gi, _default_ .

	       Important: once a handler has been triggered if it returns 0
	       then no other handler is called, exept a "_all_" handler which
	       will be called anyway.

	       If a handler returns a true value and other handlers apply,
	       then the next applicable handler will be called. Repeat,
	       rince, lather..; The exception to that rule is when the
	       "do_not_chain_handlers" option is set, in which case only the
	       first handler will be called.

	       Note that it might be a good idea to explicitely return a
	       short true value (like 1) from handlers: this ensures that
	       other applicable handlers are called even if the last state-
	       ment for the handler happens to evaluate to false. This might
	       also speedup the code by avoiding the result of the last
	       statement of the code to be copied and passed to the code man-
	       aging handlers.	It can really pay to have 1 instead of a long
	       string returned.

	       When an element is CLOSED the corresponding handler is called,
	       with 2 arguments: the twig and the "/Element ". The twig
	       includes the document tree that has been built so far, the
	       element is the complete sub-tree for the element. This means
	       that handlers for inner elements are called before handlers
	       for outer elements.

	       $_ is also set to the element, so it is easy to write inline
	       handlers like

		 para => sub { $_->change_gi( ’p’); }

	       Text is stored in elements where gi is #PCDATA (due to mixed
	       content, text and sub-element in an element there is no way to
	       store the text as just an attribute of the enclosing element).

	       Warning: if you have used purge or flush on the twig the ele-
	       ment might not be complete, some of its children might have
	       been entirely flushed or purged, and the start tag might even
	       have been printed (by "flush") already, so changing its gi
	       might not give the expected result.

	       More generally, the full_path, partial_path and gi expressions
	       are evaluated against the input document. Which means that
	       even if you have changed the gi of an element (changing the gi
	       of a parent element from a handler for example) the change
	       will not impact the expression evaluation. Attributes in
	       attribute_condition are different though. As the initial value
	       of attribute is not stored the handler will be triggered if
	       the current attribute/value pair is found when the element end
	       tag is found. Although this can be quite confusing it should
	       not impact most of users, and allow others to play clever
	       tricks with temporary attributes. Let me know if this is a
	       problem for you.

	   twig_roots
	       This argument let’s you build the tree only for those elements
	       you are interested in.

		 Example: my $t= XML::Twig->new( twig_roots => { title => 1, subtitle => 1});
			  $t->parsefile( file);
			  my $t= XML::Twig->new( twig_roots => { ’section/title’ => 1});
			  $t->parsefile( file);

	       return a twig containing a document including only "title" and
	       "subtitle" elements, as children of the root element.

	       You can use generic_attribute_condition, attribute_condition,
	       full_path, partial_path, gi, _default_ and _all_ to trigger
	       the building of the twig.  string_condition and regexp_condi-
	       tion cannot be used as the content of the element, and the
	       string, have not yet been parsed when the condition is
	       checked.

	       WARNING: path are checked for the document. Even if the
	       "twig_roots" option is used they will be checked against the
	       full document tree, not the virtual tree created by XML::Twig

	       WARNING: twig_roots elements should NOT be nested, that would
	       hopelessly confuse XML::Twig ;--(

	       Note: you can set handlers (twig_handlers) using twig_roots
		 Example: my $t= XML::Twig->new( twig_roots =>
						  { title    => sub {
	       $_{1]->print;},
						    subtitle => \&pro-
	       cess_subtitle
						  }
					      );
			  $t->parsefile( file);

	   twig_print_outside_roots
	       To be used in conjunction with the "twig_roots" argument. When
	       set to a true value this will print the document outside of
	       the "twig_roots" elements.

		Example: my $t= XML::Twig->new( twig_roots => { title => \&number_title },
					       twig_print_outside_roots => 1,
					      );
			  $t->parsefile( file);
			  { my $nb;
			  sub number_title
			    { my( $twig, $title);
			      $nb++;
			      $title->prefix( "$nb "; }
			      $title->print;
			    }
			  }

	       This example prints the document outside of the title element,
	       calls "number_title" for each "title" element, prints it, and
	       then resumes printing the document. The twig is built only for
	       the "title" elements.

	       If the value is a reference to a file handle then the document
	       outside the "twig_roots" elements will be output to this file
	       handle:

		 open( OUT, ">out_file") or die "cannot open out file out_file:$!";
		 my $t= XML::Twig->new( twig_roots => { title => \&number_title },
					# default output to OUT
					twig_print_outside_roots => \*OUT,
				      );

			{ my $nb;
			  sub number_title
			    { my( $twig, $title);
			      $nb++;
			      $title->prefix( "$nb "; }
			      $title->print( \*OUT);	# you have to print to \*OUT here
			    }
			  }

	   start_tag_handlers
	       A hash "{ expression =" \&handler}>. Sets element handlers
	       that are called when the element is open (at the end of the
	       XML::Parser "Start" handler). The handlers are called with 2
	       params: the twig and the element. The element is empty at that
	       point, its attributes are created though.

	       You can use generic_attribute_condition, attribute_condition,
	       full_path, partial_path, gi, _default_  and _all_ to trigger
	       the handler.

	       string_condition and regexp_condition cannot be used as the
	       content of the element, and the string, have not yet been
	       parsed when the condition is checked.

	       The main uses for those handlers are to change the tag name
	       (you might have to do it as soon as you find the open tag if
	       you plan to "flush" the twig at some point in the element, and
	       to create temporary attributes that will be used when process-
	       ing sub-element with "twig_hanlders".

	       You should also use it to change tags if you use "flush". If
	       you change the tag in a regular "twig_handler" then the start
	       tag might already have been flushed.

	       Note: "start_tag" handlers can be called outside of
	       "twig_roots" if this argument is used, in this case handlers
	       are called with the following arguments: $t (the twig), $gi
	       (the gi of the element) and %att (a hash of the attributes of
	       the element).

	       If the "twig_print_outside_roots" argument is also used then
	       the start tag will be printed if the last handler called
	       returns a "true" value, if it does not then the start tag will
	       not be printed (so you can print a modified string yourself
	       for example);

	       Note that you can use the ignore method in "start_tag_han-
	       dlers" (and only there).

	   end_tag_handlers
	       A hash "{ expression =" \&handler}>. Sets element handlers
	       that are called when the element is closed (at the end of the
	       XML::Parser "End" handler). The handlers are called with 2
	       params: the twig and the gi of the element.

	       twig_handlers are called when an element is completely parsed,
	       so why have this redundant option? There is only one use for
	       "end_tag_handlers": when using the "twig_roots" option, to
	       trigger a handler for an element outside the roots.  It is for
	       example very useful to number titles in a document using
	       nested sections:

		 my @no= (0);
		 my $no;
		 my $t= XML::Twig->new(
			 start_tag_handlers =>
			  { section => sub { $no[$#no]++; $no= join ’.’, @no; push @no, 0; } },
			 twig_roots	    =>
			  { title   => sub { $_[1]->prefix( $no); $_[1]->print; } },
			 end_tag_handlers   => { section => sub { pop @no;  } },
			 twig_print_outside_roots => 1
				     );
		  $t->parsefile( $file);

	       Using the "end_tag_handlers" argument without "twig_roots"
	       will result in an error.

	   do_not_chain_handlers
	       If this option is set to a true value, then only one handler
	       will be called for each element, even if several satisfy the
	       condition

	       Note that the "_all_" handler will still be called regardeless

	   ignore_elts
	       This option lets you ignore elements when building the twig.
	       This is useful in cases where you cannot use "twig_roots" to
	       ignore elements, for example if the element to ignore is a
	       sibling of elements you are interested in.

	       Example:

		 my $twig= XML::Twig->new( ignore_elts => { elt => 1 });
		 $twig->parsefile( ’doc.xml’);

	       This will build the complete twig for the document, except
	       that all "elt" elements (and their children) will be left out.

	   char_handler
	       A reference to a subroutine that will be called every time
	       "PCDATA" is found.

	   elt_class
	       The name of a class used to store elements. this class should
	       inherit from "XML::Twig::Elt" (and by default it is
	       "XML::Twig::Elt"). This option is used to subclass the element
	       class and extend it with new methods.

	       This option is needed because during the parsing of the XML,
	       elements are created by "XML::Twig", without any control from
	       the user code.

	   keep_atts_order
	       Setting this option to a true value causes the attribute hash
	       to be tied to a Tie::IxHash object.  This means that
	       Tie::IxHash needs to be installe for this option to be avail-
	       able. It also means that the hash keeps its order, so you will
	       get the attributes in order. This allows outputing the
	       attributes in the same order as they were in the original doc-
	       ument.

	   keep_encoding
	       This is a (slightly?) evil option: if the XML document is not
	       UTF-8 encoded and you want to keep it that way, then setting
	       keep_encoding will use the"Expat" original_string method for
	       character, thus keeping the original encoding, as well as the
	       original entities in the strings.

	       See the "t/test6.t" test file to see what results you can
	       expect from the various encoding options.

	       WARNING: if the original encoding is multi-byte then attribute
	       parsing will be EXTREMELY unsafe under any Perl before 5.6, as
	       it uses regular expressions which do not deal properly with
	       multi-byte characters. You can specify an alternate function
	       to parse the start tags with the "parse_start_tag" option (see
	       below)

	       WARNING: this option is NOT used when parsing with the non-
	       blocking parser ("parse_start", "parse_more", parse_done meth-
	       ods) which you probably should not use with XML::Twig anyway
	       as they are totally untested!

	   output_encoding
	       This option generates an output_filter using "Encode",
	       "Text::Iconv" or "Unicode::Map8" and "Unicode::Strings", and
	       sets the encoding in the XML declaration. This is the easiest
	       way to deal with encodings, if you need more sophisticated
	       features, look at "output_filter" below

	   output_filter
	       This option is used to convert the character encoding of the
	       output document.	 It is passed either a string corresponding
	       to a predefined filter or a subroutine reference. The filter
	       will be called every time a document or element is processed
	       by the "print" functions ("print", "sprint", "flush").

	       Pre-defined filters:

	       latin1
		   uses either "Encode", "Text::Iconv" or "Unicode::Map8" and
		   "Unicode::String" or a regexp (which works only with
		   XML::Parser 2.27), in this order, to convert all charac-
		   ters to ISO-8859-1 (aka latin1)

	       html
		   does the same conversion as "latin1", plus encodes enti-
		   ties using "HTML::Entities" (oddly enough you will need to
		   have HTML::Entities intalled for it to be available). This
		   should only be used if the tags and attribute names them-
		   selves are in US-ASCII, or they will be converted and the
		   output will not be valid XML any more

	       safe
		   converts the output to ASCII (US) only  plus character
		   entities ("&#nnn;") this should be used only if the tags
		   and attribute names themselves are in US-ASCII, or they
		   will be converted and the output will not be valid XML any
		   more

	       safe_hex
		   same as "safe" except that the character entities are in
		   hexa ("&#xnnn;")

	       iconv_convert ($encoding)
		   this function is used to create a filter subroutine that
		   will be used to convert the characters to the target
		   encoding using "Text::Iconv" (which needs to be installed,
		   look at the documentation for the module and for the
		   "iconv" library to find out which encodings are available
		   on your system)

		      my $conv = XML::Twig::iconv_convert( ’latin1’);
		      my $t = XML::Twig->new(output_filter => $conv);

	       unicode_convert ($encoding)
		   this function is used to create a filter subroutine that
		   will be used to convert the characters to the target
		   encoding using  "Unicode::Strings" and "Unicode::Map8"
		   (which need to be installed, look at the documentation for
		   the modules to find out which encodings are available on
		   your system)

		      my $conv = XML::Twig::unicode_convert( ’latin1’);
		      my $t = XML::Twig->new(output_filter => $conv);

	       Note that the "text" and "att" methods do not use the filter,
	       so their result are always in unicode.

	   output_text_filter
	       same as output_filter, except it doesn’t apply to the brackets
	       and quotes around attribute values. This is useful for all
	       filters that could change the tagging, basically anything that
	       does not just change the encoding of the output. "html",
	       "safe" and "safe_hex" are better used with this option.

	   input_filter
	       This option is similar to "output_filter" except the filter is
	       applied to the characters before they are stored in the twig,
	       at parsing time.

	   parse_start_tag
	       If you use the "keep_encoding" option then this option can be
	       used to replace the default parsing function. You should pro-
	       vide a coderef (a reference to a subroutine) as the argument,
	       this subroutine takes the original tag (given by
	       XML::Parser::Expat "original_string()" method) and returns a
	       gi and the attributes in a hash (or in a list
	       attribute_name/attribute value).

	   expand_external_ents
	       When this option is used external entities (that are defined)
	       are expanded when the document is output using "print" func-
	       tions such as "print ", "sprint ", "flush " and "xml_string ".
	       Note that in the twig the entity will be stored as an element
	       whith a gi ’"#ENT"’, the entity will not be expanded there, so
	       you might want to process the entities before outputting it.

	   load_DTD
	       If this argument is set to a true value, "parse" or "parse-
	       file" on the twig will load  the DTD information. This
	       information can then be accessed through the twig, in a
	       "DTD_handler" for example. This will load even an external
	       DTD.

	       Note that to do this the module will generate a temporary file
	       in the current directory. If this is a problem let me know and
	       I will add an option to specify an alternate directory.

	       See DTD Handling for more information

	   DTD_handler
	       Set a handler that will be called once the doctype (and the
	       DTD) have been loaded, with 2 arguments, the twig and the DTD.

	   no_prolog
	       Does not output a prolog (XML declaration and DTD)

	   id  This optional argument gives the name of an attribute that can
	       be used as an ID in the document. Elements whose ID is known
	       can be accessed through the elt_id method. id defaults to
	       ’id’.  See "BUGS "

	   discard_spaces
	       If this optional argument is set to a true value then spaces
	       are discarded when they look non-significant: strings contain-
	       ing only spaces are discarded.  This argument is set to true
	       by default.

	   keep_spaces
	       If this optional argument is set to a true value then all
	       spaces in the document are kept, and stored as "PCDATA".
	       "keep_spaces" and "discard_spaces" cannot be both set.

	   discard_spaces_in
	       This argument sets "keep_spaces" to true but will cause the
	       twig builder to discard spaces in the elements listed.

	       The syntax for using this argument is:

		 XML::Twig->new( discard_spaces_in => [ ’elt1’, ’elt2’]);

	   keep_spaces_in
	       This argument sets "discard_spaces" to true but will cause the
	       twig builder to keep spaces in the elements listed.

	       The syntax for using this argument is:

		 XML::Twig->new( keep_spaces_in => [ ’elt1’, ’elt2’]);

	   pretty_print
	       Set the pretty print method, amongst ’"none"’ (default),
	       ’"nsgmls"’, ’"nice"’, ’"indented"’, ’"indented_c"’, ’"record"’
	       and ’"record_c"’

	       pretty_print formats:

	       none
		   The document is output as one ling string, with no line
		   breaks except those found within text elements

	       nsgmls
		   Line breaks are inserted in safe places: that is within
		   tags, between a tag and an attribute, between attributes
		   and before the > at the end of a tag.

		   This is quite ugly but better than "none", and it is very
		   safe, the document will still be valid (conforming to its
		   DTD).

		   This is how the SGML parser "sgmls" splits documents,
		   hence the name.

	       nice
		   This option inserts line breaks before any tag that does
		   not contain text (so element with textual content are not
		   broken as the \n is the significant).

		   WARNING: this option leaves the document well-formed but
		   might make it invalid (not conformant to its DTD). If you
		   have elements declared as

		     <!ELEMENT foo (#PCDATA│bar)>

		   then a "foo" element including a "bar" one will be printed
		   as

		     <foo>
		     <bar>bar is just pcdata</bar>
		     </foo>

		   This is invalid, as the parser will take the line break
		   after the "foo" tag as a sign that the element contains
		   PCDATA, it will then die when it finds the "bar" tag. This
		   may or may not be important for you, but be aware of it!

	       indented
		   Same as "nice" (and with the same warning) but indents
		   elements according to their level

	       indented_c
		   Same as "indented" but a little more compact: the closing
		   tags are on the same line as the preceeding text

	       record
		   This is a record-oriented pretty print, that display data
		   in records, one field per line (which looks a LOT like
		   "indented")

	       record_c
		   Stands for record compact, one record per line

	   empty_tags
	       Set the empty tag display style (’"normal"’, ’"html"’ or
	       ’"expand"’).

	   comments
	       Set the way comments are processed: ’"drop"’ (default),
	       ’"keep"’ or ’"process"’

	       Comments processing options:

	       drop
		   drops the comments, they are not read, nor printed to the
		   output

	       keep
		   comments are loaded and will appear on the output, they
		   are not accessible within the twig and will not interfere
		   with processing though

		   Bug: comments in the middle of a text element such as

		     <p>text <!-- comment --> more text --></p>

		   are output at the end of the text:

		     <p>text  more text <!-- comment --></p>

	       process
		   comments are loaded in the twig and will be treated as
		   regular elements (their "gi" is "#COMMENT") this can
		   interfere with processing if you expect
		   "$elt->{first_child}" to be an element but find a comment
		   there.  Validation will not protect you from this as com-
		   ments can happen anywhere.  You can use
		   "$elt->first_child( ’gi’)" (which is a good habit anyway)
		   to get where you want.

		   Consider using "process" if you are outputing SAX events
		   from XML::Twig.

	   pi  Set the way processing instructions are processed: ’"drop"’,
	       ’"keep"’ (default) or ’"process"’

	       Note that you can also set PI handlers in the "twig_handlers"
	       option:

		 ’?’	   => \&handler
		 ’?target’ => \&handler 2

	       The handlers will be called with 2 parameters, the twig and
	       the PI element if "pi" is set to "process", and with 3, the
	       twig, the target and the data if "pi" is set to "keep". Of
	       course they will not be called if "pi" is set to "drop".

	       If "pi" is set to "keep" the handler should return a string
	       that will be used as-is as the PI text (it should look like ""
	       <?target data?" >" or ’’ if you want to remove the PI),

	       Only one handler will be called, "?target" or "?" if no spe-
	       cific handler for that target is available.

	   Note: I _HATE_ the Java-like name of arguments used by most XML
	   modules.  So in pure TIMTOWTDI fashion all arguments can be writ-
	   ten either as "UglyJavaLikeName" or as "readable_perl_name":
	   "twig_print_outside_roots" or "TwigPrintOutsideRoots" (or even
	   "twigPrintOutsideRoots" {shudder}).	XML::Twig normalizes them
	   before processing them.

       parse (SOURCE [, OPT => OPT_VALUE [...]])
	   This method is inherited from XML::Parser.  The "SOURCE" parameter
	   should either be a string containing the whole XML document, or it
	   should be an open "IO::Handle". Constructor options to
	   "XML::Parser::Expat" given as keyword-value pairs may follow
	   the"SOURCE" parameter. These override, for this call, any options
	   or attributes passed through from the XML::Parser instance.

	   A die call is thrown if a parse error occurs. Otherwise it will
	   return the twig built by the parse. Use "safe_parse" if you want
	   the parsing to return even when an error occurs.

       parsestring
	   This is just an alias for "parse" for backwards compatibility.

       parsefile (FILE [, OPT => OPT_VALUE [...]])
	   This method is inherited from XML::Parser.

	   Open "FILE" for reading, then call "parse" with the open handle.
	   The file is closed no matter how "parse" returns.

	   A "die" call is thrown if a parse error occurs. Otherwise it will
	   return the twig built by the parse. Use "safe_parsefile" if you
	   want the parsing to return even when an error occurs.

       parseurl ($url $optional_user_agent)
	   Gets the data from $url and parse it. Note that the data is piped
	   to the parser in chunks the size of the XML::Parser::Expat buffer,
	   so memory consumption and hopefully speed are optimal.

	   If the $optional_user_agent argument is used then it is used, oth-
	   erwise a new one is created.

       safe_parse ( SOURCE [, OPT => OPT_VALUE [...]])
	   This method is similar to "parse" except that it wraps the parsing
	   in an "eval" block. It returns the twig on success and 0 on fail-
	   ure (the twig object also contains the parsed twig). $@ contains
	   the error message on failure.

	   Note that the parsing still stops as soon as an error is detected,
	   there is no way to keep going after an error.

       safe_parsefile (FILE [, OPT => OPT_VALUE [...]])
	   This method is similar to "parsefile" except that it wraps the
	   parsing in an "eval" block. It returns the twig on success and 0
	   on failure (the twig object also contains the parsed twig) . $@
	   contains the error message on failure

	   Note that the parsing still stops as soon as an error is detected,
	   there is no way to keep going after an error.

       safe_parseurl ($url $optional_user_agent)
	   Same as "parseurl" except that it wraps the parsing in an "eval"
	   block. It returns the twig on success and 0 on failure (the twig
	   object also contains the parsed twig) . $@ contains the error mes-
	   sage on failure

       parser
	   This method returns the "expat" object (actually the
	   XML::Parser::Expat object) used during parsing. It is useful for
	   example to call XML::Parser::Expat methods on it. To get the line
	   of a tag for example use "$t->parser->current_line".

       setTwigHandlers ($handlers)
	   Set the Twig handlers. $handlers is a reference to a hash similar
	   to the one in the "twig_handlers" option of new. All previous han-
	   dlers are unset.  The method returns the reference to the previous
	   handlers.

       setTwigHandler ($exp $handler)
	   Set a single Twig handlers for elements matching $exp. $handler is
	   a reference to a subroutine. If the handler was previously set
	   then the reference to the previous handler is returned.

       setStartTagHandlers ($handlers)
	   Set the start_tag handlers. $handlers is a reference to a hash
	   similar to the one in the "start_tag_handlers" option of new. All
	   previous handlers are unset.	 The method returns the reference to
	   the previous handlers.

       setStartTagHandler ($exp $handler)
	   Set a single start_tag handlers for elements matching $exp. $han-
	   dler is a reference to a subroutine. If the handler was previously
	   set then the reference to the previous handler is returned.

       setEndTagHandlers ($handlers)
	   Set the EndTag handlers. $handlers is a reference to a hash simi-
	   lar to the one in the "end_tag_handlers" option of new. All previ-
	   ous handlers are unset.  The method returns the reference to the
	   previous handlers.

       setEndTagHandler ($exp $handler)
	   Set a single EndTag handlers for elements matching $exp. $handler
	   is a reference to a subroutine. If the handler was previously set
	   then the reference to the previous handler is returned.

       dtd Return the dtd (an XML::Twig::DTD object) of a twig

       root
	   Return the root element of a twig

       set_root ($elt)
	   Set the root of a twig

       first_elt ($optional_condition)
	   Return the first element matching $optional_condition of a twig,
	   if no condition is given then the root is returned

       elt_id	     ($id)
	   Return the element whose "id" attribute is $id

       encoding
	   This method returns the encoding of the XML document, as defined
	   by the "encoding" attribute in the XML declaration (ie it is
	   "undef" if the attribute is not defined)

       set_encoding
	   This method sets the value of the "encoding" attribute in the XML
	   declaration.	 Note that if the document did not have a declaration
	   it is generated (with an XML version of 1.0)

       xml_version
	   This method returns the XML version, as defined by the "version"
	   attribute in the XML declaration (ie it is "undef" if the
	   attribute is not defined)

       set_xml_version
	   This method sets the value of the "version" attribute in the XML
	   declaration.	 If the declaration did not exist it is created.

       standalone
	   This method returns the value of the "standalone" declaration for
	   the document

       set_standalone
	   This method sets the value of the "standalone" attribute in the
	   XML declaration.  Note that if the document did not have a decla-
	   ration it is generated (with an XML version of 1.0)

       set_doctype ($name, $system, $public, $internal)
	   Set the doctype of the element. If an argument is "undef" (or not
	   present) then its former value is retained, if a false (’’ or 0)
	   value is passed then the former value is deleted;

       entity_list
	   Return the entity list of a twig

       entity_names
	   Return the list of all defined entities

       entity ($entity_name)
	   Return the entity

       change_gi      ($old_gi, $new_gi)
	   Performs a (very fast) global change. All elements $old_gi are now
	   $new_gi.

	   See "BUGS "

       flush		($optional_filehandle, $options)
	   Flushes a twig up to (and including) the current element, then
	   deletes all unnecessary elements from the tree that’s kept in mem-
	   ory.	 "flush" keeps track of which elements need to be
	   open/closed, so if you flush from handlers you don’t have to worry
	   about anything. Just keep flushing the twig every time you’re done
	   with a sub-tree and it will come out well-formed. After the whole
	   parsing don’t forget to"flush" one more time to print the end of
	   the document.  The doctype and entity declarations are also
	   printed.

	   flush take an optional filehandle as an argument.

	   options: use the "update_DTD" option if you have updated the
	   (internal) DTD and/or the entity list and you want the updated DTD
	   to be output

	   The "pretty_print" option sets the pretty printing of the docu-
	   ment.

	      Example: $t->flush( Update_DTD => 1);
		       $t->flush( \*FILE, Update_DTD => 1);
		       $t->flush( \*FILE);

       flush_up_to ($elt, $optional_filehandle, %options)
	   Flushes up to the $elt element. This allows you to keep part of
	   the tree in memory when you "flush".

	   options: see flush.

       purge
	   Does the same as a "flush" except it does not print the twig. It
	   just deletes all elements that have been completely parsed so far.

       purge_up_to ($elt)
	   Purges up to the $elt element. This allows you to keep part of the
	   tree in memory when you "purge".

       print		($optional_filehandle, %options)
	   Prints the whole document associated with the twig. To be used
	   only AFTER the parse.

	   options: see "flush".

       sprint
	   Return the text of the whole document associated with the twig. To
	   be used only AFTER the parse.

	   options: see "flush".

       ignore
	   This method can only be called in "start_tag_handlers". It causes
	   the element to be skipped during the parsing: the twig is not
	   built for this element, it will not be accessible during parsing
	   or after it. The element will not take up any memory and parsing
	   will be faster.

	   Note that this method can also be called on an element. If the
	   element is a parent of the current element then this element will
	   be ignored (the twig will not be built any more for it and what
	   has already been built will be deleted)

       set_pretty_print	 ($style)
	   Set the pretty print method, amongst ’"none"’ (default),
	   ’"nsgmls"’, ’"nice"’, ’"indented"’, ’"record"’ and ’"record_c"’

	   WARNING: the pretty print style is a GLOBAL variable, so once set
	   it’s applied to ALL "print"’s (and "sprint"’s). Same goes if you
	   use XML::Twig with "mod_perl" . This should not be a problem as
	   the XML that’s generated is valid anyway, and XML processors (as
	   well as HTML processors, including browsers) should not care. Let
	   me know if this is a big problem, but at the moment the perfor-
	   mance/cleanliness trade-off clearly favors the global approach.

       set_empty_tag_style  ($style)
	   Set the empty tag display style (’"normal"’, ’"html"’ or
	   ’"expand"’). As with "set_pretty_print" this sets a global flag.

	   "normal" outputs an empty tag ’"<tag/>"’, "html" adds a space
	   ’"<tag />"’ and "expand" outputs ’"<tag></tag>"’

       print_prolog	($optional_filehandle, %options)
	   Prints the prolog (XML declaration + DTD + entity declarations) of
	   a document.

	   options: see "flush".

       prolog	  ($optional_filehandle, %options)
	   Return the prolog (XML declaration + DTD + entity declarations) of
	   a document.

	   options: see "flush".

       finish
	   Call Expat "finish" method.	Unsets all handlers (including inter-
	   nal ones that set context), but expat continues parsing to the end
	   of the document or until it finds an error.	It should finish up a
	   lot faster than with the handlers set.

       finish_print
	   Stop twig processing, flush the twig and proceed to finish print-
	   ing the document as fast as possible. Use this method when modify-
	   ing a document and the modification is done.

       Methods inherited from XML::Parser::Expat
	   A twig inherits all the relevant methods from XML::Parser::Expat.
	   These methods can only be used during the parsing phase (they will
	   generate a fatal error otherwise).

	   Inherited methods are:

	     depth in_element within_element context
	     current_line current_column current_byte position_in_context
	     base current_element element_index
	     recognized_string original_string
	     xpcroak xpcarp
	     xml_escape (this one is broken on some versions of expat/XML::Parser)

       path ($gi)
	   Return the element context in a form similar to XPath’s short
	   form: ’"/root/gi1/../gi"’

       get_xpath  ( $optional_array_ref, $xpath, $optional_offset)
	   Performs a "get_xpath" on the document root (see <Elt│"Elt">)

	   If the $optional_array_ref argument is used the array must contain
	   elements. The $xpath expression is applied to each element in turn
	   and the result is union of all results. This way a first query can
	   be refined in further steps.

       find_nodes ( $optional_array_ref, $xpath, $optional_offset)
	   same as "get_xpath"

       findnodes ( $optional_array_ref, $xpath, $optional_offset)
	   same as "get_xpath" (similar to the XML::LibXML method)

       findvalue ( $optional_array_ref, $xpath, $optional_offset)
	   Return the "join" of all texts of the results of appling
	   "get_xpath" to the node (similar to the XML::LibXML method)

       subs_text ($regexp, $replace)
	   subs_text does text substitution on the whole document, similar to
	   perl’s " s///" operator.

       dispose
	   Useful only if you don’t have "Scalar::Util" or "WeakRef"
	   installed.

	   Reclaims properly the memory used by an XML::Twig object. As the
	   object has circular references it never goes out of scope, so if
	   you want to parse lots of XML documents then the memory leak
	   becomes a problem. Use "$twig->dispose" to clear this problem.

       XML::Twig::Elt


       new	    ($optional_gi, $optional_atts, @optional_content)
	   The "gi" is optional (but then you can’t have a content ), the
	   $optional_atts argument is a refreference to a hash of attributes,
	   the content can be just a string or a list of strings and element.
	   A content of ’"#EMPTY"’ creates an empty element;

	    Examples: my $elt= XML::Twig::Elt->new();
		      my $elt= XML::Twig::Elt->new( para => { align => ’center’ });
		      my $elt= XML::Twig::Elt->new( para => { align => ’center’ }, ’foo’);
		      my $elt= XML::Twig::Elt->new( br	 => ’#EMPTY’);
		      my $elt= XML::Twig::Elt->new( ’para’);
		      my $elt= XML::Twig::Elt->new( para => ’this is a para’);
		      my $elt= XML::Twig::Elt->new( para => $elt3, ’another para’);

	   The strings are not parsed, the element is not attached to any
	   twig.

	   WARNING: if you rely on ID’s then you will have to set the id
	   yourself. At this point the element does not belong to a twig yet,
	   so the ID attribute is not known so it won’t be strored in the ID
	   list.

	   Note that "#COMMENT", "#PCDATA" or "#CDATA" are valid tag names,
	   that will create text elements.

	   To create an element "foo" containing a CDATA section:

		      my $foo= XML::Twig::Elt->new( ’#CDATA’ => "content of the CDATA section")
					     ->wrap_in( ’foo’);

       parse	     ($string, %args)
	   Creates an element from an XML string. The string is actually
	   parsed as a new twig, then the root of that twig is returned.  The
	   arguments in %args are passed to the twig.  As always if the parse
	   fails the parser will die, so use an eval if you want to trap syn-
	   tax errors.

	   As obviously the element does not exist beforehand this method has
	   to be called on the class:

	     my $elt= parse XML::Twig::Elt( "<a> string to parse, with <sub/>
					     <elements>, actually tons of </elements>
			     h</a>");

       print	     ($optional_filehandle, $optional_pretty_print_style)
	   Prints an entire element, including the tags, optionally to a
	   $optional_filehandle, optionally with a $pretty_print_style.

	   The print outputs XML data so base entities are escaped.

       sprint	    ($elt, $optional_no_enclosing_tag)
	   Return the xml string for an entire element, including the tags.
	   If the optional second argument is true then only the string
	   inside the element is returned (the start and end tag for $elt are
	   not).  The text is XML-escaped: base entities (& and < in text, &
	   < and " in attribute values) are turned into entities.

       gi  Return the gi of the element (the gi is the "generic identifier"
	   the tag name in SGML parlance).

	   "tag" and "name" are synonyms of "gi".

       tag Same as "gi"

       name
	   Same as "gi"

       set_gi	      ($gi)
	   Set the gi (tag) of an element

       set_tag	      ($tag)
	   Set the tag (="gi") of an element

       set_name	      ($name)
	   Set the name (="gi") of an element

       root
	   Return the root of the twig in which the element is contained.

       twig
	   Return the twig containing the element.

       parent	     ($optional_condition)
	   Return the parent of the element, or the first ancestor matching
	   the $optional_condition

       first_child   ($optional_condition)
	   Return the first child of the element, or the first child matching
	   the $optional_condition

       has_child ($optional_condition)
	   Return the first child of the element, or the first child matching
	   the $optional_condition (same as first_child)

       has_children ($optional_condition)
	   Return the first child of the element, or the first child matching
	   the $optional_condition (same as first_child)

       first_child_text	  ($optional_condition)
	   Return the text of the first child of the element, or the first
	   child
	    matching the $optional_condition If there is no first_child then
	   returns ’’. This avoids getting the child, checking for its exis-
	   tence then getting the text for trivial cases.

	   Similar methods are available for the other navigation methods:
	   "last_child_text", "prev_sibling_text", "next_sibling_text",
	   "prev_elt_text", "next_elt_text", "child_text", "parent_text"

	   All this methods also exist in "trimmed" variant:
	   "last_child_trimmed_text", "prev_sibling_trimmed_text", "next_sib-
	   ling_trimmed_text", "prev_elt_trimmed_text",
	   "next_elt_trimmed_text", "child_trimmed_text", "par-
	   ent_trimmed_text"

       field	     ($optional_condition)
	   Same method as "first_child_text" with a different name

       trimmed_field	     ($optional_condition)
	   Same method as "first_child_trimmed_text" with a different name

       first_child_matches   ($optional_condition)
	   Return the element if the first child of the element (if it
	   exists) passes the $optional_condition "undef" otherwise

	     if( $elt->first_child_matches( ’title’)) ...

	   is equivalent to

	     if( $elt->{first_child} && $elt->{first_child}->passes( ’title’))

	   "first_child_is" is an other name for this method

	   Similar methods are available for the other navigation methods:
	   "last_child_matches", "prev_sibling_matches", "next_sib-
	   ling_matches", "prev_elt_matches", "next_elt_matches",
	   "child_matches", "parent_matches"

       is_first_child ($optional_condition)
	   returns true (the element) if the element is the first child of
	   its parent (optionaly that satisfies the $optional_condition)

       is_last_child ($optional_condition)
	   returns true (the element) if the element is the first child of
	   its parent (optionaly that satisfies the $optional_condition)

       prev_sibling  ($optional_condition)
	   Return the previous sibling of the element, or the previous sib-
	   ling matching $optional_condition

       next_sibling  ($optional_condition)
	   Return the next sibling of the element, or the first one matching
	   $optional_condition.

       next_elt	    ($optional_elt, $optional_condition)
	   Return the next elt (optionally matching $optional_condition) of
	   the element. This is defined as the next element which opens after
	   the current element opens.  Which usually means the first child of
	   the element.	 Counter-intuitive as it might look this allows you
	   to loop through the whole document by starting from the root.

	   The $optional_elt is the root of a subtree. When the "next_elt" is
	   out of the subtree then the method returns undef. You can then
	   walk a sub tree with:

	     my $elt= $subtree_root;
	     while( $elt= $elt->next_elt( $subtree_root)
	       { # insert processing code here
	       }

       prev_elt	    ($optional_condition)
	   Return the previous elt (optionally matching $optional_condition)
	   of the element. This is the first element which opens before the
	   current one.	 It is usually either the last descendant of the pre-
	   vious sibling or simply the parent

       children	    ($optional_condition)
	   Return the list of children (optionally which matches
	   $optional_condition) of the element. The list is in document
	   order.

       children_count ($optional_condition)
	   Return the number of children of the element (optionally which
	   matches $optional_condition)

       children_text ($optional_condition)
	   Return an array containing the text of children of the element
	   (optionally which matches $optional_condition)

       children_copy ($optional_condition)
	   Return a list of elements that are copies of the children of the
	   element, optionally which matches $optional_condition

       descendants     ($optional_condition)
	   Return the list of all descendants (optionally which matches
	   $optional_condition) of the element. This is the equivalent of the
	   "getElementsByTagName" of the DOM (by the way, if you are really a
	   DOM addict, you can use "getElementsByTagName" instead)

       descendants_or_self ($optional_condition)
	   Same as "descendants" except that the element itself is included
	   in the list if it matches the $optional_condition

       ancestors    ($optional_condition)
	   Return the list of ancestors (optionally matching $optional_condi-
	   tion) of the element.  The list is ordered from the innermost
	   ancestor to the outtermost one

	   NOTE: the element itself is not part of the list, in order to
	   include it you will have to use ancestors_or_self

       ancestors_or_self     ($optional_condition)
	   Return the list of ancestors (optionally matching $optional_condi-
	   tion) of the element, including the element (if it matches the
	   condition>).	 The list is ordered from the innermost ancestor to
	   the outtermost one

       att	    ($att)
	   Return the value of attribute $att or "undef"

       set_att	    ($att, $att_value)
	   Set the attribute of the element to the given value

	   You can actually set several attributes this way:

	     $elt->set_att( att1 => "val1", att2 => "val2");

       del_att	    ($att)
	   Delete the attribute for the element

	   You can actually delete several attributes at once:

	     $elt->del_att( ’att1’, ’att2’, ’att3’);

       cut Cut the element from the tree. The element still exists, it can be
	   copied or pasted somewhere else, it is just not attached to the
	   tree anymore.

       cut_children ($optional_condition)
	   Cut all the children of the element (or all of those which satisfy
	   the $optional_condition).

	   Return the list of children

       copy	   ($elt)
	   Return a copy of the element. The copy is a "deep" copy: all sub
	   elements of the element are duplicated.

       paste	   ($optional_position, $ref)
	   Paste a (previously "cut" or newly generated) element. Die if the
	   element already belongs to a tree.

	   Position options:

	   first_child (default)
	       The element is pasted as the first child of the element object
	       this method is called on.

	   last_child
	       The element is pasted as the last child of the element object
	       this method is called on.

	   before
	       The element is pasted before the element object, as its previ-
	       ous sibling.

	   after
	       The element is pasted after the element object, as its next
	       sibling.

	   within
	       In this case an extra argument, $offset, should be supplied.
	       The element will be pasted in the reference element (or in its
	       first text child) at the given offset. To achieve this the
	       reference element will be split at the offset.

       move	  ($optional_position, $ref)
	   Move an element in the tree.	 This is just a "cut" then a "paste".
	   The syntax is the same as "paste".

       replace	     ($ref)
	   Replaces an element in the tree. Sometimes it is just not possible
	   to"cut" an element then "paste" another in its place, so "replace"
	   comes in handy.  The calling element replaces $ref.

       replace_with   (@elts)
	   Replaces the calling element with one or more elements

       delete
	   Cut the element and frees the memory.

       prefix	    ($text, $optional_option)
	   Add a prefix to an element. If the element is a "PCDATA" element
	   the text is added to the pcdata, if the elements first child is a
	   "PCDATA" then the text is added to it’s pcdata, otherwise a new
	   "PCDATA" element is created and pasted as the first child of the
	   element.

	   If the option is "asis" then the prefix is added asis: it is cre-
	   ated in a separate "PCDATA" element with an "asis" property. You
	   can then write:

	     $elt1->prefix( ’<b>’, ’asis’);

	   to create a "<b>" in the output of "print".

       suffix	    ($text, $optional_option)
	   Add a suffix to an element. If the element is a "PCDATA" element
	   the text is added to the pcdata, if the elements last child is a
	   "PCDATA" then the text is added to it’s pcdata, otherwise a new
	   PCDATA element is created and pasted as the last child of the ele-
	   ment.

	   If the option is "asis" then the suffix is added asis: it is cre-
	   ated in a separate "PCDATA" element with an "asis" property. You
	   can then write:

	     $elt2->suffix( ’</b>’, ’asis’);

       simplify (%options)
	   Return a data structure suspiciously similar to XML::Simple’s.
	   Options are identical to XMLin options, see XML::Simple doc for
	   more details (or use DATA::dumper or YAML to dump the data struc-
	   ture)

	   keyattr
	   forcearray
	   noattr
	   content_key
	   variables (%var_hash)
	       %var_hash is a hash { name => value }

	       This option allows variables in the XML to be expanded when
	       the file is read. (there is no facility for putting the vari-
	       able names back if you regenerate XML using XMLout).

	       A ’variable’ is any text of the form ${name} (or $name) which
	       occurs in an attribute value or in the text content of an ele-
	       ment. If ’name’ matches a key in the supplied hashref, ${name}
	       will be replaced with the corresponding value from the
	       hashref. If no matching key is found, the variable will not be
	       replaced.

	   var ($attribute_name)
	       This option gives the name of an attribute that will be used
	       to create variables in the XML:

		 <dirs>
		   <dir name="prefix">/usr/local</dir>
		   <dir name="exec_prefix">$prefix/bin</dir>
		 </dirs>

	       use "var => ’name’" to get $prefix replaced by /usr/local in
	       the generated data structure

	       By default variables are captured by the following regexp:
	       /$(\w+)/

	   var_regexp (regexp)
	       This option changes the regexp used to capture variables. The
	       variable name should be in $1

	   erase ([<tag1>, <tag2>...])
	       Option used to simplify the structure: elements listed will
	       not be used.  Their children will be, they will be considered
	       children of the element parent.

	       If the element is:

		 <config host="laptop.xmltwig.com">
		   <server>localhost</server>
		   <dirs>
		     <dir name="base">/home/mrodrigu/standards</dir>
		     <dir name="tools">$base/tools</dir>
		   </dirs>
		   <templates>
		     <template name="std_def">std_def.templ</template>
		     <template name="dummy">dummy</template>
		   </templates>
		 </config>

	       Then callin simplify with "erase => [ ’dirs’, ’templates’]"
	       makes the data structure be exactly as if the start and end
	       tags for "dirs" and "templates" were not there.

	       A YAML dump of the structure

		 base: ’/home/mrodrigu/standards’
		 host: laptop.xmltwig.com
		 server: localhost
		 template:
		   - std_def.templ
		   - dummy.templ
		 tools: ’$base/tools’

       split_at	       ($offset)
	   Split a text ("PCDATA" or "CDATA") element in 2 at $offset, the
	   original element now holds the first part of the string and a new
	   element holds the right part. The new element is returned

	   If the element is not a text element then the first text child of
	   the element is split

       split	    ( $optional_regexp, $optional_tag,
       $optional_attribute_ref)
	   Split the text descendants of an element in place, the text is
	   split using the regexp, if the regexp includes () then the matched
	   separators will be wrapped in $optional_tag, with
	   $optional_attribute_ref attributes

	   if $elt is "<p>tati tata <b>tutu tati titi</b> tata tati tata</p>"

	     $elt->split( qr/(ta)ti/, ’foo’, {type => ’toto’} )

	   will change $elt to

	     <p><foo type="toto">ta</foo> tata <b>tutu <foo type="toto">ta</foo>
		 titi</b> tata <foo type="toto">ta</foo> tata</p>

	   The regexp can be passed either as a string or as "qr//" (perl
	   5.005 and later), it defaults to \s+ just as the "split" built-in
	   (but this would be quite a useless behaviour without the
	   $optional_tag parameter)

	   $optional_tag defaults to PCDATA or CDATA, depending on the ini-
	   tial element type

	   The list of descendants is returned (including un-touched original
	   elements and newly created ones)

       mark	   ( $regexp, $optional_tag, $optional_attribute_ref)
	   This method behaves exactly as split, except only the newly cre-
	   ated elements are returned

       wrap_children ( $regexp_string, $tag, $optional_att, $optional_value)
	   Wrap the children of the element that match the regexp in an ele-
	   ment $tag.  If $optional_att and $optional_value are passed then
	   the new element will have an attribute $optional_att with a value
	   $optional_value.

	   Note that elements might get extra "id" attributes in the process.
	   See add_id.	Use strip_att to remove unwanted id’s.

	   Here is an example:

	   If the element $elt has the following content:

	     <elt>
	      <p>para 1</p>
	      <l_l1_1>list 1 item 1 para 1</l_l1_1>
		<l_l1>list 1 item 1 para 2</l_l1>
	      <l_l1_n>list 1 item 2 para 1 (only para)</l_l1_n>
	      <l_l1_n>list 1 item 3 para 1</l_l1_n>
		<l_l1>list 1 item 3 para 2</l_l1>
		<l_l1>list 1 item 3 para 3</l_l1>
	      <l_l1_1>list 2 item 1 para 1</l_l1_1>
		<l_l1>list 2 item 1 para 2</l_l1>
	      <l_l1_n>list 2 item 2 para 1 (only para)</l_l1_n>
	      <l_l1_n>list 2 item 3 para 1</l_l1_n>
		<l_l1>list 2 item 3 para 2</l_l1>
		<l_l1>list 2 item 3 para 3</l_l1>
	     </elt>

	   Then the code

	     $elt->wrap_children( q{<l_l1_1><l_l1>*} , li => { type => "ul1" });
	     $elt->wrap_children( q{<l_l1_n><l_l1>*} , li => { type => "ul" });

	     $elt->wrap_children( q{<li type="ul1"><li type="ul">+}, "ul");
	     $elt->strip_att( ’id’);
	     $elt->strip_att( ’type’);
	     $elt->print;

	   will output:

	     <elt>
		<p>para 1</p>
		<ul>
		  <li>
		    <l_l1_1>list 1 item 1 para 1</l_l1_1>
		    <l_l1>list 1 item 1 para 2</l_l1>
		  </li>
		  <li>
		    <l_l1_n>list 1 item 2 para 1 (only para)</l_l1_n>
		  </li>
		  <li>
		    <l_l1_n>list 1 item 3 para 1</l_l1_n>
		    <l_l1>list 1 item 3 para 2</l_l1>
		    <l_l1>list 1 item 3 para 3</l_l1>
		  </li>
		</ul>
		<ul>
		  <li>
		    <l_l1_1>list 2 item 1 para 1</l_l1_1>
		    <l_l1>list 2 item 1 para 2</l_l1>
		  </li>
		  <li>
		    <l_l1_n>list 2 item 2 para 1 (only para)</l_l1_n>
		  </li>
		  <li>
		    <l_l1_n>list 2 item 3 para 1</l_l1_n>
		    <l_l1>list 2 item 3 para 2</l_l1>
		    <l_l1>list 2 item 3 para 3</l_l1>
		  </li>
		</ul>
	     </elt>

       subs_text ($regexp, $replace)
	   subs_text does text substitution, similar to perl’s " s///" opera-
	   tor.

	   $regexp must be a perl regexp, created with the "qr" operatot.

	   $replace can include "$1, $2"... from the $regexp. It can also be
	   used to create element and entities, by using "&elt( tag => { att
	   => val }, text)" (similar syntax as "new") and "&ent( name)".

	   Here is a rather complex example:

	     $elt->subs_text( qr{(?<!do not )link to (http://([^\s,]*))},
			      ’see &elt( a =>{ href => $1 }, $2)’
			    );

	   This will replace text like link to http://www.xmltwig.com by see
	   <a href="www.xmltwig.com">www.xmltwig.com</a>, but not do not link
	   to...

	   Generating entities (here replacing spaces with &nbsp;):

	     $elt->subs_text( qr{ }, ’&ent( "&nbsp;")’);

	   or, using a variable:

	     my $ent="&nbsp;";
	     $elt->subs_text( qr{ }, "&ent( ’$ent’)");

	   Note that the substitution is always global, as in using the "g"
	   modifier in a perl substitution, and that it is performed on all
	   text descendants of the element.

       add_id
	   Add an id to the element.

	   The id is an attribute ("id" by default, see the "id" option for
	   XML::Twig "new" to change it. Use an id starting with "#" to get
	   an id that’s not output by print, flush or sprint) that allows you
	   to use the elt_id method to get the element easily.

       strip_att ($att)
	   Remove the attribute $att from all descendants of the element
	   (including the element)

       change_att_name ($old_name, $new_name)
	   Change the name of the attribute from $old_name to $new_name. If
	   there is no attribute $old_name nothing happens.

       sort_children_on_value( %options)
	   Sort the children of the element in place according to their text.
	   All children are sorted.

	   Return the element, with its children sorted.

	   %options are

	     type  : numeric │	alpha	  (default: alpha)
	     order : normal  │	reverse	  (default: normal)

	   Return the element, with its children sorted

       sort_children_on_att ($att, %options)
	   Sort the children of the  element in place according to attribute
	   $att.  %options are the same as for "sort_children_on_value"

	   Return the element.

       sort_children_on_field ($gi, %options)
	   Sort the children of the element in place, according to the field
	   $gi (the text of the first child of the child with this gi).
	   %options are the same as for "sort_children_on_value".

	   Return the element, with its children sorted

       sort_children( $get_key, %options)
	   Sort the children of the element in place. The $get_key argument
	   is a reference to a function that returns the sort key when passed
	   an element.

	   For example:

	     $elt->sort_children( sub { $_[0]->{’att’}->{"nb"} + $_[0]->text },
				  type => ’numeric’, order => ’reverse’
				);

       field_to_att ($cond, $att)
	   Turn the text of the first sub-element matched by $cond into the
	   value of attribute $att of the element. If $att is ommited then
	   $cond is used as the name of the attribute, which makes sense only
	   if $cond is a valid element (and attribute) name.

	   The sub-element is then cut.

       att_to_field ($att, $gi)
	   Take the value of attribute $att and create a sub-element $gi as
	   first child of the element. If $gi is ommited then $att is used as
	   the name of the sub-element.

       get_xpath  ($xpath, $optional_offset)
	   Return a list of elements satisfying the $xpath. $xpath is an
	   XPATH-like expression.

	   A subset of the XPATH abbreviated syntax is covered:

	     gi
	     gi[1] (or any other positive number)
	     gi[last()]
	     gi[@att] (the attribute exists for the element)
	     gi[@att="val"]
	     gi[@att=~ /regexp/]
	     gi[att1="val1" and att2="val2"]
	     gi[att1="val1" or att2="val2"]
	     gi[string()="toto"] (returns gi elements which text (as per the text method)
				  is toto)
	     gi[string()=~/regexp/] (returns gi elements which text (as per the text
				     method) matches regexp)
	     expressions can start with / (search starts at the document root)
	     expressions can start with . (search starts at the current element)
	     // can be used to get all descendants instead of just direct children
	     * matches any gi

	   So the following examples from the XPath recommenda-
	   tion<http://www.w3.org/TR/xpath.html#path-abbrev> work:

	     para selects the para element children of the context node
	     * selects all element children of the context node
	     para[1] selects the first para child of the context node
	     para[last()] selects the last para child of the context node
	     */para selects all para grandchildren of the context node
	     /doc/chapter[5]/section[2] selects the second section of the fifth chapter
		of the doc
	     chapter//para selects the para element descendants of the chapter element
		children of the context node
	     //para selects all the para descendants of the document root and thus selects
		all para elements in the same document as the context node
	     //olist/item selects all the item elements in the same document as the
		context node that have an olist parent
	     .//para selects the para element descendants of the context node
	     .. selects the parent of the context node
	     para[@type="warning"] selects all para children of the context node that have
		a type attribute with value warning
	     employee[@secretary and @assistant] selects all the employee children of the
		context node that have both a secretary attribute and an assistant
		attribute

	   The elements will be returned in the document order.

	   If $optional_offset is used then only one element will be
	   returned, the one with the appropriate offset in the list, start-
	   ing at 0

	   Quoting and interpolating variables can be a pain when the Perl
	   syntax and the XPATH syntax collide, so here are some more exam-
	   ples to get you started:

	     my $p1= "p1";
	     my $p2= "p2";
	     my @res= $t->get_xpath( "p[string( ’$p1’) or string( ’$p2’)]");

	     my $a= "a1";
	     my @res= $t->get_xpath( "//*[@att=\"$a\"]);

	     my $val= "a1";
	     my $exp= "//p[ \@att=’$val’]"; # you need to use \@ or you will get a warning
	     my @res= $t->get_xpath( $exp);

	   XML::Twig does not provide full XPATH support. If that’s what you
	   want then look no further than the XML::XPath module on CPAN, or
	   even better, the XML::LibXML module.

	   Note that the only supported regexps delimiters are / and that you
	   must backslash all / in regexps AND in regular strings.

       find_nodes
	   same as"get_xpath"

       text
	   Return a string consisting of all the "PCDATA" and "CDATA" in an
	   element, without any tags. The text is not XML-escaped: base enti-
	   ties such as "&" and "<" are not escaped.

       trimmed_text
	   Same as "text" except that the text is trimmed: leading and trail-
	   ing spaces are discarded, consecutive spaces are collapsed

       set_text	       ($string)
	   Set the text for the element: if the element is a "PCDATA", just
	   set its text, otherwise cut all the children of the element and
	   create a single "PCDATA" child for it, which holds the text.

       insert	      ($gi1, [$optional_atts1], $gi2, [$optional_atts2],...)
	   For each gi in the list inserts an element $gi as the only child
	   of the element.  The element gets the optional attributes
	   in"$optional_atts<n>."  All children of the element are set as
	   children of the new element.	 The upper level element is returned.

	     $p->insert( table => { border=> 1}, ’tr’, ’td’)

	   put $p in a table with a visible border, a single "tr" and a
	   single "td" and return the "table" element:

	     <p><table border="1"><tr><td>original content of p</td></tr></table></p>

       wrap_in	      (@gi)
	   Wrap elements $gi as the successive ancestors of the element,
	   returns the new element.  $elt->wrap_in( ’td’, ’tr’, ’table’)
	   wraps the element as a single cell in a table for example.

       insert_new_elt ($opt_position, $gi, $opt_atts_hashref, @opt_content)
	   Combines a "new " and a "paste ": creates a new element using $gi,
	   $opt_atts_hashref and @opt_content which are arguments similar to
	   those for "new", then paste it, using $opt_position or
	   ’first_child’, relative to $elt.

	   Return the newly created element

       erase
	   Erase the element: the element is deleted and all of its children
	   are pasted in its place.

       set_content    ( $optional_atts, @list_of_elt_and_strings) (
       $optional_atts, ’#EMPTY’)
	   Set the content for the element, from a list of strings and ele-
	   ments.  Cuts all the element children, then pastes the list ele-
	   ments as the children.  This method will create a "PCDATA" element
	   for any strings in the list.

	   The $optional_atts argument is the ref of a hash of attributes. If
	   this argument is used then the previous attributes are deleted,
	   otherwise they are left untouched.

	   WARNING: if you rely on ID’s then you will have to set the id
	   yourself. At this point the element does not belong to a twig yet,
	   so the ID attribute is not known so it won’t be strored in the ID
	   list.

	   A content of ’"#EMPTY"’ creates an empty element;

       namespace
	   Return the URI of the namespace that the name belongs to. If the
	   name doesn’t belong to any namespace, "undef" is returned.

       expand_ns_prefix ($prefix)
	   Return the uri to which the given prefix is bound in the context
	   of the element.  Returns "undef" if the prefix isn’t currently
	   bound. Use ’"#default"’ to find the current binding of the default
	   namespace (if any).

       current_ns_prefixes
	   Returna list of namespace prefixes valid for the element. The
	   order of the prefixes in the list has no meaning. If the default
	   namespace is currently bound, ’"#default"’ appears in the list.

       inherit_att  ($att, @optional_gi_list)
	   Return the value of an attribute inherited from parent tags. The
	   value returned is found by looking for the attribute in the ele-
	   ment then in turn in each of its ancestors. If the
	   @optional_gi_list is supplied only those ancestors whose gi is in
	   the list will be checked.

       all_children_are ($optional_condition)
	   return 1 if all children of the element pass the $optional_condi-
	   tion, 0 otherwise

       level	   ($optional_condition)
	   Return the depth of the element in the twig (root is 0).  If
	   $optional_condition is given then only ancestors that match the
	   condition are counted.

	   WARNING: in a tree created using the "twig_roots" option this will
	   not return the level in the document tree, level 0 will be the
	   document root, level 1 will be the "twig_roots" elements. During
	   the parsing (in a "twig_handler") you can use the "depth" method
	   on the twig object to get the real parsing depth.

       in	    ($potential_parent)
	   Return true if the element is in the potential_parent ($poten-
	   tial_parent is an element)

       in_context   ($gi, $optional_level)
	   Return true if the element is included in an element whose gi is
	   $gi, optionally within $optional_level levels. The returned value
	   is the including element.

       pcdata
	   Return the text of a "PCDATA" element or "undef" if the element is
	   not "PCDATA".

       pcdata_xml_string
	   Return the text of a PCDATA element or undef if the element is not
	   PCDATA.  The text is "XML-escaped" (’&’ and ’<’ are replaced by
	   ’&amp;’ and ’&lt;’)

       set_pcdata     ($text)
	   Set the text of a "PCDATA" element.

       append_pcdata  ($text)
	   Add the text at the end of a "PCDATA" element.

       is_cdata
	   Return 1 if the element is a "CDATA" element, returns 0 otherwise.

       is_text
	   Return 1 if the element is a "CDATA" or "PCDATA" element, returns
	   0 otherwise.

       cdata
	   Return the text of a "CDATA" element or "undef" if the element is
	   not "CDATA".

       set_cdata     ($text)
	   Set the text of a "CDATA" element.

       append_cdata  ($text)
	   Add the text at the end of a "CDATA" element.

       remove_cdata
	   Turns all "CDATA" sections in the element into regular "PCDATA"
	   elements. This is useful when converting XML to HTML, as browsers
	   do not support CDATA sections.

       extra_data
	   Return the extra_data (comments and PI’s) attached to an element

       set_extra_data	  ($extra_data)
	   Set the extra_data (comments and PI’s) attached to an element

       append_extra_data  ($extra_data)
	   Append extra_data to the existing extra_data before the element
	   (if no previous extra_data exists then it is created)

       set_asis
	   Set a property of the element that causes it to be output without
	   being XML escaped by the print functions: if it contains "a < b"
	   it will be output as such and not as "a &lt; b". This can be use-
	   ful to create text elements that will be output as markup. Note
	   that all "PCDATA" descendants of the element are also marked as
	   having the property (they are the ones taht are actually impacted
	   by the change).

	   If the element is a "CDATA" element it will also be output asis,
	   without the "CDATA" markers. The same goes for any "CDATA" descen-
	   dant of the element

       set_not_asis
	   Unsets the "asis" property for the element and its text descen-
	   dants.

       is_asis
	   Return the "asis" property status of the element ( 1 or "undef")

       closed
	   Return true if the element has been closed. Might be usefull if
	   you are somewhere in the tree, during the parse, and have no idea
	   whether a parent element is completely loaded or not.

       get_type
	   Return the type of the element: ’"#ELT"’ for "real" elements, or
	   ’"#PCDATA"’, ’"#CDATA"’, ’"#COMMENT"’, ’"#ENT"’, ’"#PI"’

       is_elt
	   Return the gi if the element is a "real" element, or 0 if it is
	   "PCDATA", "CDATA"...

       contains_only_text
	   Return 1 if the element does not contain any other "real" element

       contains_only ($exp)
	   Return the list of children if all children of the element match
	   the expression $exp

	     if( $para->contains_only( ’tt’)) { ... }

       contains_a_single ($exp)
	   If the element contains a single child that matches the expression
	   $exp returns that element. Otherwise returns 0.

       is_field
	   same as "contains_only_text"

       is_pcdata
	   Return 1 if the element is a "PCDATA" element, returns 0 other-
	   wise.

       is_empty
	   Return 1 if the element is empty, 0 otherwise

       set_empty
	   Flags the element as empty. No further check is made, so if the
	   element is actually not empty the output will be messed. The only
	   effect of this method is that the output will be "<gi
	   att="value""/>".

       set_not_empty
	   Flags the element as not empty. if it is actually empty then the
	   element will be output as "<gi att="value""></gi>"

       child ($offset, $optional_condition)
	   Return the $offset-th child of the element, optionally the $off-
	   set-th child that matches $optional_condition. The children are
	   treated as a list, so "$elt->child( 0)" is the first child, while
	   "$elt->child( -1)" is the last child.

       child_text ($offset, $optional_condition)
	   Return the text of a child or "undef" if the sibling does not
	   exist. Arguments are the same as child.

       last_child    ($optional_condition)
	   Return the last child of the element, or the last child matching
	   $optional_condition (ie the last of the element children matching
	   the condition).

       last_child_text	 ($optional_condition)
	   Same as "first_child_text" but for the last child.

       sibling	($offset, $optional_condition)
	   Return the next or previous $offset-th sibling of the element, or
	   the $offset-th one matching $optional_condition. If $offset is
	   negative then a previous sibling is returned, if $offset is posi-
	   tive then  a next sibling is returned. "$offset=0" returns the
	   element if there is no condition or if the element matches the
	   condition>, "undef" otherwise.

       sibling_text ($offset, $optional_condition)
	   Return the text of a sibling or "undef" if the sibling does not
	   exist.  Arguments are the same as "sibling".

       prev_siblings ($optional_condition)
	   Return the list of previous siblings (optionaly matching
	   $optional_condition) for the element. The elements are ordered in
	   document order.

       next_siblings ($optional_condition)
	   Return the list of siblings (optionaly matching $optional_condi-
	   tion) following the element. The elements are ordered in document
	   order.

       pos ($optional_condition)
	   Return the position of the element in the children list. The first
	   child has a position of 1 (as in XPath).

	   If the $optional_condition is given then only siblings that match
	   the condition are counted. If the element itself does not match
	   the	condition then 0 is returned.

       atts
	   Return a hash ref containing the element attributes

       set_atts	     ({att1=>$att1_val, att2=> $att2_val... })
	   Set the element attributes with the hash ref supplied as the argu-
	   ment

       del_atts
	   Deletes all the element attributes.

       att_names
	   return a list of the attribute names for the element

       att_xml_string ($att, $optional_quote)
	   Return the attribute value, where ’&’, ’<’ and $quote (" by
	   default) are XML-escaped

	   if $optional_quote is passed then it is used as the quote.

       set_id	    ($id)
	   Set the "id" attribute of the element to the value.	See "elt_id "
	   to change the id attribute name

       id  Gets the id attribute value

       del_id	    ($id)
	   Deletes the "id" attribute of the element and remove it from the
	   id list for the document

       DESTROY
	   Frees the element from memory.

       start_tag
	   Return the string for the start tag for the element, including the
	   "/>" at the end of an empty element tag

       end_tag
	   Return the string for the end tag of an element.  For an empty
	   element, this returns the empty string (’’).

       xml_string
	   Equivalent to "$elt->sprint( 1)", returns the string for the
	   entire element, excluding the element’s tags (but nested element
	   tags are present)

       xml_text
	   Return the text of the element, encoded (and processed by the cur-
	   rent "output_filter" or "output_encoding" options, without any
	   tag.

       set_pretty_print ($style)
	   Set the pretty print method, amongst ’"none"’ (default),
	   ’"nsgmls"’, ’"nice"’, ’"indented"’, ’"record"’ and ’"record_c"’

	   pretty_print styles:

	   none
	       the default, no "\n" is used

	   nsgmls
	       nsgmls style, with "\n" added within tags

	   nice
	       adds "\n" wherever possible (NOT SAFE, can lead to invalid
	       XML)

	   indented
	       same as "nice" plus indents elements (NOT SAFE, can lead to
	       invalid XML)

	   record
	       table-oriented pretty print, one field per line

	   record_c
	       table-oriented pretty print, more compact than "record", one
	       record per line

       set_empty_tag_style ($style)
	   Set the method to output empty tags, amongst ’"normal"’ (default),
	   ’"html"’, and ’"expand"’,

       set_indent ($string)
	   Set the indentation for the indented pretty print style (default
	   is 2 spaces)

       set_quote ($quote)
	   Set the quotes used for attributes. can be ’"double"’ (default) or
	   ’"single"’

       cmp	 ($elt)
	     Compare the order of the 2 elements in a twig.

	     C<$a> is the <A>..</A> element, C<$b> is the <B>...</B> element

	     document			     $a->cmp( $b)
	     <A> ... </A> ... <B>  ... </B>	-1
	     <A> ... <B>  ... </B> ... </A>	-1
	     <B> ... </B> ... <A>  ... </A>	 1
	     <B> ... <A>  ... </A> ... </B>	 1
	      $a == $b				 0
	      $a and $b not in the same tree   undef

       before	    ($elt)
	   Return 1 if $elt starts before the element, 0 otherwise. If the 2
	   elements are not in the same twig then return "undef".

	       if( $a->cmp( $b) == -1) { return 1; } else { return 0; }

       after	   ($elt)
	   Return 1 if $elt starts after the element, 0 otherwise. If the 2
	   elements are not in the same twig then return "undef".

	       if( $a->cmp( $b) == -1) { return 1; } else { return 0; }

       path
	   Return the element context in a form similar to XPath’s short
	   form: ’"/root/gi1/../gi"’

       xpath
	   Return a unique XPath expression that can be used to find the ele-
	   ment again.

	   It looks like "/doc/sect[3]/title": unique elements do not have an
	   index, the others do.

       private methods
	   Low-level methods on the twig:

	   set_parent	     ($parent)
	   set_first_child   ($first_child)
	   set_last_child    ($last_child)
	   set_prev_sibling  ($prev_sibling)
	   set_next_sibling  ($next_sibling)
	   set_twig_current
	   del_twig_current
	   twig_current
	   flushed
	       This method should NOT be used, always flush the twig, not an
	       element.

	   set_flushed
	   del_flushed
	   flush
	   contains_text

	   Those methods should not be used, unless of course you find some
	   creative and interesting, not to mention useful, ways to do it.

       cond

       Most of the navigation functions accept a condition as an optional
       argument The first element (or all elements for "children " or "ances-
       tors ") that passes the condition is returned.

       The condition is a single step of an XPath expression using the XPath
       subset defined by "get_xpath". Additional conditions are:

       The condition can be

       #ELT
	   return a "real" element (not a PCDATA, CDATA, comment or pi ele-
	   ment)

       #TEXT
	   return a PCDATA or CDATA element

       regular expression
	   return an element whose gi matches the regexp. The regexp has to
	   be created with "qr//" (hence this is available only on perl 5.005
	   and above)

       code reference
	   applies the code, passing the current element as argument, if the
	   code returns true then the element is returned, if it returns
	   false then the code is applied to the next candidate.

       XML::Twig::XPath

       XML::Twig implements a subset of XPath through the "get_xpath" method.

       If you want to use the whole XPath power, then you can use
       "XML::Twig::XPath" instead. In this case "XML::Twig" uses "XML::XPath"
       to execute XPath queries.  You will of course need "XML::XPath"
       installed to be able to use "XML::Twig::XPath".

       See XML::XPath for more information.

       The methods you can use are:

       findnodes	      ($path)
	   return a list of nodes found by $path.

       findnodes_as_string    ($path)
	   return the nodes found reproduced as XML. The result is not guar-
	   anteed to be valid XML though.

       findvalue	      ($path)
	   return the concatenation of the text content of the result nodes

       XML::Twig::XPath::Elt

       The methods you can use are the same as on "XML::Twig::XPath" ele-
       ments:

       findnodes	      ($path)
	   return a list of nodes found by $path.

       findnodes_as_string    ($path)
	   return the nodes found reproduced as XML. The result is not guar-
	   anteed to be valid XML though.

       findvalue	      ($path)
	   return the concatenation of the text content of the result nodes

       XML::Twig::Entity_list


       new Creates an entity list.

       add	   ($ent)
	   Adds an entity to an entity list.

       delete	  ($ent or $gi).
	   Deletes an entity (defined by its name or by the Entity object)
	   from the list.

       print	  ($optional_filehandle)
	   Prints the entity list.

       XML::Twig::Entity


       new	  ($name, $val, $sysid, $pubid, $ndata)
	   Same arguments as the Entity handler for XML::Parser.

       print	   ($optional_filehandle)
	   Print an entity declaration.

       name
	   Return the name of the entity

       val Return the value of the entity

       sysid
	   Return the system id for the entity (for NDATA entities)

       pubid
	   Return the public id for the entity (for NDATA entities)

       ndata
	   Return true if the entity is an NDATA entity

       text
	   Return the entity declaration text.

EXAMPLES
       See the test file in t/test[1-n].t Additional examples (and a complete
       tutorial) can be found  on the XML::Twig
       Page<http://www.xmltwig.com/xmltwig/>

       To figure out what flush does call the following script with an XML
       file and an element name as arguments

	 use XML::Twig;

	 my ($file, $elt)= @ARGV;
	 my $t= XML::Twig->new( twig_handlers =>
	     { $elt => sub {$_[0]->flush; print "\n[flushed here]\n";} });
	 $t->parsefile( $file, ErrorContext => 2);
	 $t->flush;
	 print "\n";

NOTES
       XML::Twig and various versions of Perl, XML::Parser and expat:

       Before being uploaded to CPAN, XML::Twig 3.12 has been tested under
       the following environments:

       Linux, perl 5.8.2, expat 1.95.7, XML::Parser 2.34 Solaris, perl 5.6.1,
       expat ?, XML::Parser 2.31 Windows 98, perl 5.6.1 (Activestate build
       635), XML::Parser 2.27 Windows 98, perl 5.8.2 (Activestate build 808),
       XML::Parser 2.34 Mac OS X, perl 5.6.0, XML::Parser 2.34 Mac OS X, perl
       5.8.3, XML::Parser 2.34 Dec-OSF1 4.1G, perl 5.6.1, XML::Parser 2.34

       Note that with Windows 98 and Perl 5.6.1 nmake may freeze while trying
       to copy the tools (xml_grep, xml_print and xml_spellcheck), so you
       have to answer no when asked if you want to install them.

       See <http://testers.cpan.org/search?request=dist&dist=XML-Twig> for
       the CPAN testers reports on XML::Twig

       XML::Twig does NOT work with expat 1.95.4 (upgrade to 1.95.5)
       XML::Parser 2.27 does NOT work under perl 5.8.0

       When in doubt, upgrade expat, XML::Parser and Scalar::Util

       DTD Handling

       There are 3 possibilities here.	They are:

       No DTD
	   No doctype, no DTD information, no entity information, the world
	   is simple...

       Internal DTD
	   The XML document includes an internal DTD, and maybe entity decla-
	   rations.

	   If you use the load_DTD option when creating the twig the DTD
	   information and the entity declarations can be accessed.

	   The DTD and the entity declarations will be "flush"’ed (or
	   "print"’ed) either as is (if they have not been modified) or as
	   reconstructed (poorly, comments are lost, order is not kept, due
	   to it’s content this DTD should not be viewed by anyone) if they
	   have been modified. You can also modify them directly by changing
	   the "$twig->{twig_doctype}->{internal}" field (straight from
	   XML::Parser, see the "Doctype" handler doc)

       External DTD
	   The XML document includes a reference to an external DTD, and
	   maybe entity declarations.

	   If you use the "load_DTD" when creating the twig the DTD informa-
	   tion and the entity declarations can be accessed. The entity dec-
	   larations will be "flush"’ed (or "print"’ed) either as is (if they
	   have not been modified) or as reconstructed (badly, comments are
	   lost, order is not kept).

	   You can change the doctype through the "$twig->set_doctype" method
	   and print the dtd through the "$twig->dtd_text" or
	   "$twig->dtd_print"
	    methods.

	   If you need to modify the entity list this is probably the easiest
	   way to do it.

       Flush

       If you set handlers and use "flush", do not forget to flush the twig
       one last time AFTER the parsing, or you might be missing the end of
       the document.

       Remember that element handlers are called when the element is CLOSED,
       so if you have handlers for nested elements the inner handlers will be
       called first. It makes it for example trickier than it would seem to
       number nested clauses.

BUGS
       entity handling
	   Due to XML::Parser behaviour, non-base entities in attribute val-
	   ues disappear: "att="val&ent;"" will be turned into "att => val",
	   unless you use the "keep_encoding" argument to "XML::Twig->new"

       DTD handling
	   Basically the DTD handling methods are competely bugged. No one
	   uses them and it seems very difficult to get them to work in all
	   cases, including with several slightly incompatible versions of
	   XML::Parser and of libexpat.

	   So use XML::Twig with standalone documents, or with documents
	   refering to an external DTD, but don’t expect it to properly parse
	   and even output back the DTD.

       memory leak
	   If you use a lot of twigs you might find that you leak quite a lot
	   of memory (about 2Ks per twig). You can use the "dispose " method
	   to free that memory after you are done.

	   If you create elements the same thing might happen, use the
	   "delete" method to get rid of them.

	   Alternatively installing the "Scalar::Util" (or "WeakRef") module
	   on a version of Perl that supports it (>5.6.0) will get rid of the
	   memory leaks automagically.

       ID list
	   The ID list is NOT updated when ID’s are modified or elements cut
	   or deleted.

       change_gi
	   This method will not function properly if you do:

		$twig->change_gi( $old1, $new);
		$twig->change_gi( $old2, $new);
		$twig->change_gi( $new, $even_newer);

       sanity check on XML::Parser method calls
	   XML::Twig should really prevent calls to some XML::Parser methods,
	   especially the "setHandlers" method.

       pretty printing
	   Pretty printing (at least using the ’"indented"’ style) is hard to
	   get right!  Only elements that belong to the document will be
	   properly indented. Printing elements that do not belong to the
	   twig makes it impossible for XML::Twig to figure out their depth,
	   and thus their indentation level.

	   Also there is an anavoidable bug when using "flush" and pretty
	   printing for elements with mixed content that start with an embed-
	   ded element:

	     <elt><b>b</b>toto<b>bold</b></elt>

	     will be output as

	     <elt>
	       <b>b</b>toto<b>bold</b></elt>

	   if you flush the twig when you find the "<b>" element

Globals
       These are the things that can mess up calling code, especially if
       threaded.  They might also cause problem under mod_perl.

       Exported constants
	   Whether you want them or not you get them! These are subroutines
	   to use as constant when creating or testing elements

	     PCDATA  return ’#PCDATA’
	     CDATA   return ’#CDATA’
	     PI	     return ’#PI’, I had the choice between PROC and PI :--(

       Module scoped values: constants
	   these should cause no trouble:

	     %base_ent= ( ’>’ => ’&gt;’,
			  ’<’ => ’&lt;’,
			  ’&’ => ’&amp;’,
			  "’" => ’&apos;’,
			  ’"’ => ’&quot;’,
			);
	     CDATA_START   = "<![CDATA[";
	     CDATA_END	   = "]]>";
	     PI_START	   = "<?";
	     PI_END	   = "?>";
	     COMMENT_START = "<!--";
	     COMMENT_END   = "-->";

	   pretty print styles

	     ( $NSGMLS, $NICE, $INDENTED, $RECORD1, $RECORD2)= (1..5);

	   empty tag output style

	     ( $HTML, $EXPAND)= (1..2);

       Module scoped values: might be changed
	   Most of these deal with pretty printing, so the worst that can
	   happen is probably that XML output does not look right, but is
	   still valid and processed identically by XML processors.

	   $empty_tag_style can mess up HTML bowsers though and changing $ID
	   would most likely create problems.

	     $pretty=0;		  # pretty print style
	     $quote=’"’;	  # quote for attributes
	     $INDENT= ’	 ’;	  # indent for indented pretty print
	     $empty_tag_style= 0; # how to display empty tags
	     $ID		  # attribute used as a gi (’id’ by default)

       Module scoped values: definitely changed
	   These 2 variables are used to replace gi’s by an index, thus sav-
	   ing some space when creating a twig. If they really cause you too
	   much trouble, let me know, it is probably possible to create
	   either a switch or at least a version of XML::Twig that does not
	   perform this optimisation.

	     %gi2index;	    # gi => index
	     @index2gi;	    # list of gi’s

TODO
       SAX handlers
	   Allowing XML::Twig to work on top of any SAX parser

       multiple twigs are not well supported
	   A number of twig features are just global at the moment. These
	   include the ID list and the "gi pool" (if you use "change_gi" then
	   you change the gi for ALL twigs).

	   A future version will try to support this while trying not to be
	   to hard on performance (at least when a single twig is used!).

AUTHOR
       Michel Rodriguez <mirod@xmltwig.com>

LICENSE
       This library is free software; you can redistribute it and/or modify
       it under the same terms as Perl itself.

       Bug reports should be sent using:
       RT<http://rt.cpan.org/NoAuth/Bugs.html?Dist=XML-Twig>

       Comments can be sent to mirod@xmltwig.com

       The XML::Twig page is at <http://www.xmltwig.com/xmltwig/> It includes
       the development version of the module, a slightly better version of
       the documentation, examples, a tutorial and a: Processing XML effi-
       ciently with Perl and XML::Twig: <http://www.xmltwig.com/xmltwig/tuto-
       rial/index.html>

SEE ALSO
       XML::Parser,XML::Parser::Expat, Encode, Text::Iconv, Scalar::Utils



perl v5.8.5			  2005-02-21			      Twig(3)