pcreapi

TriggerTek Logo
abcdefghijklmnopqrstuvwxyz_
PCREAPI(3)							   PCREAPI(3)



NAME
       PCRE - Perl-compatible regular expressions

PCRE NATIVE API

       #include <pcre.h>

       pcre *pcre_compile(const char *pattern, int options,
	    const char **errptr, int *erroffset,
	    const unsigned char *tableptr);

       pcre *pcre_compile2(const char *pattern, int options,
	    int *errorcodeptr,
	    const char **errptr, int *erroffset,
	    const unsigned char *tableptr);

       pcre_extra *pcre_study(const pcre *code, int options,
	    const char **errptr);

       int pcre_exec(const pcre *code, const pcre_extra *extra,
	    const char *subject, int length, int startoffset,
	    int options, int *ovector, int ovecsize);

       int pcre_dfa_exec(const pcre *code, const pcre_extra *extra,
	    const char *subject, int length, int startoffset,
	    int options, int *ovector, int ovecsize,
	    int *workspace, int wscount);

       int pcre_copy_named_substring(const pcre *code,
	    const char *subject, int *ovector,
	    int stringcount, const char *stringname,
	    char *buffer, int buffersize);

       int pcre_copy_substring(const char *subject, int *ovector,
	    int stringcount, int stringnumber, char *buffer,
	    int buffersize);

       int pcre_get_named_substring(const pcre *code,
	    const char *subject, int *ovector,
	    int stringcount, const char *stringname,
	    const char **stringptr);

       int pcre_get_stringnumber(const pcre *code,
	    const char *name);

       int pcre_get_stringtable_entries(const pcre *code,
	    const char *name, char **first, char **last);

       int pcre_get_substring(const char *subject, int *ovector,
	    int stringcount, int stringnumber,
	    const char **stringptr);

       int pcre_get_substring_list(const char *subject,
	    int *ovector, int stringcount, const char ***listptr);

       void pcre_free_substring(const char *stringptr);

       void pcre_free_substring_list(const char **stringptr);

       const unsigned char *pcre_maketables(void);

       int pcre_fullinfo(const pcre *code, const pcre_extra *extra,
	    int what, void *where);

       int pcre_info(const pcre *code, int *optptr, int *firstcharptr);

       int pcre_refcount(pcre *code, int adjust);

       int pcre_config(int what, void *where);

       char *pcre_version(void);

       void *(*pcre_malloc)(size_t);

       void (*pcre_free)(void *);

       void *(*pcre_stack_malloc)(size_t);

       void (*pcre_stack_free)(void *);

       int (*pcre_callout)(pcre_callout_block *);

PCRE API OVERVIEW

       PCRE  has  its  own  native  API, which is described in this document.
       There are also some wrapper functions that  correspond  to  the	POSIX
       regular	expression API. These are described in the pcreposix documen-
       tation. Both of these APIs define a set of C  function  calls.  A  C++
       wrapper	is  distributed	 with  PCRE.  It is documented in the pcrecpp
       page.

       The native API C function prototypes are defined in  the	 header	 file
       pcre.h,	and on Unix systems the library itself is called libpcre.  It
       can normally be accessed by adding -lpcre to the command	 for  linking
       an  application	that  uses  PCRE.  The header file defines the macros
       PCRE_MAJOR and PCRE_MINOR to contain the major and minor release	 num-
       bers  for  the library.	Applications can use these to include support
       for different releases of PCRE.

       The  functions  pcre_compile(),	pcre_compile2(),  pcre_study(),	  and
       pcre_exec() are used for compiling and matching regular expressions in
       a Perl-compatible manner. A sample program that demonstrates the	 sim-
       plest  way  of using them is provided in the file called pcredemo.c in
       the source distribution. The pcresample documentation describes how to
       run it.

       A second matching function, pcre_dfa_exec(), which is not Perl-compat-
       ible, is also provided. This uses a different algorithm for the match-
       ing.  The alternative algorithm finds all possible matches (at a given
       point in the subject), and scans the subject just once. However,	 this
       algorithm  does	not  return captured substrings. A description of the
       two matching algorithms and  their  advantages  and  disadvantages  is
       given in the pcrematching documentation.

       In  addition  to	 the main compiling and matching functions, there are
       convenience functions for extracting captured substrings from  a	 sub-
       ject string that is matched by pcre_exec(). They are:

	 pcre_copy_substring()
	 pcre_copy_named_substring()
	 pcre_get_substring()
	 pcre_get_named_substring()
	 pcre_get_substring_list()
	 pcre_get_stringnumber()
	 pcre_get_stringtable_entries()

       pcre_free_substring()  and  pcre_free_substring_list()  are  also pro-
       vided, to free the memory used for extracted strings.

       The function pcre_maketables() is used to build	a  set	of  character
       tables	in   the   current  locale  for	 passing  to  pcre_compile(),
       pcre_exec(), or pcre_dfa_exec(). This is an optional facility that  is
       provided	 for  specialist  use.	Most  commonly, no special tables are
       passed, in which case internal tables that are generated when PCRE  is
       built are used.

       The  function  pcre_fullinfo() is used to find out information about a
       compiled pattern; pcre_info() is an obsolete version that returns only
       some  of the available information, but is retained for backwards com-
       patibility.  The function pcre_version() returns a pointer to a string
       containing the version of PCRE and its date of release.

       The  function  pcre_refcount()  maintains  a reference count in a data
       block containing a compiled pattern. This is provided for the  benefit
       of object-oriented applications.

       The  global  variables pcre_malloc and pcre_free initially contain the
       entry points of the standard malloc() and  free()  functions,  respec-
       tively.	PCRE  calls  the  memory management functions via these vari-
       ables, so a calling program can replace them if it wishes to intercept
       the calls. This should be done before calling any PCRE functions.

       The  global  variables  pcre_stack_malloc and pcre_stack_free are also
       indirections to memory management functions. These  special  functions
       are  used  only	when PCRE is compiled to use the heap for remembering
       data,  instead  of  recursive  function	calls,	 when	running	  the
       pcre_exec()  function.  See the pcrebuild documentation for details of
       how to do this. It is a non-standard way of building PCRE, for use  in
       environments  that  have limited stacks. Because of the greater use of
       memory management, it runs more slowly. Separate	 functions  are	 pro-
       vided so that special-purpose external code can be used for this case.
       When used, these functions are always called in	a  stack-like  manner
       (last obtained, first freed), and always for memory blocks of the same
       size. There is a discussion about PCRE’s stack usage in the  pcrestack
       documentation.

       The  global  variable  pcre_callout initially contains NULL. It can be
       set by the caller to a "callout" function, which PCRE will  then	 call
       at  specified points during a matching operation. Details are given in
       the pcrecallout documentation.

NEWLINES

       PCRE supports five different conventions for indicating line breaks in
       strings:	 a  single CR (carriage return) character, a single LF (line-
       feed) character, the two-character sequence CRLF,  any  of  the	three
       preceding,  or  any  Unicode  newline  sequence.	 The  Unicode newline
       sequences are the three just mentioned, plus the single characters  VT
       (vertical  tab,	U+000B),  FF  (formfeed,  U+000C),  NEL	 (next	line,
       U+0085), LS (line separator, U+2028),  and  PS  (paragraph  separator,
       U+2029).

       Each  of the first three conventions is used by at least one operating
       system as its standard newline sequence. When PCRE is built, a default
       can  be specified.  The default default is LF, which is the Unix stan-
       dard. When PCRE is run, the default can be overridden, either  when  a
       pattern is compiled, or when it is matched.

       At  compile  time,  the	newline	 convention  can  be specified by the
       options argument of pcre_compile(), or it can be specified by  special
       text at the start of the pattern itself; this overrides any other set-
       tings. See the pcrepattern page for details of the  special  character
       sequences.

       In  the	PCRE  documentation  the  word "newline" is used to mean "the
       character or pair of characters	that  indicate	a  line	 break".  The
       choice  of newline convention affects the handling of the dot, circum-
       flex, and dollar metacharacters, the  handling  of  #-comments  in  /x
       mode,  and,  when CRLF is a recognized line ending sequence, the match
       position advancement for a non-anchored pattern. There is more  detail
       about this in the section on pcre_exec() options below.

       The choice of newline convention does not affect the interpretation of
       the \n or \r escape sequences, nor does it  affect  what	 \R  matches,
       which is controlled in a similar way, but by separate options.

MULTITHREADING

       The  PCRE  functions can be used in multi-threading applications, with
       the proviso  that  the  memory  management  functions  pointed  to  by
       pcre_malloc,  pcre_free,	 pcre_stack_malloc,  and pcre_stack_free, and
       the callout function pointed to by pcre_callout,	 are  shared  by  all
       threads.

       The compiled form of a regular expression is not altered during match-
       ing, so the same compiled  pattern  can	safely	be  used  by  several
       threads at once.

SAVING PRECOMPILED PATTERNS FOR LATER USE

       The  compiled form of a regular expression can be saved and re-used at
       a later time, possibly by a different program,  and  even  on  a	 host
       other  than the one on which it was compiled. Details are given in the
       pcreprecompile documentation. However, compiling a regular  expression
       with one version of PCRE for use with a different version is not guar-
       anteed to work and may cause crashes.

CHECKING BUILD-TIME OPTIONS

       int pcre_config(int what, void *where);

       The function pcre_config() makes it possible for a PCRE client to dis-
       cover  which  optional  features	 have  been  compiled  into  the PCRE
       library. The pcrebuild documentation  has  more	details	 about	these
       optional features.

       The  first  argument for pcre_config() is an integer, specifying which
       information is required; the second argument is a pointer to  a	vari-
       able  into  which the information is placed. The following information
       is available:

	 PCRE_CONFIG_UTF8

       The output is an integer that is set to one if UTF-8 support is avail-
       able; otherwise it is set to zero.

	 PCRE_CONFIG_UNICODE_PROPERTIES

       The  output  is	an  integer that is set to one if support for Unicode
       character properties is available; otherwise it is set to zero.

	 PCRE_CONFIG_NEWLINE

       The output is an integer whose value specifies the  default  character
       sequence that is recognized as meaning "newline". The four values that
       are supported are: 10 for LF, 13 for CR, 3338 for CRLF,	-2  for	 ANY-
       CRLF,  and  -1  for  ANY.  The default should normally be the standard
       sequence for your operating system.

	 PCRE_CONFIG_BSR

       The  output  is	an  integer  whose  value  indicates  what  character
       sequences  the  \R  escape  sequence  matches by default. A value of 0
       means that \R matches any Unicode line ending sequence; a value	of  1
       means  that  \R matches only CR, LF, or CRLF. The default can be over-
       ridden when a pattern is compiled or matched.

	 PCRE_CONFIG_LINK_SIZE

       The output is an integer that contains the number of  bytes  used  for
       internal	 linkage  in compiled regular expressions. The value is 2, 3,
       or 4. Larger values allow larger regular expressions to	be  compiled,
       at  the	expense	 of slower matching. The default value of 2 is suffi-
       cient for all but the most massive patterns, since it allows the	 com-
       piled pattern to be up to 64K in size.

	 PCRE_CONFIG_POSIX_MALLOC_THRESHOLD

       The  output  is an integer that contains the threshold above which the
       POSIX interface uses malloc() for output vectors. Further details  are
       given in the pcreposix documentation.

	 PCRE_CONFIG_MATCH_LIMIT

       The  output  is an integer that gives the default limit for the number
       of internal matching function calls in a pcre_exec()  execution.	 Fur-
       ther details are given with pcre_exec() below.

	 PCRE_CONFIG_MATCH_LIMIT_RECURSION

       The output is an integer that gives the default limit for the depth of
       recursion when calling the internal matching function in a pcre_exec()
       execution. Further details are given with pcre_exec() below.

	 PCRE_CONFIG_STACKRECURSE

       The output is an integer that is set to one if internal recursion when
       running pcre_exec() is implemented by recursive	function  calls	 that
       use the stack to remember their state. This is the usual way that PCRE
       is compiled. The output is zero if PCRE was compiled to use blocks  of
       data  on	 the  heap instead of recursive function calls. In this case,
       pcre_stack_malloc and pcre_stack_free  are  called  to  manage  memory
       blocks on the heap, thus avoiding the use of the stack.

COMPILING A PATTERN

       pcre *pcre_compile(const char *pattern, int options,
	    const char **errptr, int *erroffset,
	    const unsigned char *tableptr);

       pcre *pcre_compile2(const char *pattern, int options,
	    int *errorcodeptr,
	    const char **errptr, int *erroffset,
	    const unsigned char *tableptr);

       Either  of  the	functions  pcre_compile()  or  pcre_compile2() can be
       called to compile a pattern into an internal form. The only difference
       between	the  two interfaces is that pcre_compile2() has an additional
       argument, errorcodeptr, via  which  a  numerical	 error	code  can  be
       returned.

       The  pattern  is a C string terminated by a binary zero, and is passed
       in the pattern argument. A pointer to a single block of memory that is
       obtained	 via pcre_malloc is returned. This contains the compiled code
       and related data. The pcre type is defined  for	the  returned  block;
       this  is	 a  typedef for a structure whose contents are not externally
       defined. It is up to the caller to free	the  memory  (via  pcre_free)
       when it is no longer required.

       Although the compiled code of a PCRE regex is relocatable, that is, it
       does not depend on memory location, the complete pcre  data  block  is
       not  fully  relocatable, because it may contain a copy of the tableptr
       argument, which is an address (see below).

       The options argument contains various bit  settings  that  affect  the
       compilation.  It should be zero if no options are required. The avail-
       able options are described below. Some of them, in  particular,	those
       that  are  compatible with Perl, can also be set and unset from within
       the pattern (see the detailed description in the pcrepattern  documen-
       tation). For these options, the contents of the options argument spec-
       ifies their initial settings at the start of  compilation  and  execu-
       tion. The PCRE_ANCHORED and PCRE_NEWLINE_xxx options can be set at the
       time of matching as well as at compile time.

       If errptr is NULL, pcre_compile() returns  NULL	immediately.   Other-
       wise,  if compilation of a pattern fails, pcre_compile() returns NULL,
       and sets the variable pointed to by errptr to point to a textual error
       message. This is a static string that is part of the library. You must
       not try to free it. The offset from the start of the  pattern  to  the
       character  where	 the  error  was discovered is placed in the variable
       pointed to by erroffset, which must not be NULL. If it is, an  immedi-
       ate error is given.

       If  pcre_compile2()  is used instead of pcre_compile(), and the error-
       codeptr argument is not NULL, a non-zero error code number is returned
       via this argument in the event of an error. This is in addition to the
       textual error message. Error codes and messages are listed below.

       If the final argument, tableptr, is NULL, PCRE uses a default  set  of
       character  tables  that	are  built  when  PCRE is compiled, using the
       default C locale. Otherwise, tableptr must be an address that  is  the
       result  of  a call to pcre_maketables(). This value is stored with the
       compiled pattern, and used again by pcre_exec(), unless another	table
       pointer	is  passed  to	it.  For  more discussion, see the section on
       locale support below.

       This code fragment shows a typical straightforward call	to  pcre_com-
       pile():

	 pcre *re;
	 const char *error;
	 int erroffset;
	 re = pcre_compile(
	   "^A.*Z",	     /* the pattern */
	   0,		     /* default options */
	   &error,	     /* for error message */
	   &erroffset,	     /* for error offset */
	   NULL);	     /* use default character tables */

       The  following  names for option bits are defined in the pcre.h header
       file:

	 PCRE_ANCHORED

       If this bit is set, the pattern is forced to be "anchored",  that  is,
       it  is  constrained  to	match only at the first matching point in the
       string that is being searched (the "subject string"). This effect  can
       also  be	 achieved  by  appropriate  constructs in the pattern itself,
       which is the only way to do it in Perl.

	 PCRE_AUTO_CALLOUT

       If this bit  is	set,  pcre_compile()  automatically  inserts  callout
       items,  all  with number 255, before each pattern item. For discussion
       of the callout facility, see the pcrecallout documentation.

	 PCRE_BSR_ANYCRLF
	 PCRE_BSR_UNICODE

       These options (which are	 mutually  exclusive)  control	what  the  \R
       escape sequence matches. The choice is either to match only CR, LF, or
       CRLF, or to match any Unicode newline sequence. The default is  speci-
       fied when PCRE is built. It can be overridden from within the pattern,
       or by setting an option when a compiled pattern is matched.

	 PCRE_CASELESS

       If this bit is set, letters in the pattern match both upper and	lower
       case  letters.  It  is  equivalent  to Perl’s /i option, and it can be
       changed within a pattern by a (?i) option setting. In UTF-8 mode, PCRE
       always understands the concept of case for characters whose values are
       less than 128, so caseless matching is always possible. For characters
       with  higher  values, the concept of case is supported if PCRE is com-
       piled with Unicode property support, but not otherwise. If you want to
       use  caseless  matching	for characters 128 and above, you must ensure
       that PCRE is compiled with Unicode property support as  well  as	 with
       UTF-8 support.

	 PCRE_DOLLAR_ENDONLY

       If this bit is set, a dollar metacharacter in the pattern matches only
       at the end of the subject string. Without this option, a	 dollar	 also
       matches immediately before a newline at the end of the string (but not
       before any other newlines). The PCRE_DOLLAR_ENDONLY option is  ignored
       if  PCRE_MULTILINE  is  set.  There is no equivalent to this option in
       Perl, and no way to set it within a pattern.

	 PCRE_DOTALL

       If this bit is set, a dot metacharater  in  the	pattern	 matches  all
       characters,  including  those that indicate newline. Without it, a dot
       does not match when the current position is at a newline. This  option
       is equivalent to Perl’s /s option, and it can be changed within a pat-
       tern by a (?s) option setting. A negative class such  as	 [^a]  always
       matches newline characters, independent of the setting of this option.

	 PCRE_DUPNAMES

       If this bit is set, names used to identify capturing subpatterns	 need
       not  be	unique. This can be helpful for certain types of pattern when
       it is known that only one instance of the named subpattern can ever be
       matched.	 There	are more details of named subpatterns below; see also
       the pcrepattern documentation.

	 PCRE_EXTENDED

       If this bit is set, whitespace data  characters	in  the	 pattern  are
       totally	ignored	 except	 when  escaped	or  inside a character class.
       Whitespace does not include the VT character (code 11).	In  addition,
       characters  between  an	unescaped # outside a character class and the
       next newline, inclusive, are  also  ignored.  This  is  equivalent  to
       Perl’s  /x  option,  and	 it can be changed within a pattern by a (?x)
       option setting.

       This option makes it possible to include comments  inside  complicated
       patterns.   Note,  however, that this applies only to data characters.
       Whitespace  characters  may  never  appear  within  special  character
       sequences  in  a	 pattern,  for	example within the sequence (?( which
       introduces a conditional subpattern.

	 PCRE_EXTRA

       This option was invented in order to turn on additional	functionality
       of  PCRE	 that  is incompatible with Perl, but it is currently of very
       little use. When set, any backslash in a pattern that is followed by a
       letter  that  has  no  special meaning causes an error, thus reserving
       these combinations for future expansion. By default,  as	 in  Perl,  a
       backslash followed by a letter with no special meaning is treated as a
       literal. (Perl can, however, be persuaded to give a warning for this.)
       There  are  at present no other features controlled by this option. It
       can also be set by a (?X) option setting within a pattern.

	 PCRE_FIRSTLINE

       If this option is set, an unanchored  pattern  is  required  to	match
       before  or  at  the  first  newline  in the subject string, though the
       matched text may continue over the newline.

	 PCRE_MULTILINE

       By default, PCRE treats the subject string as consisting of  a  single
       line of characters (even if it actually contains newlines). The "start
       of line" metacharacter (^) matches only at the start  of	 the  string,
       while  the  "end of line" metacharacter ($) matches only at the end of
       the  string,  or	 before	 a  terminating	 newline  (unless   PCRE_DOL-
       LAR_ENDONLY is set). This is the same as Perl.

       When  PCRE_MULTILINE  it is set, the "start of line" and "end of line"
       constructs match immediately following or immediately before  internal
       newlines	 in  the subject string, respectively, as well as at the very
       start and end. This is equivalent to Perl’s /m option, and it  can  be
       changed	within	a  pattern  by a (?m) option setting. If there are no
       newlines in a subject string, or no occurrences of ^ or $  in  a	 pat-
       tern, setting PCRE_MULTILINE has no effect.

	 PCRE_NEWLINE_CR
	 PCRE_NEWLINE_LF
	 PCRE_NEWLINE_CRLF
	 PCRE_NEWLINE_ANYCRLF
	 PCRE_NEWLINE_ANY

       These  options override the default newline definition that was chosen
       when PCRE was built. Setting the first or the second specifies that  a
       newline	is  indicated by a single character (CR or LF, respectively).
       Setting PCRE_NEWLINE_CRLF specifies that a newline is indicated by the
       two-character  CRLF  sequence.  Setting PCRE_NEWLINE_ANYCRLF specifies
       that any of the three preceding sequences should be  recognized.	 Set-
       ting  PCRE_NEWLINE_ANY  specifies  that	any  Unicode newline sequence
       should be recognized. The Unicode newline sequences are the three just
       mentioned,  plus	 the  single characters VT (vertical tab, U+000B), FF
       (formfeed, U+000C), NEL	(next  line,  U+0085),	LS  (line  separator,
       U+2028), and PS (paragraph separator, U+2029). The last two are recog-
       nized only in UTF-8 mode.

       The newline setting in the options  word	 uses  three  bits  that  are
       treated	as  a  number, giving eight possibilities. Currently only six
       are used (default plus the five values above). This means that if  you
       set  more  than	one newline option, the combination may or may not be
       sensible. For example, PCRE_NEWLINE_CR with PCRE_NEWLINE_LF is equiva-
       lent  to	 PCRE_NEWLINE_CRLF,  but  other combinations may yield unused
       numbers and cause an error.

       The only time that a line break is specially recognized when compiling
       a  pattern  is  if  PCRE_EXTENDED is set, and an unescaped # outside a
       character class is encountered. This indicates a	 comment  that	lasts
       until after the next line break sequence. In other circumstances, line
       break  sequences	 are  treated  as  literal  data,  except   that   in
       PCRE_EXTENDED  mode,  both CR and LF are treated as whitespace charac-
       ters and are therefore ignored.

       The newline option that is set at compile  time	becomes	 the  default
       that  is used for pcre_exec() and pcre_dfa_exec(), but it can be over-
       ridden.

	 PCRE_NO_AUTO_CAPTURE

       If this option is set, it  disables  the	 use  of  numbered  capturing
       parentheses  in	the pattern. Any opening parenthesis that is not fol-
       lowed by ? behaves as if it were followed by ?: but named  parentheses
       can still be used for capturing (and they acquire numbers in the usual
       way). There is no equivalent of this option in Perl.

	 PCRE_UNGREEDY

       This option inverts the "greediness" of the quantifiers so  that	 they
       are not greedy by default, but become greedy if followed by "?". It is
       not compatible with Perl. It can also be set by a (?U) option  setting
       within the pattern.

	 PCRE_UTF8

       This  option causes PCRE to regard both the pattern and the subject as
       strings of UTF-8 characters instead of single-byte character  strings.
       However, it is available only when PCRE is built to include UTF-8 sup-
       port. If not, the use of this option provokes an error. Details of how
       this  option changes the behaviour of PCRE are given in the section on
       UTF-8 support in the main pcre page.

	 PCRE_NO_UTF8_CHECK

       When PCRE_UTF8 is set, the validity of the pattern as a	UTF-8  string
       is  automatically checked. There is a discussion about the validity of
       UTF-8 strings in the main pcre page. If an invalid UTF-8	 sequence  of
       bytes  is  found, pcre_compile() returns an error. If you already know
       that your pattern is valid, and you want to skip this check  for	 per-
       formance	 reasons,  you can set the PCRE_NO_UTF8_CHECK option. When it
       is set, the effect of passing an invalid UTF-8 string as a pattern  is
       undefined.  It  may cause your program to crash. Note that this option
       can also be passed to pcre_exec() and pcre_dfa_exec(), to suppress the
       UTF-8 validity checking of subject strings.

COMPILATION ERROR CODES

       The  following  table  lists  the  error codes than may be returned by
       pcre_compile2(), along with the error messages that may be returned by
       both compiling functions. As PCRE has developed, some error codes have
       fallen out of use. To avoid confusion, they have not been re-used.

	  0  no error
	  1  \ at end of pattern
	  2  \c at end of pattern
	  3  unrecognized character follows \
	  4  numbers out of order in {} quantifier
	  5  number too big in {} quantifier
	  6  missing terminating ] for character class
	  7  invalid escape sequence in character class
	  8  range out of order in character class
	  9  nothing to repeat
	 10  [this code is not in use]
	 11  internal error: unexpected repeat
	 12  unrecognized character after (?
	 13  POSIX named classes are supported only within a class
	 14  missing )
	 15  reference to non-existent subpattern
	 16  erroffset passed as NULL
	 17  unknown option bit(s) set
	 18  missing ) after comment
	 19  [this code is not in use]
	 20  regular expression too large
	 21  failed to get memory
	 22  unmatched parentheses
	 23  internal error: code overflow
	 24  unrecognized character after (?<
	 25  lookbehind assertion is not fixed length
	 26  malformed number or name after (?(
	 27  conditional group contains more than two branches
	 28  assertion expected after (?(
	 29  (?R or (?[+-]digits must be followed by )
	 30  unknown POSIX class name
	 31  POSIX collating elements are not supported
	 32  this version of PCRE is not compiled with PCRE_UTF8 support
	 33  [this code is not in use]
	 34  character value in \x{...} sequence is too large
	 35  invalid condition (?(0)
	 36  \C not allowed in lookbehind assertion
	 37  PCRE does not support \L, \l, \N, \U, or \u
	 38  number after (?C is > 255
	 39  closing ) for (?C expected
	 40  recursive call could loop indefinitely
	 41  unrecognized character after (?P
	 42  syntax error in subpattern name (missing terminator)
	 43  two named subpatterns have the same name
	 44  invalid UTF-8 string
	 45  support for \P, \p, and \X has not been compiled
	 46  malformed \P or \p sequence
	 47  unknown property name after \P or \p
	 48  subpattern name is too long (maximum 32 characters)
	 49  too many named subpatterns (maximum 10,000)
	 50  [this code is not in use]
	 51  octal value is greater than \377 (not in UTF-8 mode)
	 52  internal error: overran compiling workspace
	 53  internal error:  previously-checked  referenced  subpattern  not
       found
	 54  DEFINE group contains more than one branch
	 55  repeating a DEFINE group is not allowed
	 56  inconsistent NEWLINE options
	 57  \g is not followed by a braced name or an optionally braced
	       non-zero number
	 58  (?+ or (?- or (?(+ or (?(- must be followed by a non-zero number

STUDYING A PATTERN

       pcre_extra *pcre_study(const pcre *code, int options
	    const char **errptr);

       If a compiled pattern is going to be used several times, it  is	worth
       spending	 more  time  analyzing it in order to speed up the time taken
       for matching. The function pcre_study() takes a pointer to a  compiled
       pattern	as its first argument. If studying the pattern produces addi-
       tional information that will  help  speed  up  matching,	 pcre_study()
       returns a pointer to a pcre_extra block, in which the study_data field
       points to the results of the study.

       The returned  value  from  pcre_study()	can  be	 passed	 directly  to
       pcre_exec().  However,  a  pcre_extra block also contains other fields
       that can be set by the caller before the block is  passed;  these  are
       described below in the section on matching a pattern.

       If  studying  the  pattern does not produce any additional information
       pcre_study() returns NULL. In that circumstance, if the	calling	 pro-
       gram wants to pass any of the other fields to pcre_exec(), it must set
       up its own pcre_extra block.

       The second argument of pcre_study() contains option bits. At  present,
       no options are defined, and this argument should always be zero.

       The third argument for pcre_study() is a pointer for an error message.
       If studying succeeds (even if no data is returned),  the	 variable  it
       points  to  is  set to NULL. Otherwise it is set to point to a textual
       error message. This is a static string that is part  of	the  library.
       You  must  not  try  to free it. You should test the error pointer for
       NULL after calling pcre_study(), to be sure that it has	run  success-
       fully.

       This is a typical call to pcre_study():

	 pcre_extra *pe;
	 pe = pcre_study(
	   re,		   /* result of pcre_compile() */
	   0,		   /* no options exist */
	   &error);	   /* set to NULL or points to a message */

       At  present,  studying  a pattern is useful only for non-anchored pat-
       terns that do not have a single fixed starting character. A bitmap  of
       possible starting bytes is created.

LOCALE SUPPORT

       PCRE  handles caseless matching, and determines whether characters are
       letters, digits, or whatever, by reference to a set of tables, indexed
       by  character  value. When running in UTF-8 mode, this applies only to
       characters with codes less than 128. Higher-valued codes	 never	match
       escapes	such  as \w or \d, but can be tested with \p if PCRE is built
       with Unicode character property support. The use of locales with	 Uni-
       code is discouraged. If you are handling characters with codes greater
       than 128, you should either use UTF-8 and Unicode, or use locales, but
       not try to mix the two.

       PCRE  contains  an internal set of tables that are used when the final
       argument of pcre_compile() is NULL.  These  are	sufficient  for	 many
       applications.   Normally,  the  internal	 tables	 recognize only ASCII
       characters. However, when PCRE is built, it is possible to  cause  the
       internal	 tables	 to be rebuilt in the default "C" locale of the local
       system, which may cause them to be different.

       The internal tables can always be overridden by tables supplied by the
       application  that  calls	 PCRE.	These  may  be created in a different
       locale from the default. As more and more applications change to using
       Unicode, the need for this locale support is expected to die away.

       External	 tables	 are built by calling the pcre_maketables() function,
       which has no arguments, in the relevant locale. The result can then be
       passed  to  pcre_compile()  or  pcre_exec() as often as necessary. For
       example, to build and use tables that are appropriate for  the  French
       locale  (where  accented	 characters  with values greater than 128 are
       treated as letters), the following code could be used:

	 setlocale(LC_CTYPE, "fr_FR");
	 tables = pcre_maketables();
	 re = pcre_compile(..., tables);

       The locale name "fr_FR" is used on Linux and other Unix-like  systems;
       if  you are using Windows, the name for the French locale is "french".

       When pcre_maketables() runs, the tables are built in  memory  that  is
       obtained	 via pcre_malloc. It is the caller’s responsibility to ensure
       that the memory containing the tables remains available for as long as
       it is needed.

       The  pointer  that  is passed to pcre_compile() is saved with the com-
       piled pattern, and the same  tables  are	 used  via  this  pointer  by
       pcre_study()  and  normally also by pcre_exec(). Thus, by default, for
       any single pattern, compilation, studying and matching all  happen  in
       the  same  locale, but different patterns can be compiled in different
       locales.

       It is possible to pass a table pointer or NULL (indicating the use  of
       the  internal  tables)  to pcre_exec(). Although not intended for this
       purpose, this facility could be used to match a pattern in a different
       locale  from  the one in which it was compiled. Passing table pointers
       at run time is discussed below in the section on matching a pattern.

INFORMATION ABOUT A PATTERN

       int pcre_fullinfo(const pcre *code, const pcre_extra *extra,
	    int what, void *where);

       The pcre_fullinfo() function returns information about a compiled pat-
       tern.  It  replaces the obsolete pcre_info() function, which is never-
       theless retained for backwards compability (and is documented  below).

       The  first  argument  for pcre_fullinfo() is a pointer to the compiled
       pattern. The second argument is the result of pcre_study(), or NULL if
       the  pattern was not studied. The third argument specifies which piece
       of information is required, and the fourth argument is a pointer to  a
       variable	 to  receive  the data. The yield of the function is zero for
       success, or one of the following negative numbers:

	 PCRE_ERROR_NULL       the argument code was NULL
			       the argument where was NULL
	 PCRE_ERROR_BADMAGIC   the "magic number" was not found
	 PCRE_ERROR_BADOPTION  the value of what was invalid

       The "magic number" is placed at the start of each compiled pattern  as
       an simple check against passing an arbitrary memory pointer. Here is a
       typical call of pcre_fullinfo(), to obtain the length of the  compiled
       pattern:

	 int rc;
	 size_t length;
	 rc = pcre_fullinfo(
	   re,		     /* result of pcre_compile() */
	   pe,		     /* result of pcre_study(), or NULL */
	   PCRE_INFO_SIZE,   /* what is required */
	   &length);	     /* where to put the data */

       The  possible values for the third argument are defined in pcre.h, and
       are as follows:

	 PCRE_INFO_BACKREFMAX

       Return the number of the highest back reference in  the	pattern.  The
       fourth  argument	 should point to an int variable. Zero is returned if
       there are no back references.

	 PCRE_INFO_CAPTURECOUNT

       Return the number of capturing subpatterns in the pattern. The  fourth
       argument should point to an int variable.

	 PCRE_INFO_DEFAULT_TABLES

       Return a pointer to the internal default character tables within PCRE.
       The fourth argument should point to an unsigned char * variable.	 This
       information  call  is  provided	for  internal use by the pcre_study()
       function. External callers can cause PCRE to use its  internal  tables
       by passing a NULL table pointer.

	 PCRE_INFO_FIRSTBYTE

       Return  information  about the first byte of any matched string, for a
       non-anchored pattern. The fourth argument should point to an int vari-
       able. (This option used to be called PCRE_INFO_FIRSTCHAR; the old name
       is still recognized for backwards compatibility.)

       If there is a fixed first byte, for example, from a  pattern  such  as
       (cat|cow|coyote), its value is returned. Otherwise, if either

       (a) the pattern was compiled with the PCRE_MULTILINE option, and every
       branch starts with "^", or

       (b) every branch of the pattern starts with ".*"	 and  PCRE_DOTALL  is
       not set (if it were set, the pattern would be anchored),

       -1  is returned, indicating that the pattern matches only at the start
       of a subject string or after any newline within the string.  Otherwise
       -2 is returned. For anchored patterns, -2 is returned.

	 PCRE_INFO_FIRSTTABLE

       If the pattern was studied, and this resulted in the construction of a
       256-bit table indicating a fixed set of bytes for the  first  byte  in
       any  matching  string,  a  pointer to the table is returned. Otherwise
       NULL is returned. The fourth argument should point to an unsigned char
       * variable.

	 PCRE_INFO_HASCRORLF

       Return  1  if  the  pattern contains any explicit matches for CR or LF
       characters, otherwise 0. The fourth argument should point  to  an  int
       variable. An explicit match is either a literal CR or LF character, or
       \r or \n.

	 PCRE_INFO_JCHANGED

       Return 1 if the (?J) option setting is used in the pattern,  otherwise
       0.  The	fourth	argument  should  point	 to an int variable. The (?J)
       internal option setting changes the local PCRE_DUPNAMES option.

	 PCRE_INFO_LASTLITERAL

       Return the value of the rightmost literal byte that must exist in  any
       matched	string,	 other	than  at  its  start, if such a byte has been
       recorded. The fourth argument should point  to  an  int	variable.  If
       there  is  no such byte, -1 is returned. For anchored patterns, a last
       literal byte is recorded only if	 it  follows  something	 of  variable
       length. For example, for the pattern /^a\d+z\d+/ the returned value is
       "z", but for /^a\dz\d/ the returned value is -1.

	 PCRE_INFO_NAMECOUNT
	 PCRE_INFO_NAMEENTRYSIZE
	 PCRE_INFO_NAMETABLE

       PCRE  supports  the  use	 of  named  as	well  as  numbered  capturing
       parentheses.  The  names are just an additional way of identifying the
       parentheses, which still acquire numbers.  Several  convenience	func-
       tions  such  as pcre_get_named_substring() are provided for extracting
       captured substrings by name. It is also possible to extract  the	 data
       directly,  by first converting the name to a number in order to access
       the correct pointers in the output vector (described with  pcre_exec()
       below).	To do the conversion, you need to use the name-to-number map,
       which is described by these three values.

       The map consists of a number of	fixed-size  entries.  PCRE_INFO_NAME-
       COUNT  gives  the number of entries, and PCRE_INFO_NAMEENTRYSIZE gives
       the size of each entry; both of these return an int value.  The	entry
       size  depends  on  the length of the longest name. PCRE_INFO_NAMETABLE
       returns a pointer to the first entry of the table (a pointer to char).
       The  first  two	bytes  of  each entry are the number of the capturing
       parenthesis, most significant byte first. The rest of the entry is the
       corresponding  name,  zero  terminated.	The names are in alphabetical
       order. When PCRE_DUPNAMES is set, duplicate  names  are	in  order  of
       their parentheses numbers. For example, consider the following pattern
       (assume PCRE_EXTENDED is set, so white space - including newlines - is
       ignored):

	 (?<date> (?<year>(\d\d)?\d\d) -
	 (?<month>\d\d) - (?<day>\d\d) )

       There  are  four named subpatterns, so the table has four entries, and
       each entry in the table is eight bytes long. The table is as  follows,
       with  non-printing  bytes  shows	 in  hexadecimal, and undefined bytes
       shown as ??:

	 00 01 d  a  t	e  00 ??
	 00 05 d  a  y	00 ?? ??
	 00 04 m  o  n	t  h  00
	 00 02 y  e  a	r  00 ??

       When writing code to extract data from  named  subpatterns  using  the
       name-to-number  map, remember that the length of the entries is likely
       to be different for each compiled pattern.

	 PCRE_INFO_OKPARTIAL

       Return 1 if the pattern can be used for partial matching, otherwise 0.
       The  fourth  argument should point to an int variable. The pcrepartial
       documentation lists the restrictions that apply to patterns when	 par-
       tial matching is used.

	 PCRE_INFO_OPTIONS

       Return  a copy of the options with which the pattern was compiled. The
       fourth argument should point to an unsigned long int  variable.	These
       option  bits  are those specified in the call to pcre_compile(), modi-
       fied by any top-level option settings at	 the  start  of	 the  pattern
       itself.	In  other  words,  they are the options that will be in force
       when matching starts. For example, if the pattern /(?im)abc(?-i)d/  is
       compiled	 with  the PCRE_EXTENDED option, the result is PCRE_CASELESS,
       PCRE_MULTILINE, and PCRE_EXTENDED.

       A pattern is automatically anchored by PCRE if all  of  its  top-level
       alternatives begin with one of the following:

	 ^     unless PCRE_MULTILINE is set
	 \A    always
	 \G    always
	 .*    if PCRE_DOTALL is set and there are no back
		 references to the subpattern in which .* appears

       For  such  patterns,  the  PCRE_ANCHORED	 bit  is  set  in the options
       returned by pcre_fullinfo().

	 PCRE_INFO_SIZE

       Return the size of the compiled pattern, that is, the value  that  was
       passed  as  the argument to pcre_malloc() when PCRE was getting memory
       in which to place the compiled data. The fourth argument should	point
       to a size_t variable.

	 PCRE_INFO_STUDYSIZE

       Return  the  size of the data block pointed to by the study_data field
       in a pcre_extra block. That is, it is the value	that  was  passed  to
       pcre_malloc()  when  PCRE  was  getting memory into which to place the
       data created by pcre_study(). The fourth argument should	 point	to  a
       size_t variable.

OBSOLETE INFO FUNCTION

       int pcre_info(const pcre *code, int *optptr, int *firstcharptr);

       The  pcre_info() function is now obsolete because its interface is too
       restrictive to return all the available data about a compiled pattern.
       New   programs  should  use  pcre_fullinfo()  instead.  The  yield  of
       pcre_info() is the number of capturing subpatterns, or one of the fol-
       lowing negative numbers:

	 PCRE_ERROR_NULL       the argument code was NULL
	 PCRE_ERROR_BADMAGIC   the "magic number" was not found

       If  the	optptr argument is not NULL, a copy of the options with which
       the pattern was compiled is placed in the integer it  points  to	 (see
       PCRE_INFO_OPTIONS above).

       If  the	pattern	 is not anchored and the firstcharptr argument is not
       NULL, it is used to pass back information about the first character of
       any matched string (see PCRE_INFO_FIRSTBYTE above).

REFERENCE COUNTS

       int pcre_refcount(pcre *code, int adjust);

       The  pcre_refcount() function is used to maintain a reference count in
       the data block that contains a compiled pattern. It  is	provided  for
       the benefit of applications that operate in an object-oriented manner,
       where different parts of the application may be using  the  same	 com-
       piled  pattern, but you want to free the block when they are all done.

       When a pattern is compiled, the reference count field  is  initialized
       to zero.	 It is changed only by calling this function, whose action is
       to add the adjust value (which may be positive or negative) to it. The
       yield  of  the  function	 is  the new value. However, the value of the
       count is constrained to lie between 0 and 65535, inclusive. If the new
       value  is  outside these limits, it is forced to the appropriate limit
       value.

       Except when it is zero, the reference count is not correctly preserved
       if  a  pattern  is compiled on one host and then transferred to a host
       whose byte-order is different. (This  seems  a  highly  unlikely	 sce-
       nario.)

MATCHING A PATTERN: THE TRADITIONAL FUNCTION

       int pcre_exec(const pcre *code, const pcre_extra *extra,
	    const char *subject, int length, int startoffset,
	    int options, int *ovector, int ovecsize);

       The function pcre_exec() is called to match a subject string against a
       compiled pattern, which is passed in the code argument. If the pattern
       has  been  studied,  the	 result	 of the study should be passed in the
       extra argument. This function is the main  matching  facility  of  the
       library,	 and  it  operates  in a Perl-like manner. For specialist use
       there is also an alternative matching  function,	 which	is  described
       below in the section about the pcre_dfa_exec() function.

       In most applications, the pattern will have been compiled (and option-
       ally studied) in the same process that calls pcre_exec(). However,  it
       is  possible  to	 save  compiled patterns and study data, and then use
       them later in different processes, possibly even on  different  hosts.
       For a discussion about this, see the pcreprecompile documentation.

       Here is an example of a simple call to pcre_exec():

	 int rc;
	 int ovector[30];
	 rc = pcre_exec(
	   re,		   /* result of pcre_compile() */
	   NULL,	   /* we didn’t study the pattern */
	   "some string",  /* the subject string */
	   11,		   /* the length of the subject string */
	   0,		   /* start at offset 0 in the subject */
	   0,		   /* default options */
	   ovector,	   /* vector of integers for substring information */
	   30);		   /* number of elements (NOT size in bytes) */

   Extra data for pcre_exec()

       If the extra argument is not NULL, it must point to a pcre_extra	 data
       block. The pcre_study() function returns such a block (when it doesn’t
       return NULL), but you can also create one for yourself, and pass addi-
       tional  information in it. The pcre_extra block contains the following
       fields (not necessarily in this order):

	 unsigned long int flags;
	 void *study_data;
	 unsigned long int match_limit;
	 unsigned long int match_limit_recursion;
	 void *callout_data;
	 const unsigned char *tables;

       The flags field is a bitmap that specifies which of the	other  fields
       are set. The flag bits are:

	 PCRE_EXTRA_STUDY_DATA
	 PCRE_EXTRA_MATCH_LIMIT
	 PCRE_EXTRA_MATCH_LIMIT_RECURSION
	 PCRE_EXTRA_CALLOUT_DATA
	 PCRE_EXTRA_TABLES

       Other  flag bits should be set to zero. The study_data field is set in
       the pcre_extra block that is returned by pcre_study(),  together	 with
       the  appropriate	 flag  bit. You should not set this yourself, but you
       may add to the block by setting the other fields and their correspond-
       ing flag bits.

       The  match_limit	 field provides a means of preventing PCRE from using
       up a vast amount of resources when running patterns that are not going
       to match, but which have a very large number of possibilities in their
       search trees. The classic example  is  the  use	of  nested  unlimited
       repeats.

       Internally, PCRE uses a function called match() which it calls repeat-
       edly (sometimes recursively). The limit set by match_limit is  imposed
       on  the	number of times this function is called during a match, which
       has the effect of limiting the amount of backtracking  that  can	 take
       place.  For  patterns  that  are not anchored, the count restarts from
       zero for each position in the subject string.

       The default value for the limit can be set when	PCRE  is  built;  the
       default	default is 10 million, which handles all but the most extreme
       cases. You can override the default by  suppling	 pcre_exec()  with  a
       pcre_extra    block    in    which    match_limit    is	  set,	  and
       PCRE_EXTRA_MATCH_LIMIT is set in the flags  field.  If  the  limit  is
       exceeded, pcre_exec() returns PCRE_ERROR_MATCHLIMIT.

       The match_limit_recursion field is similar to match_limit, but instead
       of limiting the total number of times that match() is called, it	 lim-
       its  the	 depth	of recursion. The recursion depth is a smaller number
       than the total number of calls, because not all calls to	 match()  are
       recursive.   This  limit	 is  of	 use  only  if it is set smaller than
       match_limit.

       Limiting the recursion depth limits the amount of stack	that  can  be
       used,  or,  when	 PCRE  has  been  compiled  to use memory on the heap
       instead of the stack, the amount of heap memory that can be used.

       The default value for match_limit_recursion can be set  when  PCRE  is
       built;  the  default  default  is  the  same  value as the default for
       match_limit. You can override the default by suppling pcre_exec() with
       a   pcre_extra  block  in  which	 match_limit_recursion	is  set,  and
       PCRE_EXTRA_MATCH_LIMIT_RECURSION is set in the  flags  field.  If  the
       limit is exceeded, pcre_exec() returns PCRE_ERROR_RECURSIONLIMIT.

       The  pcre_callout field is used in conjunction with the "callout" fea-
       ture, which is described in the pcrecallout documentation.

       The tables field is  used  to  pass  a  character  tables  pointer  to
       pcre_exec(); this overrides the value that is stored with the compiled
       pattern. A non-NULL value is stored with the compiled pattern only  if
       custom  tables  were supplied to pcre_compile() via its tableptr argu-
       ment.  If NULL is passed	 to  pcre_exec()  using	 this  mechanism,  it
       forces  PCRE’s  internal	 tables	 to be used. This facility is helpful
       when re-using patterns that have been saved after  compiling  with  an
       external set of tables, because the external tables might be at a dif-
       ferent address when pcre_exec() is called. See the pcreprecompile doc-
       umentation for a discussion of saving compiled patterns for later use.

   Option bits for pcre_exec()

       The unused bits of the options argument for pcre_exec() must be	zero.
       The  only  bits	that  may be set are PCRE_ANCHORED, PCRE_NEWLINE_xxx,
       PCRE_NOTBOL,  PCRE_NOTEOL,   PCRE_NOTEMPTY,   PCRE_NO_UTF8_CHECK	  and
       PCRE_PARTIAL.

	 PCRE_ANCHORED

       The  PCRE_ANCHORED  option limits pcre_exec() to matching at the first
       matching position. If a pattern was compiled  with  PCRE_ANCHORED,  or
       turned out to be anchored by virtue of its contents, it cannot be made
       unachored at matching time.

	 PCRE_BSR_ANYCRLF
	 PCRE_BSR_UNICODE

       These options (which are	 mutually  exclusive)  control	what  the  \R
       escape sequence matches. The choice is either to match only CR, LF, or
       CRLF, or to match any Unicode newline sequence. These options override
       the choice that was made or defaulted when the pattern was compiled.

	 PCRE_NEWLINE_CR
	 PCRE_NEWLINE_LF
	 PCRE_NEWLINE_CRLF
	 PCRE_NEWLINE_ANYCRLF
	 PCRE_NEWLINE_ANY

       These  options  override	 the  newline  definition  that was chosen or
       defaulted when the pattern was compiled. For details, see the descrip-
       tion  of	 pcre_compile()	 above.	 During	 matching, the newline choice
       affects the behaviour of the dot, circumflex, and  dollar  metacharac-
       ters. It may also alter the way the match position is advanced after a
       match failure for an unanchored pattern.

       When PCRE_NEWLINE_CRLF, PCRE_NEWLINE_ANYCRLF, or	 PCRE_NEWLINE_ANY  is
       set, and a match attempt for an unanchored pattern fails when the cur-
       rent position is at a CRLF  sequence,  and  the	pattern	 contains  no
       explicit	 matches  for  CR  or  LF  characters,	the match position is
       advanced by two characters instead of one, in other  words,  to	after
       the CRLF.

       The  above  rule is a compromise that makes the most common cases work
       as expected. For example, if the pattern is .+A (and  the  PCRE_DOTALL
       option  is  not	set),  it  does not match the string "\r\nA" because,
       after failing at the start, it skips both the CR	 and  the  LF  before
       retrying. However, the pattern [\r\n]A does match that string, because
       it contains an explicit CR or LF reference, and so  advances  only  by
       one character after the first failure.

       An  explicit  match for CR of LF is either a literal appearance of one
       of those characters, or one of the \r or \n escape sequences. Implicit
       matches	such as [^X] do not count, nor does \s (which includes CR and
       LF in the characters that it matches).

       Notwithstanding the above, anomalous effects may still occur when CRLF
       is  a  valid  newline sequence and explicit \r or \n escapes appear in
       the pattern.

	 PCRE_NOTBOL

       This option specifies that first character of the  subject  string  is
       not  the	 beginning  of a line, so the circumflex metacharacter should
       not match before it. Setting this without PCRE_MULTILINE	 (at  compile
       time)  causes  circumflex never to match. This option affects only the
       behaviour of the circumflex metacharacter. It does not affect \A.

	 PCRE_NOTEOL

       This option specifies that the end of the subject string	 is  not  the
       end  of	a  line,  so the dollar metacharacter should not match it nor
       (except in multiline mode) a newline immediately	 before	 it.  Setting
       this  without  PCRE_MULTILINE (at compile time) causes dollar never to
       match. This option affects only the behaviour of the dollar  metachar-
       acter. It does not affect \Z or \z.

	 PCRE_NOTEMPTY

       An  empty  string is not considered to be a valid match if this option
       is set. If there are alternatives in the pattern, they are  tried.  If
       all  the	 alternatives match the empty string, the entire match fails.
       For example, if the pattern

	 a?b?

       is applied to a string not beginning with "a" or "b", it	 matches  the
       empty string at the start of the subject. With PCRE_NOTEMPTY set, this
       match is not valid, so PCRE  searches  further  into  the  string  for
       occurrences of "a" or "b".

       Perl  has  no  direct  equivalent of PCRE_NOTEMPTY, but it does make a
       special case of a pattern match of the empty string within its split()
       function,  and  when  using the /g modifier. It is possible to emulate
       Perl’s behaviour after matching a null  string  by  first  trying  the
       match  again  at the same offset with PCRE_NOTEMPTY and PCRE_ANCHORED,
       and then if that fails by advancing the starting	 offset	 (see  below)
       and  trying  an	ordinary  match again. There is some code that demon-
       strates how to do this in the pcredemo.c sample program.

	 PCRE_NO_UTF8_CHECK

       When PCRE_UTF8 is set at compile time, the validity of the subject  as
       a  UTF-8	 string	 is  automatically checked when pcre_exec() is subse-
       quently called.	The value of startoffset is also  checked  to  ensure
       that  it	 points to the start of a UTF-8 character. There is a discus-
       sion about the validity of UTF-8 strings in the section on UTF-8	 sup-
       port  in	 the main pcre page. If an invalid UTF-8 sequence of bytes is
       found, pcre_exec() returns the error PCRE_ERROR_BADUTF8. If  startoff-
       set  contains an invalid value, PCRE_ERROR_BADUTF8_OFFSET is returned.

       If you already know that your subject is valid, and you want  to	 skip
       these	checks	 for   performance   reasons,	you   can   set	  the
       PCRE_NO_UTF8_CHECK option when calling pcre_exec(). You might want  to
       do  this for the second and subsequent calls to pcre_exec() if you are
       making repeated calls to find all the  matches  in  a  single  subject
       string.	However,  you  should  be  sure that the value of startoffset
       points to the start of a UTF-8 character. When  PCRE_NO_UTF8_CHECK  is
       set,  the effect of passing an invalid UTF-8 string as a subject, or a
       value of startoffset that does not point to the start of a UTF-8 char-
       acter, is undefined. Your program may crash.

	 PCRE_PARTIAL

       This  option  turns  on	the  partial matching feature. If the subject
       string fails to match the pattern, but at some point during the match-
       ing  process  the end of the subject was reached (that is, the subject
       partially matches the pattern and the failure to match  occurred	 only
       because there were not enough subject characters), pcre_exec() returns
       PCRE_ERROR_PARTIAL instead of PCRE_ERROR_NOMATCH. When PCRE_PARTIAL is
       used,  there are restrictions on what may appear in the pattern. These
       are discussed in the pcrepartial documentation.

   The string to be matched by pcre_exec()

       The subject string is passed to pcre_exec() as a pointer in subject, a
       length  in length, and a starting byte offset in startoffset. In UTF-8
       mode, the byte offset must point to the start of	 a  UTF-8  character.
       Unlike  the pattern string, the subject may contain binary zero bytes.
       When the starting offset is zero, the search for a match starts at the
       beginning of the subject, and this is by far the most common case.

       A  non-zero starting offset is useful when searching for another match
       in the same subject by calling pcre_exec() again after a previous suc-
       cess.   Setting startoffset differs from just passing over a shortened
       string and setting PCRE_NOTBOL in the case of a	pattern	 that  begins
       with any kind of lookbehind. For example, consider the pattern

	 \Biss\B

       which  finds  occurrences of "iss" in the middle of words. (\B matches
       only if the current position in the subject is not a  word  boundary.)
       When  applied to the string "Mississipi" the first call to pcre_exec()
       finds the first occurrence. If pcre_exec() is called again  with	 just
       the  remainder  of  the	subject,  namely "issipi", it does not match,
       because \B is always false at the  start	 of  the  subject,  which  is
       deemed  to  be  a word boundary. However, if pcre_exec() is passed the
       entire string again, but with startoffset set to 4, it finds the	 sec-
       ond occurrence of "iss" because it is able to look behind the starting
       point to discover that it is preceded by a letter.

       If a non-zero starting offset is passed when the pattern is  anchored,
       one  attempt  to match at the given offset is made. This can only suc-
       ceed if the pattern does not require the match to be at the  start  of
       the subject.

   How pcre_exec() returns captured substrings

       In general, a pattern matches a certain portion of the subject, and in
       addition, further substrings from the subject may  be  picked  out  by
       parts  of  the  pattern. Following the usage in Jeffrey Friedl’s book,
       this is called "capturing" in what follows, and the phrase  "capturing
       subpattern"  is used for a fragment of a pattern that picks out a sub-
       string. PCRE supports several other kinds of parenthesized  subpattern
       that do not cause substrings to be captured.

       Captured substrings are returned to the caller via a vector of integer
       offsets whose address is passed in ovector. The number of elements  in
       the vector is passed in ovecsize, which must be a non-negative number.
       Note: this argument is NOT the size of ovector in bytes.

       The first two-thirds of the vector is used to pass back captured	 sub-
       strings,	 each substring using a pair of integers. The remaining third
       of the vector is used as workspace by pcre_exec() while matching	 cap-
       turing subpatterns, and is not available for passing back information.
       The length passed in ovecsize should always be a multiple of three. If
       it is not, it is rounded down.

       When  a	match is successful, information about captured substrings is
       returned in pairs of integers, starting at the beginning	 of  ovector,
       and  continuing	up to two-thirds of its length at the most. The first
       element of a pair is set to the offset of the  first  character	in  a
       substring,  and the second is set to the offset of the first character
       after the end of a substring. The first	pair,  ovector[0]  and	ovec-
       tor[1],	identify  the  portion	of  the subject string matched by the
       entire pattern. The next pair is used for the first capturing  subpat-
       tern,  and  so  on. The value returned by pcre_exec() is one more than
       the highest numbered pair that has been set. For example, if two	 sub-
       strings	have  been captured, the returned value is 3. If there are no
       capturing subpatterns, the return value from a successful match is  1,
       indicating that just the first pair of offsets has been set.

       If  a  capturing subpattern is matched repeatedly, it is the last por-
       tion of the string that it matched that is returned.

       If the vector is too small to hold all the captured substring offsets,
       it  is  used  as far as possible (up to two-thirds of its length), and
       the function returns a value of zero. In particular, if the  substring
       offsets	are  not  of interest, pcre_exec() may be called with ovector
       passed as NULL and ovecsize as zero. However, if the pattern  contains
       back  references	 and  the  ovector  is not big enough to remember the
       related substrings, PCRE has to get additional memory for  use  during
       matching. Thus it is usually advisable to supply an ovector.

       The  pcre_info()	 function  can be used to find out how many capturing
       subpatterns there are in a compiled pattern.  The  smallest  size  for
       ovector	that will allow for n captured substrings, in addition to the
       offsets of the substring matched by the whole pattern, is (n+1)*3.

       It is possible for capturing subpattern number n+1 to match some	 part
       of  the	subject when subpattern n has not been used at all. For exam-
       ple, if the string "abc" is matched against  the	 pattern  (a|(z))(bc)
       the  return  from  the  function	 is  4,	 and  subpatterns 1 and 3 are
       matched, but 2 is not. When this happens, both values  in  the  offset
       pairs corresponding to unused subpatterns are set to -1.

       Offset  values that correspond to unused subpatterns at the end of the
       expression are also set to -1. For example, if  the  string  "abc"  is
       matched against the pattern (abc)(x(yz)?)? subpatterns 2 and 3 are not
       matched. The return from the function is 2, because the	highest	 used
       capturing  subpattern  number is 1. However, you can refer to the off-
       sets for the second  and	 third	capturing  subpatterns	if  you	 wish
       (assuming the vector is large enough, of course).

       Some  convenience  functions  are provided for extracting the captured
       substrings as separate strings. These are described below.

   Error return values from pcre_exec()

       If pcre_exec() fails, it returns a negative number. The following  are
       defined in the header file:

	 PCRE_ERROR_NOMATCH	   (-1)

       The subject string did not match the pattern.

	 PCRE_ERROR_NULL	   (-2)

       Either  code  or	 subject  was passed as NULL, or ovector was NULL and
       ovecsize was not zero.

	 PCRE_ERROR_BADOPTION	   (-3)

       An unrecognized bit was set in the options argument.

	 PCRE_ERROR_BADMAGIC	   (-4)

       PCRE stores a 4-byte "magic number" at the start of the compiled code,
       to  catch the case when it is passed a junk pointer and to detect when
       a pattern that was compiled in an environment of one endianness is run
       in  an  environment  with the other endianness. This is the error that
       PCRE gives when the magic number is not present.

	 PCRE_ERROR_UNKNOWN_OPCODE (-5)

       While running the pattern match, an unknown item	 was  encountered  in
       the  compiled  pattern. This error could be caused by a bug in PCRE or
       by overwriting of the compiled pattern.

	 PCRE_ERROR_NOMEMORY	   (-6)

       If a pattern contains back references, but the ovector that is  passed
       to  pcre_exec()	is  not	 big  enough  to remember the referenced sub-
       strings, PCRE gets a block of memory at the start of matching  to  use
       for  this  purpose. If the call via pcre_malloc() fails, this error is
       given. The memory is automatically freed at the end of matching.

	 PCRE_ERROR_NOSUBSTRING	   (-7)

       This error is used by the pcre_copy_substring(), pcre_get_substring(),
       and  pcre_get_substring_list()  functions  (see	below).	 It  is never
       returned by pcre_exec().

	 PCRE_ERROR_MATCHLIMIT	   (-8)

       The backtracking limit, as specified by the  match_limit	 field	in  a
       pcre_extra  structure  (or defaulted) was reached. See the description
       above.

	 PCRE_ERROR_CALLOUT	   (-9)

       This error is never generated by pcre_exec() itself.  It	 is  provided
       for  use	 by  callout functions that want to yield a distinctive error
       code. See the pcrecallout documentation for details.

	 PCRE_ERROR_BADUTF8	   (-10)

       A string that contains an invalid UTF-8 byte sequence was passed as  a
       subject.

	 PCRE_ERROR_BADUTF8_OFFSET (-11)

       The  UTF-8  byte	 sequence that was passed as a subject was valid, but
       the value of startoffset did not point to the  beginning	 of  a	UTF-8
       character.

	 PCRE_ERROR_PARTIAL	   (-12)

       The  subject string did not match, but it did match partially. See the
       pcrepartial documentation for details of partial matching.

	 PCRE_ERROR_BADPARTIAL	   (-13)

       The PCRE_PARTIAL option was used with a	compiled  pattern  containing
       items that are not supported for partial matching. See the pcrepartial
       documentation for details of partial matching.

	 PCRE_ERROR_INTERNAL	   (-14)

       An unexpected internal error has occurred. This error could be  caused
       by a bug in PCRE or by overwriting of the compiled pattern.

	 PCRE_ERROR_BADCOUNT	   (-15)

       This error is given if the value of the ovecsize argument is negative.

	 PCRE_ERROR_RECURSIONLIMIT (-21)

       The internal recursion limit, as specified by  the  match_limit_recur-
       sion  field  in a pcre_extra structure (or defaulted) was reached. See
       the description above.

	 PCRE_ERROR_BADNEWLINE	   (-23)

       An invalid combination of PCRE_NEWLINE_xxx options was given.

       Error numbers -16 to -20 and -22 are not used by pcre_exec().

EXTRACTING CAPTURED SUBSTRINGS BY NUMBER

       int pcre_copy_substring(const char *subject, int *ovector,
	    int stringcount, int stringnumber, char *buffer,
	    int buffersize);

       int pcre_get_substring(const char *subject, int *ovector,
	    int stringcount, int stringnumber,
	    const char **stringptr);

       int pcre_get_substring_list(const char *subject,
	    int *ovector, int stringcount, const char ***listptr);

       Captured substrings can be accessed  directly  by  using	 the  offsets
       returned	 by  pcre_exec()  in  ovector. For convenience, the functions
       pcre_copy_substring(),	pcre_get_substring(),	 and	pcre_get_sub-
       string_list()  are provided for extracting captured substrings as new,
       separate, zero-terminated strings. These functions identify substrings
       by  number.  The next section describes functions for extracting named
       substrings.

       A substring that contains a binary zero is correctly extracted and has
       a further zero added on the end, but the result is not, of course, a C
       string.	However, you can process such a string by  referring  to  the
       length  that  is	 returned  by pcre_copy_substring() and pcre_get_sub-
       string().  Unfortunately, the interface	to  pcre_get_substring_list()
       is  not adequate for handling strings containing binary zeros, because
       the end of the final string is not independently indicated.

       The first three arguments are the same for all three  of	 these	func-
       tions:  subject	is the subject string that has just been successfully
       matched, ovector is a pointer to the vector of  integer	offsets	 that
       was passed to pcre_exec(), and stringcount is the number of substrings
       that were captured by the match, including the substring that  matched
       the   entire  regular  expression.  This	 is  the  value	 returned  by
       pcre_exec() if it is greater than zero. If pcre_exec() returned	zero,
       indicating  that	 it  ran out of space in ovector, the value passed as
       stringcount should be the number of elements in the vector divided  by
       three.

       The functions pcre_copy_substring() and pcre_get_substring() extract a
       single substring, whose number is given as stringnumber.	 A  value  of
       zero  extracts  the substring that matched the entire pattern, whereas
       higher values extract  the  captured  substrings.  For  pcre_copy_sub-
       string(),  the  string  is  placed in buffer, whose length is given by
       buffersize, while for pcre_get_substring() a new block  of  memory  is
       obtained	 via  pcre_malloc, and its address is returned via stringptr.
       The yield of the function is the length of the string,  not  including
       the terminating zero, or one of these error codes:

	 PCRE_ERROR_NOMEMORY	   (-6)

       The  buffer was too small for pcre_copy_substring(), or the attempt to
       get memory failed for pcre_get_substring().

	 PCRE_ERROR_NOSUBSTRING	   (-7)

       There is no substring whose number is stringnumber.

       The pcre_get_substring_list() function  extracts	 all  available	 sub-
       strings	and  builds a list of pointers to them. All this is done in a
       single block of memory that is obtained via pcre_malloc.	 The  address
       of  the	memory block is returned via listptr, which is also the start
       of the list of string pointers. The end of the list  is	marked	by  a
       NULL  pointer.  The yield of the function is zero if all went well, or
       the error code

	 PCRE_ERROR_NOMEMORY	   (-6)

       if the attempt to get the memory block failed.

       When any of these functions encounter a substring that is unset, which
       can  happen  when capturing subpattern number n+1 matches some part of
       the subject, but subpattern n has not been used at all, they return an
       empty  string.  This  can  be distinguished from a genuine zero-length
       substring by inspecting the appropriate offset in  ovector,  which  is
       negative for unset substrings.

       The    two    convenience    functions	 pcre_free_substring()	  and
       pcre_free_substring_list() can be used to free the memory returned  by
       a  previous call of pcre_get_substring() or pcre_get_substring_list(),
       respectively. They do nothing more than call the function  pointed  to
       by  pcre_free,  which of course could be called directly from a C pro-
       gram. However, PCRE is used in some situations where it is linked  via
       a  special  interface  to another programming language that cannot use
       pcre_free directly; it is for these cases that the functions are	 pro-
       vided.

EXTRACTING CAPTURED SUBSTRINGS BY NAME

       int pcre_get_stringnumber(const pcre *code,
	    const char *name);

       int pcre_copy_named_substring(const pcre *code,
	    const char *subject, int *ovector,
	    int stringcount, const char *stringname,
	    char *buffer, int buffersize);

       int pcre_get_named_substring(const pcre *code,
	    const char *subject, int *ovector,
	    int stringcount, const char *stringname,
	    const char **stringptr);

       To extract a substring by name, you first have to find associated num-
       ber.  For example, for this pattern

	 (a+)b(?<xxx>\d+)...

       the number of the subpattern called "xxx" is 2. If the name  is	known
       to be unique (PCRE_DUPNAMES was not set), you can find the number from
       the name by calling pcre_get_stringnumber(). The first argument is the
       compiled	 pattern,  and the second is the name. The yield of the func-
       tion is the subpattern number, or PCRE_ERROR_NOSUBSTRING (-7) if there
       is no subpattern of that name.

       Given  the  number, you can extract the substring directly, or use one
       of the functions described in the previous section.  For	 convenience,
       there are also two functions that do the whole job.

       Most    of    the   arguments   of   pcre_copy_named_substring()	  and
       pcre_get_named_substring() are the same as  those  for  the  similarly
       named  functions that extract by number. As these are described in the
       previous section, they are not re-described here. There are  just  two
       differences:

       First,  instead of a substring number, a substring name is given. Sec-
       ond, there is an extra argument,	 given	at  the	 start,	 which	is  a
       pointer	to  the	 compiled  pattern.  This  is needed in order to gain
       access to the name-to-number translation table.

       These functions call pcre_get_stringnumber(), and if it succeeds, they
       then  call pcre_copy_substring() or pcre_get_substring(), as appropri-
       ate. NOTE: If PCRE_DUPNAMES is set and there are duplicate names,  the
       behaviour may not be what you want (see the next section).

DUPLICATE SUBPATTERN NAMES

       int pcre_get_stringtable_entries(const pcre *code,
	    const char *name, char **first, char **last);

       When  a	pattern	 is compiled with the PCRE_DUPNAMES option, names for
       subpatterns are not required to be  unique.  Normally,  patterns	 with
       duplicate  names are such that in any one match, only one of the named
       subpatterns participates. An example is shown in the pcrepattern docu-
       mentation.

       When   duplicates   are	 present,   pcre_copy_named_substring()	  and
       pcre_get_named_substring() return the first substring corresponding to
       the  given  name	 that is set. If none are set, PCRE_ERROR_NOSUBSTRING
       (-7) is returned; no data  is  returned.	 The  pcre_get_stringnumber()
       function returns one of the numbers that are associated with the name,
       but it is not defined which it is.

       If you want to get full details of all captured substrings for a given
       name,  you  must	 use the pcre_get_stringtable_entries() function. The
       first argument is the compiled pattern, and the second  is  the	name.
       The  third  and	fourth are pointers to variables which are updated by
       the function. After it has run, they  point  to	the  first  and	 last
       entries	in  the name-to-number table for the given name. The function
       itself returns the length of  each  entry,  or  PCRE_ERROR_NOSUBSTRING
       (-7)  if there are none. The format of the table is described above in
       the section entitled Information about a pattern.  Given all the rele-
       vant  entries for the name, you can extract each of their numbers, and
       hence the captured data, if any.

FINDING ALL POSSIBLE MATCHES

       The traditional matching function uses a similar	 algorithm  to	Perl,
       which  stops  when it finds the first match, starting at a given point
       in the subject. If you want to  find  all  possible  matches,  or  the
       longest	possible match, consider using the alternative matching func-
       tion (see below) instead. If you cannot use the alternative  function,
       but  still  need to find all possible matches, you can kludge it up by
       making use of the callout facility, which is described  in  the	pcre-
       callout documentation.

       What  you  have	to  do is to insert a callout right at the end of the
       pattern.	 When your callout function is called, extract and  save  the
       current	matched substring. Then return 1, which forces pcre_exec() to
       backtrack and try other alternatives. Ultimately, when it runs out  of
       matches, pcre_exec() will yield PCRE_ERROR_NOMATCH.

MATCHING A PATTERN: THE ALTERNATIVE FUNCTION

       int pcre_dfa_exec(const pcre *code, const pcre_extra *extra,
	    const char *subject, int length, int startoffset,
	    int options, int *ovector, int ovecsize,
	    int *workspace, int wscount);

       The  function  pcre_dfa_exec()  is  called  to  match a subject string
       against a compiled pattern, using a matching algorithm that scans  the
       subject	string	just once, and does not backtrack. This has different
       characteristics to the normal algorithm, and is	not  compatible	 with
       Perl.  Some of the features of PCRE patterns are not supported. Never-
       theless, there are times when this kind of matching can be useful. For
       a discussion of the two matching algorithms, see the pcrematching doc-
       umentation.

       The arguments for the pcre_dfa_exec() function are  the	same  as  for
       pcre_exec(),  plus  two extras. The ovector argument is used in a dif-
       ferent way, and this is described below. The  other  common  arguments
       are  used  in the same way as for pcre_exec(), so their description is
       not repeated here.

       The two additional arguments provide workspace for the  function.  The
       workspace  vector  should contain at least 20 elements. It is used for
       keeping track  of  multiple  paths  through  the	 pattern  tree.	 More
       workspace  will	be needed for patterns and subjects where there are a
       lot of potential matches.

       Here is an example of a simple call to pcre_dfa_exec():

	 int rc;
	 int ovector[10];
	 int wspace[20];
	 rc = pcre_dfa_exec(
	   re,		   /* result of pcre_compile() */
	   NULL,	   /* we didn’t study the pattern */
	   "some string",  /* the subject string */
	   11,		   /* the length of the subject string */
	   0,		   /* start at offset 0 in the subject */
	   0,		   /* default options */
	   ovector,	   /* vector of integers for substring information */
	   10,		   /* number of elements (NOT size in bytes) */
	   wspace,	   /* working space vector */
	   20);		   /* number of elements (NOT size in bytes) */

   Option bits for pcre_dfa_exec()

       The  unused  bits  of the options argument for pcre_dfa_exec() must be
       zero. The only bits that	 may  be  set  are  PCRE_ANCHORED,  PCRE_NEW-
       LINE_xxx, PCRE_NOTBOL, PCRE_NOTEOL, PCRE_NOTEMPTY, PCRE_NO_UTF8_CHECK,
       PCRE_PARTIAL, PCRE_DFA_SHORTEST, and  PCRE_DFA_RESTART.	All  but  the
       last three of these are the same as for pcre_exec(), so their descrip-
       tion is not repeated here.

	 PCRE_PARTIAL

       This has the same general effect as it does for pcre_exec(),  but  the
       details	 are   slightly	 different.  When  PCRE_PARTIAL	 is  set  for
       pcre_dfa_exec(), the return code PCRE_ERROR_NOMATCH is converted	 into
       PCRE_ERROR_PARTIAL  if  the  end of the subject is reached, there have
       been no complete matches, but there is still  at	 least	one  matching
       possibility. The portion of the string that provided the partial match
       is set as the first matching string.

	 PCRE_DFA_SHORTEST

       Setting the PCRE_DFA_SHORTEST option causes the matching algorithm  to
       stop  as soon as it has found one match. Because of the way the alter-
       native algorithm works, this  is	 necessarily  the  shortest  possible
       match at the first possible matching point in the subject string.

	 PCRE_DFA_RESTART

       When  pcre_dfa_exec()  is  called  with	the  PCRE_PARTIAL option, and
       returns a partial match, it is possible to call it again,  with	addi-
       tional  subject	characters, and have it continue with the same match.
       The PCRE_DFA_RESTART option requests this action; when it is set,  the
       workspace and wscount options must reference the same vector as before
       because data about the match so far is left in them  after  a  partial
       match.  There  is  more discussion of this facility in the pcrepartial
       documentation.

   Successful returns from pcre_dfa_exec()

       When pcre_dfa_exec() succeeds, it may have matched more than one	 sub-
       string  in  the	subject. Note, however, that all the matches from one
       run of the function start at  the  same	point  in  the	subject.  The
       shorter	matches are all initial substrings of the longer matches. For
       example, if the pattern

	 <.*>

       is matched against the string

	 This is <something> <something else> <something further> no more

       the three matched strings are

	 <something>
	 <something> <something else>
	 <something> <something else> <something further>

       On success, the yield of the function is a number greater  than	zero,
       which  is  the number of matched substrings. The substrings themselves
       are returned in ovector. Each string uses two elements; the  first  is
       the  offset  to the start, and the second is the offset to the end. In
       fact, all the strings have the same start offset.  (Space  could	 have
       been saved by giving this only once, but it was decided to retain some
       compatibility with the way pcre_exec() returns data, even  though  the
       meaning of the strings is different.)

       The  strings  are  returned  in	reverse order of length; that is, the
       longest matching string is given first. If there were too many matches
       to fit into ovector, the yield of the function is zero, and the vector
       is filled with the longest matches.

   Error returns from pcre_dfa_exec()

       The pcre_dfa_exec() function returns a negative number when it  fails.
       Many  of	 the  errors  are  the same as for pcre_exec(), and these are
       described above.	 There are in addition the following errors that  are
       specific to pcre_dfa_exec():

	 PCRE_ERROR_DFA_UITEM	   (-16)

       This return is given if pcre_dfa_exec() encounters an item in the pat-
       tern that it does not support, for instance, the use of \C or  a	 back
       reference.

	 PCRE_ERROR_DFA_UCOND	   (-17)

       This  return  is	 given if pcre_dfa_exec() encounters a condition item
       that uses a back reference for the condition, or a test for  recursion
       in a specific group. These are not supported.

	 PCRE_ERROR_DFA_UMLIMIT	   (-18)

       This  return is given if pcre_dfa_exec() is called with an extra block
       that contains a setting of the match_limit field.  This	is  not	 sup-
       ported (it is meaningless).

	 PCRE_ERROR_DFA_WSSIZE	   (-19)

       This  return  is	 given	if  pcre_dfa_exec()  runs out of space in the
       workspace vector.

	 PCRE_ERROR_DFA_RECURSE	   (-20)

       When a recursive subpattern is processed, the matching function	calls
       itself  recursively,  using private vectors for ovector and workspace.
       This error is given if the output vector is  not	 large	enough.	 This
       should be extremely rare, as a vector of size 1000 is used.

SEE ALSO

       pcrebuild(3), pcrecallout(3), pcrecpp(3)(3), pcrematching(3), pcrepar-
       tial(3), pcreposix(3), pcreprecompile(3), pcresample(3), pcrestack(3).

AUTHOR

       Philip Hazel
       University Computing Service
       Cambridge CB2 3QH, England.

REVISION

       Last updated: 11 September 2007
       Copyright (c) 1997-2007 University of Cambridge.



								   PCREAPI(3)