File:  [ELWIX - Embedded LightWeight unIX -] / embedaddon / pcre / doc / html / pcrejit.html
Revision 1.1.1.4 (vendor branch): download - view: text, annotated - select for diffs - revision graph
Mon Jul 22 08:25:57 2013 UTC (11 years, 8 months ago) by misho
Branches: pcre, MAIN
CVS tags: v8_34, v8_33, HEAD
8.33

    1: <html>
    2: <head>
    3: <title>pcrejit specification</title>
    4: </head>
    5: <body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
    6: <h1>pcrejit man page</h1>
    7: <p>
    8: Return to the <a href="index.html">PCRE index page</a>.
    9: </p>
   10: <p>
   11: This page is part of the PCRE HTML documentation. It was generated automatically
   12: from the original man page. If there is any nonsense in it, please consult the
   13: man page, in case the conversion went wrong.
   14: <br>
   15: <ul>
   16: <li><a name="TOC1" href="#SEC1">PCRE JUST-IN-TIME COMPILER SUPPORT</a>
   17: <li><a name="TOC2" href="#SEC2">8-BIT, 16-BIT AND 32-BIT SUPPORT</a>
   18: <li><a name="TOC3" href="#SEC3">AVAILABILITY OF JIT SUPPORT</a>
   19: <li><a name="TOC4" href="#SEC4">SIMPLE USE OF JIT</a>
   20: <li><a name="TOC5" href="#SEC5">UNSUPPORTED OPTIONS AND PATTERN ITEMS</a>
   21: <li><a name="TOC6" href="#SEC6">RETURN VALUES FROM JIT EXECUTION</a>
   22: <li><a name="TOC7" href="#SEC7">SAVING AND RESTORING COMPILED PATTERNS</a>
   23: <li><a name="TOC8" href="#SEC8">CONTROLLING THE JIT STACK</a>
   24: <li><a name="TOC9" href="#SEC9">JIT STACK FAQ</a>
   25: <li><a name="TOC10" href="#SEC10">EXAMPLE CODE</a>
   26: <li><a name="TOC11" href="#SEC11">JIT FAST PATH API</a>
   27: <li><a name="TOC12" href="#SEC12">SEE ALSO</a>
   28: <li><a name="TOC13" href="#SEC13">AUTHOR</a>
   29: <li><a name="TOC14" href="#SEC14">REVISION</a>
   30: </ul>
   31: <br><a name="SEC1" href="#TOC1">PCRE JUST-IN-TIME COMPILER SUPPORT</a><br>
   32: <P>
   33: Just-in-time compiling is a heavyweight optimization that can greatly speed up
   34: pattern matching. However, it comes at the cost of extra processing before the
   35: match is performed. Therefore, it is of most benefit when the same pattern is
   36: going to be matched many times. This does not necessarily mean many calls of a
   37: matching function; if the pattern is not anchored, matching attempts may take
   38: place many times at various positions in the subject, even for a single call.
   39: Therefore, if the subject string is very long, it may still pay to use JIT for
   40: one-off matches.
   41: </P>
   42: <P>
   43: JIT support applies only to the traditional Perl-compatible matching function.
   44: It does not apply when the DFA matching function is being used. The code for
   45: this support was written by Zoltan Herczeg.
   46: </P>
   47: <br><a name="SEC2" href="#TOC1">8-BIT, 16-BIT AND 32-BIT SUPPORT</a><br>
   48: <P>
   49: JIT support is available for all of the 8-bit, 16-bit and 32-bit PCRE
   50: libraries. To keep this documentation simple, only the 8-bit interface is
   51: described in what follows. If you are using the 16-bit library, substitute the
   52: 16-bit functions and 16-bit structures (for example, <i>pcre16_jit_stack</i>
   53: instead of <i>pcre_jit_stack</i>). If you are using the 32-bit library,
   54: substitute the 32-bit functions and 32-bit structures (for example,
   55: <i>pcre32_jit_stack</i> instead of <i>pcre_jit_stack</i>).
   56: </P>
   57: <br><a name="SEC3" href="#TOC1">AVAILABILITY OF JIT SUPPORT</a><br>
   58: <P>
   59: JIT support is an optional feature of PCRE. The "configure" option --enable-jit
   60: (or equivalent CMake option) must be set when PCRE is built if you want to use
   61: JIT. The support is limited to the following hardware platforms:
   62: <pre>
   63:   ARM v5, v7, and Thumb2
   64:   Intel x86 32-bit and 64-bit
   65:   MIPS 32-bit
   66:   Power PC 32-bit and 64-bit
   67:   SPARC 32-bit (experimental)
   68: </pre>
   69: If --enable-jit is set on an unsupported platform, compilation fails.
   70: </P>
   71: <P>
   72: A program that is linked with PCRE 8.20 or later can tell if JIT support is
   73: available by calling <b>pcre_config()</b> with the PCRE_CONFIG_JIT option. The
   74: result is 1 when JIT is available, and 0 otherwise. However, a simple program
   75: does not need to check this in order to use JIT. The normal API is implemented
   76: in a way that falls back to the interpretive code if JIT is not available. For
   77: programs that need the best possible performance, there is also a "fast path"
   78: API that is JIT-specific.
   79: </P>
   80: <P>
   81: If your program may sometimes be linked with versions of PCRE that are older
   82: than 8.20, but you want to use JIT when it is available, you can test
   83: the values of PCRE_MAJOR and PCRE_MINOR, or the existence of a JIT macro such
   84: as PCRE_CONFIG_JIT, for compile-time control of your code.
   85: </P>
   86: <br><a name="SEC4" href="#TOC1">SIMPLE USE OF JIT</a><br>
   87: <P>
   88: You have to do two things to make use of the JIT support in the simplest way:
   89: <pre>
   90:   (1) Call <b>pcre_study()</b> with the PCRE_STUDY_JIT_COMPILE option for
   91:       each compiled pattern, and pass the resulting <b>pcre_extra</b> block to
   92:       <b>pcre_exec()</b>.
   93: 
   94:   (2) Use <b>pcre_free_study()</b> to free the <b>pcre_extra</b> block when it is
   95:       no longer needed, instead of just freeing it yourself. This ensures that
   96:       any JIT data is also freed.
   97: </pre>
   98: For a program that may be linked with pre-8.20 versions of PCRE, you can insert
   99: <pre>
  100:   #ifndef PCRE_STUDY_JIT_COMPILE
  101:   #define PCRE_STUDY_JIT_COMPILE 0
  102:   #endif
  103: </pre>
  104: so that no option is passed to <b>pcre_study()</b>, and then use something like
  105: this to free the study data:
  106: <pre>
  107:   #ifdef PCRE_CONFIG_JIT
  108:       pcre_free_study(study_ptr);
  109:   #else
  110:       pcre_free(study_ptr);
  111:   #endif
  112: </pre>
  113: PCRE_STUDY_JIT_COMPILE requests the JIT compiler to generate code for complete
  114: matches. If you want to run partial matches using the PCRE_PARTIAL_HARD or
  115: PCRE_PARTIAL_SOFT options of <b>pcre_exec()</b>, you should set one or both of
  116: the following options in addition to, or instead of, PCRE_STUDY_JIT_COMPILE
  117: when you call <b>pcre_study()</b>:
  118: <pre>
  119:   PCRE_STUDY_JIT_PARTIAL_HARD_COMPILE
  120:   PCRE_STUDY_JIT_PARTIAL_SOFT_COMPILE
  121: </pre>
  122: The JIT compiler generates different optimized code for each of the three
  123: modes (normal, soft partial, hard partial). When <b>pcre_exec()</b> is called,
  124: the appropriate code is run if it is available. Otherwise, the pattern is
  125: matched using interpretive code.
  126: </P>
  127: <P>
  128: In some circumstances you may need to call additional functions. These are
  129: described in the section entitled
  130: <a href="#stackcontrol">"Controlling the JIT stack"</a>
  131: below.
  132: </P>
  133: <P>
  134: If JIT support is not available, PCRE_STUDY_JIT_COMPILE etc. are ignored, and
  135: no JIT data is created. Otherwise, the compiled pattern is passed to the JIT
  136: compiler, which turns it into machine code that executes much faster than the
  137: normal interpretive code. When <b>pcre_exec()</b> is passed a <b>pcre_extra</b>
  138: block containing a pointer to JIT code of the appropriate mode (normal or
  139: hard/soft partial), it obeys that code instead of running the interpreter. The
  140: result is identical, but the compiled JIT code runs much faster.
  141: </P>
  142: <P>
  143: There are some <b>pcre_exec()</b> options that are not supported for JIT
  144: execution. There are also some pattern items that JIT cannot handle. Details
  145: are given below. In both cases, execution automatically falls back to the
  146: interpretive code. If you want to know whether JIT was actually used for a
  147: particular match, you should arrange for a JIT callback function to be set up
  148: as described in the section entitled
  149: <a href="#stackcontrol">"Controlling the JIT stack"</a>
  150: below, even if you do not need to supply a non-default JIT stack. Such a
  151: callback function is called whenever JIT code is about to be obeyed. If the
  152: execution options are not right for JIT execution, the callback function is not
  153: obeyed.
  154: </P>
  155: <P>
  156: If the JIT compiler finds an unsupported item, no JIT data is generated. You
  157: can find out if JIT execution is available after studying a pattern by calling
  158: <b>pcre_fullinfo()</b> with the PCRE_INFO_JIT option. A result of 1 means that
  159: JIT compilation was successful. A result of 0 means that JIT support is not
  160: available, or the pattern was not studied with PCRE_STUDY_JIT_COMPILE etc., or
  161: the JIT compiler was not able to handle the pattern.
  162: </P>
  163: <P>
  164: Once a pattern has been studied, with or without JIT, it can be used as many
  165: times as you like for matching different subject strings.
  166: </P>
  167: <br><a name="SEC5" href="#TOC1">UNSUPPORTED OPTIONS AND PATTERN ITEMS</a><br>
  168: <P>
  169: The only <b>pcre_exec()</b> options that are supported for JIT execution are
  170: PCRE_NO_UTF8_CHECK, PCRE_NO_UTF16_CHECK, PCRE_NO_UTF32_CHECK, PCRE_NOTBOL,
  171: PCRE_NOTEOL, PCRE_NOTEMPTY, PCRE_NOTEMPTY_ATSTART, PCRE_PARTIAL_HARD, and
  172: PCRE_PARTIAL_SOFT.
  173: </P>
  174: <P>
  175: The only unsupported pattern items are \C (match a single data unit) when
  176: running in a UTF mode, and a callout immediately before an assertion condition
  177: in a conditional group.
  178: </P>
  179: <br><a name="SEC6" href="#TOC1">RETURN VALUES FROM JIT EXECUTION</a><br>
  180: <P>
  181: When a pattern is matched using JIT execution, the return values are the same
  182: as those given by the interpretive <b>pcre_exec()</b> code, with the addition of
  183: one new error code: PCRE_ERROR_JIT_STACKLIMIT. This means that the memory used
  184: for the JIT stack was insufficient. See
  185: <a href="#stackcontrol">"Controlling the JIT stack"</a>
  186: below for a discussion of JIT stack usage. For compatibility with the
  187: interpretive <b>pcre_exec()</b> code, no more than two-thirds of the
  188: <i>ovector</i> argument is used for passing back captured substrings.
  189: </P>
  190: <P>
  191: The error code PCRE_ERROR_MATCHLIMIT is returned by the JIT code if searching a
  192: very large pattern tree goes on for too long, as it is in the same circumstance
  193: when JIT is not used, but the details of exactly what is counted are not the
  194: same. The PCRE_ERROR_RECURSIONLIMIT error code is never returned by JIT
  195: execution.
  196: </P>
  197: <br><a name="SEC7" href="#TOC1">SAVING AND RESTORING COMPILED PATTERNS</a><br>
  198: <P>
  199: The code that is generated by the JIT compiler is architecture-specific, and is
  200: also position dependent. For those reasons it cannot be saved (in a file or
  201: database) and restored later like the bytecode and other data of a compiled
  202: pattern. Saving and restoring compiled patterns is not something many people
  203: do. More detail about this facility is given in the
  204: <a href="pcreprecompile.html"><b>pcreprecompile</b></a>
  205: documentation. It should be possible to run <b>pcre_study()</b> on a saved and
  206: restored pattern, and thereby recreate the JIT data, but because JIT
  207: compilation uses significant resources, it is probably not worth doing this;
  208: you might as well recompile the original pattern.
  209: <a name="stackcontrol"></a></P>
  210: <br><a name="SEC8" href="#TOC1">CONTROLLING THE JIT STACK</a><br>
  211: <P>
  212: When the compiled JIT code runs, it needs a block of memory to use as a stack.
  213: By default, it uses 32K on the machine stack. However, some large or
  214: complicated patterns need more than this. The error PCRE_ERROR_JIT_STACKLIMIT
  215: is given when there is not enough stack. Three functions are provided for
  216: managing blocks of memory for use as JIT stacks. There is further discussion
  217: about the use of JIT stacks in the section entitled
  218: <a href="#stackcontrol">"JIT stack FAQ"</a>
  219: below.
  220: </P>
  221: <P>
  222: The <b>pcre_jit_stack_alloc()</b> function creates a JIT stack. Its arguments
  223: are a starting size and a maximum size, and it returns a pointer to an opaque
  224: structure of type <b>pcre_jit_stack</b>, or NULL if there is an error. The
  225: <b>pcre_jit_stack_free()</b> function can be used to free a stack that is no
  226: longer needed. (For the technically minded: the address space is allocated by
  227: mmap or VirtualAlloc.)
  228: </P>
  229: <P>
  230: JIT uses far less memory for recursion than the interpretive code,
  231: and a maximum stack size of 512K to 1M should be more than enough for any
  232: pattern.
  233: </P>
  234: <P>
  235: The <b>pcre_assign_jit_stack()</b> function specifies which stack JIT code
  236: should use. Its arguments are as follows:
  237: <pre>
  238:   pcre_extra         *extra
  239:   pcre_jit_callback  callback
  240:   void               *data
  241: </pre>
  242: The <i>extra</i> argument must be the result of studying a pattern with
  243: PCRE_STUDY_JIT_COMPILE etc. There are three cases for the values of the other
  244: two options:
  245: <pre>
  246:   (1) If <i>callback</i> is NULL and <i>data</i> is NULL, an internal 32K block
  247:       on the machine stack is used.
  248: 
  249:   (2) If <i>callback</i> is NULL and <i>data</i> is not NULL, <i>data</i> must be
  250:       a valid JIT stack, the result of calling <b>pcre_jit_stack_alloc()</b>.
  251: 
  252:   (3) If <i>callback</i> is not NULL, it must point to a function that is
  253:       called with <i>data</i> as an argument at the start of matching, in
  254:       order to set up a JIT stack. If the return from the callback
  255:       function is NULL, the internal 32K stack is used; otherwise the
  256:       return value must be a valid JIT stack, the result of calling
  257:       <b>pcre_jit_stack_alloc()</b>.
  258: </pre>
  259: A callback function is obeyed whenever JIT code is about to be run; it is not
  260: obeyed when <b>pcre_exec()</b> is called with options that are incompatible for
  261: JIT execution. A callback function can therefore be used to determine whether a
  262: match operation was executed by JIT or by the interpreter.
  263: </P>
  264: <P>
  265: You may safely use the same JIT stack for more than one pattern (either by
  266: assigning directly or by callback), as long as the patterns are all matched
  267: sequentially in the same thread. In a multithread application, if you do not
  268: specify a JIT stack, or if you assign or pass back NULL from a callback, that
  269: is thread-safe, because each thread has its own machine stack. However, if you
  270: assign or pass back a non-NULL JIT stack, this must be a different stack for
  271: each thread so that the application is thread-safe.
  272: </P>
  273: <P>
  274: Strictly speaking, even more is allowed. You can assign the same non-NULL stack
  275: to any number of patterns as long as they are not used for matching by multiple
  276: threads at the same time. For example, you can assign the same stack to all
  277: compiled patterns, and use a global mutex in the callback to wait until the
  278: stack is available for use. However, this is an inefficient solution, and not
  279: recommended.
  280: </P>
  281: <P>
  282: This is a suggestion for how a multithreaded program that needs to set up
  283: non-default JIT stacks might operate:
  284: <pre>
  285:   During thread initalization
  286:     thread_local_var = pcre_jit_stack_alloc(...)
  287: 
  288:   During thread exit
  289:     pcre_jit_stack_free(thread_local_var)
  290: 
  291:   Use a one-line callback function
  292:     return thread_local_var
  293: </pre>
  294: All the functions described in this section do nothing if JIT is not available,
  295: and <b>pcre_assign_jit_stack()</b> does nothing unless the <b>extra</b> argument
  296: is non-NULL and points to a <b>pcre_extra</b> block that is the result of a
  297: successful study with PCRE_STUDY_JIT_COMPILE etc.
  298: <a name="stackfaq"></a></P>
  299: <br><a name="SEC9" href="#TOC1">JIT STACK FAQ</a><br>
  300: <P>
  301: (1) Why do we need JIT stacks?
  302: <br>
  303: <br>
  304: PCRE (and JIT) is a recursive, depth-first engine, so it needs a stack where
  305: the local data of the current node is pushed before checking its child nodes.
  306: Allocating real machine stack on some platforms is difficult. For example, the
  307: stack chain needs to be updated every time if we extend the stack on PowerPC.
  308: Although it is possible, its updating time overhead decreases performance. So
  309: we do the recursion in memory.
  310: </P>
  311: <P>
  312: (2) Why don't we simply allocate blocks of memory with <b>malloc()</b>?
  313: <br>
  314: <br>
  315: Modern operating systems have a nice feature: they can reserve an address space
  316: instead of allocating memory. We can safely allocate memory pages inside this
  317: address space, so the stack could grow without moving memory data (this is
  318: important because of pointers). Thus we can allocate 1M address space, and use
  319: only a single memory page (usually 4K) if that is enough. However, we can still
  320: grow up to 1M anytime if needed.
  321: </P>
  322: <P>
  323: (3) Who "owns" a JIT stack?
  324: <br>
  325: <br>
  326: The owner of the stack is the user program, not the JIT studied pattern or
  327: anything else. The user program must ensure that if a stack is used by
  328: <b>pcre_exec()</b>, (that is, it is assigned to the pattern currently running),
  329: that stack must not be used by any other threads (to avoid overwriting the same
  330: memory area). The best practice for multithreaded programs is to allocate a
  331: stack for each thread, and return this stack through the JIT callback function.
  332: </P>
  333: <P>
  334: (4) When should a JIT stack be freed?
  335: <br>
  336: <br>
  337: You can free a JIT stack at any time, as long as it will not be used by
  338: <b>pcre_exec()</b> again. When you assign the stack to a pattern, only a pointer
  339: is set. There is no reference counting or any other magic. You can free the
  340: patterns and stacks in any order, anytime. Just <i>do not</i> call
  341: <b>pcre_exec()</b> with a pattern pointing to an already freed stack, as that
  342: will cause SEGFAULT. (Also, do not free a stack currently used by
  343: <b>pcre_exec()</b> in another thread). You can also replace the stack for a
  344: pattern at any time. You can even free the previous stack before assigning a
  345: replacement.
  346: </P>
  347: <P>
  348: (5) Should I allocate/free a stack every time before/after calling
  349: <b>pcre_exec()</b>?
  350: <br>
  351: <br>
  352: No, because this is too costly in terms of resources. However, you could
  353: implement some clever idea which release the stack if it is not used in let's
  354: say two minutes. The JIT callback can help to achieve this without keeping a
  355: list of the currently JIT studied patterns.
  356: </P>
  357: <P>
  358: (6) OK, the stack is for long term memory allocation. But what happens if a
  359: pattern causes stack overflow with a stack of 1M? Is that 1M kept until the
  360: stack is freed?
  361: <br>
  362: <br>
  363: Especially on embedded sytems, it might be a good idea to release memory
  364: sometimes without freeing the stack. There is no API for this at the moment.
  365: Probably a function call which returns with the currently allocated memory for
  366: any stack and another which allows releasing memory (shrinking the stack) would
  367: be a good idea if someone needs this.
  368: </P>
  369: <P>
  370: (7) This is too much of a headache. Isn't there any better solution for JIT
  371: stack handling?
  372: <br>
  373: <br>
  374: No, thanks to Windows. If POSIX threads were used everywhere, we could throw
  375: out this complicated API.
  376: </P>
  377: <br><a name="SEC10" href="#TOC1">EXAMPLE CODE</a><br>
  378: <P>
  379: This is a single-threaded example that specifies a JIT stack without using a
  380: callback.
  381: <pre>
  382:   int rc;
  383:   int ovector[30];
  384:   pcre *re;
  385:   pcre_extra *extra;
  386:   pcre_jit_stack *jit_stack;
  387: 
  388:   re = pcre_compile(pattern, 0, &error, &erroffset, NULL);
  389:   /* Check for errors */
  390:   extra = pcre_study(re, PCRE_STUDY_JIT_COMPILE, &error);
  391:   jit_stack = pcre_jit_stack_alloc(32*1024, 512*1024);
  392:   /* Check for error (NULL) */
  393:   pcre_assign_jit_stack(extra, NULL, jit_stack);
  394:   rc = pcre_exec(re, extra, subject, length, 0, 0, ovector, 30);
  395:   /* Check results */
  396:   pcre_free(re);
  397:   pcre_free_study(extra);
  398:   pcre_jit_stack_free(jit_stack);
  399: 
  400: </PRE>
  401: </P>
  402: <br><a name="SEC11" href="#TOC1">JIT FAST PATH API</a><br>
  403: <P>
  404: Because the API described above falls back to interpreted execution when JIT is
  405: not available, it is convenient for programs that are written for general use
  406: in many environments. However, calling JIT via <b>pcre_exec()</b> does have a
  407: performance impact. Programs that are written for use where JIT is known to be
  408: available, and which need the best possible performance, can instead use a
  409: "fast path" API to call JIT execution directly instead of calling
  410: <b>pcre_exec()</b> (obviously only for patterns that have been successfully
  411: studied by JIT).
  412: </P>
  413: <P>
  414: The fast path function is called <b>pcre_jit_exec()</b>, and it takes exactly
  415: the same arguments as <b>pcre_exec()</b>, plus one additional argument that
  416: must point to a JIT stack. The JIT stack arrangements described above do not
  417: apply. The return values are the same as for <b>pcre_exec()</b>.
  418: </P>
  419: <P>
  420: When you call <b>pcre_exec()</b>, as well as testing for invalid options, a
  421: number of other sanity checks are performed on the arguments. For example, if
  422: the subject pointer is NULL, or its length is negative, an immediate error is
  423: given. Also, unless PCRE_NO_UTF[8|16|32] is set, a UTF subject string is tested
  424: for validity. In the interests of speed, these checks do not happen on the JIT
  425: fast path, and if invalid data is passed, the result is undefined.
  426: </P>
  427: <P>
  428: Bypassing the sanity checks and the <b>pcre_exec()</b> wrapping can give
  429: speedups of more than 10%.
  430: </P>
  431: <br><a name="SEC12" href="#TOC1">SEE ALSO</a><br>
  432: <P>
  433: <b>pcreapi</b>(3)
  434: </P>
  435: <br><a name="SEC13" href="#TOC1">AUTHOR</a><br>
  436: <P>
  437: Philip Hazel (FAQ by Zoltan Herczeg)
  438: <br>
  439: University Computing Service
  440: <br>
  441: Cambridge CB2 3QH, England.
  442: <br>
  443: </P>
  444: <br><a name="SEC14" href="#TOC1">REVISION</a><br>
  445: <P>
  446: Last updated: 17 March 2013
  447: <br>
  448: Copyright &copy; 1997-2013 University of Cambridge.
  449: <br>
  450: <p>
  451: Return to the <a href="index.html">PCRE index page</a>.
  452: </p>

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>