Mercurial > hg > nginx-site
comparison xml/en/docs/dev/development_guide.xml @ 1919:dcfb4f3ac8a7
Added the "Regular expressions" section to the development guide.
author | Vladimir Homutov <vl@nginx.com> |
---|---|
date | Wed, 01 Mar 2017 14:06:46 +0300 |
parents | 8b7c3b0ef1a4 |
children | de5251816480 |
comparison
equal
deleted
inserted
replaced
1918:4ecc39397e97 | 1919:dcfb4f3ac8a7 |
---|---|
525 </list> | 525 </list> |
526 </para> | 526 </para> |
527 | 527 |
528 </section> | 528 </section> |
529 | 529 |
530 <section name="Regular expressions" id="regex"> | |
531 | |
532 <para> | |
533 The regular expressions interface in nginx is a wrapper around | |
534 the <link url="http://www.pcre.org">PCRE</link> | |
535 library. | |
536 The corresponding header file is <path>src/core/ngx_regex.h</path>. | |
537 </para> | |
538 | |
539 <para> | |
540 To use a regular expression for string matching, first, it needs to be | |
541 compiled, this is usually done at configuration phase. | |
542 Note that since PCRE support is optional, all code using the interface must | |
543 be protected by the surrounding <literal>NGX_PCRE</literal> macro: | |
544 <programlisting> | |
545 #if (NGX_PCRE) | |
546 ngx_regex_t *re; | |
547 ngx_regex_compile_t rc; | |
548 | |
549 u_char errstr[NGX_MAX_CONF_ERRSTR]; | |
550 | |
551 ngx_str_t value = ngx_string("message (\\d\\d\\d).*Codeword is '(?<cw>\\w+)'"); | |
552 | |
553 ngx_memzero(&rc, sizeof(ngx_regex_compile_t)); | |
554 | |
555 rc.pattern = value; | |
556 rc.pool = cf->pool; | |
557 rc.err.len = NGX_MAX_CONF_ERRSTR; | |
558 rc.err.data = errstr; | |
559 /* rc.options are passed as is to pcre_compile() */ | |
560 | |
561 if (ngx_regex_compile(&rc) != NGX_OK) { | |
562 ngx_conf_log_error(NGX_LOG_EMERG, cf, 0, "%V", &rc.err); | |
563 return NGX_CONF_ERROR; | |
564 } | |
565 | |
566 re = rc.regex; | |
567 #endif | |
568 </programlisting> | |
569 After successful compilation, <literal>ngx_regex_compile_t</literal> structure | |
570 fields <literal>captures</literal> and <literal>named_captures</literal> | |
571 are filled with count of all and named captures respectively found in the | |
572 regular expression. | |
573 </para> | |
574 | |
575 <para> | |
576 Later, the compiled regular expression may be used to match strings against it: | |
577 <programlisting> | |
578 ngx_int_t n; | |
579 int captures[(1 + rc.captures) * 3]; | |
580 | |
581 ngx_str_t input = ngx_string("This is message 123. Codeword is 'foobar'."); | |
582 | |
583 n = ngx_regex_exec(re, &input, captures, (1 + rc.captures) * 3); | |
584 if (n >= 0) { | |
585 /* string matches expression */ | |
586 | |
587 } else if (n == NGX_REGEX_NO_MATCHED) { | |
588 /* no match was found */ | |
589 | |
590 } else { | |
591 /* some error */ | |
592 ngx_log_error(NGX_LOG_ALERT, log, 0, ngx_regex_exec_n " failed: %i", n); | |
593 } | |
594 </programlisting> | |
595 The arguments of <literal>ngx_regex_exec()</literal> are: the compiled regular | |
596 expression <literal>re</literal>, string to match <literal>s</literal>, | |
597 optional array of integers to hold found <literal>captures</literal> | |
598 and its <literal>size</literal>. | |
599 The <literal>captures</literal> array size must be a multiple of three, | |
600 per requirements of the | |
601 <link url="http://www.pcre.org/original/doc/html/pcreapi.html">PCRE API</link>. | |
602 In the example, its size is calculated from a total number of captures plus | |
603 one for the matched string itself. | |
604 </para> | |
605 | |
606 <para> | |
607 Now, if there are matches, captures may be accessed: | |
608 <programlisting> | |
609 u_char *p; | |
610 size_t size; | |
611 ngx_str_t name, value; | |
612 | |
613 /* all captures */ | |
614 for (i = 0; i < n * 2; i += 2) { | |
615 value.data = input.data + captures[i]; | |
616 value.len = captures[i + 1] - captures[i]; | |
617 } | |
618 | |
619 /* accessing named captures */ | |
620 | |
621 size = rc.name_size; | |
622 p = rc.names; | |
623 | |
624 for (i = 0; i < rc.named_captures; i++, p += size) { | |
625 | |
626 /* capture name */ | |
627 name.data = &p[2]; | |
628 name.len = ngx_strlen(name.data); | |
629 | |
630 n = 2 * ((p[0] << 8) + p[1]); | |
631 | |
632 /* captured value */ | |
633 value.data = &input.data[captures[n]]; | |
634 value.len = captures[n + 1] - captures[n]; | |
635 } | |
636 </programlisting> | |
637 </para> | |
638 | |
639 <para> | |
640 The <literal>ngx_regex_exec_array()</literal> function accepts the array of | |
641 <literal>ngx_regex_elt_t</literal> elements (which are just compiled regular | |
642 expressions with associated names), a string to match and a log. | |
643 The function will apply expressions from the array to the string until | |
644 the match is found or no more expressions are left. | |
645 The return value is <literal>NGX_OK</literal> in case of match and | |
646 <literal>NGX_DECLINED</literal> otherwise, or <literal>NGX_ERROR</literal> | |
647 in case of error. | |
648 </para> | |
649 | |
650 </section> | |
530 | 651 |
531 </section> | 652 </section> |
532 | 653 |
533 | 654 |
534 <section name="Containers" id="containers"> | 655 <section name="Containers" id="containers"> |