Return to API CVS log | Up to [ELWIX - Embedded LightWeight unIX -] / embedaddon / php / ext / mbstring / oniguruma / doc |
1.1 misho 1: Oniguruma API Version 4.7.1 2007/07/04 2: 3: #include <oniguruma.h> 4: 5: 6: # int onig_init(void) 7: 8: Initialize library. 9: 10: You don't have to call it explicitly, because it is called in onig_new(). 11: 12: 13: # int onig_error_code_to_str(UChar* err_buf, int err_code, ...) 14: 15: Get error message string. 16: If this function is used for onig_new(), 17: don't call this after the pattern argument of onig_new() is freed. 18: 19: normal return: error message string length 20: 21: arguments 22: 1 err_buf: error message string buffer. 23: (required size: ONIG_MAX_ERROR_MESSAGE_LEN) 24: 2 err_code: error code returned by other API functions. 25: 3 err_info (optional): error info returned by onig_new(). 26: 27: 28: # void onig_set_warn_func(OnigWarnFunc func) 29: 30: Set warning function. 31: 32: WARNING: 33: '[', '-', ']' in character class without escape. 34: ']' in pattern without escape. 35: 36: arguments 37: 1 func: function pointer. void (*func)(char* warning_message) 38: 39: 40: # void onig_set_verb_warn_func(OnigWarnFunc func) 41: 42: Set verbose warning function. 43: 44: WARNING: 45: redundant nested repeat operator. 46: 47: arguments 48: 1 func: function pointer. void (*func)(char* warning_message) 49: 50: 51: # int onig_new(regex_t** reg, const UChar* pattern, const UChar* pattern_end, 52: OnigOptionType option, OnigEncoding enc, OnigSyntaxType* syntax, 53: OnigErrorInfo* err_info) 54: 55: Create a regex object. 56: 57: normal return: ONIG_NORMAL 58: 59: arguments 60: 1 reg: return regex object's address. 61: 2 pattern: regex pattern string. 62: 3 pattern_end: terminate address of pattern. (pattern + pattern length) 63: 4 option: compile time options. 64: 65: ONIG_OPTION_NONE no option 66: ONIG_OPTION_SINGLELINE '^' -> '\A', '$' -> '\Z' 67: ONIG_OPTION_MULTILINE '.' match with newline 68: ONIG_OPTION_IGNORECASE ambiguity match on 69: ONIG_OPTION_EXTEND extended pattern form 70: ONIG_OPTION_FIND_LONGEST find longest match 71: ONIG_OPTION_FIND_NOT_EMPTY ignore empty match 72: ONIG_OPTION_NEGATE_SINGLELINE 73: clear ONIG_OPTION_SINGLELINE which is enabled on 74: ONIG_SYNTAX_POSIX_BASIC, ONIG_SYNTAX_POSIX_EXTENDED, 75: ONIG_SYNTAX_PERL, ONIG_SYNTAX_PERL_NG, ONIG_SYNTAX_JAVA 76: 77: ONIG_OPTION_DONT_CAPTURE_GROUP only named group captured. 78: ONIG_OPTION_CAPTURE_GROUP named and no-named group captured. 79: 80: 5 enc: character encoding. 81: 82: ONIG_ENCODING_ASCII ASCII 83: ONIG_ENCODING_ISO_8859_1 ISO 8859-1 84: ONIG_ENCODING_ISO_8859_2 ISO 8859-2 85: ONIG_ENCODING_ISO_8859_3 ISO 8859-3 86: ONIG_ENCODING_ISO_8859_4 ISO 8859-4 87: ONIG_ENCODING_ISO_8859_5 ISO 8859-5 88: ONIG_ENCODING_ISO_8859_6 ISO 8859-6 89: ONIG_ENCODING_ISO_8859_7 ISO 8859-7 90: ONIG_ENCODING_ISO_8859_8 ISO 8859-8 91: ONIG_ENCODING_ISO_8859_9 ISO 8859-9 92: ONIG_ENCODING_ISO_8859_10 ISO 8859-10 93: ONIG_ENCODING_ISO_8859_11 ISO 8859-11 94: ONIG_ENCODING_ISO_8859_13 ISO 8859-13 95: ONIG_ENCODING_ISO_8859_14 ISO 8859-14 96: ONIG_ENCODING_ISO_8859_15 ISO 8859-15 97: ONIG_ENCODING_ISO_8859_16 ISO 8859-16 98: ONIG_ENCODING_UTF8 UTF-8 99: ONIG_ENCODING_UTF16_BE UTF-16BE 100: ONIG_ENCODING_UTF16_LE UTF-16LE 101: ONIG_ENCODING_UTF32_BE UTF-32BE 102: ONIG_ENCODING_UTF32_LE UTF-32LE 103: ONIG_ENCODING_EUC_JP EUC-JP 104: ONIG_ENCODING_EUC_TW EUC-TW 105: ONIG_ENCODING_EUC_KR EUC-KR 106: ONIG_ENCODING_EUC_CN EUC-CN 107: ONIG_ENCODING_SJIS Shift_JIS 108: ONIG_ENCODING_KOI8 KOI8 109: ONIG_ENCODING_KOI8_R KOI8-R 110: ONIG_ENCODING_BIG5 Big5 111: ONIG_ENCODING_GB18030 GB 18030 112: 113: or any OnigEncodingType data address defined by user. 114: 115: 6 syntax: address of pattern syntax definition. 116: 117: ONIG_SYNTAX_ASIS plain text 118: ONIG_SYNTAX_POSIX_BASIC POSIX Basic RE 119: ONIG_SYNTAX_POSIX_EXTENDED POSIX Extended RE 120: ONIG_SYNTAX_EMACS Emacs 121: ONIG_SYNTAX_GREP grep 122: ONIG_SYNTAX_GNU_REGEX GNU regex 123: ONIG_SYNTAX_JAVA Java (Sun java.util.regex) 124: ONIG_SYNTAX_PERL Perl 125: ONIG_SYNTAX_PERL_NG Perl + named group 126: ONIG_SYNTAX_RUBY Ruby 127: ONIG_SYNTAX_DEFAULT default (== Ruby) 128: onig_set_default_syntax() 129: 130: or any OnigSyntaxType data address defined by user. 131: 132: 7 err_info: address for return optional error info. 133: Use this value as 3rd argument of onig_error_code_to_str(). 134: 135: 136: 137: # int onig_new_deluxe(regex_t** reg, const UChar* pattern, const UChar* pattern_end, 138: OnigCompileInfo* ci, OnigErrorInfo* einfo) 139: 140: Create a regex object. 141: This function is deluxe version of onig_new(). 142: 143: normal return: ONIG_NORMAL 144: 145: arguments 146: 1 reg: return address of regex object. 147: 2 pattern: regex pattern string. 148: 3 pattern_end: terminate address of pattern. (pattern + pattern length) 149: 4 ci: compile time info. 150: 151: ci->num_of_elements: number of elements in ci. (current version: 5) 152: ci->pattern_enc: pattern string character encoding. 153: ci->target_enc: target string character encoding. 154: ci->syntax: address of pattern syntax definition. 155: ci->option: compile time option. 156: ci->ambig_flag: character matching ambiguity bit flag for 157: ONIG_OPTION_IGNORECASE mode. 158: 159: ONIGENC_AMBIGUOUS_MATCH_NONE: exact 160: ONIGENC_AMBIGUOUS_MATCH_ASCII_CASE: ignore case for ASCII 161: ONIGENC_AMBIGUOUS_MATCH_NONASCII_CASE: ignore case for non-ASCII 162: ONIGENC_AMBIGUOUS_MATCH_FULL: all ambiguity on 163: ONIGENC_AMBIGUOUS_MATCH_DEFAULT: (ASCII | NONASCII) 164: onig_set_default_ambig_flag() 165: 166: 5 err_info: address for return optional error info. 167: Use this value as 3rd argument of onig_error_code_to_str(). 168: 169: 170: Different character encoding combination is allowed for 171: the following cases only. 172: 173: pattern_enc: ASCII, ISO_8859_1 174: target_enc: UTF16_BE, UTF16_LE, UTF32_BE, UTF32_LE 175: 176: pattern_enc: UTF16_BE/LE 177: target_enc: UTF16_LE/BE 178: 179: pattern_enc: UTF32_BE/LE 180: target_enc: UTF32_LE/BE 181: 182: 183: # void onig_free(regex_t* reg) 184: 185: Free memory used by regex object. 186: 187: arguments 188: 1 reg: regex object. 189: 190: 191: # int onig_search(regex_t* reg, const UChar* str, const UChar* end, const UChar* start, 192: const UChar* range, OnigRegion* region, OnigOptionType option) 193: 194: Search string and return search result and matching region. 195: 196: normal return: match position offset (i.e. p - str >= 0) 197: not found: ONIG_MISMATCH (< 0) 198: 199: arguments 200: 1 reg: regex object 201: 2 str: target string 202: 3 end: terminate address of target string 203: 4 start: search start address of target string 204: 5 range: search terminate address of target string 205: in forward search (start <= searched string head < range) 206: in backward search (range <= searched string head <= start) 207: 6 region: address for return group match range info (NULL is allowed) 208: 7 option: search time option 209: 210: ONIG_OPTION_NOTBOL string head(str) isn't considered as begin of line 211: ONIG_OPTION_NOTEOL string end (end) isn't considered as end of line 212: ONIG_OPTION_POSIX_REGION region argument is regmatch_t[] of POSIX API. 213: 214: 215: # int onig_match(regex_t* reg, const UChar* str, const UChar* end, const UChar* at, 216: OnigRegion* region, OnigOptionType option) 217: 218: Match string and return result and matching region. 219: 220: normal return: match length (>= 0) 221: not match: ONIG_MISMATCH ( < 0) 222: 223: arguments 224: 1 reg: regex object 225: 2 str: target string 226: 3 end: terminate address of target string 227: 4 at: match address of target string 228: 5 region: address for return group match range info (NULL is allowed) 229: 6 option: search time option 230: 231: ONIG_OPTION_NOTBOL string head(str) isn't considered as begin of line 232: ONIG_OPTION_NOTEOL string end (end) isn't considered as end of line 233: ONIG_OPTION_POSIX_REGION region argument is regmatch_t[] type of POSIX API. 234: 235: 236: # OnigRegion* onig_region_new(void) 237: 238: Create a region. 239: 240: 241: # void onig_region_free(OnigRegion* region, int free_self) 242: 243: Free memory used by region. 244: 245: arguments 246: 1 region: target region 247: 2 free_self: [1: free all, 0: free memory used in region but not self] 248: 249: 250: # void onig_region_copy(OnigRegion* to, OnigRegion* from) 251: 252: Copy contents of region. 253: 254: arguments 255: 1 to: target region 256: 2 from: source region 257: 258: 259: # void onig_region_clear(OnigRegion* region) 260: 261: Clear contents of region. 262: 263: arguments 264: 1 region: target region 265: 266: 267: # int onig_region_resize(OnigRegion* region, int n) 268: 269: Resize group range area of region. 270: 271: normal return: ONIG_NORMAL 272: 273: arguments 274: 1 region: target region 275: 2 n: new size 276: 277: 278: # int onig_name_to_group_numbers(regex_t* reg, const UChar* name, const UChar* name_end, 279: int** num_list) 280: 281: Return the group number list of the name. 282: Named subexp is defined by (?<name>....). 283: 284: normal return: number of groups for the name. 285: (ex. /(?<x>..)(?<x>..)/ ==> 2) 286: name not found: -1 287: 288: arguments 289: 1 reg: regex object. 290: 2 name: group name. 291: 3 name_end: terminate address of group name. 292: 4 num_list: return list of group number. 293: 294: 295: # int onig_name_to_backref_number(regex_t* reg, const UChar* name, const UChar* name_end, 296: OnigRegion *region) 297: 298: Return the group number corresponding to the named backref (\k<name>). 299: If two or more regions for the groups of the name are effective, 300: the greatest number in it is obtained. 301: 302: normal return: group number. 303: 304: arguments 305: 1 reg: regex object. 306: 2 name: group name. 307: 3 name_end: terminate address of group name. 308: 4 region: search/match result region. 309: 310: 311: # int onig_foreach_name(regex_t* reg, 312: int (*func)(const UChar*, const UChar*, int,int*,regex_t*,void*), 313: void* arg) 314: 315: Iterate function call for all names. 316: 317: normal return: 0 318: error: func's return value. 319: 320: arguments 321: 1 reg: regex object. 322: 2 func: callback function. 323: func(name, name_end, <number of groups>, <group number's list>, 324: reg, arg); 325: if func does not return 0, then iteration is stopped. 326: 3 arg: argument for func. 327: 328: 329: # int onig_number_of_names(regex_t* reg) 330: 331: Return the number of names defined in the pattern. 332: Multiple definitions of one name is counted as one. 333: 334: arguments 335: 1 reg: regex object. 336: 337: 338: # OnigEncoding onig_get_encoding(regex_t* reg) 339: # OnigOptionType onig_get_options(regex_t* reg) 340: # OnigAmbigType onig_get_ambig_flag(regex_t* reg) 341: # OnigSyntaxType* onig_get_syntax(regex_t* reg) 342: 343: Return a value of the regex object. 344: 345: arguments 346: 1 reg: regex object. 347: 348: 349: # int onig_number_of_captures(regex_t* reg) 350: 351: Return the number of capture group in the pattern. 352: 353: arguments 354: 1 reg: regex object. 355: 356: 357: # int onig_number_of_capture_histories(regex_t* reg) 358: 359: Return the number of capture history defined in the pattern. 360: 361: You can't use capture history if ONIG_SYN_OP2_ATMARK_CAPTURE_HISTORY 362: is disabled in the pattern syntax.(disabled in the default syntax) 363: 364: arguments 365: 1 reg: regex object. 366: 367: 368: 369: # OnigCaptureTreeNode* onig_get_capture_tree(OnigRegion* region) 370: 371: Return the root node of capture history data tree. 372: 373: This value is undefined if matching has faild. 374: 375: arguments 376: 1 region: matching result. 377: 378: 379: # int onig_capture_tree_traverse(OnigRegion* region, int at, 380: int(*func)(int,int,int,int,int,void*), void* arg) 381: 382: Traverse and callback in capture history data tree. 383: 384: normal return: 0 385: error: callback func's return value. 386: 387: arguments 388: 1 region: match region data. 389: 2 at: callback position. 390: 391: ONIG_TRAVERSE_CALLBACK_AT_FIRST: callback first, then traverse childs. 392: ONIG_TRAVERSE_CALLBACK_AT_LAST: traverse childs first, then callback. 393: ONIG_TRAVERSE_CALLBACK_AT_BOTH: callback first, then traverse childs, 394: and at last callback again. 395: 396: 3 func: callback function. 397: if func does not return 0, then traverse is stopped. 398: 399: int func(int group, int beg, int end, int level, int at, 400: void* arg) 401: 402: group: group number 403: beg: capture start position 404: end: capture end position 405: level: nest level (from 0) 406: at: callback position 407: ONIG_TRAVERSE_CALLBACK_AT_FIRST 408: ONIG_TRAVERSE_CALLBACK_AT_LAST 409: arg: optional callback argument 410: 411: 4 arg; optional callback argument. 412: 413: 414: # int onig_noname_group_capture_is_active(regex_t* reg) 415: 416: Return noname group capture activity. 417: 418: active: 1 419: inactive: 0 420: 421: arguments 422: 1 reg: regex object. 423: 424: if option ONIG_OPTION_DONT_CAPTURE_GROUP == ON 425: --> inactive 426: 427: if the regex pattern have named group 428: and syntax ONIG_SYN_CAPTURE_ONLY_NAMED_GROUP == ON 429: and option ONIG_OPTION_CAPTURE_GROUP == OFF 430: --> inactive 431: 432: else --> active 433: 434: 435: # UChar* onigenc_get_prev_char_head(OnigEncoding enc, const UChar* start, const UChar* s) 436: 437: Return previous character head address. 438: 439: arguments 440: 1 enc: character encoding 441: 2 start: string address 442: 3 s: target address of string 443: 444: 445: # UChar* onigenc_get_left_adjust_char_head(OnigEncoding enc, 446: const UChar* start, const UChar* s) 447: 448: Return left-adjusted head address of a character. 449: 450: arguments 451: 1 enc: character encoding 452: 2 start: string address 453: 3 s: target address of string 454: 455: 456: # UChar* onigenc_get_right_adjust_char_head(OnigEncoding enc, 457: const UChar* start, const UChar* s) 458: 459: Return right-adjusted head address of a character. 460: 461: arguments 462: 1 enc: character encoding 463: 2 start: string address 464: 3 s: target address of string 465: 466: 467: # int onigenc_strlen(OnigEncoding enc, const UChar* s, const UChar* end) 468: # int onigenc_strlen_null(OnigEncoding enc, const UChar* s) 469: 470: Return number of characters in the string. 471: 472: 473: # int onigenc_str_bytelen_null(OnigEncoding enc, const UChar* s) 474: 475: Return number of bytes in the string. 476: 477: 478: # int onig_set_default_syntax(OnigSyntaxType* syntax) 479: 480: Set default syntax. 481: 482: arguments 483: 1 syntax: address of pattern syntax definition. 484: 485: 486: # void onig_copy_syntax(OnigSyntaxType* to, OnigSyntaxType* from) 487: 488: Copy syntax. 489: 490: arguments 491: 1 to: destination address. 492: 2 from: source address. 493: 494: 495: # unsigned int onig_get_syntax_op(OnigSyntaxType* syntax) 496: # unsigned int onig_get_syntax_op2(OnigSyntaxType* syntax) 497: # unsigned int onig_get_syntax_behavior(OnigSyntaxType* syntax) 498: # OnigOptionType onig_get_syntax_options(OnigSyntaxType* syntax) 499: 500: # void onig_set_syntax_op(OnigSyntaxType* syntax, unsigned int op) 501: # void onig_set_syntax_op2(OnigSyntaxType* syntax, unsigned int op2) 502: # void onig_set_syntax_behavior(OnigSyntaxType* syntax, unsigned int behavior) 503: # void onig_set_syntax_options(OnigSyntaxType* syntax, OnigOptionType options) 504: 505: Get/Set elements of the syntax. 506: 507: arguments 508: 1 syntax: syntax 509: 2 op, op2, behavior, options: value of element. 510: 511: 512: # void onig_copy_encoding(OnigEncoding to, OnigOnigEncoding from) 513: 514: Copy encoding. 515: 516: arguments 517: 1 to: destination address. 518: 2 from: source address. 519: 520: 521: # int onig_set_meta_char(OnigEncoding enc, unsigned int what, 522: OnigCodePoint code) 523: 524: Set a variable meta character to the code point value. 525: Except for an escape character, this meta characters specification 526: is not work, if ONIG_SYN_OP_VARIABLE_META_CHARACTERS is not effective 527: by the syntax. (Build-in syntaxes are not effective.) 528: 529: normal return: ONIG_NORMAL 530: 531: arguments 532: 1 enc: target encoding 533: 2 what: specifies which meta character it is. 534: 535: ONIG_META_CHAR_ESCAPE 536: ONIG_META_CHAR_ANYCHAR 537: ONIG_META_CHAR_ANYTIME 538: ONIG_META_CHAR_ZERO_OR_ONE_TIME 539: ONIG_META_CHAR_ONE_OR_MORE_TIME 540: ONIG_META_CHAR_ANYCHAR_ANYTIME 541: 542: 3 code: meta character or ONIG_INEFFECTIVE_META_CHAR. 543: 544: 545: # OnigAmbigType onig_get_default_ambig_flag() 546: 547: Get default ambig flag. 548: 549: 550: # int onig_set_default_ambig_flag(OnigAmbigType ambig_flag) 551: 552: Set default ambig flag. 553: 554: 1 ambig_flag: ambiguity flag 555: 556: 557: # unsigned int onig_get_match_stack_limit_size(void) 558: 559: Return the maximum number of stack size. 560: (default: 0 == unlimited) 561: 562: 563: # int onig_set_match_stack_limit_size(unsigned int size) 564: 565: Set the maximum number of stack size. 566: (size = 0: unlimited) 567: 568: normal return: ONIG_NORMAL 569: 570: 571: # int onig_end(void) 572: 573: The use of this library is finished. 574: 575: normal return: ONIG_NORMAL 576: 577: It is not allowed to use regex objects which created 578: before onig_end() call. 579: 580: 581: # const char* onig_version(void) 582: 583: Return version string. (ex. "2.2.8") 584: 585: // END