Annotation of embedaddon/pcre/testdata/testoutput15, revision 1.1.1.5

1.1.1.5 ! misho       1: /-- This set of tests is for UTF-8 support but not Unicode property support,
        !             2:     and is relevant only to the 8-bit library. --/
        !             3: 
        !             4: < forbid W
1.1.1.2   misho       5: 
                      6: /X(\C{3})/8
                      7:     X\x{1234}
                      8:  0: X\x{1234}
                      9:  1: \x{1234}
                     10: 
                     11: /X(\C{4})/8
                     12:     X\x{1234}YZ
                     13:  0: X\x{1234}Y
                     14:  1: \x{1234}Y
                     15:     
                     16: /X\C*/8
                     17:     XYZabcdce
                     18:  0: XYZabcdce
                     19:     
                     20: /X\C*?/8
                     21:     XYZabcde
                     22:  0: X
                     23:     
                     24: /X\C{3,5}/8
                     25:     Xabcdefg   
                     26:  0: Xabcde
                     27:     X\x{1234} 
                     28:  0: X\x{1234}
                     29:     X\x{1234}YZ
                     30:  0: X\x{1234}YZ
                     31:     X\x{1234}\x{512}  
                     32:  0: X\x{1234}\x{512}
                     33:     X\x{1234}\x{512}YZ
                     34:  0: X\x{1234}\x{512}
                     35: 
                     36: /X\C{3,5}?/8
                     37:     Xabcdefg   
                     38:  0: Xabc
                     39:     X\x{1234} 
                     40:  0: X\x{1234}
                     41:     X\x{1234}YZ
                     42:  0: X\x{1234}
                     43:     X\x{1234}\x{512}  
                     44:  0: X\x{1234}
                     45: 
                     46: /a\Cb/8
                     47:     aXb
                     48:  0: aXb
                     49:     a\nb
                     50:  0: a\x{0a}b
                     51:     
                     52: /a\C\Cb/8 
                     53:     a\x{100}b 
                     54:  0: a\x{100}b
                     55: 
                     56: /ab\Cde/8
                     57:     abXde
                     58:  0: abXde
                     59: 
                     60: /a\C\Cb/8 
                     61:     a\x{100}b
                     62:  0: a\x{100}b
                     63:     ** Failers 
                     64: No match
                     65:     a\x{12257}b
                     66: No match
                     67: 
                     68: /[]/8
                     69: Failed: invalid UTF-8 string at offset 1
                     70: 
                     71: //8
                     72: Failed: invalid UTF-8 string at offset 0
                     73: 
                     74: /xxx/8
                     75: Failed: invalid UTF-8 string at offset 0
                     76: 
1.1.1.5 ! misho      77: /xxx/8?DZSSO
1.1.1.2   misho      78: ------------------------------------------------------------------
                     79:         Bra
                     80:         \X{c0}\X{c0}\X{c0}xxx
                     81:         Ket
                     82:         End
                     83: ------------------------------------------------------------------
1.1       misho      84: Capturing subpattern count = 0
1.1.1.5 ! misho      85: Options: no_auto_possessify utf no_utf_check
1.1.1.2   misho      86: First char = \x{c3}
                     87: Need char = 'x'
                     88: 
1.1.1.4   misho      89: /badutf/8
                     90:     \xdf
1.1.1.2   misho      91: Error -10 (bad UTF-8 string) offset=0 reason=1
1.1.1.4   misho      92:     \xef
                     93: Error -10 (bad UTF-8 string) offset=0 reason=2
                     94:     \xef\x80
                     95: Error -10 (bad UTF-8 string) offset=0 reason=1
                     96:     \xf7
                     97: Error -10 (bad UTF-8 string) offset=0 reason=3
                     98:     \xf7\x80
                     99: Error -10 (bad UTF-8 string) offset=0 reason=2
                    100:     \xf7\x80\x80
                    101: Error -10 (bad UTF-8 string) offset=0 reason=1
                    102:     \xfb
                    103: Error -10 (bad UTF-8 string) offset=0 reason=4
                    104:     \xfb\x80
                    105: Error -10 (bad UTF-8 string) offset=0 reason=3
                    106:     \xfb\x80\x80
                    107: Error -10 (bad UTF-8 string) offset=0 reason=2
                    108:     \xfb\x80\x80\x80
1.1.1.2   misho     109: Error -10 (bad UTF-8 string) offset=0 reason=1
1.1.1.4   misho     110:     \xfd
                    111: Error -10 (bad UTF-8 string) offset=0 reason=5
                    112:     \xfd\x80
                    113: Error -10 (bad UTF-8 string) offset=0 reason=4
                    114:     \xfd\x80\x80
                    115: Error -10 (bad UTF-8 string) offset=0 reason=3
                    116:     \xfd\x80\x80\x80
                    117: Error -10 (bad UTF-8 string) offset=0 reason=2
                    118:     \xfd\x80\x80\x80\x80
1.1.1.2   misho     119: Error -10 (bad UTF-8 string) offset=0 reason=1
1.1.1.4   misho     120:     \xdf\x7f
                    121: Error -10 (bad UTF-8 string) offset=0 reason=6
                    122:     \xef\x7f\x80
                    123: Error -10 (bad UTF-8 string) offset=0 reason=6
                    124:     \xef\x80\x7f
                    125: Error -10 (bad UTF-8 string) offset=0 reason=7
                    126:     \xf7\x7f\x80\x80
                    127: Error -10 (bad UTF-8 string) offset=0 reason=6
                    128:     \xf7\x80\x7f\x80
                    129: Error -10 (bad UTF-8 string) offset=0 reason=7
                    130:     \xf7\x80\x80\x7f
                    131: Error -10 (bad UTF-8 string) offset=0 reason=8
                    132:     \xfb\x7f\x80\x80\x80
                    133: Error -10 (bad UTF-8 string) offset=0 reason=6
                    134:     \xfb\x80\x7f\x80\x80
                    135: Error -10 (bad UTF-8 string) offset=0 reason=7
                    136:     \xfb\x80\x80\x7f\x80
                    137: Error -10 (bad UTF-8 string) offset=0 reason=8
                    138:     \xfb\x80\x80\x80\x7f
                    139: Error -10 (bad UTF-8 string) offset=0 reason=9
                    140:     \xfd\x7f\x80\x80\x80\x80
                    141: Error -10 (bad UTF-8 string) offset=0 reason=6
                    142:     \xfd\x80\x7f\x80\x80\x80
                    143: Error -10 (bad UTF-8 string) offset=0 reason=7
                    144:     \xfd\x80\x80\x7f\x80\x80
                    145: Error -10 (bad UTF-8 string) offset=0 reason=8
                    146:     \xfd\x80\x80\x80\x7f\x80
                    147: Error -10 (bad UTF-8 string) offset=0 reason=9
                    148:     \xfd\x80\x80\x80\x80\x7f
                    149: Error -10 (bad UTF-8 string) offset=0 reason=10
                    150:     \xed\xa0\x80
                    151: Error -10 (bad UTF-8 string) offset=0 reason=14
                    152:     \xc0\x8f
                    153: Error -10 (bad UTF-8 string) offset=0 reason=15
                    154:     \xe0\x80\x8f
                    155: Error -10 (bad UTF-8 string) offset=0 reason=16
                    156:     \xf0\x80\x80\x8f
                    157: Error -10 (bad UTF-8 string) offset=0 reason=17
                    158:     \xf8\x80\x80\x80\x8f
                    159: Error -10 (bad UTF-8 string) offset=0 reason=18
                    160:     \xfc\x80\x80\x80\x80\x8f
                    161: Error -10 (bad UTF-8 string) offset=0 reason=19
                    162:     \x80
                    163: Error -10 (bad UTF-8 string) offset=0 reason=20
                    164:     \xfe
                    165: Error -10 (bad UTF-8 string) offset=0 reason=21
                    166:     \xff
                    167: Error -10 (bad UTF-8 string) offset=0 reason=21
                    168: 
                    169: /badutf/8
                    170:     \xfb\x80\x80\x80\x80
                    171: Error -10 (bad UTF-8 string) offset=0 reason=11
                    172:     \xfd\x80\x80\x80\x80\x80
                    173: Error -10 (bad UTF-8 string) offset=0 reason=12
                    174:     \xf7\xbf\xbf\xbf
                    175: Error -10 (bad UTF-8 string) offset=0 reason=13
                    176: 
                    177: /shortutf/8
                    178:     \P\P\xdf
                    179: Error -25 (short UTF-8 string) offset=0 reason=1
                    180:     \P\P\xef
                    181: Error -25 (short UTF-8 string) offset=0 reason=2
                    182:     \P\P\xef\x80
                    183: Error -25 (short UTF-8 string) offset=0 reason=1
                    184:     \P\P\xf7
                    185: Error -25 (short UTF-8 string) offset=0 reason=3
                    186:     \P\P\xf7\x80
                    187: Error -25 (short UTF-8 string) offset=0 reason=2
                    188:     \P\P\xf7\x80\x80
                    189: Error -25 (short UTF-8 string) offset=0 reason=1
                    190:     \P\P\xfb
                    191: Error -25 (short UTF-8 string) offset=0 reason=4
                    192:     \P\P\xfb\x80
                    193: Error -25 (short UTF-8 string) offset=0 reason=3
                    194:     \P\P\xfb\x80\x80
                    195: Error -25 (short UTF-8 string) offset=0 reason=2
                    196:     \P\P\xfb\x80\x80\x80
                    197: Error -25 (short UTF-8 string) offset=0 reason=1
                    198:     \P\P\xfd
                    199: Error -25 (short UTF-8 string) offset=0 reason=5
                    200:     \P\P\xfd\x80
                    201: Error -25 (short UTF-8 string) offset=0 reason=4
                    202:     \P\P\xfd\x80\x80
                    203: Error -25 (short UTF-8 string) offset=0 reason=3
                    204:     \P\P\xfd\x80\x80\x80
                    205: Error -25 (short UTF-8 string) offset=0 reason=2
                    206:     \P\P\xfd\x80\x80\x80\x80
1.1.1.2   misho     207: Error -25 (short UTF-8 string) offset=0 reason=1
                    208: 
                    209: /anything/8
                    210:     \xc0\x80
                    211: Error -10 (bad UTF-8 string) offset=0 reason=15
                    212:     \xc1\x8f 
                    213: Error -10 (bad UTF-8 string) offset=0 reason=15
                    214:     \xe0\x9f\x80
                    215: Error -10 (bad UTF-8 string) offset=0 reason=16
                    216:     \xf0\x8f\x80\x80 
                    217: Error -10 (bad UTF-8 string) offset=0 reason=17
                    218:     \xf8\x87\x80\x80\x80  
                    219: Error -10 (bad UTF-8 string) offset=0 reason=18
                    220:     \xfc\x83\x80\x80\x80\x80
                    221: Error -10 (bad UTF-8 string) offset=0 reason=19
                    222:     \xfe\x80\x80\x80\x80\x80  
                    223: Error -10 (bad UTF-8 string) offset=0 reason=21
                    224:     \xff\x80\x80\x80\x80\x80  
                    225: Error -10 (bad UTF-8 string) offset=0 reason=21
                    226:     \xc3\x8f
                    227: No match
                    228:     \xe0\xaf\x80
                    229: No match
                    230:     \xe1\x80\x80
                    231: No match
                    232:     \xf0\x9f\x80\x80 
                    233: No match
                    234:     \xf1\x8f\x80\x80 
                    235: No match
                    236:     \xf8\x88\x80\x80\x80  
                    237: Error -10 (bad UTF-8 string) offset=0 reason=11
                    238:     \xf9\x87\x80\x80\x80  
                    239: Error -10 (bad UTF-8 string) offset=0 reason=11
                    240:     \xfc\x84\x80\x80\x80\x80
                    241: Error -10 (bad UTF-8 string) offset=0 reason=12
                    242:     \xfd\x83\x80\x80\x80\x80
                    243: Error -10 (bad UTF-8 string) offset=0 reason=12
                    244:     \?\xf8\x88\x80\x80\x80  
                    245: No match
                    246:     \?\xf9\x87\x80\x80\x80  
                    247: No match
                    248:     \?\xfc\x84\x80\x80\x80\x80
                    249: No match
                    250:     \?\xfd\x83\x80\x80\x80\x80
                    251: No match
                    252: 
                    253: /\x{100}/8DZ
                    254: ------------------------------------------------------------------
                    255:         Bra
                    256:         \x{100}
                    257:         Ket
                    258:         End
                    259: ------------------------------------------------------------------
                    260: Capturing subpattern count = 0
                    261: Options: utf
                    262: First char = \x{c4}
                    263: Need char = \x{80}
                    264: 
                    265: /\x{1000}/8DZ
                    266: ------------------------------------------------------------------
                    267:         Bra
                    268:         \x{1000}
                    269:         Ket
                    270:         End
                    271: ------------------------------------------------------------------
                    272: Capturing subpattern count = 0
                    273: Options: utf
                    274: First char = \x{e1}
                    275: Need char = \x{80}
                    276: 
                    277: /\x{10000}/8DZ
                    278: ------------------------------------------------------------------
                    279:         Bra
                    280:         \x{10000}
                    281:         Ket
                    282:         End
                    283: ------------------------------------------------------------------
                    284: Capturing subpattern count = 0
                    285: Options: utf
                    286: First char = \x{f0}
                    287: Need char = \x{80}
                    288: 
                    289: /\x{100000}/8DZ
                    290: ------------------------------------------------------------------
                    291:         Bra
                    292:         \x{100000}
                    293:         Ket
                    294:         End
                    295: ------------------------------------------------------------------
                    296: Capturing subpattern count = 0
                    297: Options: utf
                    298: First char = \x{f4}
                    299: Need char = \x{80}
                    300: 
                    301: /\x{10ffff}/8DZ
                    302: ------------------------------------------------------------------
                    303:         Bra
                    304:         \x{10ffff}
                    305:         Ket
                    306:         End
                    307: ------------------------------------------------------------------
                    308: Capturing subpattern count = 0
                    309: Options: utf
                    310: First char = \x{f4}
                    311: Need char = \x{bf}
                    312: 
                    313: /[\x{ff}]/8DZ
                    314: ------------------------------------------------------------------
                    315:         Bra
                    316:         \x{ff}
                    317:         Ket
                    318:         End
                    319: ------------------------------------------------------------------
                    320: Capturing subpattern count = 0
                    321: Options: utf
                    322: First char = \x{c3}
                    323: Need char = \x{bf}
                    324: 
                    325: /[\x{100}]/8DZ
                    326: ------------------------------------------------------------------
                    327:         Bra
                    328:         \x{100}
                    329:         Ket
                    330:         End
                    331: ------------------------------------------------------------------
                    332: Capturing subpattern count = 0
                    333: Options: utf
                    334: First char = \x{c4}
                    335: Need char = \x{80}
                    336: 
                    337: /\x80/8DZ
                    338: ------------------------------------------------------------------
                    339:         Bra
                    340:         \x{80}
                    341:         Ket
                    342:         End
                    343: ------------------------------------------------------------------
                    344: Capturing subpattern count = 0
                    345: Options: utf
                    346: First char = \x{c2}
                    347: Need char = \x{80}
                    348: 
                    349: /\xff/8DZ
                    350: ------------------------------------------------------------------
                    351:         Bra
                    352:         \x{ff}
                    353:         Ket
                    354:         End
                    355: ------------------------------------------------------------------
                    356: Capturing subpattern count = 0
                    357: Options: utf
                    358: First char = \x{c3}
                    359: Need char = \x{bf}
                    360: 
                    361: /\x{D55c}\x{ad6d}\x{C5B4}/DZ8 
                    362: ------------------------------------------------------------------
                    363:         Bra
                    364:         \x{d55c}\x{ad6d}\x{c5b4}
                    365:         Ket
                    366:         End
                    367: ------------------------------------------------------------------
                    368: Capturing subpattern count = 0
                    369: Options: utf
                    370: First char = \x{ed}
                    371: Need char = \x{b4}
                    372:     \x{D55c}\x{ad6d}\x{C5B4} 
                    373:  0: \x{d55c}\x{ad6d}\x{c5b4}
                    374: 
                    375: /\x{65e5}\x{672c}\x{8a9e}/DZ8
                    376: ------------------------------------------------------------------
                    377:         Bra
                    378:         \x{65e5}\x{672c}\x{8a9e}
                    379:         Ket
                    380:         End
                    381: ------------------------------------------------------------------
                    382: Capturing subpattern count = 0
                    383: Options: utf
                    384: First char = \x{e6}
                    385: Need char = \x{9e}
                    386:     \x{65e5}\x{672c}\x{8a9e}
                    387:  0: \x{65e5}\x{672c}\x{8a9e}
                    388: 
                    389: /\x{80}/DZ8
                    390: ------------------------------------------------------------------
                    391:         Bra
                    392:         \x{80}
                    393:         Ket
                    394:         End
                    395: ------------------------------------------------------------------
                    396: Capturing subpattern count = 0
                    397: Options: utf
                    398: First char = \x{c2}
                    399: Need char = \x{80}
                    400: 
                    401: /\x{084}/DZ8
                    402: ------------------------------------------------------------------
                    403:         Bra
                    404:         \x{84}
                    405:         Ket
                    406:         End
                    407: ------------------------------------------------------------------
                    408: Capturing subpattern count = 0
                    409: Options: utf
                    410: First char = \x{c2}
                    411: Need char = \x{84}
                    412: 
                    413: /\x{104}/DZ8
                    414: ------------------------------------------------------------------
                    415:         Bra
                    416:         \x{104}
                    417:         Ket
                    418:         End
                    419: ------------------------------------------------------------------
                    420: Capturing subpattern count = 0
                    421: Options: utf
                    422: First char = \x{c4}
                    423: Need char = \x{84}
                    424: 
                    425: /\x{861}/DZ8
                    426: ------------------------------------------------------------------
                    427:         Bra
                    428:         \x{861}
                    429:         Ket
                    430:         End
                    431: ------------------------------------------------------------------
                    432: Capturing subpattern count = 0
                    433: Options: utf
                    434: First char = \x{e0}
                    435: Need char = \x{a1}
                    436: 
                    437: /\x{212ab}/DZ8
                    438: ------------------------------------------------------------------
                    439:         Bra
                    440:         \x{212ab}
                    441:         Ket
                    442:         End
                    443: ------------------------------------------------------------------
                    444: Capturing subpattern count = 0
                    445: Options: utf
                    446: First char = \x{f0}
                    447: Need char = \x{ab}
                    448: 
                    449: /-- This one is here not because it's different to Perl, but because the way
                    450: the captured single-byte is displayed. (In Perl it becomes a character, and you
                    451: can't tell the difference.) --/
                    452:     
                    453: /X(\C)(.*)/8
                    454:     X\x{1234}
                    455:  0: X\x{1234}
                    456:  1: \x{e1}
                    457:  2: \x{88}\x{b4}
                    458:     X\nabc 
                    459:  0: X\x{0a}abc
                    460:  1: \x{0a}
                    461:  2: abc
                    462: 
                    463: /-- This one is here because Perl gives out a grumbly error message (quite 
                    464: correctly, but that messes up comparisons). --/
                    465:     
                    466: /a\Cb/8
                    467:     *** Failers 
                    468: No match
                    469:     a\x{100}b 
                    470: No match
                    471:     
                    472: /[^ab\xC0-\xF0]/8SDZ
                    473: ------------------------------------------------------------------
                    474:         Bra
                    475:         [\x00-`c-\xbf\xf1-\xff] (neg)
                    476:         Ket
                    477:         End
                    478: ------------------------------------------------------------------
                    479: Capturing subpattern count = 0
                    480: Options: utf
                    481: No first char
                    482: No need char
                    483: Subject length lower bound = 1
                    484: Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a 
                    485:   \x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 
                    486:   \x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 
                    487:   5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y 
                    488:   Z [ \ ] ^ _ ` c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f 
                    489:   \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 
                    490:   \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf 
                    491:   \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee 
                    492:   \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd 
                    493:   \xfe \xff 
                    494:     \x{f1}
                    495:  0: \x{f1}
                    496:     \x{bf}
                    497:  0: \x{bf}
                    498:     \x{100}
                    499:  0: \x{100}
                    500:     \x{1000}   
                    501:  0: \x{1000}
                    502:     *** Failers
                    503:  0: *
                    504:     \x{c0} 
                    505: No match
                    506:     \x{f0} 
                    507: No match
                    508: 
                    509: /Ā{3,4}/8SDZ
                    510: ------------------------------------------------------------------
                    511:         Bra
                    512:         \x{100}{3}
1.1.1.5 ! misho     513:         \x{100}?+
1.1.1.2   misho     514:         Ket
                    515:         End
                    516: ------------------------------------------------------------------
                    517: Capturing subpattern count = 0
                    518: Options: utf
                    519: First char = \x{c4}
                    520: Need char = \x{80}
1.1       misho     521: Subject length lower bound = 3
                    522: No set of starting bytes
1.1.1.2   misho     523:   \x{100}\x{100}\x{100}\x{100\x{100}
                    524:  0: \x{100}\x{100}\x{100}
                    525: 
                    526: /(\x{100}+|x)/8SDZ
                    527: ------------------------------------------------------------------
                    528:         Bra
                    529:         CBra 1
1.1.1.5 ! misho     530:         \x{100}++
1.1.1.2   misho     531:         Alt
                    532:         x
                    533:         Ket
                    534:         Ket
                    535:         End
                    536: ------------------------------------------------------------------
                    537: Capturing subpattern count = 1
                    538: Options: utf
                    539: No first char
                    540: No need char
                    541: Subject length lower bound = 1
                    542: Starting byte set: x \xc4 
                    543: 
                    544: /(\x{100}*a|x)/8SDZ
                    545: ------------------------------------------------------------------
                    546:         Bra
                    547:         CBra 1
                    548:         \x{100}*+
                    549:         a
                    550:         Alt
                    551:         x
                    552:         Ket
                    553:         Ket
                    554:         End
                    555: ------------------------------------------------------------------
                    556: Capturing subpattern count = 1
                    557: Options: utf
                    558: No first char
                    559: No need char
                    560: Subject length lower bound = 1
                    561: Starting byte set: a x \xc4 
                    562: 
                    563: /(\x{100}{0,2}a|x)/8SDZ
                    564: ------------------------------------------------------------------
                    565:         Bra
                    566:         CBra 1
1.1.1.5 ! misho     567:         \x{100}{0,2}+
1.1.1.2   misho     568:         a
                    569:         Alt
                    570:         x
                    571:         Ket
                    572:         Ket
                    573:         End
                    574: ------------------------------------------------------------------
                    575: Capturing subpattern count = 1
                    576: Options: utf
                    577: No first char
                    578: No need char
                    579: Subject length lower bound = 1
                    580: Starting byte set: a x \xc4 
                    581: 
                    582: /(\x{100}{1,2}a|x)/8SDZ
                    583: ------------------------------------------------------------------
                    584:         Bra
                    585:         CBra 1
                    586:         \x{100}
1.1.1.5 ! misho     587:         \x{100}{0,1}+
1.1.1.2   misho     588:         a
                    589:         Alt
                    590:         x
                    591:         Ket
                    592:         Ket
                    593:         End
                    594: ------------------------------------------------------------------
                    595: Capturing subpattern count = 1
                    596: Options: utf
                    597: No first char
                    598: No need char
                    599: Subject length lower bound = 1
                    600: Starting byte set: x \xc4 
                    601: 
                    602: /\x{100}/8DZ
                    603: ------------------------------------------------------------------
                    604:         Bra
                    605:         \x{100}
                    606:         Ket
                    607:         End
                    608: ------------------------------------------------------------------
                    609: Capturing subpattern count = 0
                    610: Options: utf
                    611: First char = \x{c4}
                    612: Need char = \x{80}
                    613: 
                    614: /a\x{100}\x{101}*/8DZ
                    615: ------------------------------------------------------------------
                    616:         Bra
                    617:         a\x{100}
1.1.1.5 ! misho     618:         \x{101}*+
1.1.1.2   misho     619:         Ket
                    620:         End
                    621: ------------------------------------------------------------------
                    622: Capturing subpattern count = 0
                    623: Options: utf
                    624: First char = 'a'
                    625: Need char = \x{80}
                    626: 
                    627: /a\x{100}\x{101}+/8DZ
                    628: ------------------------------------------------------------------
                    629:         Bra
                    630:         a\x{100}
1.1.1.5 ! misho     631:         \x{101}++
1.1.1.2   misho     632:         Ket
                    633:         End
                    634: ------------------------------------------------------------------
                    635: Capturing subpattern count = 0
                    636: Options: utf
                    637: First char = 'a'
                    638: Need char = \x{81}
1.1       misho     639: 
1.1.1.2   misho     640: /[^\x{c4}]/DZ
                    641: ------------------------------------------------------------------
                    642:         Bra
1.1.1.4   misho     643:         [^\x{c4}]
1.1.1.2   misho     644:         Ket
                    645:         End
                    646: ------------------------------------------------------------------
1.1       misho     647: Capturing subpattern count = 0
                    648: No options
                    649: No first char
                    650: No need char
1.1.1.2   misho     651: 
                    652: /[\x{100}]/8DZ
                    653: ------------------------------------------------------------------
                    654:         Bra
                    655:         \x{100}
                    656:         Ket
                    657:         End
                    658: ------------------------------------------------------------------
                    659: Capturing subpattern count = 0
                    660: Options: utf
                    661: First char = \x{c4}
                    662: Need char = \x{80}
                    663:     \x{100}
                    664:  0: \x{100}
                    665:     Z\x{100}
                    666:  0: \x{100}
                    667:     \x{100}Z
                    668:  0: \x{100}
                    669:     *** Failers 
                    670: No match
                    671: 
                    672: /[\xff]/DZ8
                    673: ------------------------------------------------------------------
                    674:         Bra
                    675:         \x{ff}
                    676:         Ket
                    677:         End
                    678: ------------------------------------------------------------------
                    679: Capturing subpattern count = 0
                    680: Options: utf
                    681: First char = \x{c3}
                    682: Need char = \x{bf}
                    683:     >\x{ff}<
                    684:  0: \x{ff}
                    685: 
                    686: /[^\xff]/8DZ
                    687: ------------------------------------------------------------------
                    688:         Bra
1.1.1.3   misho     689:         [^\x{ff}]
1.1.1.2   misho     690:         Ket
                    691:         End
                    692: ------------------------------------------------------------------
                    693: Capturing subpattern count = 0
                    694: Options: utf
                    695: No first char
                    696: No need char
                    697: 
                    698: /\x{100}abc(xyz(?1))/8DZ
                    699: ------------------------------------------------------------------
                    700:         Bra
                    701:         \x{100}abc
                    702:         CBra 1
                    703:         xyz
                    704:         Recurse
                    705:         Ket
                    706:         Ket
                    707:         End
                    708: ------------------------------------------------------------------
                    709: Capturing subpattern count = 1
                    710: Options: utf
                    711: First char = \x{c4}
                    712: Need char = 'z'
                    713: 
                    714: /a\x{1234}b/P8
                    715:     a\x{1234}b
                    716:  0: a\x{1234}b
                    717: 
                    718: /\777/8I
                    719: Capturing subpattern count = 0
                    720: Options: utf
                    721: First char = \x{c7}
                    722: Need char = \x{bf}
                    723:   \x{1ff}
                    724:  0: \x{1ff}
                    725:   \777 
                    726:  0: \x{1ff}
                    727:   
                    728: /\x{100}+\x{200}/8DZ
                    729: ------------------------------------------------------------------
                    730:         Bra
                    731:         \x{100}++
                    732:         \x{200}
                    733:         Ket
                    734:         End
                    735: ------------------------------------------------------------------
                    736: Capturing subpattern count = 0
                    737: Options: utf
                    738: First char = \x{c4}
                    739: Need char = \x{80}
                    740: 
                    741: /\x{100}+X/8DZ
                    742: ------------------------------------------------------------------
                    743:         Bra
                    744:         \x{100}++
                    745:         X
                    746:         Ket
                    747:         End
                    748: ------------------------------------------------------------------
                    749: Capturing subpattern count = 0
                    750: Options: utf
                    751: First char = \x{c4}
                    752: Need char = 'X'
                    753: 
                    754: /^[\QĀ\E-\QŐ\E/BZ8
                    755: Failed: missing terminating ] for character class at offset 15
                    756: 
                    757: /-- This tests the stricter UTF-8 check according to RFC 3629. --/ 
                    758:     
                    759: /X/8
                    760:     \x{d800}
                    761: Error -10 (bad UTF-8 string) offset=0 reason=14
                    762:     \x{d800}\?
                    763: No match
                    764:     \x{da00}
                    765: Error -10 (bad UTF-8 string) offset=0 reason=14
                    766:     \x{da00}\?
                    767: No match
                    768:     \x{dfff}
                    769: Error -10 (bad UTF-8 string) offset=0 reason=14
                    770:     \x{dfff}\?
                    771: No match
                    772:     \x{110000}    
                    773: Error -10 (bad UTF-8 string) offset=0 reason=13
                    774:     \x{110000}\?    
                    775: No match
                    776:     \x{2000000} 
                    777: Error -10 (bad UTF-8 string) offset=0 reason=11
                    778:     \x{2000000}\? 
                    779: No match
                    780:     \x{7fffffff} 
                    781: Error -10 (bad UTF-8 string) offset=0 reason=12
                    782:     \x{7fffffff}\? 
                    783: No match
                    784: 
                    785: /(*UTF8)\x{1234}/
                    786:   abcd\x{1234}pqr
                    787:  0: \x{1234}
                    788: 
1.1.1.4   misho     789: /(*CRLF)(*UTF)(*BSR_UNICODE)a\Rb/I
1.1.1.2   misho     790: Capturing subpattern count = 0
                    791: Options: bsr_unicode utf
                    792: Forced newline sequence: CRLF
                    793: First char = 'a'
                    794: Need char = 'b'
                    795: 
                    796: /\h/SI8
                    797: Capturing subpattern count = 0
                    798: Options: utf
                    799: No first char
                    800: No need char
                    801: Subject length lower bound = 1
                    802: Starting byte set: \x09 \x20 \xc2 \xe1 \xe2 \xe3 
                    803:     ABC\x{09}
                    804:  0: \x{09}
                    805:     ABC\x{20}
                    806:  0:  
                    807:     ABC\x{a0}
                    808:  0: \x{a0}
                    809:     ABC\x{1680}
                    810:  0: \x{1680}
                    811:     ABC\x{180e}
                    812:  0: \x{180e}
                    813:     ABC\x{2000}
                    814:  0: \x{2000}
                    815:     ABC\x{202f} 
                    816:  0: \x{202f}
                    817:     ABC\x{205f} 
                    818:  0: \x{205f}
                    819:     ABC\x{3000} 
                    820:  0: \x{3000}
                    821: 
                    822: /\v/SI8
                    823: Capturing subpattern count = 0
                    824: Options: utf
                    825: No first char
                    826: No need char
                    827: Subject length lower bound = 1
                    828: Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2 
                    829:     ABC\x{0a}
                    830:  0: \x{0a}
                    831:     ABC\x{0b}
                    832:  0: \x{0b}
                    833:     ABC\x{0c}
                    834:  0: \x{0c}
                    835:     ABC\x{0d}
                    836:  0: \x{0d}
                    837:     ABC\x{85}
                    838:  0: \x{85}
                    839:     ABC\x{2028}
                    840:  0: \x{2028}
                    841: 
                    842: /\h*A/SI8
                    843: Capturing subpattern count = 0
                    844: Options: utf
                    845: No first char
                    846: Need char = 'A'
                    847: Subject length lower bound = 1
                    848: Starting byte set: \x09 \x20 A \xc2 \xe1 \xe2 \xe3 
                    849:     CDBABC
                    850:  0: A
                    851:     
                    852: /\v+A/SI8
                    853: Capturing subpattern count = 0
                    854: Options: utf
                    855: No first char
                    856: Need char = 'A'
                    857: Subject length lower bound = 2
                    858: Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2 
                    859: 
                    860: /\s?xxx\s/8SI
                    861: Capturing subpattern count = 0
                    862: Options: utf
                    863: No first char
                    864: Need char = 'x'
                    865: Subject length lower bound = 4
1.1.1.5 ! misho     866: Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20 x 
1.1.1.2   misho     867: 
                    868: /\sxxx\s/I8ST1
                    869: Capturing subpattern count = 0
                    870: Options: utf
                    871: No first char
                    872: Need char = 'x'
                    873: Subject length lower bound = 5
                    874: Starting byte set: \x09 \x0a \x0c \x0d \x20 \xc2 
                    875:     AB\x{85}xxx\x{a0}XYZ
                    876:  0: \x{85}xxx\x{a0}
                    877:     AB\x{a0}xxx\x{85}XYZ
                    878:  0: \x{a0}xxx\x{85}
                    879: 
                    880: /\S \S/I8ST1
                    881: Capturing subpattern count = 0
                    882: Options: utf
                    883: No first char
                    884: Need char = ' '
                    885: Subject length lower bound = 3
                    886: Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e 
                    887:   \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d 
                    888:   \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ 
                    889:   A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e 
                    890:   f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \xc0 \xc1 \xc2 \xc3 
                    891:   \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2 
                    892:   \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 \xe1 
                    893:   \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef \xf0 
                    894:   \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe \xff 
                    895:     \x{a2} \x{84} 
                    896:  0: \x{a2} \x{84}
                    897:     A Z 
                    898:  0: A Z
                    899: 
                    900: /a+/8
                    901:     a\x{123}aa\>1
                    902:  0: aa
                    903:     a\x{123}aa\>2
                    904: Error -11 (bad UTF-8 offset)
                    905:     a\x{123}aa\>3
                    906:  0: aa
                    907:     a\x{123}aa\>4
                    908:  0: a
                    909:     a\x{123}aa\>5
                    910: No match
                    911:     a\x{123}aa\>6
                    912: Error -24 (bad offset value)
                    913: 
                    914: /\x{1234}+/iS8I
                    915: Capturing subpattern count = 0
                    916: Options: caseless utf
                    917: No first char
                    918: No need char
                    919: Subject length lower bound = 1
                    920: Starting byte set: \xe1 
                    921: 
                    922: /\x{1234}+?/iS8I
                    923: Capturing subpattern count = 0
                    924: Options: caseless utf
                    925: No first char
                    926: No need char
                    927: Subject length lower bound = 1
                    928: Starting byte set: \xe1 
                    929: 
                    930: /\x{1234}++/iS8I
                    931: Capturing subpattern count = 0
                    932: Options: caseless utf
                    933: No first char
                    934: No need char
                    935: Subject length lower bound = 1
                    936: Starting byte set: \xe1 
                    937: 
                    938: /\x{1234}{2}/iS8I
                    939: Capturing subpattern count = 0
                    940: Options: caseless utf
                    941: No first char
                    942: No need char
                    943: Subject length lower bound = 2
                    944: Starting byte set: \xe1 
                    945: 
                    946: /[^\x{c4}]/8DZ
                    947: ------------------------------------------------------------------
                    948:         Bra
1.1.1.3   misho     949:         [^\x{c4}]
1.1.1.2   misho     950:         Ket
                    951:         End
                    952: ------------------------------------------------------------------
                    953: Capturing subpattern count = 0
                    954: Options: utf
                    955: No first char
                    956: No need char
                    957: 
                    958: /X+\x{200}/8DZ
                    959: ------------------------------------------------------------------
                    960:         Bra
                    961:         X++
                    962:         \x{200}
                    963:         Ket
                    964:         End
                    965: ------------------------------------------------------------------
                    966: Capturing subpattern count = 0
                    967: Options: utf
                    968: First char = 'X'
                    969: Need char = \x{80}
                    970: 
                    971: /\R/SI8
                    972: Capturing subpattern count = 0
                    973: Options: utf
                    974: No first char
                    975: No need char
                    976: Subject length lower bound = 1
                    977: Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2 
                    978: 
                    979: /\777/8DZ
                    980: ------------------------------------------------------------------
                    981:         Bra
                    982:         \x{1ff}
                    983:         Ket
                    984:         End
                    985: ------------------------------------------------------------------
                    986: Capturing subpattern count = 0
                    987: Options: utf
                    988: First char = \x{c7}
                    989: Need char = \x{bf}
1.1       misho     990: 
1.1.1.3   misho     991: /\w+\x{C4}/8BZ
                    992: ------------------------------------------------------------------
                    993:         Bra
                    994:         \w++
                    995:         \x{c4}
                    996:         Ket
                    997:         End
                    998: ------------------------------------------------------------------
                    999:     a\x{C4}\x{C4}
                   1000:  0: a\x{c4}
                   1001: 
                   1002: /\w+\x{C4}/8BZT1
                   1003: ------------------------------------------------------------------
                   1004:         Bra
                   1005:         \w+
                   1006:         \x{c4}
                   1007:         Ket
                   1008:         End
                   1009: ------------------------------------------------------------------
                   1010:     a\x{C4}\x{C4}
                   1011:  0: a\x{c4}\x{c4}
                   1012:     
                   1013: /\W+\x{C4}/8BZ
                   1014: ------------------------------------------------------------------
                   1015:         Bra
                   1016:         \W+
                   1017:         \x{c4}
                   1018:         Ket
                   1019:         End
                   1020: ------------------------------------------------------------------
                   1021:     !\x{C4}
                   1022:  0: !\x{c4}
                   1023:  
                   1024: /\W+\x{C4}/8BZT1
                   1025: ------------------------------------------------------------------
                   1026:         Bra
                   1027:         \W++
                   1028:         \x{c4}
                   1029:         Ket
                   1030:         End
                   1031: ------------------------------------------------------------------
                   1032:     !\x{C4}
                   1033:  0: !\x{c4}
                   1034: 
                   1035: /\W+\x{A1}/8BZ
                   1036: ------------------------------------------------------------------
                   1037:         Bra
                   1038:         \W+
                   1039:         \x{a1}
                   1040:         Ket
                   1041:         End
                   1042: ------------------------------------------------------------------
                   1043:     !\x{A1}
                   1044:  0: !\x{a1}
                   1045:  
                   1046: /\W+\x{A1}/8BZT1
                   1047: ------------------------------------------------------------------
                   1048:         Bra
                   1049:         \W+
                   1050:         \x{a1}
                   1051:         Ket
                   1052:         End
                   1053: ------------------------------------------------------------------
                   1054:     !\x{A1}
                   1055:  0: !\x{a1}
                   1056: 
                   1057: /X\s+\x{A0}/8BZ
                   1058: ------------------------------------------------------------------
                   1059:         Bra
                   1060:         X
                   1061:         \s++
                   1062:         \x{a0}
                   1063:         Ket
                   1064:         End
                   1065: ------------------------------------------------------------------
                   1066:     X\x20\x{A0}\x{A0}
                   1067:  0: X \x{a0}
                   1068: 
                   1069: /X\s+\x{A0}/8BZT1
                   1070: ------------------------------------------------------------------
                   1071:         Bra
                   1072:         X
                   1073:         \s+
                   1074:         \x{a0}
                   1075:         Ket
                   1076:         End
                   1077: ------------------------------------------------------------------
                   1078:     X\x20\x{A0}\x{A0}
                   1079:  0: X \x{a0}\x{a0}
                   1080: 
                   1081: /\S+\x{A0}/8BZ
                   1082: ------------------------------------------------------------------
                   1083:         Bra
                   1084:         \S+
                   1085:         \x{a0}
                   1086:         Ket
                   1087:         End
                   1088: ------------------------------------------------------------------
                   1089:     X\x{A0}\x{A0}
                   1090:  0: X\x{a0}\x{a0}
                   1091: 
                   1092: /\S+\x{A0}/8BZT1
                   1093: ------------------------------------------------------------------
                   1094:         Bra
                   1095:         \S++
                   1096:         \x{a0}
                   1097:         Ket
                   1098:         End
                   1099: ------------------------------------------------------------------
                   1100:     X\x{A0}\x{A0}
                   1101:  0: X\x{a0}
                   1102: 
                   1103: /\x{a0}+\s!/8BZ
                   1104: ------------------------------------------------------------------
                   1105:         Bra
                   1106:         \x{a0}++
                   1107:         \s
                   1108:         !
                   1109:         Ket
                   1110:         End
                   1111: ------------------------------------------------------------------
                   1112:     \x{a0}\x20!
                   1113:  0: \x{a0} !
                   1114: 
                   1115: /\x{a0}+\s!/8BZT1
                   1116: ------------------------------------------------------------------
                   1117:         Bra
                   1118:         \x{a0}+
                   1119:         \s
                   1120:         !
                   1121:         Ket
                   1122:         End
                   1123: ------------------------------------------------------------------
                   1124:     \x{a0}\x20!
                   1125:  0: \x{a0} !
                   1126: 
1.1.1.4   misho    1127: /A/8
                   1128:   \x{ff000041}
                   1129: ** Character \x{ff000041} is greater than 0x7fffffff and so cannot be converted to UTF-8
                   1130:   \x{7f000041} 
                   1131: Error -10 (bad UTF-8 string) offset=0 reason=12
                   1132: 
                   1133: /(*UTF8)abc/9
                   1134: Failed: setting UTF is disabled by the application at offset 0
                   1135: 
                   1136: /abc/89
                   1137: Failed: setting UTF is disabled by the application at offset 0
                   1138: 
1.1       misho    1139: /-- End of testinput15 --/

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>