Annotation of embedaddon/pcre/testdata/testoutput15, revision 1.1.1.3
1.1.1.2 misho 1: /-- This set of tests is for UTF-8 support, and is relevant only to the 8-bit
2: library. --/
3:
4: /X(\C{3})/8
5: X\x{1234}
6: 0: X\x{1234}
7: 1: \x{1234}
8:
9: /X(\C{4})/8
10: X\x{1234}YZ
11: 0: X\x{1234}Y
12: 1: \x{1234}Y
13:
14: /X\C*/8
15: XYZabcdce
16: 0: XYZabcdce
17:
18: /X\C*?/8
19: XYZabcde
20: 0: X
21:
22: /X\C{3,5}/8
23: Xabcdefg
24: 0: Xabcde
25: X\x{1234}
26: 0: X\x{1234}
27: X\x{1234}YZ
28: 0: X\x{1234}YZ
29: X\x{1234}\x{512}
30: 0: X\x{1234}\x{512}
31: X\x{1234}\x{512}YZ
32: 0: X\x{1234}\x{512}
33:
34: /X\C{3,5}?/8
35: Xabcdefg
36: 0: Xabc
37: X\x{1234}
38: 0: X\x{1234}
39: X\x{1234}YZ
40: 0: X\x{1234}
41: X\x{1234}\x{512}
42: 0: X\x{1234}
43:
44: /a\Cb/8
45: aXb
46: 0: aXb
47: a\nb
48: 0: a\x{0a}b
49:
50: /a\C\Cb/8
51: a\x{100}b
52: 0: a\x{100}b
53:
54: /ab\Cde/8
55: abXde
56: 0: abXde
57:
58: /a\C\Cb/8
59: a\x{100}b
60: 0: a\x{100}b
61: ** Failers
62: No match
63: a\x{12257}b
64: No match
65:
66: /[]/8
67: Failed: invalid UTF-8 string at offset 1
68:
69: //8
70: Failed: invalid UTF-8 string at offset 0
71:
72: /xxx/8
73: Failed: invalid UTF-8 string at offset 0
74:
75: /xxx/8?DZSS
76: ------------------------------------------------------------------
77: Bra
78: \X{c0}\X{c0}\X{c0}xxx
79: Ket
80: End
81: ------------------------------------------------------------------
1.1 misho 82: Capturing subpattern count = 0
1.1.1.2 misho 83: Options: utf no_utf_check
84: First char = \x{c3}
85: Need char = 'x'
86:
87: /abc/8
88: ]
89: Error -10 (bad UTF-8 string) offset=0 reason=6
90:
91: Error -10 (bad UTF-8 string) offset=0 reason=1
92:
93: Error -10 (bad UTF-8 string) offset=0 reason=6
94: \?
95: No match
96: \xe1\x88
97: Error -10 (bad UTF-8 string) offset=0 reason=1
98: \P\xe1\x88
99: Error -10 (bad UTF-8 string) offset=0 reason=1
100: \P\P\xe1\x88
101: Error -25 (short UTF-8 string) offset=0 reason=1
102: XX\xea
103: Error -10 (bad UTF-8 string) offset=2 reason=2
104: \O0XX\xea
105: Error -10 (bad UTF-8 string)
106: \O1XX\xea
107: Error -10 (bad UTF-8 string)
108: \O2XX\xea
109: Error -10 (bad UTF-8 string) offset=2 reason=2
110: XX\xf1
111: Error -10 (bad UTF-8 string) offset=2 reason=3
112: XX\xf8
113: Error -10 (bad UTF-8 string) offset=2 reason=4
114: XX\xfc
115: Error -10 (bad UTF-8 string) offset=2 reason=5
116: ZZ\xea\xaf\x20YY
117: Error -10 (bad UTF-8 string) offset=2 reason=7
118: ZZ\xfd\xbf\xbf\x2f\xbf\xbfYY
119: Error -10 (bad UTF-8 string) offset=2 reason=8
120: ZZ\xfd\xbf\xbf\xbf\x2f\xbfYY
121: Error -10 (bad UTF-8 string) offset=2 reason=9
122: ZZ\xfd\xbf\xbf\xbf\xbf\x2fYY
123: Error -10 (bad UTF-8 string) offset=2 reason=10
124: ZZ\xffYY
125: Error -10 (bad UTF-8 string) offset=2 reason=21
126: ZZ\xfeYY
127: Error -10 (bad UTF-8 string) offset=2 reason=21
128:
129: /anything/8
130: \xc0\x80
131: Error -10 (bad UTF-8 string) offset=0 reason=15
132: \xc1\x8f
133: Error -10 (bad UTF-8 string) offset=0 reason=15
134: \xe0\x9f\x80
135: Error -10 (bad UTF-8 string) offset=0 reason=16
136: \xf0\x8f\x80\x80
137: Error -10 (bad UTF-8 string) offset=0 reason=17
138: \xf8\x87\x80\x80\x80
139: Error -10 (bad UTF-8 string) offset=0 reason=18
140: \xfc\x83\x80\x80\x80\x80
141: Error -10 (bad UTF-8 string) offset=0 reason=19
142: \xfe\x80\x80\x80\x80\x80
143: Error -10 (bad UTF-8 string) offset=0 reason=21
144: \xff\x80\x80\x80\x80\x80
145: Error -10 (bad UTF-8 string) offset=0 reason=21
146: \xc3\x8f
147: No match
148: \xe0\xaf\x80
149: No match
150: \xe1\x80\x80
151: No match
152: \xf0\x9f\x80\x80
153: No match
154: \xf1\x8f\x80\x80
155: No match
156: \xf8\x88\x80\x80\x80
157: Error -10 (bad UTF-8 string) offset=0 reason=11
158: \xf9\x87\x80\x80\x80
159: Error -10 (bad UTF-8 string) offset=0 reason=11
160: \xfc\x84\x80\x80\x80\x80
161: Error -10 (bad UTF-8 string) offset=0 reason=12
162: \xfd\x83\x80\x80\x80\x80
163: Error -10 (bad UTF-8 string) offset=0 reason=12
164: \?\xf8\x88\x80\x80\x80
165: No match
166: \?\xf9\x87\x80\x80\x80
167: No match
168: \?\xfc\x84\x80\x80\x80\x80
169: No match
170: \?\xfd\x83\x80\x80\x80\x80
171: No match
172:
173: /\x{100}/8DZ
174: ------------------------------------------------------------------
175: Bra
176: \x{100}
177: Ket
178: End
179: ------------------------------------------------------------------
180: Capturing subpattern count = 0
181: Options: utf
182: First char = \x{c4}
183: Need char = \x{80}
184:
185: /\x{1000}/8DZ
186: ------------------------------------------------------------------
187: Bra
188: \x{1000}
189: Ket
190: End
191: ------------------------------------------------------------------
192: Capturing subpattern count = 0
193: Options: utf
194: First char = \x{e1}
195: Need char = \x{80}
196:
197: /\x{10000}/8DZ
198: ------------------------------------------------------------------
199: Bra
200: \x{10000}
201: Ket
202: End
203: ------------------------------------------------------------------
204: Capturing subpattern count = 0
205: Options: utf
206: First char = \x{f0}
207: Need char = \x{80}
208:
209: /\x{100000}/8DZ
210: ------------------------------------------------------------------
211: Bra
212: \x{100000}
213: Ket
214: End
215: ------------------------------------------------------------------
216: Capturing subpattern count = 0
217: Options: utf
218: First char = \x{f4}
219: Need char = \x{80}
220:
221: /\x{10ffff}/8DZ
222: ------------------------------------------------------------------
223: Bra
224: \x{10ffff}
225: Ket
226: End
227: ------------------------------------------------------------------
228: Capturing subpattern count = 0
229: Options: utf
230: First char = \x{f4}
231: Need char = \x{bf}
232:
233: /[\x{ff}]/8DZ
234: ------------------------------------------------------------------
235: Bra
236: \x{ff}
237: Ket
238: End
239: ------------------------------------------------------------------
240: Capturing subpattern count = 0
241: Options: utf
242: First char = \x{c3}
243: Need char = \x{bf}
244:
245: /[\x{100}]/8DZ
246: ------------------------------------------------------------------
247: Bra
248: \x{100}
249: Ket
250: End
251: ------------------------------------------------------------------
252: Capturing subpattern count = 0
253: Options: utf
254: First char = \x{c4}
255: Need char = \x{80}
256:
257: /\x80/8DZ
258: ------------------------------------------------------------------
259: Bra
260: \x{80}
261: Ket
262: End
263: ------------------------------------------------------------------
264: Capturing subpattern count = 0
265: Options: utf
266: First char = \x{c2}
267: Need char = \x{80}
268:
269: /\xff/8DZ
270: ------------------------------------------------------------------
271: Bra
272: \x{ff}
273: Ket
274: End
275: ------------------------------------------------------------------
276: Capturing subpattern count = 0
277: Options: utf
278: First char = \x{c3}
279: Need char = \x{bf}
280:
281: /\x{D55c}\x{ad6d}\x{C5B4}/DZ8
282: ------------------------------------------------------------------
283: Bra
284: \x{d55c}\x{ad6d}\x{c5b4}
285: Ket
286: End
287: ------------------------------------------------------------------
288: Capturing subpattern count = 0
289: Options: utf
290: First char = \x{ed}
291: Need char = \x{b4}
292: \x{D55c}\x{ad6d}\x{C5B4}
293: 0: \x{d55c}\x{ad6d}\x{c5b4}
294:
295: /\x{65e5}\x{672c}\x{8a9e}/DZ8
296: ------------------------------------------------------------------
297: Bra
298: \x{65e5}\x{672c}\x{8a9e}
299: Ket
300: End
301: ------------------------------------------------------------------
302: Capturing subpattern count = 0
303: Options: utf
304: First char = \x{e6}
305: Need char = \x{9e}
306: \x{65e5}\x{672c}\x{8a9e}
307: 0: \x{65e5}\x{672c}\x{8a9e}
308:
309: /\x{80}/DZ8
310: ------------------------------------------------------------------
311: Bra
312: \x{80}
313: Ket
314: End
315: ------------------------------------------------------------------
316: Capturing subpattern count = 0
317: Options: utf
318: First char = \x{c2}
319: Need char = \x{80}
320:
321: /\x{084}/DZ8
322: ------------------------------------------------------------------
323: Bra
324: \x{84}
325: Ket
326: End
327: ------------------------------------------------------------------
328: Capturing subpattern count = 0
329: Options: utf
330: First char = \x{c2}
331: Need char = \x{84}
332:
333: /\x{104}/DZ8
334: ------------------------------------------------------------------
335: Bra
336: \x{104}
337: Ket
338: End
339: ------------------------------------------------------------------
340: Capturing subpattern count = 0
341: Options: utf
342: First char = \x{c4}
343: Need char = \x{84}
344:
345: /\x{861}/DZ8
346: ------------------------------------------------------------------
347: Bra
348: \x{861}
349: Ket
350: End
351: ------------------------------------------------------------------
352: Capturing subpattern count = 0
353: Options: utf
354: First char = \x{e0}
355: Need char = \x{a1}
356:
357: /\x{212ab}/DZ8
358: ------------------------------------------------------------------
359: Bra
360: \x{212ab}
361: Ket
362: End
363: ------------------------------------------------------------------
364: Capturing subpattern count = 0
365: Options: utf
366: First char = \x{f0}
367: Need char = \x{ab}
368:
369: /-- This one is here not because it's different to Perl, but because the way
370: the captured single-byte is displayed. (In Perl it becomes a character, and you
371: can't tell the difference.) --/
372:
373: /X(\C)(.*)/8
374: X\x{1234}
375: 0: X\x{1234}
376: 1: \x{e1}
377: 2: \x{88}\x{b4}
378: X\nabc
379: 0: X\x{0a}abc
380: 1: \x{0a}
381: 2: abc
382:
383: /-- This one is here because Perl gives out a grumbly error message (quite
384: correctly, but that messes up comparisons). --/
385:
386: /a\Cb/8
387: *** Failers
388: No match
389: a\x{100}b
390: No match
391:
392: /[^ab\xC0-\xF0]/8SDZ
393: ------------------------------------------------------------------
394: Bra
395: [\x00-`c-\xbf\xf1-\xff] (neg)
396: Ket
397: End
398: ------------------------------------------------------------------
399: Capturing subpattern count = 0
400: Options: utf
401: No first char
402: No need char
403: Subject length lower bound = 1
404: Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
405: \x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
406: \x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4
407: 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y
408: Z [ \ ] ^ _ ` c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f
409: \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0
410: \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf
411: \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee
412: \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd
413: \xfe \xff
414: \x{f1}
415: 0: \x{f1}
416: \x{bf}
417: 0: \x{bf}
418: \x{100}
419: 0: \x{100}
420: \x{1000}
421: 0: \x{1000}
422: *** Failers
423: 0: *
424: \x{c0}
425: No match
426: \x{f0}
427: No match
428:
429: /Ā{3,4}/8SDZ
430: ------------------------------------------------------------------
431: Bra
432: \x{100}{3}
433: \x{100}?
434: Ket
435: End
436: ------------------------------------------------------------------
437: Capturing subpattern count = 0
438: Options: utf
439: First char = \x{c4}
440: Need char = \x{80}
1.1 misho 441: Subject length lower bound = 3
442: No set of starting bytes
1.1.1.2 misho 443: \x{100}\x{100}\x{100}\x{100\x{100}
444: 0: \x{100}\x{100}\x{100}
445:
446: /(\x{100}+|x)/8SDZ
447: ------------------------------------------------------------------
448: Bra
449: CBra 1
450: \x{100}+
451: Alt
452: x
453: Ket
454: Ket
455: End
456: ------------------------------------------------------------------
457: Capturing subpattern count = 1
458: Options: utf
459: No first char
460: No need char
461: Subject length lower bound = 1
462: Starting byte set: x \xc4
463:
464: /(\x{100}*a|x)/8SDZ
465: ------------------------------------------------------------------
466: Bra
467: CBra 1
468: \x{100}*+
469: a
470: Alt
471: x
472: Ket
473: Ket
474: End
475: ------------------------------------------------------------------
476: Capturing subpattern count = 1
477: Options: utf
478: No first char
479: No need char
480: Subject length lower bound = 1
481: Starting byte set: a x \xc4
482:
483: /(\x{100}{0,2}a|x)/8SDZ
484: ------------------------------------------------------------------
485: Bra
486: CBra 1
487: \x{100}{0,2}
488: a
489: Alt
490: x
491: Ket
492: Ket
493: End
494: ------------------------------------------------------------------
495: Capturing subpattern count = 1
496: Options: utf
497: No first char
498: No need char
499: Subject length lower bound = 1
500: Starting byte set: a x \xc4
501:
502: /(\x{100}{1,2}a|x)/8SDZ
503: ------------------------------------------------------------------
504: Bra
505: CBra 1
506: \x{100}
507: \x{100}{0,1}
508: a
509: Alt
510: x
511: Ket
512: Ket
513: End
514: ------------------------------------------------------------------
515: Capturing subpattern count = 1
516: Options: utf
517: No first char
518: No need char
519: Subject length lower bound = 1
520: Starting byte set: x \xc4
521:
522: /\x{100}/8DZ
523: ------------------------------------------------------------------
524: Bra
525: \x{100}
526: Ket
527: End
528: ------------------------------------------------------------------
529: Capturing subpattern count = 0
530: Options: utf
531: First char = \x{c4}
532: Need char = \x{80}
533:
534: /a\x{100}\x{101}*/8DZ
535: ------------------------------------------------------------------
536: Bra
537: a\x{100}
538: \x{101}*
539: Ket
540: End
541: ------------------------------------------------------------------
542: Capturing subpattern count = 0
543: Options: utf
544: First char = 'a'
545: Need char = \x{80}
546:
547: /a\x{100}\x{101}+/8DZ
548: ------------------------------------------------------------------
549: Bra
550: a\x{100}
551: \x{101}+
552: Ket
553: End
554: ------------------------------------------------------------------
555: Capturing subpattern count = 0
556: Options: utf
557: First char = 'a'
558: Need char = \x{81}
1.1 misho 559:
1.1.1.2 misho 560: /[^\x{c4}]/DZ
561: ------------------------------------------------------------------
562: Bra
563: [^\xc4]
564: Ket
565: End
566: ------------------------------------------------------------------
1.1 misho 567: Capturing subpattern count = 0
568: No options
569: No first char
570: No need char
1.1.1.2 misho 571:
572: /[\x{100}]/8DZ
573: ------------------------------------------------------------------
574: Bra
575: \x{100}
576: Ket
577: End
578: ------------------------------------------------------------------
579: Capturing subpattern count = 0
580: Options: utf
581: First char = \x{c4}
582: Need char = \x{80}
583: \x{100}
584: 0: \x{100}
585: Z\x{100}
586: 0: \x{100}
587: \x{100}Z
588: 0: \x{100}
589: *** Failers
590: No match
591:
592: /[\xff]/DZ8
593: ------------------------------------------------------------------
594: Bra
595: \x{ff}
596: Ket
597: End
598: ------------------------------------------------------------------
599: Capturing subpattern count = 0
600: Options: utf
601: First char = \x{c3}
602: Need char = \x{bf}
603: >\x{ff}<
604: 0: \x{ff}
605:
606: /[^\xff]/8DZ
607: ------------------------------------------------------------------
608: Bra
1.1.1.3 ! misho 609: [^\x{ff}]
1.1.1.2 misho 610: Ket
611: End
612: ------------------------------------------------------------------
613: Capturing subpattern count = 0
614: Options: utf
615: No first char
616: No need char
617:
618: /\x{100}abc(xyz(?1))/8DZ
619: ------------------------------------------------------------------
620: Bra
621: \x{100}abc
622: CBra 1
623: xyz
624: Recurse
625: Ket
626: Ket
627: End
628: ------------------------------------------------------------------
629: Capturing subpattern count = 1
630: Options: utf
631: First char = \x{c4}
632: Need char = 'z'
633:
634: /a\x{1234}b/P8
635: a\x{1234}b
636: 0: a\x{1234}b
637:
638: /\777/8I
639: Capturing subpattern count = 0
640: Options: utf
641: First char = \x{c7}
642: Need char = \x{bf}
643: \x{1ff}
644: 0: \x{1ff}
645: \777
646: 0: \x{1ff}
647:
648: /\x{100}+\x{200}/8DZ
649: ------------------------------------------------------------------
650: Bra
651: \x{100}++
652: \x{200}
653: Ket
654: End
655: ------------------------------------------------------------------
656: Capturing subpattern count = 0
657: Options: utf
658: First char = \x{c4}
659: Need char = \x{80}
660:
661: /\x{100}+X/8DZ
662: ------------------------------------------------------------------
663: Bra
664: \x{100}++
665: X
666: Ket
667: End
668: ------------------------------------------------------------------
669: Capturing subpattern count = 0
670: Options: utf
671: First char = \x{c4}
672: Need char = 'X'
673:
674: /^[\QĀ\E-\QŐ\E/BZ8
675: Failed: missing terminating ] for character class at offset 15
676:
677: /-- This tests the stricter UTF-8 check according to RFC 3629. --/
678:
679: /X/8
680: \x{0}\x{d7ff}\x{e000}\x{10ffff}
681: No match
682: \x{d800}
683: Error -10 (bad UTF-8 string) offset=0 reason=14
684: \x{d800}\?
685: No match
686: \x{da00}
687: Error -10 (bad UTF-8 string) offset=0 reason=14
688: \x{da00}\?
689: No match
690: \x{dfff}
691: Error -10 (bad UTF-8 string) offset=0 reason=14
692: \x{dfff}\?
693: No match
694: \x{110000}
695: Error -10 (bad UTF-8 string) offset=0 reason=13
696: \x{110000}\?
697: No match
698: \x{2000000}
699: Error -10 (bad UTF-8 string) offset=0 reason=11
700: \x{2000000}\?
701: No match
702: \x{7fffffff}
703: Error -10 (bad UTF-8 string) offset=0 reason=12
704: \x{7fffffff}\?
705: No match
706:
707: /(*UTF8)\x{1234}/
708: abcd\x{1234}pqr
709: 0: \x{1234}
710:
711: /(*CRLF)(*UTF8)(*BSR_UNICODE)a\Rb/I
712: Capturing subpattern count = 0
713: Options: bsr_unicode utf
714: Forced newline sequence: CRLF
715: First char = 'a'
716: Need char = 'b'
717:
718: /\h/SI8
719: Capturing subpattern count = 0
720: Options: utf
721: No first char
722: No need char
723: Subject length lower bound = 1
724: Starting byte set: \x09 \x20 \xc2 \xe1 \xe2 \xe3
725: ABC\x{09}
726: 0: \x{09}
727: ABC\x{20}
728: 0:
729: ABC\x{a0}
730: 0: \x{a0}
731: ABC\x{1680}
732: 0: \x{1680}
733: ABC\x{180e}
734: 0: \x{180e}
735: ABC\x{2000}
736: 0: \x{2000}
737: ABC\x{202f}
738: 0: \x{202f}
739: ABC\x{205f}
740: 0: \x{205f}
741: ABC\x{3000}
742: 0: \x{3000}
743:
744: /\v/SI8
745: Capturing subpattern count = 0
746: Options: utf
747: No first char
748: No need char
749: Subject length lower bound = 1
750: Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2
751: ABC\x{0a}
752: 0: \x{0a}
753: ABC\x{0b}
754: 0: \x{0b}
755: ABC\x{0c}
756: 0: \x{0c}
757: ABC\x{0d}
758: 0: \x{0d}
759: ABC\x{85}
760: 0: \x{85}
761: ABC\x{2028}
762: 0: \x{2028}
763:
764: /\h*A/SI8
765: Capturing subpattern count = 0
766: Options: utf
767: No first char
768: Need char = 'A'
769: Subject length lower bound = 1
770: Starting byte set: \x09 \x20 A \xc2 \xe1 \xe2 \xe3
771: CDBABC
772: 0: A
773:
774: /\v+A/SI8
775: Capturing subpattern count = 0
776: Options: utf
777: No first char
778: Need char = 'A'
779: Subject length lower bound = 2
780: Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2
781:
782: /\s?xxx\s/8SI
783: Capturing subpattern count = 0
784: Options: utf
785: No first char
786: Need char = 'x'
787: Subject length lower bound = 4
788: Starting byte set: \x09 \x0a \x0c \x0d \x20 x
789:
790: /\sxxx\s/I8ST1
791: Capturing subpattern count = 0
792: Options: utf
793: No first char
794: Need char = 'x'
795: Subject length lower bound = 5
796: Starting byte set: \x09 \x0a \x0c \x0d \x20 \xc2
797: AB\x{85}xxx\x{a0}XYZ
798: 0: \x{85}xxx\x{a0}
799: AB\x{a0}xxx\x{85}XYZ
800: 0: \x{a0}xxx\x{85}
801:
802: /\S \S/I8ST1
803: Capturing subpattern count = 0
804: Options: utf
805: No first char
806: Need char = ' '
807: Subject length lower bound = 3
808: Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e
809: \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d
810: \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @
811: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e
812: f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \xc0 \xc1 \xc2 \xc3
813: \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2
814: \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 \xe1
815: \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef \xf0
816: \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe \xff
817: \x{a2} \x{84}
818: 0: \x{a2} \x{84}
819: A Z
820: 0: A Z
821:
822: /a+/8
823: a\x{123}aa\>1
824: 0: aa
825: a\x{123}aa\>2
826: Error -11 (bad UTF-8 offset)
827: a\x{123}aa\>3
828: 0: aa
829: a\x{123}aa\>4
830: 0: a
831: a\x{123}aa\>5
832: No match
833: a\x{123}aa\>6
834: Error -24 (bad offset value)
835:
836: /\x{1234}+/iS8I
837: Capturing subpattern count = 0
838: Options: caseless utf
839: No first char
840: No need char
841: Subject length lower bound = 1
842: Starting byte set: \xe1
843:
844: /\x{1234}+?/iS8I
845: Capturing subpattern count = 0
846: Options: caseless utf
847: No first char
848: No need char
849: Subject length lower bound = 1
850: Starting byte set: \xe1
851:
852: /\x{1234}++/iS8I
853: Capturing subpattern count = 0
854: Options: caseless utf
855: No first char
856: No need char
857: Subject length lower bound = 1
858: Starting byte set: \xe1
859:
860: /\x{1234}{2}/iS8I
861: Capturing subpattern count = 0
862: Options: caseless utf
863: No first char
864: No need char
865: Subject length lower bound = 2
866: Starting byte set: \xe1
867:
868: /[^\x{c4}]/8DZ
869: ------------------------------------------------------------------
870: Bra
1.1.1.3 ! misho 871: [^\x{c4}]
1.1.1.2 misho 872: Ket
873: End
874: ------------------------------------------------------------------
875: Capturing subpattern count = 0
876: Options: utf
877: No first char
878: No need char
879:
880: /X+\x{200}/8DZ
881: ------------------------------------------------------------------
882: Bra
883: X++
884: \x{200}
885: Ket
886: End
887: ------------------------------------------------------------------
888: Capturing subpattern count = 0
889: Options: utf
890: First char = 'X'
891: Need char = \x{80}
892:
893: /\R/SI8
894: Capturing subpattern count = 0
895: Options: utf
896: No first char
897: No need char
898: Subject length lower bound = 1
899: Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2
900:
901: /\777/8DZ
902: ------------------------------------------------------------------
903: Bra
904: \x{1ff}
905: Ket
906: End
907: ------------------------------------------------------------------
908: Capturing subpattern count = 0
909: Options: utf
910: First char = \x{c7}
911: Need char = \x{bf}
1.1 misho 912:
1.1.1.3 ! misho 913: /\w+\x{C4}/8BZ
! 914: ------------------------------------------------------------------
! 915: Bra
! 916: \w++
! 917: \x{c4}
! 918: Ket
! 919: End
! 920: ------------------------------------------------------------------
! 921: a\x{C4}\x{C4}
! 922: 0: a\x{c4}
! 923:
! 924: /\w+\x{C4}/8BZT1
! 925: ------------------------------------------------------------------
! 926: Bra
! 927: \w+
! 928: \x{c4}
! 929: Ket
! 930: End
! 931: ------------------------------------------------------------------
! 932: a\x{C4}\x{C4}
! 933: 0: a\x{c4}\x{c4}
! 934:
! 935: /\W+\x{C4}/8BZ
! 936: ------------------------------------------------------------------
! 937: Bra
! 938: \W+
! 939: \x{c4}
! 940: Ket
! 941: End
! 942: ------------------------------------------------------------------
! 943: !\x{C4}
! 944: 0: !\x{c4}
! 945:
! 946: /\W+\x{C4}/8BZT1
! 947: ------------------------------------------------------------------
! 948: Bra
! 949: \W++
! 950: \x{c4}
! 951: Ket
! 952: End
! 953: ------------------------------------------------------------------
! 954: !\x{C4}
! 955: 0: !\x{c4}
! 956:
! 957: /\W+\x{A1}/8BZ
! 958: ------------------------------------------------------------------
! 959: Bra
! 960: \W+
! 961: \x{a1}
! 962: Ket
! 963: End
! 964: ------------------------------------------------------------------
! 965: !\x{A1}
! 966: 0: !\x{a1}
! 967:
! 968: /\W+\x{A1}/8BZT1
! 969: ------------------------------------------------------------------
! 970: Bra
! 971: \W+
! 972: \x{a1}
! 973: Ket
! 974: End
! 975: ------------------------------------------------------------------
! 976: !\x{A1}
! 977: 0: !\x{a1}
! 978:
! 979: /X\s+\x{A0}/8BZ
! 980: ------------------------------------------------------------------
! 981: Bra
! 982: X
! 983: \s++
! 984: \x{a0}
! 985: Ket
! 986: End
! 987: ------------------------------------------------------------------
! 988: X\x20\x{A0}\x{A0}
! 989: 0: X \x{a0}
! 990:
! 991: /X\s+\x{A0}/8BZT1
! 992: ------------------------------------------------------------------
! 993: Bra
! 994: X
! 995: \s+
! 996: \x{a0}
! 997: Ket
! 998: End
! 999: ------------------------------------------------------------------
! 1000: X\x20\x{A0}\x{A0}
! 1001: 0: X \x{a0}\x{a0}
! 1002:
! 1003: /\S+\x{A0}/8BZ
! 1004: ------------------------------------------------------------------
! 1005: Bra
! 1006: \S+
! 1007: \x{a0}
! 1008: Ket
! 1009: End
! 1010: ------------------------------------------------------------------
! 1011: X\x{A0}\x{A0}
! 1012: 0: X\x{a0}\x{a0}
! 1013:
! 1014: /\S+\x{A0}/8BZT1
! 1015: ------------------------------------------------------------------
! 1016: Bra
! 1017: \S++
! 1018: \x{a0}
! 1019: Ket
! 1020: End
! 1021: ------------------------------------------------------------------
! 1022: X\x{A0}\x{A0}
! 1023: 0: X\x{a0}
! 1024:
! 1025: /\x{a0}+\s!/8BZ
! 1026: ------------------------------------------------------------------
! 1027: Bra
! 1028: \x{a0}++
! 1029: \s
! 1030: !
! 1031: Ket
! 1032: End
! 1033: ------------------------------------------------------------------
! 1034: \x{a0}\x20!
! 1035: 0: \x{a0} !
! 1036:
! 1037: /\x{a0}+\s!/8BZT1
! 1038: ------------------------------------------------------------------
! 1039: Bra
! 1040: \x{a0}+
! 1041: \s
! 1042: !
! 1043: Ket
! 1044: End
! 1045: ------------------------------------------------------------------
! 1046: \x{a0}\x20!
! 1047: 0: \x{a0} !
! 1048:
1.1 misho 1049: /-- End of testinput15 --/
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>