MMCT TEAM

Server IP : 111.118.215.189 / Your IP : 216.73.216.162
Web Server : Apache
System : Linux md-in-83.webhostbox.net 4.19.286-203.ELK.el7.x86_64 #1 SMP Wed Jun 14 04:33:55 CDT 2023 x86_64
User : a1673wkz ( 2475)
PHP Version : 8.2.25
Disable Function : NONE
MySQL : OFF | cURL : ON | WGET : ON | Perl : ON | Python : ON
Directory (0755) : /proc/thread-self/root/opt/cpanel/ea-ruby24/root/usr/share/ri/system/
[ Home ]	[ C0mmand ]	[ Upload File ]
Current File : //proc/thread-self/root/opt/cpanel/ea-ruby24/root/usr/share/ri/system/page-regexp_rdoc.ri
U:RDoc::TopLevel[	iI"regexp.rdoc:EFcRDoc::Parser::Simpleo:RDoc::Markup::Document:@parts[�o:RDoc::Markup::Paragraph;[I"JRegular expressions (<i>regexp</i>s) are patterns which describe the ;TI"Pcontents of a string. They're used for testing whether a string contains a ;TI"Lgiven pattern, or extracting the portions that match. They are created ;TI"1with the <tt>/</tt><i>pat</i><tt>/</tt> and ;TI"J<tt>%r{</tt><i>pat</i><tt>}</tt> literals or the <tt>Regexp.new</tt> ;TI"constructor.;To:RDoc::Markup::BlankLineo;	;[I"JA regexp is usually delimited with forward slashes (<tt>/</tt>). For ;TI"
example:;T@o:RDoc::Markup::Verbatim;[I"!/hay/ =~ 'haystack'   #=> 0
;TI"0/y/.match('haystack') #=> #<MatchData "y">
;T:@format0o;	;[I"LIf a string contains the pattern it is said to <i>match</i>. A literal ;TI"string matches itself.;T@o;	;[I"PHere 'haystack' does not contain the pattern 'needle', so it doesn't match:;T@o;;[I"(/needle/.match('haystack') #=> nil
;T;0o;	;[I"?Here 'haystack' contains the pattern 'hay', so it matches:;T@o;;[I"7/hay/.match('haystack')    #=> #<MatchData "hay">
;T;0o;	;[I"NSpecifically, <tt>/st/</tt> requires that the string contains the letter ;TI"D_s_ followed by the letter _t_, so it matches _haystack_, also.;T@S:RDoc::Markup::Heading:
leveli:	textI"!<tt>=~</tt> and Regexp#match;T@o;	;[I"TPattern matching may be achieved by using <tt>=~</tt> operator or Regexp#match ;TI"method.;T@S;
;i;I"<tt>=~</tt> operator;T@o;	;[I"S<tt>=~</tt> is Ruby's basic pattern-matching operator.  When one operand is a ;TI"Qregular expression and the other is a string then the regular expression is ;TI"Tused as a pattern to match against the string.  (This operator is equivalently ;TI"Sdefined by Regexp and String so the order of String and Regexp do not matter. ;TI"SOther classes may have different implementations of <tt>=~</tt>.)  If a match ;TI"Qis found, the operator returns index of first match in string, otherwise it ;TI"returns +nil+.;T@o;;[	I"!/hay/ =~ 'haystack'   #=> 0
;TI"!'haystack' =~ /hay/   #=> 0
;TI"!/a/   =~ 'haystack'   #=> 1
;TI"#/u/   =~ 'haystack'   #=> nil
;T;0o;	;[I"PUsing <tt>=~</tt> operator with a String and Regexp the <tt>$~</tt> global ;TI"Nvariable is set after a successful match.  <tt>$~</tt> holds a MatchData ;TI"<object. Regexp.last_match is equivalent to <tt>$~</tt>.;T@S;
;i;I"Regexp#match method;T@o;	;[I"2The #match method returns a MatchData object:;T@o;;[I"4/st/.match('haystack')   #=> #<MatchData "st">
;T;0S;
;i;I"Metacharacters and Escapes;T@o;	;[
I"EThe following are <i>metacharacters</i> <tt>(</tt>, <tt>)</tt>, ;TI"M<tt>[</tt>, <tt>]</tt>, <tt>{</tt>, <tt>}</tt>, <tt>.</tt>, <tt>?</tt>, ;TI"N<tt>+</tt>, <tt>*</tt>. They have a specific meaning when appearing in a ;TI"Opattern. To match them literally they must be backslash-escaped. To match ;TI"Ba backslash literally backslash-escape that: <tt>\\\\\\</tt>.;T@o;;[I"K/1 \+ 2 = 3\?/.match('Does 1 + 2 = 3?') #=> #<MatchData "1 + 2 = 3?">
;T;0o;	;[I"HPatterns behave like double-quoted strings so can contain the same ;TI"backslash escapes.;T@o;;[I"5/\s\u{6771 4eac 90fd}/.match("Go to 東京都")
;TI"'    #=> #<MatchData " 東京都">
;T;0o;	;[I"GArbitrary Ruby expressions can be embedded into patterns with the ;TI"<tt>#{...}</tt> construct.;T@o;;[I"place = "東京都"
;TI")/#{place}/.match("Go to 東京都")
;TI"&    #=> #<MatchData "東京都">
;T;0S;
;i;I"Character Classes;T@o;	;[	I"MA <i>character class</i> is delimited with square brackets (<tt>[</tt>, ;TI"K<tt>]</tt>) and lists characters that may appear at that point in the ;TI"Pmatch. <tt>/[ab]/</tt> means _a_ or _b_, as opposed to <tt>/ab/</tt> which ;TI"means _a_ followed by _b_.;T@o;;[I"8/W[aeiou]rd/.match("Word") #=> #<MatchData "Word">
;T;0o;	;[I"IWithin a character class the hyphen (<tt>-</tt>) is a metacharacter ;TI"Ndenoting an inclusive range of characters. <tt>[abcd]</tt> is equivalent ;TI"Eto <tt>[a-d]</tt>. A range can be followed by another range, so ;TI"P<tt>[abcdwxyz]</tt> is equivalent to <tt>[a-dw-z]</tt>. The order in which ;TI"Hranges or individual characters appear inside a character class is ;TI"irrelevant.;T@o;;[I"1/[0-9a-f]/.match('9f') #=> #<MatchData "9">
;TI"1/[9f]/.match('9f')     #=> #<MatchData "9">
;T;0o;	;[I"MIf the first character of a character class is a caret (<tt>^</tt>) the ;TI"Fclass is inverted: it matches any character _except_ those named.;T@o;;[I"1/[^a-eg-z]/.match('f') #=> #<MatchData "f">
;T;0o;	;[
I"KA character class may contain another character class. By itself this ;TI"Hisn't useful because <tt>[a-z[0-9]]</tt> describes the same set as ;TI"P<tt>[a-z0-9]</tt>. However, character classes also support the <tt>&&</tt> ;TI"Ooperator which performs set intersection on its arguments. The two can be ;TI"combined as follows:;T@o;;[I"2/[a-w&&[^c-g]z]/ # ([a-w] AND ([^c-g] OR z))
;T;0o;	;[I"This is equivalent to:;T@o;;[I"/[abh-w]/
;T;0o;	;[I"EThe following metacharacters also behave like character classes:;T@o:RDoc::Markup::List:
@type:BULLET:@items[o:RDoc::Markup::ListItem:@label0;[o;	;[I"3<tt>/./</tt> - Any character except a newline.;To;;0;[o;	;[I"L<tt>/./m</tt> - Any character (the +m+ modifier enables multiline mode);To;;0;[o;	;[I"=<tt>/\w/</tt> - A word character (<tt>[a-zA-Z0-9_]</tt>);To;;0;[o;	;[I"D<tt>/\W/</tt> - A non-word character (<tt>[^a-zA-Z0-9_]</tt>). ;TI"RPlease take a look at {Bug #4044}[https://bugs.ruby-lang.org/issues/4044] if ;TI"7using <tt>/\W/</tt> with the <tt>/i</tt> modifier.;To;;0;[o;	;[I"7<tt>/\d/</tt> - A digit character (<tt>[0-9]</tt>);To;;0;[o;	;[I"<<tt>/\D/</tt> - A non-digit character (<tt>[^0-9]</tt>);To;;0;[o;	;[I"@<tt>/\h/</tt> - A hexdigit character (<tt>[0-9a-fA-F]</tt>);To;;0;[o;	;[I"E<tt>/\H/</tt> - A non-hexdigit character (<tt>[^0-9a-fA-F]</tt>);To;;0;[o;	;[I"E<tt>/\s/</tt> - A whitespace character: <tt>/[ \t\r\n\f\v]/</tt>;To;;0;[o;	;[I"J<tt>/\S/</tt> - A non-whitespace character: <tt>/[^ \t\r\n\f\v]/</tt>;T@o;	;[
I"MPOSIX <i>bracket expressions</i> are also similar to character classes. ;TI"NThey provide a portable alternative to the above, with the added benefit ;TI"Kthat they encompass non-ASCII characters. For instance, <tt>/\d/</tt> ;TI"Qmatches only the ASCII decimal digits (0-9); whereas <tt>/[[:digit:]]/</tt> ;TI"8matches any character in the Unicode _Nd_ category.;T@o;;;;[o;;0;[o;	;[I"><tt>/[[:alnum:]]/</tt> - Alphabetic and numeric character;To;;0;[o;	;[I"2<tt>/[[:alpha:]]/</tt> - Alphabetic character;To;;0;[o;	;[I"*<tt>/[[:blank:]]/</tt> - Space or tab;To;;0;[o;	;[I"/<tt>/[[:cntrl:]]/</tt> - Control character;To;;0;[o;	;[I"#<tt>/[[:digit:]]/</tt> - Digit;To;;0;[o;	;[I"L<tt>/[[:graph:]]/</tt> - Non-blank character (excludes spaces, control ;TI"characters, and similar);To;;0;[o;	;[I"><tt>/[[:lower:]]/</tt> - Lowercase alphabetical character;To;;0;[o;	;[I"N<tt>/[[:print:]]/</tt> - Like [:graph:], but includes the space character;To;;0;[o;	;[I"3<tt>/[[:punct:]]/</tt> - Punctuation character;To;;0;[o;	;[I"Q<tt>/[[:space:]]/</tt> - Whitespace character (<tt>[:blank:]</tt>, newline, ;TI"carriage return, etc.);To;;0;[o;	;[I"4<tt>/[[:upper:]]/</tt> - Uppercase alphabetical;To;;0;[o;	;[I"L<tt>/[[:xdigit:]]/</tt> - Digit allowed in a hexadecimal number (i.e., ;TI"0-9a-fA-F);T@o;	;[I"BRuby also supports the following non-POSIX character classes:;T@o;;;;[o;;0;[o;	;[I"I<tt>/[[:word:]]/</tt> - A character in one of the following Unicode ;TI"4general categories _Letter_, _Mark_, _Number_, ;TI"!<i>Connector_Punctuation</i>;To;;0;[o;	;[I"D<tt>/[[:ascii:]]/</tt> - A character in the ASCII character set;T@o;;[	I"3# U+06F2 is "EXTENDED ARABIC-INDIC DIGIT TWO"
;TI"B/[[:digit:]]/.match("\u06F2")    #=> #<MatchData "\u{06F2}">
;TI"C/[[:upper:]][[:lower:]]/.match("Hello") #=> #<MatchData "He">
;TI"C/[[:xdigit:]][[:xdigit:]]/.match("A6")  #=> #<MatchData "A6">
;T;0S;
;i;I"Repetition;T@o;	;[I"KThe constructs described so far match a single character. They can be ;TI"Pfollowed by a repetition metacharacter to specify how many times they need ;TI"Ato occur. Such metacharacters are called <i>quantifiers</i>.;T@o;;;;[o;;0;[o;	;[I"$<tt>*</tt> - Zero or more times;To;;0;[o;	;[I"#<tt>+</tt> - One or more times;To;;0;[o;	;[I".<tt>?</tt> - Zero or one times (optional);To;;0;[o;	;[I":<tt>{</tt><i>n</i><tt>}</tt> - Exactly <i>n</i> times;To;;0;[o;	;[I";<tt>{</tt><i>n</i><tt>,}</tt> - <i>n</i> or more times;To;;0;[o;	;[I";<tt>{,</tt><i>m</i><tt>}</tt> - <i>m</i> or less times;To;;0;[o;	;[I"L<tt>{</tt><i>n</i><tt>,</tt><i>m</i><tt>}</tt> - At least <i>n</i> and ;TI"at most <i>m</i> times;T@o;	;[I"NAt least one uppercase character ('H'), at least one lowercase character ;TI"-('e'), two 'l' characters, then one 'o':;T@o;;[I"M"Hello".match(/[[:upper:]]+[[:lower:]]+l{2}o/) #=> #<MatchData "Hello">
;T;0o;	;[
I"MRepetition is <i>greedy</i> by default: as many occurrences as possible ;TI"Gare matched while still allowing the overall match to succeed. By ;TI"Hcontrast, <i>lazy</i> matching makes the minimal amount of matches ;TI"Onecessary for overall success. A greedy metacharacter can be made lazy by ;TI""following it with <tt>?</tt>.;T@o;	;[I"QBoth patterns below match the string. The first uses a greedy quantifier so ;TI"O'.+' matches '<a><b>'; the second uses a lazy quantifier so '.+?' matches ;TI"'<a>':;T@o;;[I"7/<.+>/.match("<a><b>")  #=> #<MatchData "<a><b>">
;TI"4/<.+?>/.match("<a><b>") #=> #<MatchData "<a>">
;T;0o;	;[	I"NA quantifier followed by <tt>+</tt> matches <i>possessively</i>: once it ;TI"Mhas matched it does not backtrack. They behave like greedy quantifiers, ;TI"Jbut having matched they refuse to "give up" their match even if this ;TI"#jeopardises the overall match.;T@S;
;i;I"Capturing;T@o;	;[
I"LParentheses can be used for <i>capturing</i>. The text enclosed by the ;TI"P<i>n</i><sup>th</sup> group of parentheses can be subsequently referred to ;TI"Bwith <i>n</i>. Within a pattern use the <i>backreference</i> ;TI"-<tt>\n</tt>; outside of the pattern use ;TI"+<tt>MatchData[</tt><i>n</i><tt>]</tt>.;T@o;	;[I"P'at' is captured by the first group of parentheses, then referred to later ;TI"with <tt>\1</tt>:;T@o;;[I"</[csh](..) [csh]\1 in/.match("The cat sat in the hat")
;TI".    #=> #<MatchData "cat sat in" 1:"at">
;T;0o;	;[I"KRegexp#match returns a MatchData object which makes the captured text ;TI"#available with its #[] method:;T@o;;[I"H/[csh](..) [csh]\1 in/.match("The cat sat in the hat")[1] #=> 'at'
;T;0o;	;[I"ECapture groups can be referred to by name when defined with the ;TI"N<tt>(?<</tt><i>name</i><tt>>)</tt> or <tt>(?'</tt><i>name</i><tt>')</tt> ;TI"constructs.;T@o;;[I"7/\$(?<dollars>\d+)\.(?<cents>\d+)/.match("$3.67")
;TI"8    => #<MatchData "$3.67" dollars:"3" cents:"67">
;TI"I/\$(?<dollars>\d+)\.(?<cents>\d+)/.match("$3.67")[:dollars] #=> "3"
;T;0o;	;[I"PNamed groups can be backreferenced with <tt>\k<</tt><i>name</i><tt>></tt>, ;TI"$where _name_ is the group name.;T@o;;[I">/(?<vowel>[aeiou]).\k<vowel>.\k<vowel>/.match('ototomy')
;TI",    #=> #<MatchData "ototo" vowel:"o">
;T;0o;	;[I"B*Note*: A regexp can't use named backreferences and numbered ;TI"#backreferences simultaneously.;T@o;	;[I"OWhen named capture groups are used with a literal regexp on the left-hand ;TI"Nside of an expression and the <tt>=~</tt> operator, the captured text is ;TI"?also assigned to local variables with corresponding names.;T@o;;[I"9/\$(?<dollars>\d+)\.(?<cents>\d+)/ =~ "$3.67" #=> 0
;TI"dollars #=> "3"
;T;0S;
;i;I"
Grouping;T@o;	;[I"OParentheses also <i>group</i> the terms they enclose, allowing them to be ;TI"+quantified as one <i>atomic</i> whole.;T@o;	;[I"EThe pattern below matches a vowel followed by 2 word characters:;T@o;;[I"K/[aeiou]\w{2}/.match("Caenorhabditis elegans") #=> #<MatchData "aen">
;T;0o;	;[I"QWhereas the following pattern matches a vowel followed by a word character, ;TI"5twice, i.e. <tt>[aeiou]\w[aeiou]\w</tt>: 'enor'.;T@o;;[I"6/([aeiou]\w){2}/.match("Caenorhabditis elegans")
;TI"(    #=> #<MatchData "enor" 1:"or">
;T;0o;	;[	I"GThe <tt>(?:</tt>...<tt>)</tt> construct provides grouping without ;TI"Pcapturing. That is, it combines the terms it contains into an atomic whole ;TI"Owithout creating a backreference. This benefits performance at the slight ;TI"expense of readability.;T@o;	;[I"QThe first group of parentheses captures 'n' and the second 'ti'. The second ;TI"Cgroup is referred to later with the backreference <tt>\2</tt>:;T@o;;[I"2/I(n)ves(ti)ga\2ons/.match("Investigations")
;TI"8    #=> #<MatchData "Investigations" 1:"n" 2:"ti">
;T;0o;	;[I"OThe first group of parentheses is now made non-capturing with '?:', so it ;TI"Hstill matches 'n', but doesn't create the backreference. Thus, the ;TI"2backreference <tt>\1</tt> now refers to 'ti'.;T@o;;[I"4/I(?:n)ves(ti)ga\1ons/.match("Investigations")
;TI"2    #=> #<MatchData "Investigations" 1:"ti">
;T;0S;
;i;I"Atomic Grouping;T@o;	;[
I"-Grouping can be made <i>atomic</i> with ;TI"P<tt>(?></tt><i>pat</i><tt>)</tt>. This causes the subexpression <i>pat</i> ;TI"Nto be matched independently of the rest of the expression such that what ;TI"Pit matches becomes fixed for the remainder of the match, unless the entire ;TI"Isubexpression must be abandoned and subsequently revisited. In this ;TI"Lway <i>pat</i> is treated as a non-divisible whole. Atomic grouping is ;TI"Ftypically used to optimise patterns so as to prevent the regular ;TI"4expression engine from backtracking needlessly.;T@o;	;[	I"TThe <tt>"</tt> in the pattern below matches the first character of the string, ;TI"Tthen <tt>.*</tt> matches <i>Quote"</i>. This causes the overall match to fail, ;TI"Nso the text matched by <tt>.*</tt> is backtracked by one position, which ;TI"Kleaves the final character of the string available to match <tt>"</tt>;T@o;;[I">/".*"/.match('"Quote"')     #=> #<MatchData "\"Quote\"">
;T;0o;	;[I"RIf <tt>.*</tt> is grouped atomically, it refuses to backtrack <i>Quote"</i>, ;TI"8even though this means that the overall match fails;T@o;;[I")/"(?>.*)"/.match('"Quote"') #=> nil
;T;0S;
;i;I"Subexpression Calls;T@o;	;[	I"GThe <tt>\g<</tt><i>name</i><tt>></tt> syntax matches the previous ;TI"Msubexpression named _name_, which can be a group name or number, again. ;TI"NThis differs from backreferences in that it re-executes the group rather ;TI"2than simply trying to re-match the same text.;T@o;	;[I"TThis pattern matches a <i>(</i> character and assigns it to the <tt>paren</tt> ;TI"Rgroup, tries to call that the <tt>paren</tt> sub-expression again but fails, ;TI"%then matches a literal <i>)</i>:;T@o;;[I"-/\A(?<paren>\(\g<paren>*\))*\z/ =~ '()'
;TI"
;TI"5/\A(?<paren>\(\g<paren>*\))*\z/ =~ '(())' #=> 0
;TI"
# ^1
;TI"#      ^2
;TI"#           ^3
;TI"#                 ^4
;TI"#      ^5
;TI"#           ^6
;TI"#                      ^7
;TI" #                       ^8
;TI" #                       ^9
;TI"%#                           ^10
;T;0o;;:NUMBER;[o;;0;[o;	;[I"CMatches at the beginning of the string, i.e. before the first ;TI"character.;To;;0;[o;	;[I"7Enters a named capture group called <tt>paren</tt>;To;;0;[o;	;[I"BMatches a literal <i>(</i>, the first character in the string;To;;0;[o;	;[I"ECalls the <tt>paren</tt> group again, i.e. recurses back to the ;TI"second step;To;;0;[o;	;[I"'Re-enters the <tt>paren</tt> group;To;;0;[o;	;[I"=Matches a literal <i>(</i>, the second character in the ;TI"string;To;;0;[o;	;[I"?Try to call <tt>paren</tt> a third time, but fail because ;TI"7doing so would prevent an overall successful match;To;;0;[o;	;[I"BMatch a literal <i>)</i>, the third character in the string. ;TI"/Marks the end of the second recursive call;To;;0;[o;	;[I"AMatch a literal <i>)</i>, the fourth character in the string;To;;0;[o;	;[I" Match the end of the string;T@S;
;i;I"Alternation;T@o;	;[I"OThe vertical bar metacharacter (<tt>|</tt>) combines two expressions into ;TI"Pa single one that matches either of the expressions. Each expression is an ;TI"<i>alternative</i>.;T@o;;[I"G/\w(and|or)\w/.match("Feliformia") #=> #<MatchData "form" 1:"or">
;TI"I/\w(and|or)\w/.match("furandi")    #=> #<MatchData "randi" 1:"and">
;TI"2/\w(and|or)\w/.match("dissemblance") #=> nil
;T;0S;
;i;I"Character Properties;T@o;	;[I"MThe <tt>\p{}</tt> construct matches characters with the named property, ;TI"%much like POSIX bracket classes.;T@o;;;;[o;;0;[o;	;[I"<<tt>/\p{Alnum}/</tt> - Alphabetic and numeric character;To;;0;[o;	;[I"0<tt>/\p{Alpha}/</tt> - Alphabetic character;To;;0;[o;	;[I"(<tt>/\p{Blank}/</tt> - Space or tab;To;;0;[o;	;[I"-<tt>/\p{Cntrl}/</tt> - Control character;To;;0;[o;	;[I"!<tt>/\p{Digit}/</tt> - Digit;To;;0;[o;	;[I"J<tt>/\p{Graph}/</tt> - Non-blank character (excludes spaces, control ;TI"characters, and similar);To;;0;[o;	;[I"<<tt>/\p{Lower}/</tt> - Lowercase alphabetical character;To;;0;[o;	;[I"U<tt>/\p{Print}/</tt> - Like <tt>\p{Graph}</tt>, but includes the space character;To;;0;[o;	;[I"1<tt>/\p{Punct}/</tt> - Punctuation character;To;;0;[o;	;[I"O<tt>/\p{Space}/</tt> - Whitespace character (<tt>[:blank:]</tt>, newline, ;TI"carriage return, etc.);To;;0;[o;	;[I"2<tt>/\p{Upper}/</tt> - Uppercase alphabetical;To;;0;[o;	;[I"T<tt>/\p{XDigit}/</tt> - Digit allowed in a hexadecimal number (i.e., 0-9a-fA-F);To;;0;[o;	;[I"L<tt>/\p{Word}/</tt> - A member of one of the following Unicode general ;TI"9category <i>Letter</i>, <i>Mark</i>, <i>Number</i>, ;TI""<i>Connector\_Punctuation</i>;To;;0;[o;	;[I"B<tt>/\p{ASCII}/</tt> - A character in the ASCII character set;To;;0;[o;	;[I"F<tt>/\p{Any}/</tt> - Any Unicode character (including unassigned ;TI"characters);To;;0;[o;	;[I"4<tt>/\p{Assigned}/</tt> - An assigned character;T@o;	;[I"MA Unicode character's <i>General Category</i> value can also be matched ;TI"Lwith <tt>\p{</tt><i>Ab</i><tt>}</tt> where <i>Ab</i> is the category's ;TI"%abbreviation as described below:;T@o;;;;[,o;;0;[o;	;[I" <tt>/\p{L}/</tt> - 'Letter';To;;0;[o;	;[I",<tt>/\p{Ll}/</tt> - 'Letter: Lowercase';To;;0;[o;	;[I"'<tt>/\p{Lm}/</tt> - 'Letter: Mark';To;;0;[o;	;[I"(<tt>/\p{Lo}/</tt> - 'Letter: Other';To;;0;[o;	;[I",<tt>/\p{Lt}/</tt> - 'Letter: Titlecase';To;;0;[o;	;[I"+<tt>/\p{Lu}/</tt> - 'Letter: Uppercase;To;;0;[o;	;[I"(<tt>/\p{Lo}/</tt> - 'Letter: Other';To;;0;[o;	;[I"<tt>/\p{M}/</tt> - 'Mark';To;;0;[o;	;[I"+<tt>/\p{Mn}/</tt> - 'Mark: Nonspacing';To;;0;[o;	;[I"2<tt>/\p{Mc}/</tt> - 'Mark: Spacing Combining';To;;0;[o;	;[I"*<tt>/\p{Me}/</tt> - 'Mark: Enclosing';To;;0;[o;	;[I" <tt>/\p{N}/</tt> - 'Number';To;;0;[o;	;[I"0<tt>/\p{Nd}/</tt> - 'Number: Decimal Digit';To;;0;[o;	;[I")<tt>/\p{Nl}/</tt> - 'Number: Letter';To;;0;[o;	;[I"(<tt>/\p{No}/</tt> - 'Number: Other';To;;0;[o;	;[I"%<tt>/\p{P}/</tt> - 'Punctuation';To;;0;[o;	;[I"1<tt>/\p{Pc}/</tt> - 'Punctuation: Connector';To;;0;[o;	;[I",<tt>/\p{Pd}/</tt> - 'Punctuation: Dash';To;;0;[o;	;[I",<tt>/\p{Ps}/</tt> - 'Punctuation: Open';To;;0;[o;	;[I"-<tt>/\p{Pe}/</tt> - 'Punctuation: Close';To;;0;[o;	;[I"5<tt>/\p{Pi}/</tt> - 'Punctuation: Initial Quote';To;;0;[o;	;[I"3<tt>/\p{Pf}/</tt> - 'Punctuation: Final Quote';To;;0;[o;	;[I"-<tt>/\p{Po}/</tt> - 'Punctuation: Other';To;;0;[o;	;[I" <tt>/\p{S}/</tt> - 'Symbol';To;;0;[o;	;[I"'<tt>/\p{Sm}/</tt> - 'Symbol: Math';To;;0;[o;	;[I"+<tt>/\p{Sc}/</tt> - 'Symbol: Currency';To;;0;[o;	;[I"+<tt>/\p{Sc}/</tt> - 'Symbol: Currency';To;;0;[o;	;[I"+<tt>/\p{Sk}/</tt> - 'Symbol: Modifier';To;;0;[o;	;[I"(<tt>/\p{So}/</tt> - 'Symbol: Other';To;;0;[o;	;[I"#<tt>/\p{Z}/</tt> - 'Separator';To;;0;[o;	;[I"+<tt>/\p{Zs}/</tt> - 'Separator: Space';To;;0;[o;	;[I"*<tt>/\p{Zl}/</tt> - 'Separator: Line';To;;0;[o;	;[I"/<tt>/\p{Zp}/</tt> - 'Separator: Paragraph';To;;0;[o;	;[I"<tt>/\p{C}/</tt> - 'Other';To;;0;[o;	;[I")<tt>/\p{Cc}/</tt> - 'Other: Control';To;;0;[o;	;[I"(<tt>/\p{Cf}/</tt> - 'Other: Format';To;;0;[o;	;[I".<tt>/\p{Cn}/</tt> - 'Other: Not Assigned';To;;0;[o;	;[I"-<tt>/\p{Co}/</tt> - 'Other: Private Use';To;;0;[o;	;[I"+<tt>/\p{Cs}/</tt> - 'Other: Surrogate';T@o;	;[I"LLastly, <tt>\p{}</tt> matches a character's Unicode <i>script</i>. The ;TI"Ffollowing scripts are supported: <i>Arabic</i>, <i>Armenian</i>, ;TI"G<i>Balinese</i>, <i>Bengali</i>, <i>Bopomofo</i>, <i>Braille</i>, ;TI"O<i>Buginese</i>, <i>Buhid</i>, <i>Canadian_Aboriginal</i>, <i>Carian</i>, ;TI"A<i>Cham</i>, <i>Cherokee</i>, <i>Common</i>, <i>Coptic</i>, ;TI"H<i>Cuneiform</i>, <i>Cypriot</i>, <i>Cyrillic</i>, <i>Deseret</i>, ;TI"M<i>Devanagari</i>, <i>Ethiopic</i>, <i>Georgian</i>, <i>Glagolitic</i>, ;TI"P<i>Gothic</i>, <i>Greek</i>, <i>Gujarati</i>, <i>Gurmukhi</i>, <i>Han</i>, ;TI"D<i>Hangul</i>, <i>Hanunoo</i>, <i>Hebrew</i>, <i>Hiragana</i>, ;TI"I<i>Inherited</i>, <i>Kannada</i>, <i>Katakana</i>, <i>Kayah_Li</i>, ;TI"O<i>Kharoshthi</i>, <i>Khmer</i>, <i>Lao</i>, <i>Latin</i>, <i>Lepcha</i>, ;TI"B<i>Limbu</i>, <i>Linear_B</i>, <i>Lycian</i>, <i>Lydian</i>, ;TI"M<i>Malayalam</i>, <i>Mongolian</i>, <i>Myanmar</i>, <i>New_Tai_Lue</i>, ;TI"C<i>Nko</i>, <i>Ogham</i>, <i>Ol_Chiki</i>, <i>Old_Italic</i>, ;TI"H<i>Old_Persian</i>, <i>Oriya</i>, <i>Osmanya</i>, <i>Phags_Pa</i>, ;TI"H<i>Phoenician</i>, <i>Rejang</i>, <i>Runic</i>, <i>Saurashtra</i>, ;TI"L<i>Shavian</i>, <i>Sinhala</i>, <i>Sundanese</i>, <i>Syloti_Nagri</i>, ;TI"D<i>Syriac</i>, <i>Tagalog</i>, <i>Tagbanwa</i>, <i>Tai_Le</i>, ;TI"N<i>Tamil</i>, <i>Telugu</i>, <i>Thaana</i>, <i>Thai</i>, <i>Tibetan</i>, ;TI"A<i>Tifinagh</i>, <i>Ugaritic</i>, <i>Vai</i>, and <i>Yi</i>.;T@o;	;[I"SUnicode codepoint U+06E9 is named "ARABIC PLACE OF SAJDAH" and belongs to the ;TI"Arabic script:;T@o;;[I"</\p{Arabic}/.match("\u06E9") #=> #<MatchData "\u06E9">
;T;0o;	;[I"MAll character properties can be inverted by prefixing their name with a ;TI"caret (<tt>^</tt>).;T@o;	;[I"OLetter 'A' is not in the Unicode Ll (Letter; Lowercase) category, so this ;TI"match succeeds:;T@o;;[I"//\p{^Ll}/.match("A") #=> #<MatchData "A">
;T;0S;
;i;I"Anchors;T@o;	;[I"KAnchors are metacharacter that match the zero-width positions between ;TI"Ccharacters, <i>anchoring</i> the match to a specific position.;T@o;;;;[o;;0;[o;	;[I"+<tt>^</tt> - Matches beginning of line;To;;0;[o;	;[I"%<tt>$</tt> - Matches end of line;To;;0;[o;	;[I"/<tt>\A</tt> - Matches beginning of string.;To;;0;[o;	;[I"I<tt>\Z</tt> - Matches end of string. If string ends with a newline, ;TI"#it matches just before newline;To;;0;[o;	;[I"(<tt>\z</tt> - Matches end of string;To;;0;[
o;	;[I"3<tt>\G</tt> - Matches first matching position:;T@o;	;[I"bIn methods like <tt>String#gsub</tt> and <tt>String#scan</tt>, it changes on each iteration. ;TI"}It initially matches the beginning of subject, and in each following iteration it matches where the last match finished.;T@o;;[I"3"    a b c".gsub(/ /, '_')    #=> "____a_b_c"
;TI"3"    a b c".gsub(/\G /, '_')  #=> "____a b c"
;T;0o;	;[I"�In methods like <tt>Regexp#match</tt> and <tt>String#match</tt> that take an (optional) offset, it matches where the search begins.;T@o;;[I":"hello, world".match(/,/, 3)    #=> #<MatchData ",">
;TI"-"hello, world".match(/\G,/, 3)  #=> nil
;T;0o;;0;[o;	;[I"B<tt>\b</tt> - Matches word boundaries when outside brackets; ;TI"*backspace (0x08) when inside brackets;To;;0;[o;	;[I".<tt>\B</tt> - Matches non-word boundaries;To;;0;[o;	;[I"M<tt>(?=</tt><i>pat</i><tt>)</tt> - <i>Positive lookahead</i> assertion: ;TI"Iensures that the following characters match <i>pat</i>, but doesn't ;TI"1include those characters in the matched text;To;;0;[o;	;[I"M<tt>(?!</tt><i>pat</i><tt>)</tt> - <i>Negative lookahead</i> assertion: ;TI"Hensures that the following characters do not match <i>pat</i>, but ;TI"9doesn't include those characters in the matched text;To;;0;[o;	;[I"D<tt>(?<=</tt><i>pat</i><tt>)</tt> - <i>Positive lookbehind</i> ;TI"Lassertion: ensures that the preceding characters match <i>pat</i>, but ;TI"9doesn't include those characters in the matched text;To;;0;[o;	;[I"D<tt>(?<!</tt><i>pat</i><tt>)</tt> - <i>Negative lookbehind</i> ;TI"Cassertion: ensures that the preceding characters do not match ;TI"I<i>pat</i>, but doesn't include those characters in the matched text;T@o;	;[I"IIf a pattern isn't anchored it can begin at any point in the string:;T@o;;[I"8/real/.match("surrealist") #=> #<MatchData "real">
;T;0o;	;[I"TAnchoring the pattern to the beginning of the string forces the match to start ;TI"Rthere. 'real' doesn't occur at the beginning of the string, so now the match ;TI"fails:;T@o;;[I"*/\Areal/.match("surrealist") #=> nil
;T;0o;	;[I"QThe match below fails because although 'Demand' contains 'and', the pattern ;TI"'does not occur at a word boundary.;T@o;;[I"/\band/.match("Demand")
;T;0o;	;[I"LWhereas in the following example 'and' has been anchored to a non-word ;TI"Pboundary so instead of matching the first 'and' it matches from the fourth ;TI" letter of 'demand' instead:;T@o;;[I"M/\Band.+/.match("Supply and demand curve") #=> #<MatchData "and curve">
;T;0o;	;[I"PThe pattern below uses positive lookahead and positive lookbehind to match ;TI"Ltext appearing in <b></b> tags without including the tags in the match:;T@o;;[I"E/(?<=<b>)\w+(?=<\/b>)/.match("Fortune favours the <b>bold</b>")
;TI"!    #=> #<MatchData "bold">
;T;0S;
;i;I"Options;T@o;	;[I"QThe end delimiter for a regexp can be followed by one or more single-letter ;TI"5options which control how the pattern can match.;T@o;;;;[	o;;0;[o;	;[I""<tt>/pat/i</tt> - Ignore case;To;;0;[o;	;[I"K<tt>/pat/m</tt> - Treat a newline as a character matched by <tt>.</tt>;To;;0;[o;	;[I"D<tt>/pat/x</tt> - Ignore whitespace and comments in the pattern;To;;0;[o;	;[I"C<tt>/pat/o</tt> - Perform <tt>#{}</tt> interpolation only once;T@o;	;[
I"G<tt>i</tt>, <tt>m</tt>, and <tt>x</tt> can also be applied on the ;TI""subexpression level with the ;TI"I<tt>(?</tt><i>on</i><tt>-</tt><i>off</i><tt>)</tt> construct, which ;TI"Henables options <i>on</i>, and disables options <i>off</i> for the ;TI",expression enclosed by the parentheses.;T@o;;[I"4/a(?i:b)c/.match('aBc') #=> #<MatchData "aBc">
;TI"4/a(?i:b)c/.match('abc') #=> #<MatchData "abc">
;T;0o;	;[I"7Options may also be used with <tt>Regexp.new</tt>:;T@o;;[	I"JRegexp.new("abc", Regexp::IGNORECASE)                     #=> /abc/i
;TI"JRegexp.new("abc", Regexp::MULTILINE)                      #=> /abc/m
;TI"TRegexp.new("abc # Comment", Regexp::EXTENDED)             #=> /abc # Comment/x
;TI"KRegexp.new("abc", Regexp::IGNORECASE | Regexp::MULTILINE) #=> /abc/mi
;T;0S;
;i;I"#Free-Spacing Mode and Comments;T@o;	;[
I"KAs mentioned above, the <tt>x</tt> option enables <i>free-spacing</i> ;TI"Fmode. Literal white space inside the pattern is ignored, and the ;TI"Moctothorpe (<tt>#</tt>) character introduces a comment until the end of ;TI"Nthe line. This allows the components of the pattern to be organized in a ;TI"'potentially more readable fashion.;T@o;	;[I"HA contrived pattern to match a number with optional decimal places:;T@o;;[I"float_pat = /\A
;TI"B    [[:digit:]]+ # 1 or more digits before the decimal point
;TI"&    (\.          # Decimal point
;TI"E        [[:digit:]]+ # 1 or more digits after the decimal point
;TI"B    )? # The decimal point and following digits are optional
;TI"
\Z/x
;TI"=float_pat.match('3.14') #=> #<MatchData "3.14" 1:".14">
;T;0o;	;[I">There are a number of strategies for matching whitespace:;T@o;;;;[o;;0;[o;	;[I"=Use a pattern such as <tt>\s</tt> or <tt>\p{Space}</tt>.;To;;0;[o;	;[I"VUse escaped whitespace such as <tt>\ </tt>, i.e. a space preceded by a backslash.;To;;0;[o;	;[I"0Use a character class such as <tt>[ ]</tt>.;T@o;	;[I"CComments can be included in a non-<tt>x</tt> pattern with the ;TI"M<tt>(?#</tt><i>comment</i><tt>)</tt> construct, where <i>comment</i> is ;TI"1arbitrary text ignored by the regexp engine.;T@o;	;[I"EComments in regexp literals cannot include unescaped terminator ;TI"characters.;T@S;
;i;I"
Encoding;T@o;	;[I"MRegular expressions are assumed to use the source encoding. This can be ;TI"4overridden with one of the following modifiers.;T@o;;;;[	o;;0;[o;	;[I",<tt>/</tt><i>pat</i><tt>/u</tt> - UTF-8;To;;0;[o;	;[I"-<tt>/</tt><i>pat</i><tt>/e</tt> - EUC-JP;To;;0;[o;	;[I"2<tt>/</tt><i>pat</i><tt>/s</tt> - Windows-31J;To;;0;[o;	;[I"1<tt>/</tt><i>pat</i><tt>/n</tt> - ASCII-8BIT;T@o;	;[I"HA regexp can be matched against a string when they either share an ;TI"Pencoding, or the regexp's encoding is _US-ASCII_ and the string's encoding ;TI"is ASCII-compatible.;T@o;	;[I"?If a match between incompatible encodings is attempted an ;TI"?<tt>Encoding::CompatibilityError</tt> exception is raised.;T@o;	;[
I"PThe <tt>Regexp#fixed_encoding?</tt> predicate indicates whether the regexp ;TI"Ihas a <i>fixed</i> encoding, that is one incompatible with ASCII. A ;TI"<regexp's encoding can be explicitly fixed by supplying ;TI"><tt>Regexp::FIXEDENCODING</tt> as the second argument of ;TI"<tt>Regexp.new</tt>:;T@o;;[	I"Lr = Regexp.new("a".force_encoding("iso-8859-1"),Regexp::FIXEDENCODING)
;TI"r =~"a\u3042"
;TI"M   #=> Encoding::CompatibilityError: incompatible encoding regexp match
;TI"3        (ISO-8859-1 regexp with UTF-8 string)
;T;0S;
;i;I"Special global variables;T@o;	;[I"2Pattern matching sets some global variables :;To;;;;[o;;0;[o;	;[I"4<tt>$~</tt> is equivalent to Regexp.last_match;;To;;0;[o;	;[I"4<tt>$&</tt> contains the complete matched text;;To;;0;[o;	;[I".<tt>$`</tt> contains string before match;;To;;0;[o;	;[I"-<tt>$'</tt> contains string after match;;To;;0;[o;	;[I"Q<tt>$1</tt>, <tt>$2</tt> and so on contain text matching first, second, etc ;TI"capture group;;To;;0;[o;	;[I"-<tt>$+</tt> contains last capture group.;T@o;	;[I"
Example:;T@o;;[I"Pm = /s(\w{2}).*(c)/.match('haystack') #=> #<MatchData "stac" 1:"ta" 2:"c">
;TI"P$~                                    #=> #<MatchData "stac" 1:"ta" 2:"c">
;TI"PRegexp.last_match                     #=> #<MatchData "stac" 1:"ta" 2:"c">
;TI"
;TI"$&      #=> "stac"
;TI"        # same as m[0]
;TI"$`      #=> "hay"
;TI"#        # same as m.pre_match
;TI"$'      #=> "k"
;TI"$        # same as m.post_match
;TI"$1      #=> "ta"
;TI"        # same as m[1]
;TI"$2      #=> "c"
;TI"        # same as m[2]
;TI"$3      #=> nil
;TI")        # no third group in pattern
;TI"$+      #=> "c"
;TI"        # same as m[-1]
;T;0o;	;[I"HThese global variables are thread-local and method-local variables.;T@S;
;i;I"Performance;T@o;	;[I"OCertain pathological combinations of constructs can lead to abysmally bad ;TI"performance.;T@o;	;[I"GConsider a string of 25 <i>a</i>s, a <i>d</i>, 4 <i>a</i>s, and a ;TI"<i>c</i>.;T@o;;[I"(s = 'a' * 25 + 'd' + 'a' * 4 + 'c'
;TI"+#=> "aaaaaaaaaaaaaaaaaaaaaaaaadaaaac"
;T;0o;	;[I"@The following patterns match instantly as you would expect:;T@o;;[I"/(b|a)/ =~ s #=> 0
;TI"/(b|a+)/ =~ s #=> 0
;TI"/(b|a+)*/ =~ s #=> 0
;T;0o;	;[I"=However, the following pattern takes appreciably longer:;T@o;;[I"/(b|a+)*c/ =~ s #=> 26
;T;0o;	;[
I"IThis happens because an atom in the regexp is quantified by both an ;TI"Fimmediate <tt>+</tt> and an enclosing <tt>*</tt> with nothing to ;TI"Hdifferentiate which is in control of any particular character. The ;TI"Mnondeterminism that results produces super-linear performance. (Consult ;TI"@<i>Mastering Regular Expressions</i> (3rd ed.), pp 222, by ;TI"L<i>Jeffery Friedl</i>, for an in-depth analysis). This particular case ;TI"Lcan be fixed by use of atomic grouping, which prevents the unnecessary ;TI"backtracking:;T@o;;[	I"A(start = Time.now) && /(b|a+)*c/ =~ s && (Time.now - start)
;TI"   #=> 24.702736882
;TI"C(start = Time.now) && /(?>b|a+)*c/ =~ s && (Time.now - start)
;TI"   #=> 0.000166571
;T;0o;	;[I"FA similar case is typified by the following example, which takes ;TI"0approximately 60 seconds to execute for me:;T@o;	;[I"OMatch a string of 29 <i>a</i>s against a pattern of 29 optional <i>a</i>s ;TI"(followed by 29 mandatory <i>a</i>s:;T@o;;[I"2Regexp.new('a?' * 29 + 'a' * 29) =~ 'a' * 29
;T;0o;	;[
I"JThe 29 optional <i>a</i>s match the string, but this prevents the 29 ;TI"Mmandatory <i>a</i>s that follow from matching. Ruby must then backtrack ;TI"Krepeatedly so as to satisfy as many of the optional matches as it can ;TI"Owhile still matching the mandatory 29. It is plain to us that none of the ;TI"Koptional matches can succeed, but this fact unfortunately eludes Ruby.;T@o;	;[	I"RThe best way to improve performance is to significantly reduce the amount of ;TI"Nbacktracking needed.  For this case, instead of individually matching 29 ;TI"Roptional <i>a</i>s, a range of optional <i>a</i>s can be matched all at once ;TI"with <i>a{0,29}</i>:;T@o;;[I"1Regexp.new('a{0,29}' + 'a' * 29) =~ 'a' * 29;T;0:
@file@:0@omit_headings_from_table_of_contents_below0
MMCT - 2023