Skip to content

Commit dc538cc

Browse files
committed
Non capture and named capture groups implemeneted
Warning: Not yet added named capture group backreference code
1 parent deba950 commit dc538cc

File tree

3 files changed

+40
-8
lines changed

3 files changed

+40
-8
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,14 +30,14 @@ or a huge number of possible matches, such as `/.\w/`, then only a subset of the
3030
* Character sets (inluding ranges and negation!), e.g. `/[abc]/`, `/[A-Z0-9]/`, `/[^a-z]/`
3131
* Escaped characters, e.g. `/\n/`, `/\w/`, `/\D/` (and so on...)
3232
* Capture groups, and backreferences(!!), e.g. `/(this|that) \1/`
33+
* Named capture groups, e.g. `(?<name>bar)/`(Warning: Named capture group backreferences not yet implemented!)
34+
* Non-capture groups, e.g. `/(?:foo)/`
3335
* Arbitrarily complex combinations of all the above!
3436

3537
## Not-Yet-Supported syntax
3638

3739
I plan to add the following features to the gem (in order of most -> least likely), but have not yet got round to it:
3840

39-
* Non-capture groups, e.g. `/(?:foo)/`
40-
* Named capture groups, e.g. `(?<name>bar)/`
4141
* Throw exceptions if illegal syntax (see below) is used
4242
* POSIX bracket expressions, e.g. `/[[:alnum:]]/`, `/[[:space:]]/`
4343
* Options, e.g. `/pattern/i`, `/foo.*bar/m`

lib/regexp-examples/parser.rb

Lines changed: 24 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ def parse_group(repeaters)
4545
def parse_after_backslash_group
4646
@current_position += 1
4747
case
48-
when regexp_string[@current_position..-1] =~ /^(\d+)/
48+
when rest_of_string =~ /\A(\d+)/
4949
group = parse_backreference_group($&)
5050
when BackslashCharMap.keys.include?(regexp_string[@current_position])
5151
group = CharGroup.new(
@@ -79,11 +79,25 @@ def parse_repeater(group)
7979
def parse_multi_group
8080
@current_position += 1
8181
@num_groups += 1
82-
this_group_num = @num_groups
82+
group_id = nil # init
83+
rest_of_string.match(/\A(\?)?(:|!|=|<(!|=|[^!=][^>]*))?/) do |match|
84+
case
85+
when match[1].nil? # e.g. /(normal)/
86+
group_id = @num_groups
87+
when match[2] == ':' # e.g. /(?:nocapture)/
88+
@current_position += 2
89+
group_id = nil
90+
when %w(! =).include?(match[2]) # e.g. /(?=lookahead)/, /(?!neglookahead)/
91+
# TODO: Raise exception
92+
when %w(! =).include?(match[3]) # e.g. /(?<=lookbehind)/, /(?<!neglookbehind)/
93+
# TODO: Raise exception
94+
else # e.g. /(?<name>namedgroup)/
95+
@current_position += (match[3].length + 3)
96+
group_id = match[3]
97+
end
98+
end
8399
groups = parse
84-
# TODO: Non-capture groups, i.e. /...(?:foo).../
85-
# TODO: Named capture groups, i.e. /...(?<name>foo).../
86-
MultiGroup.new(groups, this_group_num)
100+
MultiGroup.new(groups, group_id)
87101
end
88102

89103
def parse_multi_end_group
@@ -146,7 +160,7 @@ def parse_question_mark_repeater(group)
146160
end
147161

148162
def parse_range_repeater(group)
149-
match = regexp_string[@current_position..-1].match(/^\{(\d+)(,)?(\d+)?\}/)
163+
match = rest_of_string.match(/\A\{(\d+)(,)?(\d+)?\}/)
150164
@current_position += match[0].size
151165
min = match[1].to_i if match[1]
152166
has_comma = !match[2].nil?
@@ -157,6 +171,10 @@ def parse_range_repeater(group)
157171
def parse_one_time_repeater(group)
158172
OneTimeRepeater.new(group)
159173
end
174+
175+
def rest_of_string
176+
regexp_string[@current_position..-1]
177+
end
160178
end
161179
end
162180

spec/regexp-examples_spec.rb

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,20 @@ def self.examples_exist_and_match(*regexps)
5252
)
5353
end
5454

55+
context "for complex multi groups" do
56+
examples_exist_and_match(
57+
/(normal)/,
58+
/(?:nocapture)/,
59+
/(?<name>namedgroup)/
60+
)
61+
# TODO: These are not yet implemented
62+
# (expect to raise exception)
63+
# /(?=lookahead)/,
64+
# /(?!neglookahead)/,
65+
# /(?<=lookbehind)/,
66+
# /(?<!neglookbehind)/,
67+
end
68+
5569
context "for escaped characters" do
5670
examples_exist_and_match(
5771
/\w/,

0 commit comments

Comments
 (0)