Skip to content

Commit 119ff5c

Browse files
committed
Add regex pattern comments
1 parent 039ab76 commit 119ff5c

1 file changed

Lines changed: 22 additions & 7 deletions

File tree

test/regex_test.py

Lines changed: 22 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -126,17 +126,32 @@ def test_phoneNumberRegex(data, expected):
126126
("C##", False),
127127
("C", True),
128128
]
129+
# The regex pattern (? initiates a special non-capturing group or lookaround assertion, depending on the next character and the regex flavor used. It is a prefix for advanced patterns that modify how the regex engine matches text without creating a standard backreference.
130+
# Here are the primary meanings for sequences starting with (?:
131+
132+
# (?: creates a non-capturing group. This groups elements for the purpose of applying quantifiers (like + or *) or the | (OR) operator, but the matched content is not stored in a backreference (e.g., $1, $2).
133+
# Example: The pattern (abc)(?:123)(def), when matched against abc123def, would capture abc in group 1 and def in group 2, but 123 would not be captured.
134+
135+
# (?= defines a positive lookahead assertion. It matches a position in the string only if the pattern inside the lookahead follows that position, without actually including the following pattern in the match.
136+
# Example: The pattern Windows(?=XP) matches "Windows" only if it is immediately followed by "XP".
137+
138+
# (?<= defines a positive lookbehind assertion (not supported in all regex flavors). It matches a position only if the pattern inside the lookbehind immediately precedes it.
139+
# Example: The pattern (?<=\$)\d+ matches a number only if it is immediately preceded by a dollar sign.
140+
141+
# (?!) defines a negative lookahead assertion. It matches a position only if the pattern inside the lookahead does not follow it.
142+
# Example: (?<!S) Asserts that the current position in the string is not preceded by a non-whitespace character (\S).
143+
# This effectively matches locations that are preceded by a whitespace character (\s) or the beginning of the string (BOS), without including the whitespace character itself in the final match.
144+
145+
# (?<! defines a negative lookbehind assertion (not supported in all regex flavors). It matches a position only if the pattern inside the lookbehind does not precede it.
146+
# Example: (?!\S) Asserts that the current position in the string is not followed by a non-whitespace character (\S).
147+
# This effectively matches locations that are followed by a whitespace character (\s) or the end of the string (EOS), without including the following character in the final match.
148+
149+
# (?i), (?m), (?s), (?x) are inline modifiers that change the matching behavior (e.g., case-insensitivity, multiline mode) for the rest of the pattern or within the group in some flavors.
129150
@pytest.mark.parametrize("data, expected", CPP_CSHARP_REGEX_TEST_CASES)
130151
def test_cpp_csharpRegex(data, expected):
131152
# https://stackoverflow.com/questions/79435236/how-to-match-c-c-or-c
132153
# The ?: inside the group (?:\+\+|#) just make the group non capturing. The (?<!S) and (?!\S) are called lookarounds, and assert that either whitespace or the start/end precedes/follows the match.
133-
# (?<!S) is negative lookbehind
134-
# Meaning: Asserts that the current position in the string is not preceded by a non-whitespace character (\S).
135-
# Effect: This effectively matches locations that are preceded by a whitespace character (\s) or the beginning of the string (BOS), without including the whitespace character itself in the final match.
136-
# (?!\S) is negative lookahead
137-
# Meaning: Asserts that the current position in the string is not followed by a non-whitespace character (\S).
138-
# Effect: This effectively matches locations that are followed by a whitespace character (\s) or the end of the string (EOS), without including the following character in the final match.
139-
# matches to be the entire input string
154+
# matches to be the entire input string. Quantifier '?' = 0 or 1.
140155
cpp_csharp_regex = r"^C(?:\+\+|#)?$"
141156
# matches perhaps as part of a larger string, with the matches surrounded by whitespace
142157
#cpp_csharp_regex = r"\bC(?:\+\+|#)?(?!\S)"

0 commit comments

Comments
 (0)