A single-header C library that implements Lua-style pattern matching, including captures, character classes, repetition operators, balanced matches, frontier patterns, and detailed error reporting.
This is not regex — it follows Lua's lightweight pattern syntax, making it small, fast, and easy to embed. An overview of pattern matching is provided below. For more information on Lua's patterns refer to the official documentation.
- Lua-compatible pattern syntax
- Header-only (drop-in)
- Works on binary data or C strings
- Captures and back-references
- Greedy and lazy repetition
- Anchors (
^,$) - Balanced matches (
%b) - Frontier patterns (
%f) - Detailed error diagnostics with source location
- No dynamic allocations
- Recursive backtracking matcher
- Designed for simplicity and embeddability
- Behavior closely follows Lua's
string.match
- Not a full regex engine
- No alternation (
|) - No lookahead/lookbehind
Just copy pattern.h into your project.
In one C file, define PATTERN_IMPLEMENTATION before including it:
#define PATTERN_IMPLEMENTATION
#include "pattern.h"In other files, include it normally:
#include "pattern.h"Pattern_State ps;
pattern_match_cstr(&ps, "hello world", "h(ello)");
if(ps.status == PATTERN_ERROR) {
pattern_print_error(stderr, &ps);
return 1;
}
if(ps.status == PATTERN_NO_MATCH) {
printf("No match");
return 0;
}
printf("Matched!\n");
printf("Full match: %.*s\n", (int)ps.captures[0].size, ps.captures[0].data);
printf("Capture 1: %.*s\n", (int)ps.captures[1].size, ps.captures[1].data);NOTE: the full match is always stored at capture index
0.
Conceptually, it is as the pattern is always surrounded with a top-level capture.
| Pattern | Meaning |
|---|---|
. |
Any character |
%a |
Letters |
%c |
Control characters |
%d |
Digits |
%l |
Lowercase character |
%p |
Punctuation |
%s |
Whitespace |
%u |
Uppercase character |
%w |
Alphanumeric |
%x |
Hexadecimal |
%g |
Printable character and not whitespace |
%z |
\0 |
| Uppercase classes | Negated (%A, %D, etc.) |
[abc] -- a, b, or c
[a-z] -- range
[^0-9] -- negated class
[%a%d_] -- escaped classes allowed| Operator | Meaning |
|---|---|
? |
0 or 1 (optional) |
* |
0 or more (greedy) |
+ |
1 or more (greedy) |
- |
0 or more (lazy) |
| Pattern | Meaning |
|---|---|
^ |
Start of string |
$ |
End of string |
(abc) -- capture substring
() -- position capture- Capture
0is always the entire match - Maximum captures: by default 31 (30 + 1 for the top-level match), can be overridden by defining
PATTERN_MAX_CAPTURESbefore including the library.
(%a+)%1 -- matches repeated word%bxy -- matches substring starting with x, ending with y, with balanced occurrencesExamples:
%b()matches balanced parentheses:"(a(b)c)"→"(a(b)c)"%b[]matches balanced brackets:"[a[b]c]"→"[a[b]c]"%b<>matches balanced angle brackets:"<a<b>c>"→"<a<b>c>"
The balanced match starts at the first occurrence of x and continues until it finds a matching y where all nested x/y pairs are balanced.
%f[set] -- matches empty string at boundary where next char is in set and previous is notExamples:
%f[%w]matches the start of a word (word boundary)%f[%W]matches the end of a word%f[%a]hello%f[%A]matches "hello" as a complete word%f[%d]matches positions just before a digit
The frontier pattern matches a zero-width position (like () position captures) where:
- The previous character is NOT in the specified set
- The current character IS in the specified set
This is useful for matching word boundaries and other transitions between character classes.
pattern_match(&ps, "cantami\0o\0diva", 14, pattern);pattern_match_ex(&ps, data, len, pattern, start_pos);- Negative values start from the end (Lua-style)
if(ps.status == PATTERN_ERROR) {
pattern_print_error(stderr, &ps);
}Example output:
column:5: unclosed character class
[a-z
^PATTERN_ERR_MAX_CAPTURESPATTERN_ERR_UNEXPECTED_CAPTURE_CLOSEPATTERN_ERR_UNCLOSED_CAPTUREPATTERN_ERR_INVALID_CAPTURE_IDXPATTERN_ERR_INCOMPLETE_ESCAPEPATTERN_ERR_UNCLOSED_CLASSPATTERN_ERR_INVALID_BALANCED_PATTERNPATTERN_ERR_UNCLOSED_FRONTIER_PATTERN
// Returns true if `idx` is a position-only capture (i.e. `()`)
bool pattern_is_position_capture(const Pattern_State* ps, int idx);
// Gets the offset from the start of the string where the capture `idx` starts
size_t pattern_get_capture_pos(const Pattern_State* ps, int idx);
// Returns a human readable string describing the error
const char* pattern_strerror(Pattern_Error err);
// Prints an error in human readable form along with the error location in the pattern
void pattern_print_error(FILE* stream, const Pattern_State* ps);A test suite is provided in 'test/' folder. To run them:
make test