Skip to content

Consider manual parsing and handling of regex #4

@PgBiel

Description

@PgBiel

The regex module currently just forwards all regex strings to Nix's built-in regex matching and splitting functions. However, Nix uses POSIX ERE syntax for regex, which is more limited than the syntax available in other Gleam targets. For example, \s for whitespace doesn't work, and must be replaced by [[:space:]]. Similarly, \d needs to be written as [0-9] for example.

We could solve this by manually parsing regex on compile and transforming into a format that Nix accepts. In particular, we'd look into tackling the following incompatibilities:

  1. Properly handle \s and \d, which are widely used across Gleam packages;
  2. Handle other Regex escapes, such as \n;
  3. Handle case-insensitive flag;
  4. Handle multi-line flag;
  5. Handle (?!x) (negative lookahead);
  6. Handle (?:x) (ignored group).

1 and 2 would require parsing and replacing depending on the context (inside/outside character classes - a simple global replacement isn't enough, since the [ ] in [0-9] have to be dropped when already inside a character class).

3 would basically consist of converting letters such as a or A into [aA], and would require parsing in the same manner.

4 could perhaps be done by splitting the string into lines first and joining matches on each line.

5 could perhaps be done by replacing (?!x) with (x)? and storing the capture group number; later on, when using match functions, matches where the group is present would be ignored (additionally, the group would be removed from submatches).

6 could be done by replacing (?:x) with (x) and then removing the group from submatches later, by storing the group's number after parsing.

Now, parsing with Nix could be inefficient, but we expect compile to be a "slower" operation anyway. This would allow proper compatibility with the ecosystem.

We could also test if parsing would be needed anyway first , otherwise just pass the regex through.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestnix incompatibilitySome function works differently or doesn't work in the Nix target

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions