Skip to content

Alternative word-boundary rules #171

@dhardy

Description

@dhardy

Is there any interest in supporting alternative word-boundary rules as desired for (e.g.) text editing?

Currently, split_word_bounds follows UAX #29 word boundary rules, which explicitly considers things like 3,456.789 and example.com a single word (although interestingly e.g. has a boundary before the latter period).

Text editing usually wants slightly different rules; e.g. text.len should be considered two words + intervening punctuation.

In some cases CamelCase and snake_case may also be considered multiple words (e.g. KDE's Kate editor treats these as Camel Case and snake_ case for word-mode keyboard navigation (Ctrl + Right etc.) but does not sub-divide for double-click selection).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions