Conversation
|
|
Wow, this is amazing. Thank you very much. Let me know if you need any help or want to discuss. |
No problem :) I got irritated that there is no good polyfill for word segmenting with up-to-date Unicode rules, and I stumbled upon your project via e18e and decided to contribute :)
Yeah so far I think I'm good. It's a bit confusing to implement because I try to base it off grapheme.js but also word.rs, and they are implemented completely differently 😂 It's all very new to me, but I think I'm getting the hang of it now. Will try to finish it this week. |
|
#112 needs to be addressed before implementing other segmenters. |
Hey Hyeseong, sorry for the silence from my side. I'll be honest with you, but I don't know if I'll be working on this PR any time soon. Since last time I touched the code, my own needs and circumstances have changed, and now I myself don't need a word segmenter for the project I initially needed it for. Because of this, I now have less motivation to finish this :) I might come back to it later when I get the time and inspiration to work with ICU again, but don't expect this to happen any time soon :/ If you, or anyone else, want to pick this up, be my guest 😅 |
|
Understandable. I was wondering if people still need the word segmenter, but it seems to receive less attention than grapheme. I tried a different implementation on my end, but having automated tests was a bit difficult due to the fragmentation of Intl.Segmenter behavior on Node.js. I think I can postpone the development of additional segments a little more. |
This PR tackles #25 and implements a word segmenter. This is still WIP