Open
Conversation
adithyaov
reviewed
Jun 21, 2023
core/src/Streamly/Unicode/Stream.hs
Outdated
| -- | ||
| module Streamly.Unicode.Stream | ||
| ( | ||
| DecodeState |
Member
There was a problem hiding this comment.
Before exposing:
- Check the naming
- Check good documentation
- Check naming consistency
- Optional: Benchmark and Test
6a20721 to
5fa6d57
Compare
adithyaov
reviewed
Jul 11, 2023
adithyaov
reviewed
Jul 11, 2023
adithyaov
reviewed
Jul 11, 2023
adithyaov
reviewed
Jul 24, 2023
528ba6e to
4e308f1
Compare
4e308f1 to
6c754f4
Compare
harendra-kumar
requested changes
Jul 24, 2023
| -- For multi-byte characters, the decoding state indicates the number of bytes | ||
| -- remaining to complete the character. It is usually initialized to a non-zero | ||
| -- value corresponding to the number of bytes in the multi-byte character, e.g | ||
| -- DecodeState will be 1 for 2-bytes char. |
Member
There was a problem hiding this comment.
This documentation does not make much sense to the user of the library. Need to write which APIs use this, where does this come from, what values need to be supplied etc. The information is to be used to correctly understand how to use the APIs.
| -- Calculate the code point value: Depending on the type of the leading byte, | ||
| -- extract the significant bits from each byte of the sequence and combine them | ||
| -- to form the complete code point value. The specific bit manipulations will | ||
| -- differ based on the number of bytes used. |
Member
There was a problem hiding this comment.
We should not define a codepoint here. Need to say what it means in the context of the APIs and to be able to use the APIs.
d8470ed to
3692740
Compare
| -- ** Resumable UTF-8 Decoding | ||
| , DecodeError(..) | ||
| , DecodeState | ||
| , CodePoint |
Member
There was a problem hiding this comment.
We can return a more intelligent decode error:
data DecodeUTF8Error = DecodeUTF8Incomplete Word32 | DecodeUTF8NonStarter Word8 | DecodeUTF8Invalid
This should be enough to build a resumable decoder. When resuming we should supply the Word32 from DecodeUTF8Incomplete.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.