Skip to content

Conversation

@LauraLangdon
Copy link
Contributor

Add a spell check to docs workflow

@LauraLangdon LauraLangdon added the documentation Anything related to documentation (e.g. doc bugs or similar), *not* documenting new features label Apr 5, 2022
@LauraLangdon LauraLangdon self-assigned this Apr 5, 2022
@LauraLangdon LauraLangdon linked an issue Apr 5, 2022 that may be closed by this pull request
@LauraLangdon
Copy link
Contributor Author

LauraLangdon commented Apr 5, 2022

I need to add more words to spelling.dic; it's flagging all kinds of things we need.

@hola-soy-milk
Copy link
Contributor

Looking amazing!

@flaki
Copy link
Contributor

flaki commented Apr 12, 2022

A couple thoughts on this based on the flagged (false) positives:

The docs are not plain Markdown, but MDX, that is markdown+JSX (so practically as much HTML as they are MD):

Misspelled words:
<markdown> website/docs/atmo/runnable-api/graphql-client.md
--------------------------------------------------------------------------------
…
MultiLanguageCodeBlock
…
TabItem
…
groupId
href
…

We would definitely need to filter these or build a custom dictionary that takes care of this.

On the other hand, the spellcheck seems to indiscriminately crawl and flag all code blocks:

Misspelled words:
<markdown> website/docs/grav/usage/getting-started/request-reply.md
--------------------------------------------------------------------------------
Grav
MsgReceipt
MsgTypeDefault
NewMsg
OnReply
Println
RPC
…
pre
reciepts
requestReply

On the one hand, this will be improved in the future when we move a good chunk of (but not necessarily all) code snippets to files outside of the markdowns, but in certain cases it would make a lot of sense to check these files, such as for the runnable APIs, but based on a generated wordlist:

Misspelled words:
<markdown> website/docs/atmo/runnable-api/logging.md
--------------------------------------------------------------------------------
…
LogDebug
LogErr
LogInfo
LogWarn
…
logDebug
logErr
logInfo
logWarn
…

Similarly for other things like client libraries. This would of course be a "blunt weapon", and a smarter way to do this would be through proper tests (á la #81).

@LauraLangdon
Copy link
Contributor Author

A couple thoughts on this based on the flagged (false) positives:

The docs are not plain Markdown, but MDX, that is markdown+JSX (so practically as much HTML as they are MD):

Misspelled words:
<markdown> website/docs/atmo/runnable-api/graphql-client.md
--------------------------------------------------------------------------------
…
MultiLanguageCodeBlock
…
TabItem
…
groupId
href
…

We would definitely need to filter these or build a custom dictionary that takes care of this.

Yep! I'm slowly working my way through all the false positives in spelling.dic.

On the other hand, the spellcheck seems to indiscriminately crawl and flag all code blocks:

Misspelled words:
<markdown> website/docs/grav/usage/getting-started/request-reply.md
--------------------------------------------------------------------------------
Grav
MsgReceipt
MsgTypeDefault
NewMsg
OnReply
Println
RPC
…
pre
reciepts
requestReply

On the one hand, this will be improved in the future when we move a good chunk of (but not necessarily all) code snippets to files outside of the markdowns, but in certain cases it would make a lot of sense to check these files, such as for the runnable APIs, but based on a generated wordlist:

Misspelled words:
<markdown> website/docs/atmo/runnable-api/logging.md
--------------------------------------------------------------------------------
…
LogDebug
LogErr
LogInfo
LogWarn
…
logDebug
logErr
logInfo
logWarn
…

Similarly for other things like client libraries. This would of course be a "blunt weapon", and a smarter way to do this would be through proper tests (á la #81).

From today's docs meeting notes:

Flaki: it looks like the spellchecker used https://facelessuser.github.io/pyspelling/filters/markdown/ for a markdown filter, I don’t see an option for this out of the box (nor for JSX/MDX) so this might not be possible (without modifying that filter manually). cc @arbourd

LauraLangdon and others added 4 commits April 26, 2022 18:11
Previous build failed `spellcheck.yml` test, citing `expected <block end>, but found '?'
  in "<unicode string>", line 7, column 3:
      dictionary:
      ^`

Just trying matching our `spellcheck.yml` to the example: https://github.com/igsekor/pyspelling-any/blob/main/spellcheck.yaml to see if that fixes it.
@LauraLangdon LauraLangdon requested review from arbourd and flaki April 27, 2022 16:59
@LauraLangdon LauraLangdon marked this pull request as ready for review April 27, 2022 17:00
Moved code from `docs/.spellcheck.yaml` (don't even know where that came from?) to `docs/.github/workflows/spellcheck.yml`
This shouldn't be here, I think; it should be .`github/workflows/spellcheck.yml`.
@LauraLangdon LauraLangdon requested review from arbourd and hola-soy-milk and removed request for arbourd May 3, 2022 22:20
@LauraLangdon LauraLangdon merged commit 6a6683d into main May 4, 2022
@LauraLangdon LauraLangdon deleted the arbourd-patch-1 branch May 4, 2022 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Anything related to documentation (e.g. doc bugs or similar), *not* documenting new features

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CI tests for docs

5 participants