Conversation
the old one didn't strip the trailing newline
lib/rouge/lexers/abap.rb
Outdated
| rule %r/".*/, Comment::Single | ||
| rule %r(^\*.*), Comment::Multiline | ||
| rule %r/"[^\r\n]*/, Comment::Single | ||
| rule %r/^\*[^\r\n]*/, Comment::Single |
There was a problem hiding this comment.
Without the /m flag, . in regexes doesn't typically match \n anyways.
There was a problem hiding this comment.
okay, I have been just mislead by the optical demo (http://localhost:9292/abap) which shows a newline after every comment, whereas the overview page (http://localhost:9292/) does not show them.
After comparing with some other languages I see that this is an issue for almost every other language too, so I might think this will work without the newline regexp - thanks for the hint
There was a problem hiding this comment.
Can you show me a screenshot of "shows a newline after every comment"?
There was a problem hiding this comment.
- I added a devcontainer (because I am not allowed to install ruby on my work device)
mcr.microsoft.com/devcontainers/ruby:3 - ran
pumaand opened http://localhost:9292/ - in the overview page, the snipptes look good (for ABAP an other languages - just look at single line comments)
4. then looking into the detail pages, I see line feeds after every single line comment (which are not in the sourcecode!).
ABAP ( https://github.com/rouge-ruby/rouge/blob/main/spec/visual/samples/abap ):
C-Sharp ( https://github.com/rouge-ruby/rouge/blob/main/spec/visual/samples/csharp ):

There was a problem hiding this comment.
What browser are you viewing these in? I definitely don't see those in Firefox on Mac. I'm kind of shocked that a browser would render a lone \r as a full newline in 2026.
| # https://help.sap.com/doc/abapdocu_758_index_htm/7.58/en-US/index.htm?file=abenabap_shortref.htm | ||
| # https://help.sap.com/doc/abapdocu_758_index_htm/7.58/en-us/abenbuilt_in_functions_overview.htm | ||
| # https://help.sap.com/doc/abapdocu_758_index_htm/7.58/en-US/abencds_language_elements.htm | ||
| # https://help.sap.com/doc/abapdocu_758_index_htm/7.58/en-us/abenabap_words.htm |
There was a problem hiding this comment.
Are these keyword/builtin lists the same as in the sources? If so, I might take a moment to write a docs parser for it.
There was a problem hiding this comment.
The second and the last link probably contain everything - but I cannot guarantee 100%, since SAP mixes some builtin data types into the list of the keywords which I would not see as a keyword, but they may be in a different context (and this is the pain in the ass for ABAP/SAP - some words can be a keyword in a special context, but may be just some names/variables in another context - it is not as simple as C, Java...)
Since those lists barely change (a few new keywords may appear with greater SAP releases), the content of the list is rather stable - whereas the layout of the page may not be. Since I am not an expert in writing docs parser and the list is complete for the latest SAP release, I do not see the need to invest more work here than needed
There was a problem hiding this comment.
What I'm asking is - did you manually edit the contents of these pages to source the lists? Or are these lists just the lists of keywords from the docs?
There was a problem hiding this comment.
It's not 1:1 the same, since SAP is mixing some builtin-data types into the last list (https://help.sap.com/doc/abapdocu_758_index_htm/7.58/en-us/abenabap_words.htm) which are not keywords per se - but (depending on the usage), some of them may be a keyword a special context 🤦
e.g. CHAR is a built in data type (but not as you know a string or similar from other languages), but CL_DBI_UTILITIES is even a SAP delivered class. I try to mimic the syntax highlighting of the ABAP editor here...
But your approach is still good (but out of my scope here): I saw that I still missed some SQL functions in the long list of KEYWORDS - this shows me that manually maintaining it is already hard. But on the other side: writing a script/parser fetches in false positives. I don't know that the lesser evil is
There was a problem hiding this comment.
Okay - I found one more documentation for ABAP in S/4 HANA cloud where more keywords are in which are not available in ABAP on premise (https://help.sap.com/doc/abapdocu_cp_index_htm/CLOUD/en-US/ABENABAP_REFERENCE.html)...
I think I use my vacation to think about a parser. Where do you see a parser located? Are there examples in this project?
There was a problem hiding this comment.
I see, I can take a look. The other doc parsers are in tasks/builtins/*.rake. Some of them use simple regexes but the newer apache.rake (for httpd) one brings in Nokogiri to parse their XML docs. I would be more than happy to write one if there's a good source, and if the docs parser needs to hard-code a list of exceptions to filter out, that's fine by me too, presuming it's a smaller list that's easier to maintain.
Updated existing PR #2257 by