Conversation
use tee(2) to peek at pipes in order to avoid reading one byte at a time. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Remove the meta detection in expandmeta and rely on the detection in expmeta instead. Replace the open-coded meta detection with one based on strpbrk. This is slightly inaccurate with bracket expressions but the difference is minor (only affecting patterns with an unquoted ']'). Move int_pending to the end of the loop so that it is only executed after some work has been done. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
If the directory pointer is not a directory, a symlink or an unknown entity, do not recurse into expmeta. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Calling pungetc upon PEOF must cause the next pgetc call to return PEOF. This was broken by the multi-byte pungetc patch. Fix it by adding the EOF logic to pgetc. Note that pungetn will always disregard the PEOF. Fixes: 2c92409 ("input: Allow MB_LEN_MAX calls to pungetc") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Bail out of getmbc if the first character is PEOF. Fixes: 6c44f4e ("parser: Add support for multi-byte characters") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move the rare case of a literal dollar sign to the end of the parsesub block. This eliminates a duplicate USTPUTC call. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Elminate the first chkeofmark branch by moving the CTLVAR to the end of the parsesub block and always doing STADJUST. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add support for $' quoting, including \u and \U. The code is shared with printf, so printf (both format and %b) will recognise the new escape codes (except \c) too. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When leading white spaces are detected in ifsbreakup ifsspc needs to be cleared. Reported-by: Martijn Dekker <martijn@inlv.org> Fixes: c0674f4 ("expand: Support multi-byte characters during field splitting") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Op 22-06-2024 om 15:25 schreef Martijn Dekker: > memrchr(3) is non-standard, and has been ported from glibc to FreeBSD, NetSBD > and OpenBSD, but not to macOS, at least as of 12.7.5. So we need a test for > it. As far as I can tell, *name is a zero-terminated C string, so it should > work to use strrchr(3) as a fallback. Reading the code more closely, that's nonsense, because 'p' does not point to the end of the string if metacharacters are found. Guess the best we can do is provide a simple local fallback implementation of memrchr(3). Patch v2 attached. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
MBCHAR should be preserved in argstr if the EXP_MBCHAR bit is set. This broke case statements. Reported-by: Martijn Dekker <martijn@inlv.org> Fixes: 6c44f4e ("parser: Add support for multi-byte characters") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The function dollarsq_escape may read past the current escape code in order to provide enough data to the underlying escape code processing function. This is OK because we will call unget to return any unused characters. However, if this occurs at the end of a quoted string, this may prompt the user for more input which is wrong. Fix this by terminating the loop whenever we see a single quote. Even if this is an escaped single quote and thus does not indicate the end of the whole quoted string, it's still OK because no single escape code can continue after a single quote. Reported-by: наб <nabijaczleweli@nabijaczleweli.xyz> Fixes: 776424a ("parser: Add dollar single quote") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
117067 s1 = s2 True if the strings s1 and s2 are identical; otherwise, false. 117068 s1 != s2 True if the strings s1 and s2 are not identical; otherwise, false. 117069 s1 > s2 True if s1 collates after s2 in the current locale; otherwise, false. 117070 s1 < s2 True if s1 collates before s2 in the current locale; otherwise, false. "identical" does not mean "collate equally"; this is the difference between sort | uniq and sort -u Fixes: 597850a ("shell: Use strcoll instead of strcmp where applicable") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
For everything but the first component of a pipeline, the input needs to be reset because it is no longer equal to that of the parent shell. Reported-by: arĉi <arcxi@dismail.de> Fixes: b1864ee ("input: Use lseek on stdin when possible") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
For background jobs where the stdin is redirected to /dev/null, a reset_input may be needed in future. For the time being there is no reason to do this as all possible states for stdin will work correctly with /dev/null. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
117027 pathname1 −nt pathname2 117028 True if pathname1 resolves to an existing file and pathname2 cannot be resolved, or if 117029 both resolve to existing files and pathname1 is newer than pathname2 according to 117030 their last data modification timestamps; otherwise, false. 117031 pathname1 −ot pathname2 117032 True if pathname2 resolves to an existing file and pathname1 cannot be resolved, or if 117033 both resolve to existing files and pathname1 is older than pathname2 according to 117034 their last data modification timestamps; otherwise, false. The correct output is $ [ 2024 -nt 2023 ] && echo yes yes $ [ 2023 -nt 2024 ] && echo yes $ [ 2023 -nt ENOENT ] && echo yes yes $ [ ENOENT -nt 2024 ] && echo yes and $ [ 2024 -ot 2023 ] && echo yes $ [ 2023 -ot 2024 ] && echo yes yes $ [ 2023 -ot ENOENT ] && echo yes $ [ ENOENT -ot 2024 ] && echo yes yes but dash currently returned only the first yes out of both blocks. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As can be seen in the `man` page for `el_set`, using `EL_PROMPT_ESC` for the op is the same as `EL_PROMPT`, but it allows escape characters to be expanded in the prompt the same way they are when used with `echo` or `printf(1)`. As far as I know, this is not specified by POSIX, but neither is the emacs editing mode (please correct me if I am wrong), so I think this is a justified change to make it align with the behaviour or `echo` and `printf(1)`. Given that this is not specified by POSIX, there isn't much of a precident for what the value of the start/stop character should be. From what I have seen, 0o001 is common, so that is what I have included in the patch, but it may not be the most fitting. Taking a look at how ASCII defines its control characters, I believe any characters between 0o034 and 0o037 may be a more suitable choice, but this could be up for debate. Signed-off-by: Sebastien Peterson-Boudreau <sebastien.peterson.boudreau@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
A lot of scripts (in particular, autoconf) relies on echo keeping undefined backslash sequences intact. Preserve this behaviour by only interpreting the few sequences required for dollar single quote. Repoted-by: Дилян Палаузов <dilyan.palauzov@aegee.org> Fixes: 776424a ("parser: Add dollar single quote") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The jump table is unnecessarily large for a function that is not performance-critical. Move some of the cases out of the switch statement to reduce its size. Move the value = ch assignment to the common path. Merge the code for '\a', '\b' and '\f'. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When our own pmatch is used, loc2 is unused in scanleft/right when quotes is true. However, it is still needed when quotes is false. Fix the scanleft/right code so that loc2 is always updated (so it will be garbage when quotes is true) but only returned depending on the value of quotes. Fixes: c5bf970 ("expand: Add multi-byte support to pmatch") Reported-by: Johannes Altmanninger <aclopte@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
With C23 and LTO, we get the following warning (or error if promoted to such):
```
src/builtins.c:28:5: error: type of ‘timescmd’ does not match original declaration [-Werror=lto-type-mismatch]
28 | int timescmd(int, char **);
| ^
src/bltin/times.c:15:5: note: type mismatch in parameter 1
src/bltin/times.c:15:5: note: type ‘void’ should match type ‘int’
```
Make the two consistent. This didn't show up before because pre-C23
had unprototyped functions.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Johannes Altmanninger <aclopte@gmail.com> wrote: > I noticed another regression in c5bf970 (expand: Add multi-byte > support to pmatch, 2024-06-02). > > This command now prints "abc-def" but used to print "ef". > > x=abc-def > y="${x##*d}" > echo "$y" Fix this by setting s to the correct value in scanright based on FNMATCH_IS_ENABLED. Fixes: c5bf970 ("expand: Add multi-byte support to pmatch") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Jan Pechanec <Jan.Pechanec@oracle.com> wrote: > > thank you for working on dash. I was testing it recently and it worked > really well. > > However, I noticed the dash code from github does filename pattern > matching even for code like "[ x = x ] && echo ok". I believe the > unquoted space after '[' should not trigger pattern matching but rather > only to invoke the test/[ utility, as before. It seems it works fine > though and only doing some extra unneeded work which may not be > immediatelly noticeable. > > dash installed on my Oracle Linux 9: > > janp:len49:~/_INST/dash$ strings /usr/bin/dash | grep dash > dash-0.5.11.5-4.el9.x86_64.debug > janp:len49:~/_INST/dash$ time dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done' > > real 0m0.752s > user 0m0.748s > sys 0m0.002s > > dash from github (commit b3e38ad) take > way more time to do the same thing: > > janp:len49:~/_INST/dash$ time ./src/dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done' > > real 0m4.202s > user 0m1.361s > sys 0m2.804s > > For the latter, strace shows open, fstat, getdents*, and close system > calls for each iteration and it depends on number of files in the > current directory. With more files, it takes more time: > > janp:len49:/etc$ time ~/_INST/dash/src/dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done' > real 0m15.591s > user 0m5.704s > sys 0m9.828s > > If I change [ to test, the dash github version behaves as before, and > possibly even faster: > > janp:len49:~/_INST/dash$ time ~/_INST/dash/src/dash -c 'i=0; while :; do : $((i=i+1)); test $i -eq 500000 && break; done' > > real 0m0.662s > user 0m0.659s > sys 0m0.002s > > Even bash would be faster than the current github version of dash: > > janp:len49:~/_INST/dash$ time bash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done' > real 0m1.943s > user 0m1.939s > sys 0m0.002s Fix performance regression for idiomatic "[ ... ]" expression by adding a bypass for a literal "]" in pathname expansion. Reported-by: Jan Pechanec <Jan.Pechanec@oracle.com> Fixes: 8d0eca2 ("expand: Rewrite expmeta meta detection") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
…g in pmatch strpbrk() accepts two null-terminated string arguments. stop[] is char array that is not null-terminated but is still passed as a second argument to strpbrk. This causes buffer overread, which is detected by AddressSanitizer. This commit adds an explicit null-terminated to the end of the array. Signed-off-by: Zurab Kvachadze <zurabid2016@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move the stop array closer to the strpbrk(3) call in pmatch. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ensure that the EOF state is reset in reset_input as otherwise the new stdin may be treated as empty. Reported-by: Nathan Royce <nroycea+kernel@gmail.com> Fixes: 69786bc ("input: Fix pungetc on PEOF") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As pointed out by Denys Vlasenko, we can avoid blocking signals on vfork() by making the signal handler of a vfork child immediately return. This saves a syscall. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Recent versions of VxWorks support fork() and as result can support dash.
For example, to cross compile for IA with this patch applied, and your VSB environment sourced (aka sysroot)
./configure --build=x86_64-pc-linux-gnu --host=x86_64-wrs-vxworks --prefix=/usr \
CC=wr-cc CXX=wr-c++ LD=wr-ld AR=wr-ar NM=wr-nm OBJCOPY=wr-objcopy OBJDUMP=wr-objdump RANLIB=wr-ranlib READELF=wr-readelf SIZE=wr-size STRIP= wr-strip \
ac_cv_func_faccessat=no \
CFLAGS="-DJOBS=0 "
make install DESTDIR=${VSB}/usr/3pp/develop
For other architectures update your <host> appropriately.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
procargs(int argc, char **argv)
argc is used in just one place:
if (argc > 0)
xargv++;
Trivially replaceable by if(xargv[0] != NULL), so can avoid passing
this argument.
char **xargv;
xargv = argv;
xargv is always equal to argv, so why having a separate variable?
const char *xminusc;
xminusc = minusc;
Similar situation with xminusc being equal to minusc
during the range where it is live, they diverge here:
if (xminusc) {
minusc = *xargv++;
but after this, xminusc is not used.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Michael Greenberg <michael.greenberg@stevens.edu>
There are now three quoting modes in pretty printing: unquoted, quoted (only escape special characters, including `"`), and heredoc (only escape special characters, excluding `"`). We now no longer treat `!` as an escaped character, which is correct on non-interactive shells but will break interactive scripts. Longer term, we need to know when pretty-printing who is consuming our output. But right now the only real client is non-interactive, so here we are.
Directly invoking `setup.py` was causing a build failure on macOS; using `pip3` solves the problem. Signed-off-by: Michael Greenberg <michael.greenberg@stevens.edu>
Fix up escaping of `$`; revise tests to support. --------- Signed-off-by: Bolun Thompson <bolunthompson@ucla.edu>
Fix: Nested shell in subshell Signed-off-by: Bolun Thompson <bolunthompson@ucla.edu>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Trying to fix the dune build... and recover from some old squash merges that make rebasing history hard.