path.c: translate Windows paths recorded by Windows git on POSIX hosts#2107
path.c: translate Windows paths recorded by Windows git on POSIX hosts#2107johnnyshields wants to merge 2 commits intogitgitgadget:masterfrom
Conversation
Welcome to GitGitGadgetHi @johnnyshields, and welcome to GitGitGadget, the GitHub App to send patch series to the Git mailing list from GitHub Pull Requests. Please make sure that either:
You can CC potential reviewers by adding a footer to the PR description with the following syntax: NOTE: DO NOT copy/paste your CC list from a previous GGG PR's description, Also, it is a good idea to review the commit messages one last time, as the Git project expects them in a quite specific form:
It is in general a good idea to await the automated test ("Checks") in this Pull Request before contributing the patches, e.g. to avoid trivial issues such as unportable code. Contributing the patchesBefore you can contribute the patches, your GitHub username needs to be added to the list of permitted users. Any already-permitted user can do that, by adding a comment to your PR of the form Both the person who commented An alternative is the channel Once on the list of permitted usernames, you can contribute the patches to the Git mailing list by adding a PR comment If you want to see what email(s) would be sent for a After you submit, GitGitGadget will respond with another comment that contains the link to the cover letter mail in the Git mailing list archive. Please make sure to monitor the discussion in that thread and to address comments and suggestions (while the comments and suggestions will be mirrored into the PR by GitGitGadget, you will still want to reply via mail). If you do not want to subscribe to the Git mailing list just to be able to respond to a mail, you can download the mbox from the Git mailing list archive (click the curl -g --user "<EMailAddress>:<Password>" \
--url "imaps://imap.gmail.com/INBOX" -T /path/to/raw.txtTo iterate on your change, i.e. send a revised patch or patch series, you will first want to (force-)push to the same branch. You probably also want to modify your Pull Request description (or title). It is a good idea to summarize the revision by adding something like this to the cover letter (read: by editing the first comment on the PR, i.e. the PR description): To send a new iteration, just add another PR comment with the contents: Need help?New contributors who want advice are encouraged to join git-mentoring@googlegroups.com, where volunteers who regularly contribute to Git are willing to answer newbie questions, give advice, or otherwise provide mentoring to interested contributors. You must join in order to post or view messages, but anyone can join. You may also be able to find help in real time in the developer IRC channel, |
|
Invalid author email in 0324283: "27655+johnnyshields@users.noreply.github.com" |
0324283 to
bc012d1
Compare
|
Invalid author email in bc012d1: "27655+johnnyshields@users.noreply.github.com" |
When `git worktree add` is run from native Windows, git writes
absolute paths into the worktree's `.git` file, into
`<commondir>/worktrees/<id>/gitdir`, and (when present) into
`<commondir>/commondir`, in `<x>:/...` or `<x>:\...` form. Reading
those files back from a non-Windows-native build of git fails because
neither form is meaningful on POSIX, so the worktree appears broken
even though every byte of it is reachable - the most common scenario
being a worktree on a Windows drive opened from inside WSL2 (where
the Windows filesystem is mounted at `/mnt/<x>/`) or from Cygwin/MSYS
(where it is `/cygdrive/<x>/`).
Add a small helper `translate_windows_path()` that recognises this
shape at the start of a path and rewrites it to the POSIX mount form
appropriate for the current build (`/cygdrive/<x>/` on Cygwin,
`/mnt/<x>/` everywhere else), converting any backslashes in the
remainder to forward slashes. Call it at the three places where
non-Windows-native git reads a recorded worktree-related path back
from disk:
* `read_gitfile_gently()` - the `gitdir:` line in a worktree's
`.git` file.
* `get_common_dir_noenv()` - the `commondir` file inside a
worktree's git directory, which points at the main repo.
* `get_linked_worktree()` - the `gitdir` file inside
`<commondir>/worktrees/<id>/`, which points at the worktree's
`.git` link.
Translation only happens for `<x>:/` or `<x>:\` where `<x>` is a
single ASCII letter; anything else is left alone. The helper is a
no-op on `GIT_WINDOWS_NATIVE` builds, where the input is already in
native form. On non-WSL Linux hosts the translation still produces
a syntactically valid POSIX path; if the corresponding `/mnt/<x>/`
mount does not exist, the next stat()/open() fails as it would have
without translation - i.e. the change cannot make a working
configuration stop working.
Add a `translate_windows_path` subcommand to the path-utils test tool
and cover it in `t/t0060-path-utils.sh`. The test fixtures pick the
expected prefix from the CYGWIN prereq so the same suite passes on
Linux and Cygwin builds.
Signed-off-by: johnnyshields <27655+johnnyshields@users.noreply.github.com>
bc012d1 to
766de21
Compare
|
Invalid author email in 947de87: "27655+johnnyshields@users.noreply.github.com" |
|
Invalid author email in 766de21: "27655+johnnyshields@users.noreply.github.com" |
766de21 to
9ea2fc1
Compare
|
Invalid author email in 947de87: "27655+johnnyshields@users.noreply.github.com" |
|
Invalid author email in 9ea2fc1: "27655+johnnyshields@users.noreply.github.com" |
A `git worktree` created by git running under one runtime is in
general not openable by git running under another, because the paths
recorded in the worktree's metadata files (`<worktree>/.git`,
`<commondir>/worktrees/<id>/gitdir`, `<commondir>/commondir`) are
written in the originating runtime's form. The two common cases:
1. Worktree created from WSL2 (or Cygwin/MSYS), opened by native
Windows git. Recorded paths look like `/mnt/c/...` or
`/cygdrive/c/...` — not parseable by Win32 APIs.
2. Worktree created from native Windows, opened by WSL2 / Cygwin /
MSYS2 git. Recorded paths look like `C:/...` or `C:\...` — not
valid POSIX paths.
In either case the worktree appears broken even though every byte of
it is reachable from the reader.
Add a single helper `translate_windows_path()` that rewrites recorded
paths to the form expected by the current build. Direction is
selected at compile time, not at runtime:
* On `GIT_WINDOWS_NATIVE` builds, `/mnt/<x>/...` or
`/cygdrive/<x>/...` (where `<x>` is a single ASCII letter
followed by `/`, `\`, or end-of-string) are rewritten in place
to `<x>:/...`.
* On other builds, `<x>:/...` and `<x>:\...` are rewritten to the
mount form for this runtime: `/<x>/...` on MSYS2,
`/cygdrive/<x>/...` on real Cygwin, `/mnt/<x>/...` everywhere
else (the WSL2 default; harmless on hosts where `/mnt/<x>/` is
not a Windows-drive mount, because the translated path simply
fails to resolve, no worse than the unparseable input).
Backslashes in the remainder are normalised to forward slashes.
Multi-character segments (`/mnt/storage`, `/cygdrive/usr`) and
digit-prefixed mounts pass through untouched, so legitimate POSIX
paths under these prefixes are never disturbed.
Wire the helper into the four sites that read recorded worktree
path metadata:
* `read_gitfile_gently()` — the `gitdir:` line in a worktree's
`.git` file.
* `get_common_dir_noenv()` — the `commondir` file inside a
worktree's git directory.
* `get_linked_worktree()` — the `gitdir` file inside
`<commondir>/worktrees/<id>/`.
* `should_prune_worktree()` — re-reads the same file when deciding
prunability; without translation, a cross-runtime worktree
would be marked `prunable gitdir file points to non-existent
location` even when the listing succeeded.
Add tests:
* `t/t0060-path-utils.sh` exercises the helper directly via a new
`translate_windows_path` subcommand of `test-tool path-utils`,
covering both translatable shapes and shapes that must remain
untouched. The expected mount root is selected from `uname -s`,
so the suite passes on Cygwin, MSYS2, and Linux/WSL builds.
* `t/t0042-wsl-mnt-path.sh` (MINGW-gated) exercises all four
read sites end-to-end using a real worktree whose recorded paths
have been rewritten in `/mnt/<x>/` form, mimicking git running
inside WSL2.
9ea2fc1 to
84281bf
Compare
|
Invalid author email in 947de87: "27655+johnnyshields@users.noreply.github.com" |
|
Invalid author email in 84281bf: "27655+johnnyshields@users.noreply.github.com" |
|
/allow |
|
User johnnyshields already allowed to use GitGitGadget. |
The Problem
Worktree
gitdirpaths do not work when a filesystem is shared between WSL Linux and Windows.If I create a git worktree using Windows Git, then git inside WSL Linux cannot resolve its path, because the
gitdirspecifier will be a Windows-style pathC:\repo\...rather than aPOSIX-style /mnt/c/repo...The vice-versa issue also applies, Windows Git can't read
gitdirfrom a worktree in created in WSL.As background, I use a hybrid Windows and WSL Ubuntu setup on the same machine; it works nearly perfectly except for the above issue, specifically with worktrees.
Root Cause
Git worktrees use a bi-directional link that include absolute paths in the
gitdirvalue to point from the main repo to the worktree repo, and vice versa. It is specifically thisgitdirvalue that needs Windows-to-POSIX path translation handling. (Nothing else besides this path requires additional logic for worktrees to work across WSL and Windows native.)When
git worktree addis run from native Windows, git writes absolute paths into the worktree's.gitfile, into<commondir>/worktrees/<id>/gitdir, and (when present) into<commondir>/commondir, in<x>:/...or<x>:\...form. Reading those files back from a non-Windows-native build of git fails because neither form is meaningful on POSIX, so the worktree appears broken even though every byte of it is reachable - the most common scenario being a worktree on a Windows drive opened from inside WSL2 (where the Windows filesystem is mounted at/mnt/<x>/) or from Cygwin/MSYS (where it is/cygdrive/<x>/).The Solution (What this Patch Does)
This patch adds a small helper
translate_windows_path()that recognizes this shape at the start of a path.GIT_WINDOWS_NATIVEis defined, this helper rewrites it to the Windows path<x>:/.GIT_WINDOWS_NATIVEis not defined, this helper rewrites it to the POSIX mount form appropriate for the current build (/cygdrive/<x>/on Cygwin,/<x>on MSYS2,/mnt/<x>/everywhere else), converting backslashes to forward slashes.This helper is called everywhere git reads a recorded worktree-related path back from disk:
read_gitfile_gently()- thegitdir:line in a worktree's.gitfile.get_common_dir_noenv()- thecommondirfile inside a worktree's git directory, which points at the main repo.get_linked_worktree()- thegitdirfile inside<commondir>/worktrees/<id>/, which points at the worktree's.gitlink.Translation only happens for
<x>:/or<x>:\where<x>is a single ASCII letter; anything else is left alone. The helper is a no-op onGIT_WINDOWS_NATIVEbuilds, where the input is already in native form. On non-WSL Linux hosts the translation still produces a syntactically valid POSIX path; if the corresponding/mnt/<x>/mount does not exist, the next stat()/open() fails as it would have without translation - i.e. the change cannot make a working configuration stop working.Why this is Safe
This PR is safe because:
<x>:\where<x>is a single ASCII letter (e.g.C:\). Anything else (e.g.http://) is left alone. The helper also doesn't evaluate/collapse relative paths, so it doesn't introduce any path-traversal vectors that aren't existing today.Known Limitations / Caveats
TLDR; this PR is targeted to solve the pain point as it affects the 99% majority of "happy-path" Windows-WSL shared filesystem users. It is intended to gracefully degrade in the 1% of non-happy-path edge cases.
/mnt/<x>as their Linux mountpoint for Windows drives, with/mnt/being hardcoded. This is a reasonable assumption because (1) every currently available WSL Linux distro uses/mnt/by default (even non-strict-FHS distros like NixOs), and (2) supposing a user renames their/mnt/mountpoint to something else, Git will simply gracefully degrade to not detect the cross-OS filesystem path. I don't think there's a real need to support this edgecase, but if we are seriously concerned about it we could expose a git config such aswslMountPath = /my/path/c/asC:\. Technically this could clash with/c/being a path on Linux, but the odds of this happening in practice are extremely remote: the user would have to (A) be using git on Windows, and (B) have a worktree in a Windows visible path, and (C) have it's parent repo in a completely separate Windows-invisible path, and (D) having an identical clashing path in your Windows filesystem.#ifdefto change the guts of thetranslate_windows_path()for POSIX vs. non-POSIX. Conceptually I think this works in this case, as it minimizes the number of call sites (essentially the method is "do the right thing depending on OS") but if this way of coding is not kosher for the project standards I am happy to split it intotranslate_windows_to_posix_path()andtranslate_posix_to_windows_path()functions. It will mean more#ifdefsare added downstream.Should this be a PR for "Git on Windows" instead?
I think no, because in order to support reading Windows paths from WSL (Windows-to-POSIX) we need to apply a patch to Linux Git. Since the equivalent POSIX-to-Windows patch lives in the exact same place in the code, it's best to just do them both together to avoid conflicts.