Skip to content

runsascoded/dffs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

75 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

dffs

Pipe and diff files: execute shell pipelines, diff/compare/join results.

dffs on PyPI

Apply a transform to both sides of a comparison, then diff the results — like Haskell's on:

a ─→ [f] ─→ f(a) ─┐
                    ├─→ [cmp] ─→ result
b ─→ [f] ─→ f(b) ─┘

Three CLIs apply this pattern with different comparators:

CLI cmp Description
git-diff-x diff Diff a Git-tracked file at two commits, through a pipeline
diff-x diff Diff two files through a pipeline
comm-x comm Set operations on two files through a pipeline

Install

pip install dffs

CLIs

git-diff-x

Diff a Git-tracked file at two commits (or one commit vs. the worktree), after piping both versions through a command pipeline.

Examples

JSON: readable diffs of compact files

A config file is stored as compact JSON (one line, no whitespace). In commit 38856a0, a few fields were updated, but git diff is unreadable:

-{"name":"myapp","version":"1.2.3","settings":{"debug":false,"logLevel":"info","maxRetries":3,...}}
+{"name":"myapp","version":"1.3.0","settings":{"debug":false,"logLevel":"warn","maxRetries":3,...}}

git diff-x with jq pretty-prints both sides before diffing, so you see exactly what changed:

git diff-x -R 38856a0 'jq .' example/config.json
# 3c3
# <   "version": "1.2.3",
# ---
# >   "version": "1.3.0",
# 6c6
# <     "logLevel": "info",
# ---
# >     "logLevel": "warn",
# 11c11
# <       "analytics": false
# ---
# >       "analytics": true

This also normalizes out any whitespace or key-ordering differences — only actual value changes appear.

CSV: ignoring row order

A CSV's rows were reordered in commit 7f0c468:

git diff --stat 7f0c468^..7f0c468 -- example/data.csv
# example/data.csv | 4 ++--
# 1 file changed, 2 insertions(+), 2 deletions(-)

git diff shows a noisy patch, but sorting first reveals the data is unchanged:

git diff-x -R 7f0c468 sort example/data.csv
# (no output — the sorted contents are identical)
CSV: comparing column names

In commit 0bdec95, the city column was replaced with role. Extract and compare just the header:

git diff-x -R 0bdec95 "head -1 | tr , '\n'" example/data.csv
# 3c3
# < city
# ---
# > role
Comparing line counts

Compare line-count (wc -l) of this README, before and after commit 8b7a761:

git-diff-x -R 8b7a761 'wc -l' README.md
# 1c1
# < 16
# ---
# > 206
More examples
# Compare the number of lines (`wc -l`) in file `foo` at the previous vs. current commit
# (`-R HEAD` is equivalent to `-r HEAD^..HEAD`).
git diff-x -R HEAD wc -l foo

# Colorized (`-c`) diff of `md5sum`s of `foo`, at HEAD (last committed value) vs. the current
# worktree content.
git diff-x -c md5sum foo

# Use `-` to separate pipeline commands from paths (when more than one path is to be diffed),
# e.g. this compares the largest 10 numbers in `file{1,2}` (HEAD vs. worktree):
git diff-x 'sort -rn' head - file1 file2

Usage

git-diff-x
# Usage: git-diff-x [OPTIONS] [exec_cmd...] [<path> | - [paths...]]
#
#   Diff files at two commits, or one commit and the current worktree, after
#   applying an optional command pipeline.
#
#   Examples:
#
#   # Compare the number of lines (`wc -l`) in file `foo` at the previous vs.
#   current commit (`-r HEAD^..HEAD`):
#
#   git diff-x -r HEAD^..HEAD wc -l foo
#
#   # Colorized (`-c`) diff of `md5sum`s of `foo`, at HEAD (last committed
#   value) vs. the current worktree content:
#
#   git diff-x -c md5sum foo
#
#   # Use `-` to separate pipeline commands from paths (when more than one path
#   is to be diffed), e.g. this compares the largest 10 numbers in `file{1,2}`
#   (HEAD vs. worktree):
#
#   git diff-x 'sort -rn' head - file1 file2
#
# Options:
#   -c, --color / --no-color     Colorize the output (default: auto, based on
#                                TTY)
#   -r, --refspec TEXT           <commit 1>..<commit 2> (compare two commits) or
#                                <commit> (compare <commit> to the worktree)
#   -R, --ref TEXT               Diff a specific commit; alias for `-r
#                                <ref>^..<ref>`
#   -t, --staged                 Compare HEAD vs. staged changes (index)
#   -P, --pipefail               Check all pipeline commands for errors (like
#                                bash's `set -o pipefail`); default only checks
#                                last command
#   -s, --shell-executable TEXT  Shell to use for executing commands; defaults
#                                to $SHELL
#   -S, --no-shell               Don't pass `shell=True` to Python
#                                `subprocess`es
#   -U, --unified INTEGER        Number of lines of context to show (passes
#                                through to `diff`)
#   -V, --version                Show version and exit
#   -v, --verbose                Log intermediate commands to stderr
#   -w, --ignore-whitespace      Ignore whitespace differences (pass `-w` to
#                                `diff`)
#   -x, --exec-cmd TEXT          Command(s) to execute before invoking `comm`;
#                                alternate syntax to passing commands as
#                                positional arguments
#   --help                       Show this message and exit.

diff-x

The underlying building block — same concept, but for two arbitrary files (not Git commits).

Examples

Given two similar JSON objects, where one is compact and the other is pretty-printed:

echo '{"a":1,"b":2}' > 1.json
echo '{"a":1,"b":3}' | jq > 2.json

diff {1,2}.json outputs the entirety of both objects:

1c1,4
< {"a":1,"b":2}
---
> {
>   "a": 1,
>   "b": 3
> }

diff-x 'jq .' {1,2}.json pretty-prints each side before diffing:

3c3
<   "b": 2
---
>   "b": 3

Usage

diff-x
# Usage: diff-x [OPTIONS] [exec_cmd...] <path1> <path2>
#
#   Diff two files after running them through a pipeline of other commands.
#
# Options:
#   -c, --color / --no-color     Colorize the output (default: auto, based on
#                                TTY)
#   -P, --pipefail               Check all pipeline commands for errors (like
#                                bash's `set -o pipefail`); default only checks
#                                last command
#   -s, --shell-executable TEXT  Shell to use for executing commands; defaults
#                                to $SHELL
#   -S, --no-shell               Don't pass `shell=True` to Python
#                                `subprocess`es
#   -U, --unified INTEGER        Number of lines of context to show (passes
#                                through to `diff`)
#   -V, --version                Show version and exit
#   -v, --verbose                Log intermediate commands to stderr
#   -w, --ignore-whitespace      Ignore whitespace differences (pass `-w` to
#                                `diff`)
#   -x, --exec-cmd TEXT          Command(s) to execute before invoking `comm`;
#                                alternate syntax to passing commands as
#                                positional arguments
#   --help                       Show this message and exit.

comm-x

comm performs set intersection/difference; comm-x lets you run a pipeline on each input first.

Examples

Given two similar lists of numbers, but in different orders:

seq 10 > 1.txt
seq 10 -2 0 > 2.txt

comm outputs gibberish, because the files aren't in sorted order:

comm 1.txt 2.txt
# 1
# 	10
# 2
# 3
# 4
# 5
# 6
# 7
# 		8
# comm: file 2 is not in sorted order
# 	6
# 	4
# 	2
# 	0
# 9
# comm: file 1 is not in sorted order
# 10
# comm: input is not in sorted order

comm-x sort sorts each file first:

comm-x sort 1.txt 2.txt
# 	0
# 1
# 		10
# 		2
# 3
# 		4
# 5
# 		6
# 7
# 		8
# 9

Shell Integration

Add convenient aliases to your shell by adding this to your ~/.bashrc or ~/.zshrc:

eval "$(dffs-shell-integration bash)"

This provides aliases with the following suffix conventions:

  • c = color, n = no-color, w = ignore-whitespace
  • r = ref (-R, compare commit to parent), s = refspec (-r), t = staged (--staged)

Examples

Alias Expands to
dx diff-x
dxc diff-x --color
cx comm-x
gdx git diff-x
gdxc git diff-x --color
gdxr git diff-x -R (compare commit to parent)
gdxs git diff-x -r (explicit refspec)
gdxt git diff-x --staged
gdxcr git diff-x --color -R
gdxtw git diff-x --staged -w

To see all available aliases:

dffs-shell-integration bash

To load only aliases for a specific command:

eval "$(dffs-shell-integration bash git-diff-x)"

About

Execute shell pipelines against multiple inputs, diff/compare/join results

Resources

License

Stars

Watchers

Forks

Contributors