BIP138: Compact Encryption Scheme for Non-seed Wallet Data#1951
BIP138: Compact Encryption Scheme for Non-seed Wallet Data#1951pythcoiner wants to merge 2 commits into
Conversation
2a6e241 to
d9d02ff
Compare
|
thanks for the review! will address comments tmr! |
In general nonce reuse is unsafe because if you make multiple backups over time, e.g. as you add more transaction labels, you would be reusing the nonce with different message. By including the However it still seems unwise to mess with cryptographic standards. It doesn't seem worth the risk for saving 32 bytes on something that's going to be at least a few hundred bytes for a typical multisig. |
|
Concept ACK, seems adjacent to how some lightning tools enable users to recover SCB's with just their seed to identify and decrypt the backup. Makes sense for descriptors to have something similar. |
1e4ca34 to
3b6b6ad
Compare
|
Concept ACK |
|
(not yet finish addressing comments) |
|
Hi @pythcoiner, By coincidence, two weeks ago I started working on a proposal for a "Standard Encrypted Wallet Payload" to be placed inside an "Encrypted Envelope". The "Wallet Payload" contains descriptors and metadata but can also act as a full wallet backup including transactions, UTXOs and addresses. The proposal is very much a work in progress. I only just found this discussion so am reading through it to compare it to my proposal. The descriptor backup in the "Wallet Payload" of my proposal seems to have some overlap with the BIP proposed here. If there is too much overlap I may reconsider progressing with my proposal. As mentioned, my proposal is very much a work in progress but the wallet payload proposal can be found here: https://gist.github.com/KeysSoze/7109a7f0455897b1930f851bde6337e3 Maybe jump to the test vector section to see what a basic backup of a descriptor and some meta data would look like prior to encryption. https://gist.github.com/KeysSoze/7109a7f0455897b1930f851bde6337e3#test-vectors As my proposal is designed to be modular and extensible the encryption envelopes may be extended to offer Multiparty Encryption and Authentication. See: I have already started documenting an encryption envelope that uses AES-256-GCM and password protection: https://gist.github.com/KeysSoze/866d009ccd082edf6802df240154b20d I have not written a reference implementation yet but there are well established python and Rust libraries for CBOR and COSE that should make implementing the BIPs relatively simple. |
ab0d14d to
2ce692d
Compare
Hi @KeysSoze, this work seems more related/parallel to the But I've adopted a slightly different approach by simply using JSON. FYI we already implemented this wallet backup format in Liana wallet and I plan to work on a BIP proposal relatively soon. |
7b1acc6 to
4e9b864
Compare
|
Thanks @murchandamus for the review, I addressed your comments and reworked the spec:
|
128e70a to
4e01fa5
Compare
|
Assigned BIP138 |
4e01fa5 to
e9858e8
Compare
There was a problem hiding this comment.
I just skimmed the proposal another time, this time without diving into the details.
Generally, this proposal is starting to get fairly mature.
I’d like to invite some people that commented in the Delving discussion or are working on similar topics to give it another review. E.g., @bigspider, @Sjors, @reardencode, @craigraw, @seedhammer, @achow101, and @darosior, but please feel free to ignore if you don’t have time.
I notice that there isn’t a dedicated backwards compatibility section and would like to ask for one to be added. Please feel free to squash my number assignment commit into your commit (no need to give me credit).
What is your perspective on this submission, do you still have planned work, do you have some ideas for reviewers that should take another look before it is published?
| ``` | ||
| BIP: 138 | ||
| Layer: Applications | ||
| Title: Compact encryption scheme for non-seed wallet data |
There was a problem hiding this comment.
Nit: Personally, I prefer the titles of BIPs to be title case as in the PR name, please feel free to disregard if you feel otherwise.
|
|
||
| ### Motivation | ||
|
|
||
| Losing the **wallet descriptor** (or **wallet policy**) is just as catastrophic as |
There was a problem hiding this comment.
Losing the funds and privacy vs losing privacy does not seem equivalent. ;)
| Losing the **wallet descriptor** (or **wallet policy**) is just as catastrophic as | |
| Losing the **wallet descriptor** (or **wallet policy**) is almost as catastrophic as |
|
|
||
| This proposal targets output script descriptors (BIP-0380) and policies (BIP-0388), but the | ||
| scheme also works for labels (BIP-0329) and other wallet metadata like | ||
| [wallet backup metadata](https://github.com/bitcoin/bips/pull/2130). |
There was a problem hiding this comment.
This should now perhaps be replaced by a link to BIP139
|
|
||
| ## Specification | ||
|
|
||
| Note: in the followings sections, the operator ⊕ refers to the bitwise XOR operation. |
There was a problem hiding this comment.
Extra space
| Note: in the followings sections, the operator ⊕ refers to the bitwise XOR operation. | |
| Note: in the followings sections, the operator ⊕ refers to the bitwise XOR operation. |
|
BIP number hooray! I still plan to update my own implementation and then go through the spec here once more. |
|
Extending the encryption payload to things that are updated over time (as compared to static data like the descriptor/wallet policy) increases the risk of side channels. For example, a software wallet that updates the cloud-based backup every time a transaction is received would indeed leak a non-trivial amount of information to the hosting provider. This seems especially a concern since things like BIP-329 wallet labels are included in the backup in this proposal. One thing that might help reduce the amount of information leaked is padding, and possibly a minimum size. If the typical size of the backup is expected to be - say - below 10kb, padding anything smaller to 10kb would make all encryptions indistinguishable. An approach to generalize to arbitrary sizes could be to round up to the smallest size Some side channels depend rather on wallet behavior. Therefore, it is probably unavoidable to leave some of the responsibility for limiting the impact to wallet implementations. A paragraph explaining the risks and the recommendations might be useful. |
|
If we add padding or minimum size support, then I suggest we make it optional, and only recommend / required for things that are updated over time. That way users who only want to backup their descriptors, don't have to copy-paste giant blobs. @pythcoiner are you sure all test vectors have been updated? My agent thinks |
not yet updated, I'll finalize next week |
I agree on this, I'll have a look next week also for the padding |
Add the complete encryption and encoding layer for BIP138 encrypted backups: Encryption: - EncryptChaCha20Poly1305: encrypt with ChaCha20-Poly1305 AEAD - DecryptChaCha20Poly1305: decrypt and verify authentication tag Backup creation and encoding: - CreateEncryptedBackup: create backup from descriptor and plaintext - EncodeEncryptedBackup: encode to binary format - EncodeEncryptedBackupBase64: encode to base64 string - DecodeEncryptedBackup: decode from binary format - DecodeEncryptedBackupBase64: decode from base64 string Decryption: - DecryptBackupWithKey: attempt decryption using a single public key - DecryptBackupWithDescriptor: try all keys from a descriptor The format uses 6-byte magic "BIP138", version byte, derivation paths, individual secrets (ci values for key recovery), and ChaCha20-Poly1305 encrypted payload containing content type metadata and user data. The encryption-secret and full-backup test vectors intentionally use the assigned BIP138 domain tags and magic. At the time this was written, bitcoin/bips#1951 still contained stale BIPXXX-tagged expected values for these vectors, so the local vectors deviate from the current draft outputs until the BIP vectors are regenerated upstream.
Add the complete encryption and encoding layer for BIP138 encrypted backups: Encryption: - EncryptChaCha20Poly1305: encrypt with ChaCha20-Poly1305 AEAD - DecryptChaCha20Poly1305: decrypt and verify authentication tag Backup creation and encoding: - CreateEncryptedBackup: create backup from descriptor and plaintext - EncodeEncryptedBackup: encode to binary format - EncodeEncryptedBackupBase64: encode to base64 string - DecodeEncryptedBackup: decode from binary format - DecodeEncryptedBackupBase64: decode from base64 string Decryption: - DecryptBackupWithKey: attempt decryption using a single public key - DecryptBackupWithDescriptor: try all keys from a descriptor The format uses 6-byte magic "BIP138", version byte, derivation paths, individual secrets (ci values for key recovery), and ChaCha20-Poly1305 encrypted payload containing content type metadata and user data. The encryption-secret and full-backup test vectors intentionally use the assigned BIP138 domain tags and magic. At the time this was written, bitcoin/bips#1951 still contained stale BIPXXX-tagged expected values for these vectors, so the local vectors deviate from the current draft outputs until the BIP vectors are regenerated upstream.
Add the complete encryption and encoding layer for BIP138 encrypted backups: Encryption: - EncryptChaCha20Poly1305: encrypt with ChaCha20-Poly1305 AEAD - DecryptChaCha20Poly1305: decrypt and verify authentication tag Backup creation and encoding: - CreateEncryptedBackup: create backup from descriptor and plaintext - EncodeEncryptedBackup: encode to binary format - EncodeEncryptedBackupBase64: encode to base64 string - DecodeEncryptedBackup: decode from binary format - DecodeEncryptedBackupBase64: decode from base64 string Decryption: - DecryptBackupWithKey: attempt decryption using a single public key - DecryptBackupWithDescriptor: try all keys from a descriptor The format uses 6-byte magic "BIP138", version byte, derivation paths, individual secrets (ci values for key recovery), and ChaCha20-Poly1305 encrypted payload containing content type metadata and user data. The encryption-secret and full-backup test vectors intentionally use the assigned BIP138 domain tags and magic. At the time this was written, bitcoin/bips#1951 still contained stale BIPXXX-tagged expected values for these vectors, so the local vectors deviate from the current draft outputs until the BIP vectors are regenerated upstream.
Add the complete encryption and encoding layer for BIP138 encrypted backups: Encryption: - EncryptChaCha20Poly1305: encrypt with ChaCha20-Poly1305 AEAD - DecryptChaCha20Poly1305: decrypt and verify authentication tag Backup creation and encoding: - CreateEncryptedBackup: create backup from descriptor and plaintext - EncodeEncryptedBackup: encode to binary format - EncodeEncryptedBackupBase64: encode to base64 string - DecodeEncryptedBackup: decode from binary format - DecodeEncryptedBackupBase64: decode from base64 string Decryption: - DecryptBackupWithKey: attempt decryption using a single public key - DecryptBackupWithDescriptor: try all keys from a descriptor The format uses 6-byte magic "BIP138", version byte, derivation paths, individual secrets (ci values for key recovery), and ChaCha20-Poly1305 encrypted payload containing content type metadata and user data. The encryption-secret and full-backup test vectors intentionally use the assigned BIP138 domain tags and magic. At the time this was written, bitcoin/bips#1951 still contained stale BIPXXX-tagged expected values for these vectors, so the local vectors deviate from the current draft outputs until the BIP vectors are regenerated upstream.
Add the complete backup structure, binary/base64 encoding, backup creation, and decryption helpers for BIP138 encrypted backups. The format uses the BIP138 magic, version byte, derivation paths, individual secrets, encryption algorithm, nonce, and ChaCha20-Poly1305 encrypted payload. The encrypted payload contains content type metadata followed by user data. Add full-backup test vectors and roundtrip tests covering encode, decode, descriptor-derived encryption, descriptor-derived decryption, base64 encoding, and wrong-key failure. The encryption-secret and full-backup test vectors intentionally use the assigned BIP138 domain tags and magic. At the time this was written, bitcoin/bips#1951 still contained stale BIPXXX-tagged expected values for these vectors, so the local vectors deviate from the current draft outputs until the BIP vectors are regenerated upstream.
|
I think we should have this BIP specify how descriptors are stored. They are the main use case for the encryption scheme and they're essential to it. There's also no other BIP that does this, BIP329 merely adds them as annotations. Conversely I think BIP139 (#2130) is trying to do too much. Here's a draft paragraph that I plan to implement in Sjors/bitcoin#109 for Bitcoin Core import and export: #### BIP380 Descriptor Backup Content
When `CONTENT` is `TYPE = 0x01` with `DATA = 0x017c` (BIP380), `PLAINTEXT`
is UTF-8 encoded BIP380 descriptor backup content. It is either:
- A text descriptor backup document.
- A JSON descriptor backup document, if the first character is `{`.
Test vectors are in
[`bip380_descriptor_backup.txt`](./bip-0138/test_vectors/bip380_descriptor_backup.txt)
and
[`bip380_descriptor_backup.json`](./bip-0138/test_vectors/bip380_descriptor_backup.json).
Descriptor strings MUST NOT contain private key material and SHOULD include a
checksum.
##### Text Descriptor Backup Documents
A text descriptor backup document contains one BIP380 output script descriptor
per line. Empty lines and lines starting with `#` MUST be ignored. Each
descriptor MUST use BIP389 multipath key expressions, where `/<0;1>` means
receive and change, respectively.
##### JSON Descriptor Backup Documents
The descriptor backup document is a JSON object with the following fields:
- `version`: integer. This specification defines version `1`.
- `descriptor_sets`: array of descriptor set objects.
Each descriptor set describes BIP380 output script descriptors belonging to one
logical account or script family.
##### Descriptor Set Fields
`descriptor` is a required string containing a BIP380 output script descriptor.
For BIP389 multipath descriptors, `/<0;1>` means receive and change,
respectively. The `change_descriptor` field MUST NOT be used.
For descriptors without BIP389 multipath key expressions, `descriptor` is the
receive descriptor and `change_descriptor` is the change descriptor.
If optional boolean `archived` is `true`, importing wallets SHOULD NOT use the
descriptor set for new address generation.
`range` is an optional two-element array `[start, end]`, inclusive, describing
the ranged derivation indexes covered by the descriptor set.
`birth_time` is an optional integer Unix timestamp in seconds, indicating a
lower bound for when the descriptor set may have received funds. Importing
wallets MAY use this value as a scanning hint.
Test vectors# BIP380 descriptor backup text vector
wpkh([d34db33f/84h/1h/0h]tpubDC5FSnBiZDMmhiuCmWAYsLwgLYrrT9rAqvTySfuCCrgsWz8wxMXUS9Tb9iVMvcRbvFcAHGkMD5Kx8koh4GquNGNTfohfk7pgjhaPCdXpoba/<0;1>/*)
[
{
"description": "Descriptor with change descriptor",
"valid": true,
"document": {
"version": 1,
"descriptor_sets": [
{
"archived": false,
"birth_time": 1710000000,
"range": [0, 999],
"descriptor": "wpkh([d34db33f/84h/1h/0h]tpubDC5FSnBiZDMmhiuCmWAYsLwgLYrrT9rAqvTySfuCCrgsWz8wxMXUS9Tb9iVMvcRbvFcAHGkMD5Kx8koh4GquNGNTfohfk7pgjhaPCdXpoba/0/*)",
"change_descriptor": "wpkh([d34db33f/84h/1h/0h]tpubDC5FSnBiZDMmhiuCmWAYsLwgLYrrT9rAqvTySfuCCrgsWz8wxMXUS9Tb9iVMvcRbvFcAHGkMD5Kx8koh4GquNGNTfohfk7pgjhaPCdXpoba/1/*)"
}
]
}
},
{
"description": "BIP389 multipath descriptor",
"valid": true,
"document": {
"version": 1,
"descriptor_sets": [
{
"archived": false,
"birth_time": 1710000000,
"range": [0, 999],
"descriptor": "wpkh([d34db33f/84h/1h/0h]tpubDC5FSnBiZDMmhiuCmWAYsLwgLYrrT9rAqvTySfuCCrgsWz8wxMXUS9Tb9iVMvcRbvFcAHGkMD5Kx8koh4GquNGNTfohfk7pgjhaPCdXpoba/<0;1>/*)"
}
]
}
}
](I'd be fine with doing that in a separate BIP too, but it should be focussed on just encoding descriptors) |
|
|
||
| #### Content | ||
|
|
||
| `CONTENT` is a variable length field defining the type of `PLAINTEXT` being encrypted, |
There was a problem hiding this comment.
I find the current BIP text unclear about how multiple data blobs are supposed to be encoded. The term CONTENT is also too ambiguous.
There was a problem hiding this comment.
Something like this would be better I think:
diff --git a/bip-0138.md b/bip-0138.md
index 2fbb2b2..1ed60bd 100644
--- a/bip-0138.md
+++ b/bip-0138.md
@@ -264,5 +264,5 @@ Implementations MUST reject empty payloads.
defined in `ENCRYPTION` where `PAYLOAD` is encoded following this format:
-`CONTENT` `PLAINTEXT`
+`CONTENT` `CONTENT` `...`
#### Integer Encodings
@@ -273,8 +273,8 @@ All variable-length integers are encoded as
#### Content
-`CONTENT` is a variable length field defining the type of `PLAINTEXT` being encrypted,
+`CONTENT` is a variable length field containing one piece of encrypted content,
it follows this format:
-`TYPE` (`LENGTH`) `DATA`
+`TYPE` `LENGTH` `DATA`
`TYPE`: 1-byte unsigned integer identifying how to interpret `DATA`.
@@ -288,16 +288,13 @@ it follows this format:
`LENGTH`: variable-length integer representing the length of `DATA` in bytes.
-For all `TYPE` values except `0x01`, `LENGTH` MUST be present.
-
`DATA`: variable-length field whose encoding depends on `TYPE`.
For `TYPE` values defined above:
- 0x00: parsers MUST reject the payload.
-- 0x01: `LENGTH` MUST be omitted and `DATA` is a 2-byte big-endian unsigned integer
- representing the BIP number that defines it.
+- 0x01: `DATA` MUST be at least 2 bytes. Its first two bytes are a big-endian unsigned
+ integer representing the BIP number that defines the remaining `DATA` bytes.
- 0x02: `DATA` MUST be `LENGTH` bytes of opaque, vendor-specific data.
-For all `TYPE` values except `0x01`, parsers MUST reject `CONTENT` if `LENGTH` exceeds
-the remaining payload bytes.
+Parsers MUST reject `CONTENT` if `LENGTH` exceeds the remaining payload bytes.
Parsers MUST skip unknown `TYPE` values less than `0x80`, by consuming `LENGTH` bytes
@@ -305,5 +302,5 @@ of `DATA`.
For unknown `TYPE` values greater than or equal to `0x80`, parsers MUST stop parsing
-`CONTENT`.[^type-upgrade]
+content items.[^type-upgrade]
[^type-upgrade]: **Why the 0x80 threshold?**
@@ -314,6 +311,7 @@ For unknown `TYPE` values greater than or equal to `0x80`, parsers MUST stop par
#### BIP380 Descriptor Backup Content
-When `CONTENT` is `TYPE = 0x01` with `DATA = 0x017c` (BIP380), `PLAINTEXT`
-is UTF-8 encoded BIP380 descriptor backup content. It is either:
+When `CONTENT` is `TYPE = 0x01` and `DATA` begins with `0x017c` (BIP380),
+the remaining `DATA` bytes are UTF-8 encoded BIP380 descriptor backup content.
+It is either:
- A text descriptor backup document.Add the complete backup structure, binary/base64 encoding, backup creation, and decryption helpers for BIP138 encrypted backups. The format uses the BIP138 magic, version byte, derivation paths, individual secrets, encryption algorithm, nonce, and ChaCha20-Poly1305 encrypted payload. The encrypted payload contains content type metadata followed by user data. Add full-backup test vectors and roundtrip tests covering encode, decode, descriptor-derived encryption, descriptor-derived decryption, base64 encoding, and wrong-key failure. The encryption-secret and full-backup test vectors intentionally use the assigned BIP138 domain tags and magic. At the time this was written, bitcoin/bips#1951 still contained stale BIPXXX-tagged expected values for these vectors, so the local vectors deviate from the current draft outputs until the BIP vectors are regenerated upstream.
Add the complete backup structure, binary/base64 encoding, backup creation, and decryption helpers for BIP138 encrypted backups. The format uses the BIP138 magic, version byte, derivation paths, individual secrets, encryption algorithm, nonce, and ChaCha20-Poly1305 encrypted payload. The encrypted payload contains content type metadata followed by user data. Add full-backup test vectors and roundtrip tests covering encode, decode, descriptor-derived encryption, descriptor-derived decryption, base64 encoding, and wrong-key failure. The encryption-secret and full-backup test vectors intentionally use the assigned BIP138 domain tags and magic. At the time this was written, bitcoin/bips#1951 still contained stale BIPXXX-tagged expected values for these vectors, so the local vectors deviate from the current draft outputs until the BIP vectors are regenerated upstream.
This is a bip for encrypted backup, an encryption scheme for bitcoin wallet related metadata.
Mailing list post: https://groups.google.com/g/bitcoindev/c/5NgJbpVDgEc