-
Notifications
You must be signed in to change notification settings - Fork 5
Serialization to seed & subsequent re-serialization to shares breaks shamir recover result #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Yeah, it's to do with padding, but the reason seems to be that this library is badly broken/confused. In codex32 the threshold/id/index At least, that's my best interpretation of what's going on. The data structures in this library are a real mess and combine strings, I apologize for the state of this library -- for a long time I have intended to replace all the error-correction logic with rust-bech32 0.12, which will have codex32 support. (It will not have interpolation logic, but that's easy/small and I will continue to implement it here.) However, I let rust-bech32 0.12 get scope-creeped into doing error correction, which is still not done. There is a tracking issue here rust-bitcoin/rust-bech32#189. Maybe I should just cut a release so that I can fix this library. |
|
Oh, I'm not actually blocked on rust-bech32 0.12. Indeed, the docs for 0.11 have codex32 support as an example. |
|
The Without these it won't round trip and worse recovers a different seed. Using default 0 padding on shares has some other problems:
For the BIP85 codex32 application (and that compact QR idea) I set the padding bits by CRC of the data to avoid this. That way they depend on 128-bits of unknown data and can't be assumed and are deterministic. A fast fix would be |
|
If it's only possible to construct |
This comment has been minimized.
This comment has been minimized.
smart, and does the trick! (I tried with your
I agree this is much better than passing padding around |
Thanks, I just rewrote it to save 100 lines and put
To standardize CRC padding we should:
My library passes padding around, it tests the alternate encodings by encoding every def test_from_seed_and_alternates():
"""Test Vector 4: encode secret share from seed"""
seed = bytes.fromhex(VECTOR_4["secret_hex"])
for pad_val in range(0b1111 + 1):
s = Codex32String.from_seed(seed, header="ms10leet", pad_val=pad_val)
assert str(s) == VECTOR_4["secret_s_alternates"][pad_val]
assert s.data == seed
# confirm all 16 encodings decode to same master dataGiven we leak a secret character if initial |
|
@BenWestgate if you have a rewrite feel free to open a PR -- if it's not too hard to review I'm happy to take it in. (Though "I saved 100 lines" makes me worry that it's a big diff.) But I think the correct direction to rewrite in is one where we add a rust-bech32 dependency, use that for all the checksumming and encoding stuff, and then here we add (a) utility methods, and (b) constructors and accessors for the id/threshold/share index. |
It's in python, about 400 lines excluding tests. Most of it from BIP-93, BIP-0173 or ported from this implementation. It also passes Bech32 tests.
This is exactly what I did in python: wrote a general Encoding class and then a Codex32String class with utility methods, constructors and properties. class Encoding(Enum):
"""Enumeration type to list the various supported encodings."""
CODEX32 = (CODEX32_GEN, 13, 0x10CE0795C2FD1E62A)
CODEX32_LONG = (CODEX32_LONG_GEN, 15, 0x43381E570BF4798AB26)
BECH32 = (BECH32_GEN, 6, 1)
BECH32M = (BECH32_GEN, 6, 0x2BC830A3)
def __init__(self, gen, cs_len, const):
self.gen = gen
self.cs_len = cs_len
self.const = const
def polymod(self, values: list[int], residue=1):
"""Internal function that computes the Bech32/Codex32 checksums."""
shift = 5 * (self.cs_len - 1)
mask = (1 << shift) - 1
for value in values:
top = residue >> self.shift
residue = (residue & mask) << 5 ^ value
for i, g in enumerate(self.gen):
residue ^= g if ((top >> i) & 1) else 0
return residue
def _verify_checksum(data):
"""Verify a checksum given HRP and converted data characters."""
for spec in Encoding:
if spec.polymod(data) == spec.const:
return spec
return None
def _create_checksum(values, spec: Encoding):
"""Compute the checksum values given HRP and data."""
polymod = spec.polymod(values + [0] * spec.cs_len) ^ spec.const
return [(polymod >> 5 * (spec.cs_len - 1 - i)) & 31 for i in range(spec.cs_len)]
def u5_to_bech32(data: list[int]):
"""Map list of 5-bit integers (0-31) -> bech32 data-part string."""
if not all(x in range(32) for x in data):
raise InvalidDataValue
return "".join(CHARSET[d] for d in data)
def bech32_to_u5(bech: str):
"""Map bech32 data-part string -> list of 5-bit integers (0-31)."""
if not all(x in CHARSET for x in bech[pos + 1 :]):
raise InvalidChar
return [CHARSET.find(x) for x in bech.lower()]
def bech32_hrp_expand(hrp):
"""Expand the HRP into values for checksum computation."""
return [ord(x) >> 5 for x in hrp] + [0] + [ord(x) & 31 for x in hrp]
def bech32_encode(hrp, data, spec):
"""Compute a Bech32 string given HRP and data values."""
combined = data + _create_checksum(bech32_hrp_expand(hrp) + data, spec)
return hrp + "1" + u5_to_bech32(combined)
def bech32_decode(bech: str):
"""Validate a Bech32 string, and determine HRP and data."""
if (any(ord(x) < 33 or ord(x) > 126 for x in bech)):
raise InvalidChar
if bech.lower() != bech and bech.upper() != bech:
raise InvalidCase
bech = bech.lower()
pos = bech.rfind("1")
if pos < 1 or pos > 83 or pos + 7 > len(bech): # or len(bech) > 90:
raise InvalidLength
hrp = bech[:pos]
data = bech32_to_u5(bech[pos + 1 :])
spec = _verify_checksum(bech32_hrp_expand(hrp) + data)
if spec is None:
raise InvalidChecksum
return (hrp, data[: -spec.cs_len], spec)
def codex32_decode(codex: str):
"""Validate a Codex32/Long Codex32 string, and determine HRP and data."""
hrp, data, spec = bech32_decode(codex)
if spec not in Encoding.CODEX32, Encoding.CODEX32_LONG:
raise NotCodex32Checksum
if 19 > len(data) or len(data) > 1023:
raise InvalidLength
if not codex[len(hrp) + 1].isdigit():
raise InvalidThreshold
if codex[len(hrp) + 1] == "0" and codex[len(hrp) + 6] != "s":
raise InvalidShareIndex
return hrp, data, specThen a Should I PR this python implementation? By replacing the |
What about extracting bytes data from non-"s" strings? Raise InvalidShareIndex error or output useless bytes without their padding? The PR author is able to achieve an unexpected (but inevitable) result that he can't recover the original secret when derived shares are reconstructed from bytes extracted from the original derived shares. Should our libraries:
|
|
maybe I misunderstood the scope of this application - @apoelstra can you check this last comment of mine BenWestgate/python-codex32#2 (comment) |
@BenWestgate I'm definitely in this category. If round-trips (even for derived shares) can be achieved, it should be desired property for Codex32 |
The problem you encounter is you can't construct 130-bits of data from 16 bytes for all 32 share indices because any math you do to pad, even constants like all 0s or all 1s, only works for k initial (aka encoded) strings. The derived (aka interpolated) strings break that padding because it was not a 5-bit value and so is not preserved by GF(32) interpolation the way the u5 checksum or header is. This means you need to know the last payload character for all the initial strings in order to add the correct padding to bytes extracted from derived strings. So a solution may need to define a
This allows a That's the theory anyway, the math is trickier as the design must not leave less than If you have any idea for |
I just noticed that if I import string share(s) to my app and serialize it for storage where I'm storing just seed (aka
data) and metadata (hrp, idx, threshold, id) separately. Next when I try to load storage data to Share object (viafrom_seed) and try to recover secret shares, shares before serialization provide different result compared to shares loaded from seed.My guess is that this has something to do with padding. Any idea how to fix this issue ?
I added test case proving the point: