Re: How much entropy is lost by searching for a '01' prefix SHA512 output

Quote from: ThomasV on February 04, 2017, 03:11:33 PM

Again, you are wrong.

I am sorry to say this, but I have to say it bluntly, that it's you who is wrong.

No disrespect, I like your work, and support it 100%, it's just that I think you misunderstood my argument.

I can, and have proven this experimentally, there are 2 ways to prove it:

1) TEST HEX CHAR BITS

Code:

#!/usr/bin/env python
import hashlib
import binascii
import math
filter_bit=0
total_bit=0

def byte_to_binary(n):
return ''.join(str((n & (1 << i)) and 1) for i in reversed(range(8)))
def hex_to_binary(h):
return ''.join(byte_to_binary(ord(b)) for b in binascii.unhexlify(h))

#### total permutation = 2 ^ nestsize

for a in range(2): #1bit
for b in range(2): #2bit
for c in range(2): #3bit
for d in range(2): #4bit
for e in range(2): #5bit
for f in range(2): #6bit
for g in range(2): #7bit
for h in range(2): #8bit

tupstr = str(a)+str(b)+str(c)+str(d)+str(e)+str(f)+str(g)+str(h)
hashed = hashlib.sha512(tupstr).hexdigest()
hashed = hex_to_binary(hashed)
total_bit+=1
if hashed[0] == "0" and hashed[1] =="1":
filter_bit+=1

#####################################

print "Filtered: "+str(filter_bit)
print "Filtered Entropy: "+str(math.log(filter_bit,2))
print "Total: "+str(total_bit)
print "Total Entropy: "+str(math.log(total_bit,2))

It can be easily proven that 1 hex char = 4 bits, and the experiment result always in 2 hex bit loss if we fix the first 2 characters.

Therefore that is 2*4 =8 bit entropy loss. (because 1 hex character is 4 bits long)

This is exactly how Electrum fixed the first 2 hex bits which is equivalent to 8 bit entropy loss in Electrum.

2) SEED TEST - MONTE CARLO SIMULATION

Code:

# ADD these to mnemonic.py

container = []

def test_make_seed(self, num_bits=128, prefix=version.SEED_PREFIX):
# increase num_bits in order to obtain a uniform distibution for the last word
# bpw = math.log(len(self.wordlist), 2) # 11.0
# n = int(math.ceil(num_bits/bpw)) * bpw # 132.0
# print_error("make_seed", prefix, "adding %d bits"%n)
n = 11
my_entropy = ecdsa.util.randrange(pow(2, n))
nonce = 0
while True:
nonce += 1
i = my_entropy + nonce
seed = self.mnemonic_encode(i)
assert i == self.mnemonic_decode(seed)
if is_new_seed(seed, prefix):
break
print_error('%d words'%len(seed.split()))
return seed

def searchf(self):
seed=self.test_make_seed()
container.append(seed)
for nonce in range(0,500):
seed=self.test_make_seed()
n = len(container)
test="T"
for pp in range(0,n):
if container[pp]==seed:
test="F"
break
if test == "T":
container.append(seed)

Mnemonic().searchf()
print container

We fix the bits to 11 bits (so the seed is 1 word wrong for simplicity)

And the container loops through all seed variations for 11 bits total, and hopefully in 500 iterations it goes through all of them like a monte carlo simulation (it does).

And we will only get 8 seeds, which means that 2^11 input only yields 2^3 output, because 11-3 = 8.

We are missing 8 bits of entropy

So either way we are losing 8 bits of entropy, that DOES matter, because the possible combinations are shrunk.

In the 2nd testing method, we ought to get 2^11 combinations but we only get 2^3.

That is obvious security loss, and the 8 bits of entropy loss is real.