I focused on small table, because i wanted to have big empty space for storing the biggest bloomfilter possible.
Bigger bloom filter is not necessarily better. A better approach is to have multiple smaller bloom filters, based on different parts of the hash lookup. If a 512MB bloom filter has a hit rate of 0.01, a single 1024MB bloom filter will have a hit rate of 0.005, but two independent 512MB bloom filters will have a hit rate of 0.01^2 = 0.0001
i have to experiment that. It's interesting
but look here the probability of hit rate/number of hash function dont seems to follow a linear rule...
with this bloom filter calculator if you increase the number of hash function you can increase the probability of hit rate too.
https://hur.st/bloomfilter/?n=80M&p=1.0E-7&m=&k=40So what's the difference beetween one bloomfilter of 1G with k=40 (for ex) and two bloomfilter of 512M with k=20 both?