how much ram do we need to store 2^30 points ,and what gpu or gpus do we need to maintain the speed of operations and lookup to be instant , like if our gpu can make 10^9 scalars ops per second how can maintain that speed and their lookup
There is no known algorithm that can lookup a non-linear key, in a dictionary structure, in instant (O(1)) time. A tree data structure retrieves keys and data in log(numKeys) steps (for example, 30 steps on average / worst-case, to find a value in a binary tree with 2**30 keys, depending on the type of the data structure). B-Trees (used by SQLite) can do it even faster, since a node can have hundreds of direct children. This is why you're most likely better off with storing such a DB on disk, and letting the DBMS take care of what's cached in RAM. Ofcourse, a bloom filter can take care of fast filtering such a lookup before it's needed.
The only way to know how much RAM is needed is to compute and store the table, because you can't know in advance the total entropy of all those keys, so it may require more or less space, depending on the clustering density of the key indexing (same prefix bytes = less used memory), and the data structure used for the lookup. Anyway, for 2**30 256-bit keys, each storing a 64-bit integer, you'd need around 40 - 50 GB at the minimum.
Running a lookup tree on a GPU spells trouble. Even if doable, the memory latency will kill any sort of performance. GPUs are compute devices, while lookups are a memory-intensive operation.