Re: delete

Quote from: JorgeStolfi on October 04, 2014, 02:28:56 AM

...
Strictly by the definitions, the "Bitcoin Mining Problem" (BMP), the partial inversion of SHA256, is O(1), because it does not matter what the algorithm outputs when n is not 256. Hence, strictly by the definitions, BMP is in P, and therefore is in NP; but is not NP-complete. And, indeed, it could be solved by a giant table. Of course, that table is too big to be built in practice; but that obstacle is not relevant for the theory.

I am not aware of any theory that would give a useful estimate of the practical difficulty of solving some problem with fixed-size inputs, like the BMP. At best, one can prove theoretically that a certain specific shortcut will not work; but there is no way to prove that there are no practical shortcuts. The only "proof" that BMP is hard is that many people have tried to find such a practical shortcut, for many years, with no success.

It is a terrible misconception to think that "exponential cost" is bigger (or grows faster) than "polynomial cost". That is true only for "sufficiently large n", but the theory does not say when n is "sufficiently large"; it could be true only for n greater than 10⁵⁰⁰...

Indeed you are correct that the theory only makes proofs about the asymptotic complexity. It is analogous to Random Oracle proofs of cryptographic properties, because it is in fact proven that no perfect Random Oracle can be created.

We can prove nothing in this universe with absolute certainty because we are not an external observer (study the assumptions of the Shannon-Nyquist theorem for example). I get into the philosophical discussion of this in my article about the Universe.

However, I can analyze at which ranges of n algorithm resource costs scale more like n^k versus n log n. Maybe I can even write a proof for a specific algorithm that it scales approximately like some function over a certain range of n. Are you saying that can't be done? I've never tried, I just intuitively looked at the code and could see it was scaling exponentially and one line was the culprit.

You are correct that if our best known algorithms are impractical to implement with current resources, it doesn't mean there isn't any possible algorithm that will. But here I want to take you back to my discovery about the edge of the universe. I was toying around with the duality of the Bottom and Top type in the two difference classes of programming languages and it made me realize that time and the universe is co-inductive and thus the finality or edge is indeterminate, which is analogous to undecidable in the Halting problem.

Thus in the real world all we have are the observations we've made, i.e. the calls we made to our co-inductive universe.

In summary, you were talking about what is provable and I was talking about what is observable. That is why we were talking past each other. And I used the wrong terminology, because O() and NP/P have very specific meanings that are essentially meaningless, because they can't ever be observed. I get your point now, yes indeed.

Edit:

Quote from: JorgeStolfi on October 04, 2014, 02:28:56 AM

I have seen several computer scientists and students who, trusting that "O(log n) is much better than O(n)", tried to use this method for things like pattern matching, where each site is an observed pattern, represented by a point in some high-dimensional space. Only to find that, for that application, the quad-tree method is often much slower than plain sequential search.

In the real-world there is no absolute referential transparency, thus we can never be sure there aren't other input variables other than n that impact the performance.

Quote from: JorgeStolfi on October 04, 2014, 02:28:56 AM

It turns out that the quad-tree begins to work only if the number of sites n is at least 2^d. It works quite well for n = 1000 and d = 2 or 3; but, for d = 30, it is a total waste of time unless n is a lot greater than a billion. Not "smallish" at all...

The big-O notation was invented by pure mathematicians, to describe the limiting behavior of functions from reals to reals, such as the solutions of differential equations. It was adapted to algorithm analysis by Don Knuth, IIRC. However, for Knuth it was only a working tool, that would give an idea of the type of formula that had to be determined. For example, if the number t(n) of inner block executions was found to be O(n²), one could write t(n) ≤ A n² + B n + C, and then look for suitable values of A, B, and C. If t(n) was found to be exponential, one would look for a formula like t(n) ≤ A × Bⁿ. An so on.

However, finding those constants requires lots of boring algebra, so Knuth's lazy followers tacitly agreed that finding the big-O of the running time was good enough. And then came the really lazy ones, who decided that "polynomial time" was all one needed to know.

Ah so you do agree with me that we can apply the concept of approximated scaling to analysis of algorithms, with caveats. Yeah we can't prove anything absolutely.