On Tue, Feb 28, 2012 at 20:04, Brandon Wiley brandon@blanu.net wrote:
However, you have a false assumption, that dictionary systems can simultaneously have all three properties of Zooko's Triangle.
This isn't an "assumption", but rather an actual claim about the properties of the specific system I proposed (within the caveats I gave).
If you want to refute it, go ahead, but do so by challenging a given property I'm claiming as applied to my proposal, rather than merely dismissing it out of hand on the basis of an axiomatic belief that it's impossible to have the conjunction.
Hashes are effectively random and so have maximum information density. Words do not have maximum information density, they have redundancy, which is why they are easier to remember and tell apart from each other than random strings. However, this comes at the cost of making the words longer. The more redundant information that you add in terms of constraints such as part of speech, the longer you will need to make the words (on average) so that they can contain this additional information. If you look at the 4 little words post you will notice that the phrases are about 5 characters longer than the IPv4 addresses.
The entire point of the encoding I propose is to do exactly this — convert something that's dense but impossible to remember into something that's less dense in expression but easier in cognitive/linguistic encoding.
However, you're making a false assumption: that string length is an appropriate measure for effective storage requirements.
It's not, when it comes to human cognition.
"Maximum information density" is a useful concept when talking about computer storage, but it is grossly misleading when talking about human memory, and the latter is what this proposal is intending to optimize for.
Of course you could make the claim that sometimes longer strings are easier to remember than shorter ones.
Indeed I do, and plenty of cogsci research supports this claim. How easy something is to remember has much more to do with exactly what the constraints are on memory (eg whether synonyms are treated as distinct), how networked it is to previously remembered information, state cues, etc.
There is an intuitive appeal to the idea that words are more memorable than hexadecimal strings (or base64 or whatever). That might be true sometimes for special cases, but there is no evidence that it is true generally or in this particular case.
Have you studied cognitive science or cognitive linguistics? This is extremely well established research.
As one example you may find more tractable, take a look at the "memory palace" technique — extremely effective, BTW, and used by people who do memory competitions. It uses exactly this kind of "expansion" to transform something that's very hard to remember (eg the complete order of a deck of cards, or a long series of phone numbers) into something that's easier (eg the various participants in a silly scenario).
My proposal does essentially the same thing, except in a way that also fulfills the secure (i.e. canonically interconvertable with a given hash) and distributed (i.e. not reliant on any centralized or even coordinated authority). It simply assigns a memorable scenario / phrase to each hash.
- Sai