On Feb 29, 2012 1:58 PM, "Sai" <tor@saizai.com> wrote:

>  For a 6 word sentence, with 8 (3b) templates, we need ~12b (4k word)
>  dictionaries for each word category.

1. You need 2^8=256 templates, not just 8, to reach 6*12+8=80 bits.

2. Having toyed with this idea in the past, let me warn that forming a 4096 word dictionary of memorable, non-colliding  words for each word category is going to be very difficult.  Too many words are semantically similar, phonetically similar, or just unfamiliar.  You might find Google Ngrams a good resource for common words; I provide a complete sorted list here:

http://kenta.blogspot.com/2012/02/lefoezyy-some-notes-on-google-books.html

Another way to go about it might be to first catalogue semantic categories (colors, animals, etc.) then list the most common (yet dissimilar) members of each category.  An attempt at 64 words is here:

http://kenta.blogspot.com/2011/10/xpmqawkv-common-words.html

I'd propose that the "right" way to do this is not just sentences, but entire semantically consistent stories, written in rhyming verse, with entropy of perhaps only a few bits per sentence.  (Prehistoric oral tradition does prove we can memorize such poems.)  However, synthesizing these seem extremely difficult, an AI problem.

3. I presume people are familiar with Bubblebabble?  It doesn't solve all the problems, but does make bit strings seem less "dense".

Ken