Re: [tor-dev] Mnemonic 80-bit phrases (proposal)

21 Mar 2012

      On Tue, Mar 20, 2012 at 20:11, Ken Takusagawa II
ken.takusagawa.2@gmail.com wrote:
...

You need 2^8=256 templates, not just 8, to reach 6*12+8=80 bits.

We won't know for sure how it hashes out until we make both the
dictionaries and the syntax generator. The ambiguity was intentional.
But yes, it may well use a number of generated templates. We're
thinking of making it symbolic expansion based, which is more
efficient on bits but also more complicated to describe before it's
fixed (and it'll require a parser library).
...

Having toyed with this idea in the past, let me warn that forming a 4096

word dictionary of memorable, non-colliding  words for each word category is
going to be very difficult.  Too many words are semantically similar,
phonetically similar, or just unfamiliar.
Our intention currently is to first take candidate dictionaries from
WordNet, and use a combination of WordNet and Google 1-gram frequency
data as part of the cutoff for whether words are adequately familiar.
(N-grams with n >= 2 are rather irrelevant to our needs, AFAICT.)
...
http://kenta.blogspot.com/2012/02/lefoezyy-some-notes-on-google-books.html
Thanks; that could be useful.
...
Another way to go about it might be to first catalogue semantic categories
(colors, animals, etc.) then list the most common (yet dissimilar) members
of each category.  An attempt at 64 words is here:
This is something that WordNet has already done.
...
http://kenta.blogspot.com/2011/10/xpmqawkv-common-words.html
I think you omit far more common words, which you shouldn't — eg air
water coal man house etc.
But quibbling at this level is pointless; we'll need to be dealing
with dictionaries mostly on the order of a few thousand words, sorted
by *constituent types*, not be semantic categories. (E.g. one
dictionary would be "nouns that can be the target of a transitive
verb".)
...
I'd propose that the "right" way to do this is not just sentences, but
entire semantically consistent stories, written in rhyming verse, with
entropy of perhaps only a few bits per sentence.  (Prehistoric oral
tradition does prove we can memorize such poems.)  However, synthesizing
these seem extremely difficult, an AI problem.
I think it's currently impossible to do that, and furthermore, that
it's *not* Right even if you could — because it would violate a key
constraint: that it can be reasonably typed as a domain. It shouldn't
take longer than a few seconds to remember and type. It won't be as
fast as typing "google.com", and that's OK, but I think that level of
redundant expansion is way too much.
Creating unambiguously parseable syntaxes and dictionaries that meet
our stated constraints is already hard enough. ;-)
...

I presume people are familiar with Bubblebabble?  It doesn't solve all

the problems, but does make bit strings seem less "dense".
BubbleBabble produces nonwords; as such it fails a basic requirement.
Making something merely look phonotactically valid isn't enough; it
has to be grammatically valid and composed entirely of known terms.
- Sai

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [tor-dev] Mnemonic 80-bit phrases (proposal)