Hi folks,
I was trying to help a user in #tor to get their ftp server running behind
an onion, and rediscovering the ftp protocol's weird separation between
the control channel and the file transfer channels, and I realized it
could be interesting in the obfs4 / "unclassifiable protocols" context.
To recap, when you ftp a file, you connect to port 21, and tell it you
want to download a file. In the modern era, clients and servers use
"passive mode" by default, where the server opens a high-numbered port,
and you connect to that port and it dumps the file on you.
I might be mistaken (please tell me if I am), but I believe there are no
protocol headers or preamble or handshake or anything to the download. You
just connect and the bytes start flowing.
If this is so, and if ftp were still popular, then there are a bunch of
high-numbered-port connections which will be hard to classify by protocol
because they are simply a file, on the network.
Of course, many files have structure of their own, including some
"header"-like preface that e.g. says it's a zip file.
By that reasoning, a variant of obfsproxy that wrapped Tor traffic to
look like a password-protected zip file could give it many other
things to blend with on a large network like China's backbone.
(Compression is good but not enough, because DPI engines already know
how to uncompress a zipped flow to look inside it. So we need some sort
of encryption or password or the like too.)
By that reasoning also, it might be interesting to separate the two
directional flows in an obfs connection -- i.e. so there's a "download"
flow, and a separate "upload" flow. This approach will surely look
weird in some contexts (two flows rather than one, and no ftp control
connection, gotcha), but maybe in other contexts it will have many
friends -- I'm thinking network backbones where there are many flows,
many users are natted, and it's expensive to try to tie together state
between permutations of flows.
If you google for 'why is ftp still used' you find a bunch of articles
lamenting that people won't move away from it, and especially that
large orgs won't move away from it. Maybe some of those large orgs are
reasoning that if they secure the files contents themselves, then the
transfer protocol doesn't matter so much. That scenario would play well
into our goals of having a bunch of high-entropy files being passed
around with no protocol headers.
Hm,
--Roger