Hello!
My name is Dan, I've been working on a pluggable transport for Tor based on bittorrent as cover traffic and wanted to let you know about it.
https://github.com/danoctavian/bit-smuggler
In a nutshell, I'm tunnelling a data stream through a bittorrent peer connection that is created by real bittorrent clients (uTorrent for this implementation) - to avoid "parroting" traffic pitfalls and active probing. This made the implementation quite tricky to get right, so my reasoning is that it's a worthy trade-off.
I worked with Dr. George Danezis as my supervisor for the project. He came up with the idea to try bittorrent, the crypto strategy and advised me throughout.
The docs in the repo contain more information. I researched this topic for my master thesis, and the last 2 months i did a rewrite of the project. At the moment I did not integrate with Tor (working on an Extended orPort implementation) and I need to do more work on the server to make it run properly as a long running process.
Please ask me anything for clarification and let me know how can i make this useful for the Tor project. Any kind of feedback is very welcome. I'm working a 9-5 starting next week, but I'm going to work on it in my spare time.
Thanks!
On Sat, Feb 28, 2015 at 10:46:03AM -0800, Dan Cristian Octavian wrote:
My name is Dan, I've been working on a pluggable transport for Tor based on bittorrent as cover traffic and wanted to let you know about it.
https://github.com/danoctavian/bit-smuggler
In a nutshell, I'm tunnelling a data stream through a bittorrent peer connection that is created by real bittorrent clients (uTorrent for this implementation) - to avoid "parroting" traffic pitfalls and active probing. This made the implementation quite tricky to get right, so my reasoning is that it's a worthy trade-off.
People reading this should look at the documentation, there's thoughtful information there.
https://github.com/danoctavian/bit-smuggler/blob/master/README.md https://github.com/danoctavian/bit-smuggler/blob/master/DESIGN.md https://github.com/danoctavian/bit-smuggler/blob/master/docs/system-componen...
I don't know anything about BitTorrent. What parts of the protocol are easily visible to the censor, without expensive reconstruction? I guess it includes at least: file names, file sizes, peer IP addresses.
About active probing: it's true that if the censor probes you, you look like a BitTorrent client. Is there anything weird about how you use the protocol that could make you stand out anyway? At https://github.com/danoctavian/bit-smuggler/blob/master/README.md#security, you say that a network monitor would have to reconstruct a stream in order to detect anomalies. Could a censor acting as an ordinary peer detect them more easily, just by participating in the file transfer? (I'm thinking of how the movie studios would run their own BitTorrent clients in order to find other downloaders.)
David Fifield
Also interesting is that BitTorrent has its own family of obfuscation transports. I think they are designed to evade throttling by ISPs, which is a threat model similar to the censorship one.
https://en.wikipedia.org/wiki/BitTorrent_protocol_encryption
MSE (Message Stream Encryption) is a little weird, but not entirely dissimilar to obfs3.
https://wiki.vuze.com/w/Message_Stream_Encryption http://www.tcs.hut.fi/Publications/bbrumley/nordsec08_brumley_valkonen.pdf
David Fifield
Hi,
I'm wondering about a particular case--let me explain. From your threat model you assume that the adversary has suspicions about encrypted traffic and may block them without strong justification. You also take as given that the adversary may be state-level. From the adversary objective this is because the adversary wants to know who and what this communication is about. In the limitations you state that the adversary (counter-intuitively) has strong socio-economic reasons to not block bittorent. It does not follow... In China it's not uncommon to hijack torrent sites or ban them entirely. They perform mitm even for encrypted sites like github. They have a one-strike policy that they don't normally enforce regarding file sharing. The golden shield is sophisticated enough to correlate the use of a bridge across multiple users. Which means you need strength in numbers. Then again, outside China, bittorrent is commonly subjected to traffic shaping. I'm unclear about how this helps the censored user. Under such circumstance wouldn't it be possible to have a common peer show up in multiple unique torrent swarms? --leeroy
Hi Leeroy,
If I understand correctly, you are arguing that my assumption that bittorrent is unlikely to be blocked is faulty. I don't have a strong argument against this, other than that it would be a very drastic move since for that part of the world bittorrent is the main way to get access to media files. As we've seen they've blocked already major things such as facebook or google so it would not be surprising.
Can you please ellaborate on this question. "Under such circumstance wouldn't it be possible to have a common peer show up in multiple unique torrent swarms?"
I don't think i understand what you're getting to. In general for bittorrent, a peer can be part of many swarms yes.
Also, you're suggesting that the reality that traffic shaping exists is a good or bad thing? sorry, i'm not clear on this 1 either.
On Mon, Mar 2, 2015 at 3:19 PM, l.m ter.one.leeboi@hush.com wrote:
Hi,
I'm wondering about a particular case--let me explain. From your threat model you assume that the adversary has suspicions about encrypted traffic and may block them without strong justification. You also take as given that the adversary may be state-level. From the adversary objective this is because the adversary wants to know who and what this communication is about. In the limitations you state that the adversary (counter-intuitively) has strong socio-economic reasons to not block bittorent. It does not follow... In China it's not uncommon to hijack torrent sites or ban them entirely. They perform mitm even for encrypted sites like github. They have a one-strike policy that they don't normally enforce regarding file sharing. The golden shield is sophisticated enough to correlate the use of a bridge across multiple users. Which means you need strength in numbers. Then again, outside China, bittorrent is commonly subjected to traffic shaping. I'm unclear about how this helps the censored user. Under such circumstance wouldn't it be possible to have a common peer show up in multiple unique torrent swarms?
--leeroy
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On Mon, Mar 02, 2015 at 07:10:55PM -0800, Dan Cristian Octavian wrote:
If I understand correctly, you are arguing that my assumption that bittorrent is unlikely to be blocked is faulty. I don't have a strong argument against this, other than that it would be a very drastic move since for that part of the world bittorrent is the main way to get access to media files. As we've seen they've blocked already major things such as facebook or google so it would not be surprising.
I think it's reasonable to just state in your threat model that BitTorrent is not blocked. Even though there will be censors for which that is not true, I'm sure there are enough where it is true for it to be interesting. It's a mistake to say that if something doesn't work in China (or any other single concrete threat environment), then it's useless. It's a question of motivation, and technical capability, and resources, all of which vary under different censors. BitTorrent is interesting because I would guess, at least in the U.S., that you're more likely to get blocked by your ISP than by a firewall further out.
David Fifield
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 15-03-03 10:10 AM, David Fifield wrote:
On Mon, Mar 02, 2015 at 07:10:55PM -0800, Dan Cristian Octavian wrote:
If I understand correctly, you are arguing that my assumption that bittorrent is unlikely to be blocked is faulty. I don't have a strong argument against this, other than that it would be a very drastic move since for that part of the world bittorrent is the main way to get access to media files. As we've seen they've blocked already major things such as facebook or google so it would not be surprising.
I think it's reasonable to just state in your threat model that BitTorrent is not blocked. Even though there will be censors for which that is not true, I'm sure there are enough where it is true for it to be interesting. It's a mistake to say that if something doesn't work in China (or any other single concrete threat environment), then it's useless. It's a question of motivation, and technical capability, and resources, all of which vary under different censors. BitTorrent is interesting because I would guess, at least in the U.S., that you're more likely to get blocked by your ISP than by a firewall further out.
These are thoughtful responses and I would like to add some food for further thought. Please do not think that I am attacking any particular system(s) or group of people. Awesome work is being done that is making an impact in the real world. I write this as a means to think about our philosophies and ultimate aims, rather than the day to day efforts. Also, I am sure that what follows will not be surprising to or new to many, but by bringing it up it would be good to have an open discussion.
Is it _alright_ to say "it works sometimes, for some people, somewhere" or, in a slightly different by related sentiment, is it _alright_ to say "the law is against X, but they aren't actually locking up people for violating X so let's just keep on X'ing." To me it ultimately feels unsatisfying and not _alright_ for two reasons.
The first is that it makes me think that perhaps we ought to separate the intrinsic properties of the Censorship Resistance System (CRS) and extrinsic properties due to the operating environment. The issue seems to be that incorporating things like censor motivation, popularity, and current trends in to the design of the CRS, actually embedding these as assumptions, creates fragile CRSs that are averse to change. I say this because we have examples of broken CRSs where the leveraged popular service changed its architecture or the censor decided to do something slightly different and the CRS no longer works.
I want to clear here that I am not talking about the censor's computational and space complexity and technical ability. These are intrinsic properties akin to that we find when talking about the security of crypto systems. Taking these in to account in the design of the CRS is _alright_.
The second, most likely due to the community we are in (and it is a great community), is that there is an emphasis on making/engineering things. This by itself is great since then people can actually use the things. What I find off-putting is when academics also subscribe to the mentality that things are working today so it must be the truly good. Now I know this is a blanket statement and I am sure it does not apply to all academics, but the fact that it does happen is what I am pointing out. What I am getting at here is that we ought to figure out properties of CRSs that all CRSs should have based on some fundamentals/theories rather than what happens to be the censorship landscape today. The future holds many challenges and changes and getting ahead of the game will come from CRS designs that are resilient to change and do not make strong assumptions about the operating environment.
The above does not mean that no one should use a CRS until it is perfect. What I wanted to point out is that there is a place for stop-gap measures but the trend should not be to include more and more extrinsic factors in to the CRS designs but to reduce them as much as possible.
Thanks for getting this far, I look forward to your thoughtful responses.
Cheers, Tariq
On 03/03/15 16:54, Tariq Elahi wrote:
What I am getting at here is that we ought to figure out properties of CRSs that all CRSs should have based on some fundamentals/theories rather than what happens to be the censorship landscape today. The future holds many challenges and changes and getting ahead of the game will come from CRS designs that are resilient to change and do not make strong assumptions about the operating environment.
Responding to just one of many good points: I think your insight is the same one that motivated the creation of pluggable transports. That is, we need censorship resistance systems that are resilient to changes in the operating environment, and one way to achieve that is to separate the core of the CRS from the parts that are exposed to the environment. Then we can replace the outer parts quickly in response to new censorship tactics, without replacing the core.
In my view this is a reasonable strategy because there's very little we can say about censorship tactics in general, as those tactics are devised by intelligent people observing and responding to our own tactics. If we draw a line around certain tactics and say, "This is what censors do", the censor is free to move outside that line. We've seen that happen time and time again with filtering, throttling, denial of service attacks, active probing, internet blackouts, and the promotion of domestic alternatives to blocked services. Censors are too clever to be captured by a fixed definition. The best we can do is to make strategic choices, such as protocol agility, that enable us to respond quickly and flexibly to the censor's moves.
Is it alright to use a tactic that may fail, perhaps suddenly, perhaps silently, perhaps for some users but not others? I think it depends on the censor's goals and the nature of the failure. If the censor just wants to deny access to the CRS and the failure results in some users losing access, then yes, it's alright - nobody's worse off than they would've been without the tactic, and some people are better off for a while.
If the censor wants to identify users of the CRS, perhaps to monitor or persecute them, and the failure exposes the identities of some users, it's harder to say whether using the tactic is alright. Who's responsible for weighing the potential benefit of access against the potential cost of exposure? It's tempting to say that developers have a responsibility to protect users from any risk - but I've been told that activists don't want developers to manage risks on their behalf; they want developers to give them enough information to manage their own risks. Is that true of all users? If not, perhaps the only responsible course of action is to disable risky features by default and give any users who want to manage their own risks enough information to decide whether to override the defaults.
Cheers, Michael
I agree with Michael's idea of core parts vs replaceable parts (such as the type of cover traffic) since I feel much of the censorship circumvention still relies on how the landscape looks like and that there isn't a clear cut, theory-based solution to the problem (in the way you argue for example that a certain end-to-end encryption protocol is correct - you can do proper formal reasoning about that).
What I feel is that at this point we lack a more solid way of evaluating how good is a pluggable transport.
I would like to thank all for the feedback and do a summary of the ideas I've gathered.
*## Goals*
My goal at this point with bit-smuggler is to figure out what are the next steps to bring value with it.
* Does it have potential to be used as a Tor PT, by incorporating ideas to make it better? If that is the case i would gladly continue work on it.
* Or are there intrinsic limitations of bittorrent as a cover traffic which make it unsuitable for the security standards of a Tor PT? In this case, maybe it can have a different use case (penetrate a censorship firewall without getting caught in real-time, but with an acceptable risk of being later on detected with a delayed analysis)
In the latter case, I guess it would be useful to document my work for future reference when working on other PTs since I use some techniques that may be reused/avoided in the future based on whether they are proven to be good/bad (eg. attempting to tamper with traffic generated by a real-world implementation of the protocol through proxying)
*## Discussion summary*
David thinks that it is reasonable to assume bittorrent won't be blocked by the censor and raises some important questions about how my bit-smuggler may create network traffic patterns that are unusual and therefore fingerprintable. I made a list of the ones I can think of in a previous message, and it's up for discussions which may compromise a bit-smuggler connection in real-time and therefore need to be mitigated or which won't do that and are acceptable.
Michael support the idea that an approach where we adapt to censor landscape and have some core concepts/designs that are the same for all PTs and some changeable parts to adapt to circumstances. His argument is build starting from Tariq message, who states the need for PTs that don't just work "sometimes" and he argues that Tariq's points are ideas that got the PT project started in the first place.
Leeroy stresses that the following aspects problematic:
* bittorrent spec breaking due to the fact that in the bittorrent message exchange between the PT server and client using bit-smuggler, the data being exchanged doesn't match the correct checksums stated in the .torrent file
* bittorrent having no extra layer of encryption, bit-smuggler relies on steganography which is harder to get right (as opposed to meek where everything happens under the cover of an https connection)
* plausible deniability is compromised - if a user's bittorrent traffic is captured, reconstructed and found to have many checksum failures it can be argued he was using bit-smuggler
I am not sure I completely understand Leeroy's strategy for breaking undetectability but here's a non-real time one that can work.
A simple approach is this: suppose that the adversary would just do a packet capture for all bittorrent traffic crossing national borders in an interval of 8 hours. Then it performs TCP reconstruction, reconstructs the BitTorrent message exchange for all those captures, fetches the corresponding torrent file, computes hashes and sees a large number of hash failures -> it's bit-smuggler. So all active PT servers and clients during that interval of time would be caught (with a delay). By looking at the IPs of those broken bittorrent streams, it can then detect the IP of the bridge (since many IPs connect to 1 particular IP, it's like a sink). It can then either wait passively to see the activity of the bridge, now that it identified it, and see what ppl connect to it, or just go ahead and block it.
If anything above is inaccurate, please let me know, that is my current understanding of the discussion.
*## Trade-offs and use cases *
At this point i believe that Bit-smuggler can be made to work in situations where the user requires to penetrate a censorship firewall without being cut down in real-time, get a good throughput upstream and downstream and have data confidentiality. In support of it come the properties of high volume (harder to monitor)
However, it's very likely that given enough investment of resources, a censor can devise a system with delayed non-real time analysis where he detects which connections were bitsmuggler and which were not and, there are strong reasons to believe that even though the data is encrypted/looks like random, a high a occurrence of detected hash fails is enough to break plausible deniability (aka argue in court that the user used bit-smuggler) I believe there are situations where this is an acceptable trade-off, eg. an adversary that stops at just cutting VPN connections but doesn't pursue users of VPN any further.If other PTs with better properties are unusable in some situation (eg. it's cover protocol is blocked, look-like-nothing protocols fail because of protocol white-listing) this can be a fall-back solution with this tradeoff.
Would like to hear your thoughts on the potential use cases and further steps, and please let me know about what things are unclear so i can explain.
Thank you! Dan
On Sat, Mar 7, 2015 at 3:56 AM, Michael Rogers michael@briarproject.org wrote:
On 03/03/15 16:54, Tariq Elahi wrote:
What I am getting at here is that we ought to figure out properties of CRSs that all CRSs should have based on some fundamentals/theories rather than what happens to be the censorship landscape today. The future holds many challenges and changes and getting ahead of the game will come from CRS designs that are resilient to change and do not make strong assumptions about the operating environment.
Responding to just one of many good points: I think your insight is the same one that motivated the creation of pluggable transports. That is, we need censorship resistance systems that are resilient to changes in the operating environment, and one way to achieve that is to separate the core of the CRS from the parts that are exposed to the environment. Then we can replace the outer parts quickly in response to new censorship tactics, without replacing the core.
In my view this is a reasonable strategy because there's very little we can say about censorship tactics in general, as those tactics are devised by intelligent people observing and responding to our own tactics. If we draw a line around certain tactics and say, "This is what censors do", the censor is free to move outside that line. We've seen that happen time and time again with filtering, throttling, denial of service attacks, active probing, internet blackouts, and the promotion of domestic alternatives to blocked services. Censors are too clever to be captured by a fixed definition. The best we can do is to make strategic choices, such as protocol agility, that enable us to respond quickly and flexibly to the censor's moves.
Is it alright to use a tactic that may fail, perhaps suddenly, perhaps silently, perhaps for some users but not others? I think it depends on the censor's goals and the nature of the failure. If the censor just wants to deny access to the CRS and the failure results in some users losing access, then yes, it's alright - nobody's worse off than they would've been without the tactic, and some people are better off for a while.
If the censor wants to identify users of the CRS, perhaps to monitor or persecute them, and the failure exposes the identities of some users, it's harder to say whether using the tactic is alright. Who's responsible for weighing the potential benefit of access against the potential cost of exposure? It's tempting to say that developers have a responsibility to protect users from any risk - but I've been told that activists don't want developers to manage risks on their behalf; they want developers to give them enough information to manage their own risks. Is that true of all users? If not, perhaps the only responsible course of action is to disable risky features by default and give any users who want to manage their own risks enough information to decide whether to override the defaults.
Cheers, Michael
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
It's a mistake to say that if something doesn't work in China (or any other single concrete threat environment), then it's useless.
Out of respect for the work you've done I'm not going to assume you're taking typed-word out of context incorrectly.
I'm concerned that this PT exchanges one threat for another and is thought to be a good for integration with Tor. It's one thing to use Google/Azure/etc where there are legitimate uses. It's another to trade the threat of secure-encrypted traffic (with crypto-secure PRNG in most PT cases) for another option that utilizes insecure obfuscation of file transfer together with a server that utilizes (presumably) secure communication. In one you increase vulnerability surface of the censored user, in the other the only threat is this unknown communication which can be easily blocked. I digress.
Allow me to attack the problem head on then. What do I know about bitorrent. Not much. I know, for any user of bitorrent, the infohash is easily derivable and so is the peer list. So, if you don't participate in the swarm intersection of peer lists means it's less difficult to find the needle-in-the-haystack that is the PT-server. Just look at the unique peers across multiple users of the PT who each create unique torrent swarm fingerprints corresponding to the infohash of the files shared. You, the PT-server, must participate in the swarm.
Suppose then that you, the PT-server, do participate in the swarm. Long transfers with peers who provide hash-failing pieces breaks BT spec. The adversary just needs to force peer list rotation. How can this be done? Well, the adversary knows the infohash and the peer list to expect. So, flip-bit, as you put it. Only do it for all peers who cross the country-firewall. If the client is indeed running a bitorrent client sit back and watch the churn. Only something stands out. There's a peer, you, the PT-server, who is ignoring the ban fingerprint. This can be done in either direction of piece share. Because you the the PT-user differ from the spec you stand out.
Another case. The adversary can monitor the bitfield of the peer connected to the PT-server. When the torrent is complete the client will disconnect from all peers and take the seed role. Only there's a problem. They're still transferring data with the PT-server as if they were a leech. It's not enough to change torrent swarms because it would be immediately apparent that they re-establish communication with the PT-server, crossing swarms.
A final thought. It's one thing for an adversary to not be able to attack a communication besides blocking it entirely. This would be the case with crypto-secure communications. Bitorrent doesn't fall into this category. Especially when facing the state-level adversary. So your PT communication would need to be crypto secure (not saying it's not). The caveat is that if one were to try and pack encrypted data within BT-spec obfuscation and that BT-spec obfuscation better not ever fail. If it did the user of the PT can be proven to be hiding data via steganography in hash failing pieces (as you've mentioned). This can provide justification for an accusation of state-offense. This would be different from packing data where no hash fail is apparent such as regular steganography, minus bittorrent. Video streaming or audio streaming combined with data hiding, and without any checksum, is a different beast than video transfer over BT.
tl;dr -- It's a novel idea to prevent detection of the PT-server by tunneling in some other traffic. --leeroy
Hey Leeroy,
On your last point: yeah a traffic capture follows by TCP packet reconstruction and thus reconstruction of the bittorrent messages and a check against the original checksums of the pieces (as specified in the torrent file) will show that a connection was not genuine (very likely it was bitsmuggler) since failed checksums are probably a rare occurence in nature.
"Suppose then that you, the PT-server, do participate in the swarm. Long transfers with peers who provide hash-failing pieces breaks BT spec. The adversary just needs to force peer list rotation. How can this be done? Well, the adversary knows the infohash and the peer list to expect. So, flip-bit, as you put it. Only do it for all peers who cross the country-firewall. If the client is indeed running a bitorrent client sit back and watch the churn. Only something stands out. There's a peer, you, the PT-server, who is ignoring the ban fingerprint. This can be done in either direction of piece share. Because you the the PT-user differ from the spec you stand out."
Not 100% sure i understand what you mean here. Are you suggesting an attack that involves tampering with/sending of Peer Exchange messages that say a certain peer should be banned and then the bit-smuggler owned peers just ignore it?
I think you are right if you are saying to that messing up the swarm with a strategy like that or smth else, thus disrupting the communication between PT server and client would with the current implementation trigger the client to cross the swarm and seek to connect to the server through another swarm, and this behavior may be a give-away with real-time results.
I think you make valid points. In general I found bittorrent hard to make it do what you want and i'm not confident about the current swarm handling design, that's why i am asking for opinions on whether it can be improved, or it's not fit to be used as a PT.
On the issue of broken checksums there is no solution for real-time communication if you want to prevent an adversary to be unable to infer that a bit-smuggler connection is hiding behind a bittorrent one (in same way it's unable to detect some steganographied message inside a picture). An option may be to use the encrypted bittorrent (yeap it has one). I'm guessing that encrypted bittorrent connections are rarely used though. An adversary may simply choose to ban this type of connections without causing much disruption.
On Wed, Mar 4, 2015 at 3:37 PM, l.m ter.one.leeboi@hush.com wrote:
It's a mistake to say that if something doesn't work in China (or any other single concrete threat environment), then it's useless.
Out of respect for the work you've done I'm not going to assume you're taking typed-word out of context incorrectly.
I'm concerned that this PT exchanges one threat for another and is thought to be a good for integration with Tor. It's one thing to use Google/Azure/etc where there are legitimate uses. It's another to trade the threat of secure-encrypted traffic (with crypto-secure PRNG in most PT cases) for another option that utilizes insecure obfuscation of file transfer together with a server that utilizes (presumably) secure communication. In one you increase vulnerability surface of the censored user, in the other the only threat is this unknown communication which can be easily blocked. I digress.
Allow me to attack the problem head on then. What do I know about bitorrent. Not much. I know, for any user of bitorrent, the infohash is easily derivable and so is the peer list. So, if you don't participate in the swarm intersection of peer lists means it's less difficult to find the needle-in-the-haystack that is the PT-server. Just look at the unique peers across multiple users of the PT who each create unique torrent swarm fingerprints corresponding to the infohash of the files shared. You, the PT-server, must participate in the swarm.
Suppose then that you, the PT-server, do participate in the swarm. Long transfers with peers who provide hash-failing pieces breaks BT spec. The adversary just needs to force peer list rotation. How can this be done? Well, the adversary knows the infohash and the peer list to expect. So, flip-bit, as you put it. Only do it for all peers who cross the country-firewall. If the client is indeed running a bitorrent client sit back and watch the churn. Only something stands out. There's a peer, you, the PT-server, who is ignoring the ban fingerprint. This can be done in either direction of piece share. Because you the the PT-user differ from the spec you stand out.
Another case. The adversary can monitor the bitfield of the peer connected to the PT-server. When the torrent is complete the client will disconnect from all peers and take the seed role. Only there's a problem. They're still transferring data with the PT-server as if they were a leech. It's not enough to change torrent swarms because it would be immediately apparent that they re-establish communication with the PT-server, crossing swarms.
A final thought. It's one thing for an adversary to not be able to attack a communication besides blocking it entirely. This would be the case with crypto-secure communications. Bitorrent doesn't fall into this category. Especially when facing the state-level adversary. So your PT communication would need to be crypto secure (not saying it's not). The caveat is that if one were to try and pack encrypted data within BT-spec obfuscation and that BT-spec obfuscation better not ever fail. If it did the user of the PT can be proven to be hiding data via steganography in hash failing pieces (as you've mentioned). This can provide justification for an accusation of state-offense. This would be different from packing data where no hash fail is apparent such as regular steganography, minus bittorrent. Video streaming or audio streaming combined with data hiding, and without any checksum, is a different beast than video transfer over BT.
tl;dr -- It's a novel idea to prevent detection of the PT-server by tunneling in some other traffic.
--leeroy
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Good point about joining the swarm. This is a part of the design that i'm not confident about, it's definitely questionable.
Suppose a non-bitsmuggler peer joins the swarm. If he starts torrenting the file, he will get a correct copy (no checksum fails on the pieces) of it because all bitsmuggler parties hold a full correct copy of it to begin with.
ODDITIES
1. *File content*. However, as described in the docs, at the moment those files are just random data, generated with a pseudo-random generator using an integer seed. So an entropy analysis of the the file may be a give-away (the fact that it doesn't look like anything really) and also the fact that right now the percentage of the file that is available is fixed (1/2).
a solution here is to use real existing files, but this involves pre downloading them (fetch pirates of the Caribbean 3 first and then torrent it again with your bit-smuggler server). how much of the file is available can be randomized.
2. *Contact files*. Another aspect is that the server now works by advertising a set of so called contact files. those contact files are bittorrent files that a client needs to start downloading to tunnel a bit-smuggler connection through them. They are partially completed files (1/2 of the pieces are there). Once they are depleted (all downloadable pieces are downloaded) the file is removed and a new partial copy is placed in, to allow for new peer connections on that contact file to have plenty of data flowing back and forth.
This aspect that the server keeps refreshing its files is odd.
These files are part of the server descriptor.
Another way of doing it could be deciding the contact files dynamically. You could maybe have a small exchange at the very beginning between the server and the client through some other channel and steg some data in there.
Possible ways: * the client can make a DHT request for the server, and the server would reply with a set of nodes, but the data in the reply contains data about what contact file the client should use, so not a correct DHT query response. * ue the bifield message of bittorrent to do a request-response sequence between the bitsmuggler server and client about what contact file to use and then switch to it.
3.*Upload slots per torrent = 1 . *the client and server instruct their bittorrent clients to upload to a single peer. basically i'm restricting swarms to a size of 2 to load balance things. if i disable this it would just mean the file gets depleated faster.
So actually, given this setting, an outsider joining a swarm where a bit-smuggler server and client live would not actually be able to download.
ABOUT BITTORRENT
On bittorrent: it's wire protocol is not very complicated. traditionally it runs over a TCP connection, starts with a handshake (containing infohash ( the ID ) of the file transferred and the ) and continues with length prefixed messages which are mostly piece requests and piece data messages (the ones where i embedd the payload) + some control messages to control the data flow (choke, unchoke, interested)
The handshake you can probably yank it out with a regex easily at IP level without any packet reconstruction. the rest i guess you need to parse the stream at application level to make sense of it.
Having the infohash means you can fetch the torrent file.
Spec is here https://wiki.theory.org/BitTorrentSpecification
POSSIBLY BETTER DESIGN
the bitsmuggler server joins aribtrary existing swarms on the internets and informs the client somehow which swarms to look for it. once they find each other, they start exchanging data.
Tech limitations
A crappy thing about utorrent's interface is that it doesn't allow you to tell it to look for a certain peer (so bitsmuggler client tells its bittorrent process to just look for the bitsmuggler's bittorrent process in a big swarm). So who you connect to in a swarm is arbitrary.
A solution could be to intentionally join seeder-only swarms that just sit there.
If you are to not break the file integrity for the other peers you better have a copy of that file of whose swarm you are joining ahead of time. but this is necessary with the current design as well.
Any suggestions/comments are very welcome. IT seems to me that bittorrent is very hard to tame compared to let's say HTTP as a cover between a server and a client, so this might be an impairing limitation for the project.
On Mon, Mar 2, 2015 at 2:01 PM, David Fifield david@bamsoftware.com wrote:
On Sat, Feb 28, 2015 at 10:46:03AM -0800, Dan Cristian Octavian wrote:
My name is Dan, I've been working on a pluggable transport for Tor based
on
bittorrent as cover traffic and wanted to let you know about it.
https://github.com/danoctavian/bit-smuggler
In a nutshell, I'm tunnelling a data stream through a bittorrent peer connection that is created by real bittorrent clients (uTorrent for this implementation) - to avoid "parroting" traffic pitfalls and active
probing.
This made the implementation quite tricky to get right, so my reasoning
is that
it's a worthy trade-off.
People reading this should look at the documentation, there's thoughtful information there.
https://github.com/danoctavian/bit-smuggler/blob/master/README.md https://github.com/danoctavian/bit-smuggler/blob/master/DESIGN.md
https://github.com/danoctavian/bit-smuggler/blob/master/docs/system-componen...
I don't know anything about BitTorrent. What parts of the protocol are easily visible to the censor, without expensive reconstruction? I guess it includes at least: file names, file sizes, peer IP addresses.
About active probing: it's true that if the censor probes you, you look like a BitTorrent client. Is there anything weird about how you use the protocol that could make you stand out anyway? At https://github.com/danoctavian/bit-smuggler/blob/master/README.md#security , you say that a network monitor would have to reconstruct a stream in order to detect anomalies. Could a censor acting as an ordinary peer detect them more easily, just by participating in the file transfer? (I'm thinking of how the movie studios would run their own BitTorrent clients in order to find other downloaders.)
David Fifield
Hi Dan. Very cool. Would you like some analysis of how well your pluggable transport mimicks real BitTorrent traffic?
I don't have time to install bitsmuggler myself right now as I am currently at a conference. However, if you send me a .pcap file recorded with tcpdump or Wireshark of bitsmuggler traffic, I will test it against BitTorrent traffic using the Adversary Labs tools I have been developing.
By the way, George is on my committee as well!
On Saturday, February 28, 2015, Dan Cristian Octavian < danoctavian91@gmail.com> wrote:
Hello!
My name is Dan, I've been working on a pluggable transport for Tor based on bittorrent as cover traffic and wanted to let you know about it.
https://github.com/danoctavian/bit-smuggler
In a nutshell, I'm tunnelling a data stream through a bittorrent peer connection that is created by real bittorrent clients (uTorrent for this implementation) - to avoid "parroting" traffic pitfalls and active probing. This made the implementation quite tricky to get right, so my reasoning is that it's a worthy trade-off.
I worked with Dr. George Danezis as my supervisor for the project. He came up with the idea to try bittorrent, the crypto strategy and advised me throughout.
The docs in the repo contain more information. I researched this topic for my master thesis, and the last 2 months i did a rewrite of the project. At the moment I did not integrate with Tor (working on an Extended orPort implementation) and I need to do more work on the server to make it run properly as a long running process.
Please ask me anything for clarification and let me know how can i make this useful for the Tor project. Any kind of feedback is very welcome. I'm working a 9-5 starting next week, but I'm going to work on it in my spare time.
Thanks!
Hi Brandon,
Yeah that would be great, thanks! I'll do the packet capture when i get back home from work.
A nice! Have fun at the conference!
On Tue, Mar 3, 2015 at 4:58 AM, Brandon Wiley brandon@blanu.net wrote:
Hi Dan. Very cool. Would you like some analysis of how well your pluggable transport mimicks real BitTorrent traffic?
I don't have time to install bitsmuggler myself right now as I am currently at a conference. However, if you send me a .pcap file recorded with tcpdump or Wireshark of bitsmuggler traffic, I will test it against BitTorrent traffic using the Adversary Labs tools I have been developing.
By the way, George is on my committee as well!
On Saturday, February 28, 2015, Dan Cristian Octavian < danoctavian91@gmail.com> wrote:
Hello!
My name is Dan, I've been working on a pluggable transport for Tor based on bittorrent as cover traffic and wanted to let you know about it.
https://github.com/danoctavian/bit-smuggler
In a nutshell, I'm tunnelling a data stream through a bittorrent peer connection that is created by real bittorrent clients (uTorrent for this implementation) - to avoid "parroting" traffic pitfalls and active probing. This made the implementation quite tricky to get right, so my reasoning is that it's a worthy trade-off.
I worked with Dr. George Danezis as my supervisor for the project. He came up with the idea to try bittorrent, the crypto strategy and advised me throughout.
The docs in the repo contain more information. I researched this topic for my master thesis, and the last 2 months i did a rewrite of the project. At the moment I did not integrate with Tor (working on an Extended orPort implementation) and I need to do more work on the server to make it run properly as a long running process.
Please ask me anything for clarification and let me know how can i make this useful for the Tor project. Any kind of feedback is very welcome. I'm working a 9-5 starting next week, but I'm going to work on it in my spare time.
Thanks!
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev