On Sat, Aug 22, 2015 at 12:43 AM, Yawning Angel <yawning@schwanenlied.me> wrote:
On Fri, 21 Aug 2015 17:51:20 -0700
Kevin P Dyer <kpdyer@gmail.com> wrote:

> On Wed, Aug 19, 2015 at 11:58 AM, Yawning Angel
> <yawning@schwanenlied.me> wrote:
>
> > [snip]
> >
> > The FTE semantic attack they presented isn't the easiest one I know
> > of (the GET request as defined by the regex is pathologically
> > malformed).
> >
>
> Very interesting! This is news to me. I'm assuming I did something
> silly. (Even though I tested it against bro, wireshark, etc.)

Huh. I brought it up in conversation with a few people and was under
the impression it was passed on.  I probably should have e-mailed you
about it or something.

> How is it pathologically malformed?

 "manual-http-request": {
   "regex": "^GET\\ \\/([a-zA-Z0-9\\.\\/]*) HTTP/1\\.1\\r\\n\\r\\n$"
 },

No "Host" header.  All complaint requests MUST include one per RFC
2616, and all compliant servers MUST respond with a 400 if it is
missing.

Ah, gotcha. It's not RFC compliant. RFC2616 was created in 1999 and there are tons of HTTP-like implementations since then that, ostensibly, don't need to follow it. (e.g., an HTTP-like client/server that only talk to each other.) A network monitor must deal with these cases too, and they'll broadcast HTTP/1.1 in their headers.

This [1] paper is a bit dated (2007) but my intuition is that real-world implementations have drifted even further from the RFC over the last 8 years. I swear there's a more recent paper on this topic, but I couldn't find it...
 
Since requests of that sort should invoke the error path on RFC
compliant servers it's a really good distinguisher since legitimate
clients will not do such a thing.  Existing realistic adversaries
already have "identify 'suspicious behavior', call back to confirm"
style filtering in production, so false positive rate can be reduce to
0 if needed.

Based on our exploration of data, we found there's a wide range of implementations and most of which have non-RFC-compliant behaviors. See Section 4 of our paper for more details. For that reason I'd be very surprised if a host-header-check could result in a 0 FP rate.

With that being said, I'll add the host-header-check to the list of experiments that we want to do for the full version of our paper. Would be interesting to learn what the data tells us.

-Kevin

[1] https://www.ideals.illinois.edu/bitstream/handle/2142/11424/Non-compliant%20and%20Proud%20A%20Case%20Study%20of%20HTTP%20Compliance.pdf