On Mon, Dec 12, 2022 at 10:18:53PM +0100, Anders Trier Olesen wrote:
It is surprising, isn't it? It certainly feels like calling connect without first binding to an address should have the same effect as manually binding to an address and then calling connect, especially if the address you bind to is the same as the kernel would have chosen automatically. It seems like it might be a bug, but I'm not qualified to judge that.
Yes, I'm starting to think so too. And strange that Cloudflare doesn't mention stumbling upon this problem in their blogpost on running out of ephemeral ports. [1] [1]https://blog.cloudflare.com/how-to-stop-running-out-of-ephemeral-ports-and-s... If I find the time, I'll make an attempt at understanding exactly what is going on in the kernel.
Cloudflare has another blog post today that gets into this topic.
https://blog.cloudflare.com/the-quantum-state-of-a-tcp-port/
It investigates the difference in behavior between inet_csk_bind_conflict and __inet_hash_connect that I commented on at https://forum.torproject.net/t/tor-relays-inet-csk-bind-conflict/5757/13 and https://bugs.torproject.org/tpo/anti-censorship/pluggable-transports/snowfla.... Setting the IP_BIND_ADDRESS_NO_PORT option leads to __inet_hash_connect; not setting it leads to inet_csk_bind_conflict.
The author attributes the difference in behavior to the fastreuse field in the bind hash bucket:
The bucket might already exist or we might have to create it first. But once it exists, its fastreuse field is in one of three possible states: -1, 0, or +1.
…
…inet_csk_get_port() skips conflict check for fastreuse == 1 buckets. …__inet_hash_connect() skips buckets with fastreuse != -1.