Hi o/,
During the Tor Operator Meetup I asked about Quick Assist Technology (QAT) support and was asked to bring it to the tor-relays mailing list so the network team can take a look at the question.
In 2025 we're going to build one or more new servers and we're looking in to optimizing the performance per watt ratio since some of our current servers are rather power hungry ;-).
I'm wondering whether QAT works for Tor to offload compression, hashing and encryption. In theory, looking at the nature of Tor (a lot encryption), this could result in a huge performance boost of 100-300% (based on other hashing, cryptographic and compression offload benchmarks). Support for QAT also has improved considerably over the years so many programs/workloads already work nicely with it, but I'm not sure about Tor.
It looks like Tor uses [1] RSA-1024, AES-CBC, AES-CTR, Curve25519, Ed25519, SHA1, AES256, AES3-256. Most (no Curve- and Ed25519) should in theory also work with QAT [2] (although I guess only a few would impact performance significantly when offloaded). But the question is: does it really work? If not, what would be needed to make it work? Are there Tor operators who already utilize QAT? Does the Network Team have some insight in to this? :)
Some of the potential advantages when comparing a similar amount of traffic: - Lower power consumption (much cheaper to run in expensive European countries). - Less CPU cycles required (= cheaper CPUs). - Less heat/cooling required (easier to put in distribution boxes and other small places). - Smaller physical footprint (easier to put in distribution boxes and other small places). - Alleviates some of the issues and challenges caused by Tor's single threaded architecture by effectively increasing bandwidth per CPU core considerably.
With regards,
tornth
[1] https://spec.torproject.org/tor-spec/preliminaries.html?highlight=cipher#cip... [2] https://www.intel.com/content/www/us/en/support/articles/000093843/technolog...
Hi
I think Tor is using libraries for this functions. If the libraries in question supports QAT, then it should accelerate Tor too. If the libraries don't support QAT, then maybe it could be replaced by a library supporting it.
In older days, I've seen configuration options in programs to activate hardware acceleartion of the library, setting environment variables for the library or other ways of control.
Cheers Andreas
On Saturday, June 22, 2024 23:14 CEST, mail--- via tor-relays tor-relays@lists.torproject.org wrote: Hi o/, During the Tor Operator Meetup I asked about Quick Assist Technology (QAT) support and was asked to bring it to the tor-relays mailing list so the network team can take a look at the question. In 2025 we're going to build one or more new servers and we're looking in to optimizing the performance per watt ratio since some of our current servers are rather power hungry ;-). I'm wondering whether QAT works for Tor to offload compression, hashing and encryption. In theory, looking at the nature of Tor (a lot encryption), this could result in a huge performance boost of 100-300% (based on other hashing, cryptographic and compression offload benchmarks). Support for QAT also has improved considerably over the years so many programs/workloads already work nicely with it, but I'm not sure about Tor. It looks like Tor uses [1] RSA-1024, AES-CBC, AES-CTR, Curve25519, Ed25519, SHA1, AES256, AES3-256. Most (no Curve- and Ed25519) should in theory also work with QAT [2] (although I guess only a few would impact performance significantly when offloaded). But the question is: does it really work? If not, what would be needed to make it work? Are there Tor operators who already utilize QAT? Does the Network Team have some insight in to this? :) Some of the potential advantages when comparing a similar amount of traffic:- Lower power consumption (much cheaper to run in expensive European countries).- Less CPU cycles required (= cheaper CPUs).- Less heat/cooling required (easier to put in distribution boxes and other small places).- Smaller physical footprint (easier to put in distribution boxes and other small places).- Alleviates some of the issues and challenges caused by Tor's single threaded architecture by effectively increasing bandwidth per CPU core considerably. With regards, tornth [1] https://spec.torproject.org/tor-spec/preliminaries.html?highlight=cipher#cip...] https://www.intel.com/content/www/us/en/support/articles/000093843/technolog...
Excerpts from mail--- via tor-relays's message of June 22, 2024 5:14 pm:
Hi o/,
During the Tor Operator Meetup I asked about Quick Assist Technology (QAT) support and was asked to bring it to the tor-relays mailing list so the network team can take a look at the question.
In 2025 we're going to build one or more new servers and we're looking in to optimizing the performance per watt ratio since some of our current servers are rather power hungry ;-).
I'm wondering whether QAT works for Tor to offload compression, hashing and encryption. In theory, looking at the nature of Tor (a lot encryption), this could result in a huge performance boost of 100-300% (based on other hashing, cryptographic and compression offload benchmarks). Support for QAT also has improved considerably over the years so many programs/workloads already work nicely with it, but I'm not sure about Tor.
It looks like Tor uses [1] RSA-1024, AES-CBC, AES-CTR, Curve25519, Ed25519, SHA1, AES256, AES3-256. Most (no Curve- and Ed25519) should in theory also work with QAT [2] (although I guess only a few would impact performance significantly when offloaded). But the question is: does it really work? If not, what would be needed to make it work? Are there Tor operators who already utilize QAT? Does the Network Team have some insight in to this? :)
Some of the potential advantages when comparing a similar amount of traffic:
- Lower power consumption (much cheaper to run in expensive European countries).
- Less CPU cycles required (= cheaper CPUs).
- Less heat/cooling required (easier to put in distribution boxes and other small places).
- Smaller physical footprint (easier to put in distribution boxes and other small places).
- Alleviates some of the issues and challenges caused by Tor's single threaded architecture by effectively increasing bandwidth per CPU core considerably.
With regards,
tornth
[1] https://spec.torproject.org/tor-spec/preliminaries.html?highlight=cipher#cip... [2] https://www.intel.com/content/www/us/en/support/articles/000093843/technolog...
tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
I previously answered this at https://lists.torproject.org/pipermail/tor-relays/2022-April/020495.html. In principle, it should work if you set HardwareAccel 1. However, based on my profiling, the actual AES encryption doesn't use that much CPU when using regular AES instructions. I couldn't find any independent QAT benchmarks from an internet search, but https://calomel.org/aesni_ssl_performance.html says AES-NI can reach over 1 GB/s per core, which is far more than Tor can use.
Cheers, Alex.
Hi,
Thanks for your reply Alex! That mailing list thread is great and contains quite some relevant pointers.
Some of my current Tor hardware (Intel C3958) is actually QAT compatible (it's the one mentioned even) so based on the information in the thread I activated to experiment: - Kernel TLS - Kernel TLS for AES-CBC - QAT kernel module - QAT itself - HardwareAccel 1 in torrc
I'll monitor the differences, although I doubt it will be this simple. I also have to look in to how I can even verify that QAT and/or kTLS are being used. It's a new territory for me :). Looks like there are stats available for kTLS at least in sysctl.
In between the previous mailing list thread and this one, support for OpenSSL 3.0 has been greatly improved it seems by the way.
Also about RSA-1024: Intel documentation says it's "Opt-in" [1]. Any idea how one can opt-in? I can't find any sysctl parameters for this in FreeBSD.
Cheers and thanks again,
tornth
[1] https://www.intel.com/content/www/us/en/support/articles/000093843/technolog...
Jun 24, 2024, 21:53 by tor-relays@lists.torproject.org:
Excerpts from mail--- via tor-relays's message of June 22, 2024 5:14 pm:
Hi o/,
During the Tor Operator Meetup I asked about Quick Assist Technology (QAT) support and was asked to bring it to the tor-relays mailing list so the network team can take a look at the question.
In 2025 we're going to build one or more new servers and we're looking in to optimizing the performance per watt ratio since some of our current servers are rather power hungry ;-).
I'm wondering whether QAT works for Tor to offload compression, hashing and encryption. In theory, looking at the nature of Tor (a lot encryption), this could result in a huge performance boost of 100-300% (based on other hashing, cryptographic and compression offload benchmarks). Support for QAT also has improved considerably over the years so many programs/workloads already work nicely with it, but I'm not sure about Tor.
It looks like Tor uses [1] RSA-1024, AES-CBC, AES-CTR, Curve25519, Ed25519, SHA1, AES256, AES3-256. Most (no Curve- and Ed25519) should in theory also work with QAT [2] (although I guess only a few would impact performance significantly when offloaded). But the question is: does it really work? If not, what would be needed to make it work? Are there Tor operators who already utilize QAT? Does the Network Team have some insight in to this? :)
Some of the potential advantages when comparing a similar amount of traffic:
- Lower power consumption (much cheaper to run in expensive European countries).
- Less CPU cycles required (= cheaper CPUs).
- Less heat/cooling required (easier to put in distribution boxes and other small places).
- Smaller physical footprint (easier to put in distribution boxes and other small places).
- Alleviates some of the issues and challenges caused by Tor's single threaded architecture by effectively increasing bandwidth per CPU core considerably.
With regards,
tornth
[1] https://spec.torproject.org/tor-spec/preliminaries.html?highlight=cipher#cip... [2] https://www.intel.com/content/www/us/en/support/articles/000093843/technolog...
tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
I previously answered this at https://lists.torproject.org/pipermail/tor-relays/2022-April/020495.html. In principle, it should work if you set HardwareAccel 1. However, based on my profiling, the actual AES encryption doesn't use that much CPU when using regular AES instructions. I couldn't find any independent QAT benchmarks from an internet search, but https://calomel.org/aesni_ssl_performance.html says AES-NI can reach over 1 GB/s per core, which is far more than Tor can use.
Cheers, Alex. _______________________________________________ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Some of my current Tor hardware (Intel C3958) is actually QAT compatible (it's the one mentioned even)Just for the record, Intel makes QAT PCIe adapter cards [1], so one can benefit from it even if the CPU is older and doesn’t directly support QAT.They are not cheap but one can find them for less than half of the price on some well-known second hand online stores. [1] https://www.intel.com/content/www/us/en/products/docs/network-io/ethernet/10...
tor-relays@lists.torproject.org