What is the benefit of this approach rather than discarding low priority requests right away in the top-half handler?
Note that a priority queue is typically implemented as a heap, which does not support efficient trimming.
Correct me if I'm wrong.
When a cell with a small effort in the queue has any chance of getting selected, the optimal strategy for a legitimate client would be to compute nonces and send as many nonces as possible until it causes congestion on his network. Instead when only the cell with the highest effort is processed, sending more than one nonces per connection does no good for a client. We want each legitimate client to send only one nonce per connection.
As of trimming the priority queue, we don't have to use a heap. We can compress the effort into maybe 7 bits, and then store the requests in 128 arrays. Then trimming it is freeing an array. The compression can be something like floating point.
~clz(POW_NONCE) << 1 | (POW_NONCE >> (127 - clz(POW_NONCE))) & 1
That is, take the number of leading zeros and by one bit on the right of the leftmost 1 bit, then complement the first part to preserve order. We can expect the number of leading zeros to be less than 64, so this will take 7 bits. A decrement of this value means about 1.3 - 1.5 times more work, which should be finely enough grained.