Tor-stem in python script - HTTP requests issue - tor-dev

1 Aug 2013


      Hello guys,
I had a problem and currently I'm not able to solve it. So, here I am ;) I
have a python script that uses python-stem to create and handle a tor
instance (on a defined port). What it does is retrieving (using a  HTTP
GET) a web page and submitting information (using HTTP POST messages).
Basically i use tor because I need to test this server from different IP
addresses with more requests in parallel. What I also do is keeping trace
of Cookies. Here's a sample of the code I use, based on the example on stem
website https://stem.torproject.org/tutorials/to_russia_with_love.html (to
have more parallel requests, i launch the script many times with different
socks_port value):
----------------------------
import socket, socks, stem.process
import mechanize, cookielib
SOCKS_PORT = 9000
DATA_DIRECTORY = "TOR_%s" % SOCKS_PORT
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, '127.0.0.1', SOCKS_PORT)
socket.socket = socks.socksocket
tor_process = stem.process.launch_tor_with_config(
          config = {
            'SocksPort': str(SOCKS_PORT),
            'ControlPort': str(SOCKS_PORT+1),
            'DataDirectory': DATA_DIRECTORY,
            'ExitNodes': '{it}',
          },
        )
# initialize python mechanize, with cookies (it works exactly like urllib2,
urllib3, etc. already tried...)
br = mechanize.Browser()
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)
...
for number in num_list:
  req = br.open_novisit("http://example.com") #_1_
  res = req.read()
  print res
  req.close()
  req2 = br.open("http://example.com/post_to_me", data_to_post) #_2_
  res2 = req2.read()
  req2.close()
--------------------------------
And that's it. The problem occurs on the lines i marked as _1_ and _2_:
basically when it reaches around 200 requests, it seems to block
undefinitely, waiting for a response that never comes. Of course,
wiresharking doesn't work because it's encrypted. The same stuff, without
TOR, works perfectly. So, why does it stuck at about 200 requests!? I tried
to:
1. Telnet on control port, forcing to renew circuits with SIGNAL NEWNYM
2. instantiating mechanize (urllib2, 3, whatever) in the loop
3. ...i don't remember what else
I thought it could be a local socket connection limit: actually without
TOR, i see in wireshark the source port changes every time a request is
performed. But actually i don't know if the problem is in using the same
source port every time (but i don't think so) and if so, should I close the
current socket and open a new one? Should I kill the tor process? I can't
exaplain myself why...
What I only know is: *when the script stucks, if i kill the python process
(ctrl+c) and then re-launch, it starts working again.*. I've seen that it's
possible to set the value of TrackHostExitsExpire, is it useful in my case?
Thanks in advance to whoever can help me!!
Ed