multithreading - Learning python and threading. I think my code runs infinitely. Help me find bugs? -


so i've started learning python now, , absolutely in love it.

i'm building small scale facebook data scraper. basically, use graph api , scrape first names of specified number of users. works fine in single thread (or no thread guess).

i used online tutorials come following multithreaded version (updated code):

import requests import json import time import threading import queue  graphurl = 'http://graph.facebook.com/' first_names = {} # store first names , counts queue = queue.queue()  def getoneuser(url):     http_response = requests.get(url) # open request url     if http_response.status_code == 200:         data = http_response.text.encode('utf-8', 'ignore') # text of response, , encode         json_obj = json.loads(data) # load json object         # name = json_obj['name']         return json_obj['first_name']         # last = json_obj['last_name']     return none  class threadget(threading.thread):     """ threaded name scraper """     def __init__(self, queue):         threading.thread.__init__(self)         self.queue = queue      def run(self):         while true:             #print 'thread started\n'             url = graphurl + str(self.queue.get())             first = getoneuser(url) # 1 user's first name             if first not none:                 if first_names.has_key(first): # if name has been encountered before                     first_names[first] = first_names[first] + 1 # increment count                 else:                     first_names[first] = 1 # add new name             self.queue.task_done()             #print 'thread ended\n'  def main():     start = time.time()     in range(6):         t = threadget(queue)         t.setdaemon(true)         t.start()      in range(100):         queue.put(i)      queue.join()      name in first_names.keys():         print name + ': ' + str(first_names[name])      print '----------------------------------------------------------------'     print '================================================================'     # print top first names     key in first_names.keys():         if first_names[key] > 2:             print key + ': ' + str(first_names[key])      print 'it took ' + str(time.time()-start) + 's'  main() 

to honest, don't understand of parts of code main idea. output nothing. mean shell has nothing in it, believe keeps on running.

so doing filling queue integers user id's on fb. each id used build api call url. getoneuser returns name of 1 user @ time. task (id) marked 'done' , moves on.

what wrong code above?

your original run function processed 1 item queue. in you've removed 5 items queue.

usually run functions like

run(self):     while true:          dousefulwork() 

i.e. have loop causes recurring work done.

[edit] op edited code include change.

some other useful things try:

  • add print statement run function: you'll find called 5 times.
  • remove queue.join() call, causing module block, able probe state of queue.
  • put entire body of run function. verify can use function in single threaded manner desired results, then
  • try single worker thread, go
  • multiple worker threads.

Comments

Popular posts from this blog

blackberry 10 - how to add multiple markers on the google map just by url? -

php - guestbook returning database data to flash -

delphi - Dynamic file type icon -