Multiprocessing in Python with large numbers of processes but limit numbers of cpus -


i have large number of data files needed processed through function a. let 1000 files, each process each file takes less 15 min 6gb memory. computer has 32gb , 8 cpus, can use maximum 4 processes (24gb mem , 4 cpus) time safety. question can use multiprocess package in python create 4 processes , each process continuously function process data file independently figure below. each cpu has process approx. 250 files, file sizes of 1000 files diferent not true. 1 note once process finished, assigned new job no matter other processes finished or not, i.e there no wait time 4 processes finished @ same time. return of function not important here. please provide codes! thank suggestion.

enter image description here

i think best solution use multiprocessing.pool. makes easy set pool of processes (as many specify), provide them jobs in parallel. here's basic example code:

import multiprocessing mp  def handle_file(filename):     # processing here  def process_files(list_of_files):     pool = mp.pool(4) # argument number of processes, default number of cpus     pool.map(list_of_files) # returns list of results, can ignore 

this code little slower necessary, since passes results function calls parent process (even if return values none), suspect overhead relatively small if processing tasks take significant amount of time.


Comments

Popular posts from this blog

blackberry 10 - how to add multiple markers on the google map just by url? -

php - guestbook returning database data to flash -

delphi - Dynamic file type icon -