Multiprocessing in Python with large numbers of processes but limit numbers of cpus -
i have large number of data files needed processed through function a. let 1000 files, each process each file takes less 15 min 6gb memory. computer has 32gb , 8 cpus, can use maximum 4 processes (24gb mem , 4 cpus) time safety. question can use multiprocess package in python create 4 processes , each process continuously function process data file independently figure below. each cpu has process approx. 250 files, file sizes of 1000 files diferent not true. 1 note once process finished, assigned new job no matter other processes finished or not, i.e there no wait time 4 processes finished @ same time. return of function not important here. please provide codes! thank suggestion.
i think best solution use multiprocessing.pool
. makes easy set pool of processes (as many specify), provide them jobs in parallel. here's basic example code:
import multiprocessing mp def handle_file(filename): # processing here def process_files(list_of_files): pool = mp.pool(4) # argument number of processes, default number of cpus pool.map(list_of_files) # returns list of results, can ignore
this code little slower necessary, since passes results function calls parent process (even if return values none
), suspect overhead relatively small if processing tasks take significant amount of time.
Comments
Post a Comment