python - Data locality via many queues in Celery? -
we're trying design distributed pipeline crunches large numbers of data chunks in parallel fashion. we're moving towards adopting celery, 1 of requirements need able map jobs nodes in cluster, e.g. if 1 node has access data chunk.
the first answer comes mind multiple queues, potentially 1 queue per node, large (~64) number of nodes. feasible, , efficient? celery queues lightweight? there better way?
the best answer i've found date here:
is celery appropriate use many small, distributed systems?
which suggests celery indeed fit use case. perhaps i'll update again when we've implemented.
Comments
Post a Comment