for the distribution of computation tasks on the machines, we need to solve the "bin packing problem" -- we have n tasks each of them needs k_i CPUs, and we want to fit them on computing nodes where each node contains m CPUs (depending on the machine we use m can be 24, 40, 64 or 80). Obviously, we would like to reduce the number of nodes we use (we have a limited amount of node-hours assigned to us on each machine). Currently, the task is done using either manual distribution or https://pypi.org/project/binpacking which provides a "greedy" solution, which is not always optimal.
It would be nice to have an optimal solution for the small cases that interest us (n <= 100, k_i <= m <= 100). You can find some links to exact algorithms in https://en.wikipedia.org/wiki/Bin_packing_problem#Exact_algorithm, but maybe there are also separate algorithms, that are practical for these small cases. it is best for me if you can find or write implementation in python, but any programming language would be ok for that.