numpy - Fast calculation of Pareto front in Python -


i have set of points in 3d space, need find pareto frontier. speed of execution important here, , time increases fast add points test.

the set of points looks this:

[[0.3296170319979843, 0.0, 0.44472108843537406], [0.3296170319979843,0.0, 0.44472108843537406], [0.32920760896951373, 0.0, 0.4440408163265306], [0.32920760896951373, 0.0, 0.4440408163265306], [0.33815192743764166, 0.0, 0.44356462585034007]] 

right now, i'm using algorithm:

def dominates(row, candidaterow):     return sum([row[x] >= candidaterow[x] x in range(len(row))]) == len(row)   def simple_cull(inputpoints, dominates):     paretopoints = set()     candidaterownr = 0     dominatedpoints = set()     while true:         candidaterow = inputpoints[candidaterownr]         inputpoints.remove(candidaterow)         rownr = 0         nondominated = true         while len(inputpoints) != 0 , rownr < len(inputpoints):             row = inputpoints[rownr]             if dominates(candidaterow, row):                 # if worse on features remove row array                 inputpoints.remove(row)                 dominatedpoints.add(tuple(row))             elif dominates(row, candidaterow):                 nondominated = false                 dominatedpoints.add(tuple(candidaterow))                 rownr += 1             else:                 rownr += 1          if nondominated:             # add non-dominated point pareto frontier             paretopoints.add(tuple(candidaterow))          if len(inputpoints) == 0:             break     return paretopoints, dominatedpoints 

found here: http://code.activestate.com/recipes/578287-multidimensional-pareto-front/

what's fastest way find set of non-dominated solutions in ensemble of solutions? or, in short, can python better algorithm?

if you're worried actual speed, want use numpy (as clever algorithmic tweaks have way less effect gains had using array operations). here 2 solutions. "dumb" solution slower in situations becomes faster number of costs increases:

import numpy np   def is_pareto_efficient_dumb(costs):     """     :param costs: (n_points, n_costs) array     :return: (n_points, ) boolean array, indicating whether each point pareto efficient     """     is_efficient = np.ones(costs.shape[0], dtype = bool)     i, c in enumerate(costs):         is_efficient[i] = np.all(np.any(costs>=c, axis=1))     return is_efficient   def is_pareto_efficient(costs):     """     :param costs: (n_points, n_costs) array     :return: (n_points, ) boolean array, indicating whether each point pareto efficient     """     is_efficient = np.ones(costs.shape[0], dtype = bool)     i, c in enumerate(costs):         if is_efficient[i]:             is_efficient[is_efficient] = np.any(costs[is_efficient]<=c, axis=1)  # remove dominated points     return is_efficient 

profiling tests:

with 10000 samples, 2 costs:

dumb: elapsed time 0.9168s smart: elapsed time 0.004274s 

with 5000 samples, 15 costs:

dumb: elapsed time 1.394s smart: elapsed time 1.982s 

Comments