This tutorial
-
Shows how to solve the ESP and PitzDaily real world problems using fcmaes differential evolution.
-
Compares with a surrogate model based solver MVRSM.
"When a black-box optimization objective can only be evaluated with costly or noisy measurements, most standard optimization algorithms are unsuited to find the optimal solution. Specialized algorithms that deal with exactly this situation make use of surrogate models."
ESP2 definitely is such a problem. The question is: Do surrogate model based solvers clearly have an advantage?
Executing the objective function:
-
Loads a docker image containing the open source fluid dynamic software OpenFOAM
-
Simulates an electrostatic precipitator configured using the argument vector of the objective function
-
ESP2 uses both 49 continuous and 49 discrete variables.
An electrostatic precipitator is a large gas filtering installation, whose efficiency is dependent on how well the intake gas is distributed. This installation has slots which can be of various types, each having a different impact on the distribution. The goal is to find a configuration that has the best resulting distribution.
fcmaes-DE has some advantages when applied to extremely expensive optimization problems:
-
It converges fast because it is based on the DE/best/1 strategy
-
It enables parallel function evaluation
First experiments applying fcmaes-DE to ESP2 (and ESP) were disappointing. For scheduling / route finding problems
like job shop, GTOC11 or noisy TSP we successfully used the np.argmin / sorting trick converting continuous input variables
into a list of unique integer values. But what if the discrete values don’t have to be unique like in ESP2? Here we have the additional
problem that the range for the discrete variables is very small: [0,3]
.
Finally I had to accept that fcmaes needed a mixed-integer upgrade which was implemented for fcmaes-DE, fcmaes-MODE (the new multi-objective / constraint algorithm) both for their Python and C variants.
We now can tell the algo via a new parameter ints
which are your discrete integer variables.
ints = [True, True, False]
for instance means that the first two variables are discrete.
from scipy.optimize import Bounds
from fcmaes import decpp, de
problem = ESP2
ret = decpp.minimize(problem.evaluate,
dim = problem.dims(),
bounds = Bounds(problem.lbs(),problem.ubs()),
popsize = 31,
max_evaluations = 10000,
workers = 12,
ints = [v != 'cont' for v in problem.vartype()]
)
print(ret.x, ret.fun)
For logging purposes you could wrap problem.evaluate
. Just be careful to use mp.RawValue
parameters to
monitor progress like:
self.evals = mp.RawValue(ct.c_long, 0) # writable across python processes
self.best_y = mp.RawValue(ct.c_double, math.inf) # writable across python processes
since the objective function is called in parallel. On Windows you should replace the decpp.minimize
by
de.minimize
, the Python variant. Parallel retry works on Windows, parallel function evaluation has issues.
On an AMD 5950x 16-core processor we get about 0.7 ESP2 function evaluations per second. Docker calls scale not optimally, but we get a decent performance gain.
Lowerering popsize
to 16 or 8 will speed up convergence but greatly increases the chance getting stuck in a local
minimum.
The 8 MVRSM runs were performed in parallel on a 16 core machine, the DE runs use parallel function evaluation. MVRSM either needs "luck" or you can view the parallel execution of 8 runs as a kind of "poor mans parallelization" where you are only interested in the best out of these 8 runs. For a fair comparison only the best MVRSM run should be considered.
There is not enough data to identify a "winner" here: Both approaches reach excellent results - which all other optimization methods tested did not - including fcmaes-DE without mixed integer enhancement.
But it is obvious that any claim about the superiority of surrogate based methods is not justified considering these results.
There is nothing magic behind the mixed integer support, it is a minor - but quite effective - modification: The first change is related to the bounds: Since the algorithm internally works with continuous variables (floats) we should assign the same value range to each discrete choice inside the bounds. So we add 0.99999 to the upper integer bound and assume that the fitness function will truncate to an integer value.
We adapt the generation of new argument vectors by mutation / crossover: Generated values are truncated to the next integer value.
This is because candidates are produced based on differences (this is why it is called "Differential" Evolution). We want this differences being discrete values.
We add some random mutations on the integer variables, with random probability dependent on the number of integer variables.
This is because I observed stagnation related to the integer variables in the later stages of the optimization
process. Checking the MVRSM
code I noticed there are plenty of mutations for the discrete variables, so I thought may be doing something similar
could help. Using the Python variants of the algorithms you can define your own sample modifier
overwriting the default behavior.
Please notice me when you found an improvement.
# default modifier for integer variables
def _modifier(self, x):
x_ints = x[self.ints]
n_ints = len(self.ints)
lb = self.lower[self.ints]
ub = self.upper[self.ints]
min_mutate = 0.1
max_mutate = 0.5 # max(1.0, n_ints/20.0)
to_mutate = self.rg.uniform(min_mutate, max_mutate)
# mututate some integer variables
x[self.ints] = np.array([x if self.rg.random() > to_mutate/n_ints else
int(self.rg.uniform(lb[i], ub[i]))
for i, x in enumerate(x_ints)])
return x
ESP3 is another mixed integer modification of the ESP CFD problem with less continuous variables.
from scipy.optimize import Bounds
from fcmaes import decpp, de
problem = ESP3
ret = decpp.minimize(problem.evaluate,
dim = problem.dims(),
bounds = Bounds(problem.lbs(),problem.ubs()),
popsize = 24,
max_evaluations = 5000,
workers = 12,
ints = [v != 'cont' for v in problem.vartype()]
)
print(ret.x, ret.fun)
popsize
can be reduced to 24 for this problem. Parallel execution on an AMD 5950x CPU enabled an execution time
of about 1.5
sec / evaluation.
In Bliek21 you may find results for surrogate based optimizers for this problem.
PitzDaily is another benchmark included in ESP and PitzDaily. The problem assesses the effect of combustion on the mean flowfield properties such as mixing layer growth, entrainment rate, and reattachment length. Here PitzDaily is a nice visualization of the problem. It was chosen as an OpenFOAM tutorial, because it has limited complexity: Only continuous variables and low dimension = 10. fcmaes-DE can solve it easily.
from scipy.optimize import Bounds
from fcmaes import decpp, de
problem = PitzDaily
ret = decpp.minimize(problem.evaluate,
dim = problem.dims(),
bounds = Bounds(problem.lbs(),problem.ubs()),
popsize = 24,
max_evaluations = 5000,
workers = 12,
)
print(ret.x, ret.fun)
popsize
can be reduced to 24 for this problem, no ints
parameter is required, since all variables are continuous.
After about 600 seconds all but one out of 13 runs reach 0.08. And this last one also succeeds after about 1100 seconds.
What is a bit surprising is that solutions < 0.079 are hard to find in the literature.
Parallel execution on an AMD 5950x CPU enabled an execution time
of about 0.7
sec / evaluation.
From Frazier2018: "Bayesian optimization is an approach to optimizing objective functions that take a long time (minutes or hours) to evaluate. It is best-suited for optimization over continuous domains of less than 20 dimensions, and tolerates stochastic noise in function evaluations. It builds a surrogate for the objective and quantifies the uncertainty in that surrogate using a Bayesian machine learning technique, Gaussian process regression, and then uses an acquisition function defined from this surrogate to decide where to sample."
We use ESP4 to check if it is also applicable to larger dimensions - ESP4 has 54. Bayesian Optimization is an interesting method for this problem because:
-
There are implementations supporting mixed integer problems like https://github.com/SheffieldML/GPyOpt and https://github.com/wangronin/Bayesian-Optimization.
-
Some implementations support parallel function evaluations like the two above and https://github.com/wujian16/Cornell-MOE .
Unfortunately for https://github.com/wangronin/Bayesian-Optimization we got an
_pickle.PicklingError: Could not pickle the task to send it to the workers.
error with the docker based ESP4 problem. https://github.com/SheffieldML/GPyOpt works but slows down to one thread after the initial population is evaluated in parallel. And https://github.com/wujian16/Cornell-MOE doesn’t support mixed integer problems.
So we chose to use https://github.com/SheffieldML/GPyOpt single threaded for this comparison. To utilize the CPU we performed 8 optimizations in parallel.
-
GPyOpt needs about 17.6 sec per evaluation (8 parallel optimizations), 1000 evaluations need about 17500 sec.
-
MVRSM needs about 17 sec per evaluation (8 parallel optimizations), 1000 evaluations need about 17000 sec.
-
fcmaes DE needs about 1.5 sec per evaluation (12 parallel function evaluations), 1000 evaluations need about 1500 sec.
Here are the results for fcmaes DE using the mixed integer enhancement, limited to 5000 evaluations:
Even if you compare the DE results after 1000 evals / 1500 sec with the BO results after 1000 evals / 17500 sec this is a clear win for fcmaes DE with mixed integer enhancement. Independent from the fact that the algorithmic overhead for Bayesian Optimization is much higher and we couldn’t get parallel evaluation working, if we just compare the results after 1000 evaluations, we can conclude that for ESP4 (dim = 54) Bayesian Optimization doesn’t work well. MVRSM is more competitive compared to DE if you compare the result after 1000 evals / 17000 sec. Note that after only 200 evaluations BO and MVRSM are very close - and both beat Differential Evolution. For problems which are so expensive to evaluate that you can afford only 200 evaluations, both BO and MVRSM could be the right choice - if parallel evaluation is supported (and really works). After 1000 evaluations only DE and MVRSM can be recommended for problems with high dimensionality.
To summarize:
-
We could confirm the superiority of surrogate based methods like BO and MVRSM for for complex mixed integer CFD simulation based methods if we only can afford about 200 function evaluations. But the results after only 200 evaluations are poor and parallel execution can help to increase the feasible evaluation budged.
-
fcmaes Differential Evolution, thanks to its mixed integer support, is a serious competition if more than 500 evaluations can be performed - already equipped with the ability to perform parallel function evaluations.
-
MVRSM is (as DE) superior to Bayesian Optimization for higher evaluation budgets.
-
The fcmaes multi objective solver (MO-DE) with mixed integer support is ready to be tested in this application area.