Why is bayesopt scaling badly?

8 visualizzazioni (ultimi 30 giorni)

Piers Lillystone il 1 Feb 2021

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/732798-why-is-bayesopt-scaling-badly

Commentato: stdrj il 3 Mag 2024

Hello,

I've been using bayesopt to find the minimum to a rather complex objective function and I've run into a performance issue I am unable to solve. The basic outline of the code is below:

% get_optvars creates the optimizableVariables from an input list
optimizable_variables = get_optvars(variable_list);
% Manipulate_Matrix is a handle class, which manipulates the input data - a numerical matrix
matrix_class = Manipulate_Matrix(data);
obj_fn = @(v) optimize_output(v, matrix_class);
% Optimize.
results = bayesopt(obj_fn, vars, ...
                  'MaxObjectiveEvaluations', 200, 'UseParallel', true, ...
                  'XConstraintFcn', @DeterministicConstraints, ...
                  'ConditionalVariableFcn', @ConditionalConstraints);
% Objective function
function objective = optimize_output(v, matrix_class)
    % perform_method manipulates the data and saves the objective value to a class property.
    matrix_class.perform_method(v);
    objective = matrix_class.property_A;
end

My issue is that if I scale up the input data the optimization process ('total elapsed time' in the optimizer output) increases by an order of magnitude, without a corresponding increase in the total objective function time taken.

For example, if the input data is 1000 x 20 the optimization takes about 160 seconds (the problem has about 20 variables, both categorical and numeric), with an average objective function evaluation time of about 0.02 seconds. However, if I scale the input data to 10,000 x 20, and change nothing else in the code, the average objective function evaluation time scales linearly to 0.2 seconds, but the total optimization time is now 2000 seconds. (c.f. a total objective evaluation time of 4 secs vs 40 secs for the two tests)

At first I thought it was due to the data being copied to each of the parallel workers inefficiently so I followed all the possibilities here https://www.mathworks.com/help/stats/parallel_bayesian_optimization.html but none gave any significant improvement. The best (marginal) imporvement I could find was to make "optimizer_output" a nested function, embedding matrix_class into the function definition, and then copy that directly to the workers. However, the scaling issue still persists.

I'm at a loss to work out why bayesopt is taking disproportionately longer for larger input. As far as I am aware bayesopt should have no idea it's being passed larger data, only that the objective function evaluation takes longer and the internal gpr models are still fitting on the same set of variables, so why is the optimizer scaling so badly?

Thanks for any help you can provide!

Piers