Seeking Alternative Optimization Methods for a 9-Parameter Optimization Task

6 visualizzazioni (ultimi 30 giorni)

Hi there,
I am working on an optimization task where I need to determine 9 optimal parameters that satisfy my objective function (Objective.m). I have experimented with various optimization methods, including fmincon, fminsearch, particleswarm, lsqnonlin, etc, and documented a summary of their performance in computeError_1.m.
So far, fmincon and fminsearch have shown the most promising results, but they are not achieving the level of optimization I need (computeError.m). I am looking for alternative approaches that might improve performance, efficiency, and robustness.
If you have any recommendations or insights on more effective optimization techniques—whether it's hybrid methods, global optimization strategies, or derivative-free approaches—I would greatly appreciate your input.
Thank you in advance for your help!

  5 Commenti
William Rose
William Rose il 31 Mar 2025
I like the recommendations of @Torsten very much. You suggest, in your reply to him, that you have done what he recommends. However, it seems to me that Test_8.m does not really do what @Torsten recommends. He recommends that the initial guess for the unknown vector be the correct value, or a slightly perturbed value. Test_8.m calls computeError.m, which estimates the 9 unknown parameters with fminsearch (i.e. with simplex algorithm, unconstrained) and with fmincon (constrained). fminsearch() starts from initial guess LvecInit, which is not the actual value of [mx,my,mz,sx,sy,sz,rxy,ryz,rzx] for this data set (Sample_2.xlsx). fmincon() starts from
LvecStart = LvecInit + 0.1 * randn(size(LvecInit));
which is not a slightly perturbed version of the actual vector of parameters for this data set.
Test_8.m does not display the true value of [mx,my,mz,sx,sy,sz,rxy,ryz,rzx] for the data set and does not display the value of [mx,my,mz,sx,sy,sz,rxy,ryz,rzx] found by minimization. I suggest you add that to the code.
Test_8.m does display the covariance matrix of the actual x,y,z data and the covariance matrix of x,y,z data which is computed using the best-fit values for [mx,my,mz,sx,sy,sz,rxy,ryz,rzx]. The comparison of covariance matrices shows that estimated Var(x) is too large by a factor of about 40. The covariance matrices also show the wrong signs for estimated cov(x,y) and cov(z,x). This indicates that the estimated values for rho(x,y) and rho(z,x) have the wrong signs. These are significant errors in estimation of the unknown parameters.
If you modify Test_8.m so that it does what @Torsten suggests, then it will be interesting to see how close it gets to the correct values for the unknown vector [mx,my,mz,sx,sy,sz,rxy,ryz,rzx].
payam samadi
payam samadi il 1 Apr 2025
Thank you for your both comments. I believed I had sent the LvecArray.m file; however, here is the code. If I'm not mistaken, this is what you suggested.
For 'Sample 1.xlsx', I went with these two scenarios:
1. LvecInit= [1.50 -7.35 2.38 2.01 5.66 2.82 -0.94 0.95 -0.96]; % Best guess for Lvec, which produces the following covariance matrix for both real and estimation datasets.
Covariance Matrix A_matrix From Real Data:
4.0600 -10.7324 5.4269
-10.7324 32.0362 -15.5045
5.4269 -15.5045 7.9759
Covariance Matrix A_matrix From Estimation:
8.3597 -3.6364 -6.9907
-3.6364 30.0348 4.7806
-6.9907 4.7806 13.8301
RMSE for X, Y, Z (mm): [3.17, 0.56, 5.23] mm
Elapsed time is 18.800424 seconds. Please also see the attached Sample_1_a.fig.
2. LvecInit= [1.40 -7.15 2.18 1.91 5.36 2.72 -0.84 0.75 -0.76]; % A little far from the best value, which produces the following covariance matrix for both real data and estimation.
Covariance Matrix A_matrix From Real Data:
4.0600 -10.7324 5.4269
-10.7324 32.0362 -15.5045
5.4269 -15.5045 7.9759
Covariance Matrix A_matrix From Estimation:
5.4911 -4.7757 -3.7426
-4.7757 29.9628 3.7121
-3.7426 3.7121 13.8008
RMSE for X, Y, Z (mm): [2.54, 0.54, 5.23] mm
Elapsed time is 17.252999 seconds. Please also see the attached Sample_1_b.fig.
Note: for both scenarios, the "fminsearch" used LvecInit while the "fmincon" used LvecStart = LvecInit + 0.1 * randn(size(LvecInit));

Accedi per commentare.

Risposta accettata

William Rose
William Rose il 31 Mar 2025
Modificato: William Rose il 31 Mar 2025
[Edit: Some of the equations in my answer do not look right. I pasted them in. The equations looked fine in the Matlab Answers editing window, but they do not look fine once my answer is posted. Therefore I am editing them. I have not changed the content.]
In your 17 March 2025 posting on Matlab Answers, you included data for a target that moves in a deterministic way in 3 dimensions, as the scanner rotates around the target. The five test data files you attached abouve also show deterministic motion in 3D. The model with 9 adjustable parameters, which you are fitting in this case, is from Poulsen et al., 2009. That model is for a target that moves randomly. The model is that the target is decribed by a 3-D Gaussian distibution. The 3D Gaussian distribution can be thought of as an ellipse in 3D. The 9 parameters are the centroid of the ellipse, the SDs along x,y,z, and the correlations , , . (From these parameters, one can infer the lengths along the principal axes of the ellipse, and the orientation of the 3D ellipse.) The difference between random and deterministric motion is important. If you are using simulated or real data in which the target moves deterministically as the scanner rotates about the target, then the diferent 2D views of the target will not be random samples of the distribution, and one is likely to obtain incorrect estimates for the parameters.
Perhaps the script fails to find the correct solution because it gets stuck in a local minimum. You could try different starting points. You could use the mean location, found by a separate method that may use singular value decomposition method, as the initial guess for the first three elements of the 9-dimensional minimization. This is the method used by Poulsen at al., 2008, pp. 1588-1589.
Other ideas for improvement:
  • More restrictive upper and lower bounds for the mean y-coordinate, my. ub(my) = f*max(Yp); lb(my) = f*min(Yp); using all projection angles, using all projection angles. f=approximate inverse magnification=SAD/SID.
  • More restrictive upper bound and better initial guess for sy. ub(sy)=(ub(my)- lb(my))/2. sy initial guess = f*std(Yp), using all Yp observations. f=approximate inverse magnification=SAD/SID.
  • More restrictive upper and lower bounds for mx and mz. ub(mx) = ub(mz) = f*max(abs(Xp)); lb(mx) = lb(mz) = -ub(mx).
  • Better initial guess for sx and sz. Let the initial guesses be sx0 and sz0. Three options for the initial guesses are:
  • sx0=sz0=s.d.(Xp), using all projections. This may be a considerable overestimate of sx and sz, for a target whose centroid is not at the origin, because, even if the target does not move in 3D at all (i.e. true sx=sz=0), Xp will vary as the source and detector rotate.
  • For each projection angle ai, compute the deviation of the projected point from the projection of the centroid, corrected for approximate magnification: where f=inverse magnification=SAD/SDD and =projection of the initial 3D estimate of the centroid. Then
  • Use Xp values from projections with angles between ±30° and 180°±30° to estimate sx0, taking magnification into account:, using only projections for which or , and where the Xp values are negated when . Use Xp values from projections with angles between 90±30° and 270°±30° to estimate sz0, taking magnification into account: , using only projections for which or , and where the Xp values are negated when . This method is unlikely to yield useful results if there are few projections in each set.
  • Instead of using initial guesses for sx, sy, sz computed with a specific strategy, as described immediately above, use values chosen at random within the 3D cube defined by the lower and upper bounds for sx, sy,sz. Choose the best fit values that minimize the objective function among all the different initial guesses. This strategy will reduce the chance of getting stuck in a local minimum that is not the global minimum. To sample the different parts of this cube, we probably should use 2^3=8 to 3^3=27 different initial guesses.
  • Instead of using initial guess = 0 for rx, ry, rz, use values chosen at random in the 3D cube with bounds ±1. Choose the best fit values that minimize the objective function among all the different initial guesses. Use 8 to 27 different random initial guesses to explore the cube.
  • If we combine random initial guesses for sx, sy,sz and for rx, ry, rz, we get a 6D hypercube. To sample the different parts of this hypercube, we probably want 2^6=64 to 3^6=729 different initial guesses. This could take a long time.
  22 Commenti
payam samadi
payam samadi il 6 Apr 2025
Thank you for the current script. I have summarized all of these scripts, as well as a few I developed earlier, into one comprehensive document, which I have sent to you via email. Additionally, I checked a few other items and explained them in the email as well.
William Rose
William Rose il 8 Apr 2025
Modificato: William Rose il 8 Apr 2025
Here is a revised version of the script to assist in visualizing the 3D Gaussian distribution. The input is the 1x9 L vector = [mean(x),mean(y),mean(z),sd(x),sd(y),sd(z),corr(x,y),corr(y,z),corr(z,x)].
L may be estimated from 2D projections or may be computed directly, if the x,y,z coordinates are known.
The script displays the orientation of the principal axis of the 3D Gaussian distribution, and displays the fraction of total variance accounted for by movement along the principal axis.
The script generates 2 figures showing an ellipsoid that bounds 90% of the probability for the marker location. Change the value 0.9 in the code, to generate an ellipse that bounds a different probability.
The first figure is designed to be rotated as desired by the user. The second figure shows three orthogonal two-D views of the ellipsoid, for a supine patient. The ellipsoid color has no specific meaning, but it helps the viewer know how the different views are related.
visualize3dGaussian1
Principal direction=0.306 left, 0.846 caudal, 0.437 anterior. Percent of motion variance due to movement along principal axis=98.8%.

Accedi per commentare.

Più risposte (0)

Categorie

Scopri di più su Get Started with Curve Fitting Toolbox in Help Center e File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by