Steady state error in DDPG control

Question

0 voti

I am trying to make some modifications in Control Water Level in a Tank Using a DDPG Agent example. I want to reduce sample time from 1.0 to 0.5, so I set Ts = 0.5. Consequently, I had to make adjustment on StopTrainingValue, i.e., changed its value from 2000 to 4000. The training process was successfully completed as it can be seen below.

But there is something unexpected happened: this modifications introduce a steady state error (or something similar to) that wasn't there in the original example.

How to overcome this steady state error? Do I need to make additional adjustments, e.g. make changes to the structure of observations, reward function, actor/critic network, StopTrainingCriteria, etc?

Update:

This is the error I get using pre-trained agent (doTraining = false, no change on the original example)

This is the error I get using re-trained agent (doTraining = true, no change on the original example)

3 Commenti
Mostra 1 commento meno recente Nascondi 1 commento meno recente

Ari il 26 Dic 2024

Modificato: Ari il 26 Dic 2024

The original reward function is defined as reward = 10(∣e∣<0.1) - 1(∣e∣≥0.1) - 100(h≤0 ∣∣ h≥20) where e = reference - h is the error and h is the height of the water in the tank. I didn't touch this function. It works well in the original example.

Sam Chak il 26 Dic 2024

I see. This probably implies that changing the sampling time affects the learning efficiency of the RL algorithm in tuning the PI Controller gains.

You may manually adjust the tuning parameter, but night as well use an optimization algorithm like GA or PSO to auto-tune all other hyperparameters in RL.

Accedi per commentare.

Accedi per rispondere a questa domanda.

Follow Question

Answer 1

Divyanshu il 26 Dic 2024

0 voti

Hello @Ari Prasetiyo,

To get the same results and to avoid the error for sample time 0.5, you might have to change 'Tf' as well and set its value to '100'. This will ensure that the 'MaxStepsPerEpisode' parameter of 'rlTrainingOptions' still has the correct value which the example expects.

Since you only tried to modify the sample time, incorrect value of 'MaxStepsPerEpisode' was computed and maybe that can be a reason for the error.

I hope this helps. However, to find the exact root cause of the error, I might the snapshot of the error message and the reproduction steps.

1 Commento
Mostra -1 commenti meno recenti Nascondi -1 commenti meno recenti

Ari il 26 Dic 2024

Modificato: Ari il 26 Dic 2024

Maximum number of environment steps to run per episode is set using MaxStepsPerEpisode = ceil(Tf/Ts). So, it's automatically adjusted. Also, there is no error message whatsoever. To reproduce my result: download the example -> change Ts to 0.5 -> change StopTrainingValue to 4000 -> change doTraining to true -> run the simulation. You may add an integrator to see the steady state error.

Accedi per commentare.

Answer 2

Ari il 24 Gen 2025

Modificato: Ari il 24 Gen 2025

0 voti

I came up with this solution:

Remove output limit from integrator on observation block

add another pair of layers in actor and critic networks, and use different random seeds. This is the result I obtained

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Accedi per commentare.

Steady state error in DDPG control

3 Commenti
Mostra 1 commento meno recente Nascondi 1 commento meno recente

Risposte (2)

1 Commento
Mostra -1 commenti meno recenti Nascondi -1 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Categorie

Prodotti

Release

Tag

Community Treasure Hunt

Steady state error in DDPG control

3 Commenti Mostra 1 commento meno recente Nascondi 1 commento meno recente

Risposte (2)

1 Commento Mostra -1 commenti meno recenti Nascondi -1 commenti meno recenti

0 Commenti Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Categorie

Prodotti

Release

Tag

Vedere anche

Community Treasure Hunt

3 Commenti
Mostra 1 commento meno recente Nascondi 1 commento meno recente

1 Commento
Mostra -1 commenti meno recenti Nascondi -1 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti