I participated in the Kumo Torakku contest of DeepRacer Virtual Circuit.
It is a reward function at that time. The lap time was about 25 seconds after repeated training.
- Kumo Torakku Training
| Action space | Value |
|---|---|
| Maximum steering angle | 30 degrees |
| Steering angle granularity | 7 |
| Maximum speed | 8 m/s |
| Speed granularity | 3 |
| Hyperparameters | Value |
|---|---|
| Gradient Descent Batch Size | 64 |
| Entropy | 0.01 |
| Discount Factor | 0.999 |
| Loss Type | Huber |
| Learning Rate | 0.0003 or 0.001 |
| No# Experience Episodes between each policy-updating iteration | 20 |
| No# of Epochs | 3 or 10 |
- 60 or 120 mins