Skip to content

習題8: 請自己設計一個固定的策略(不需要學習)解決 CartPole 問題,讓你的竿子盡量撐得久不會倒下來 #10

@ccckmit

Description

@ccckmit

建議

  1. 不要用機器學習,神經網路,直接用手寫固定策略來解決
  2. 記得先了解 Observation 與 Action ,再開始寫程式

參考

  1. https://gymnasium.farama.org/environments/classic_control/cart_pole/
  2. cartpole_human_run.py
import gymnasium as gym
env = gym.make("CartPole-v1", render_mode="human") # 若改用這個,會畫圖
# env = gym.make("CartPole-v1", render_mode="rgb_array")
observation, info = env.reset(seed=42)
for _ in range(100):
   env.render()
   action = env.action_space.sample()  # 把這裡改成你的公式,看看能撐多久
   observation, reward, terminated, truncated, info = env.step(action)
   print('observation=', observation)
   if terminated or truncated: # 這裡要加入程式,紀錄你每次撐多久
      observation, info = env.reset()
      print('done')
env.close()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions