Skip to content

Guidance on num_simulations, max_depth, and large-branching setups for MAPF in MCTX #108

@DuoZhangRobotics

Description

@DuoZhangRobotics

Hi—thanks for the fantastic library!

I’m using MCTX (Gumbel MuZero search) for multi-agent path finding on grids. Each agent has 5 actions (UP/DOWN/LEFT/RIGHT/STAY), so the joint action space grows as $5^N$:

  • 2 agents → 25 actions
  • 3 agents → 125 actions
  • 4 agents → 625 actions

I don’t have a policy-value network yet; I’m using GMZ as a planner with uniform priors and either value=0 or a light heuristic. Horizons can be long on large maps.

Current settings

  • num_simulations: 10k–20k
  • max_depth: 15–30
  • max_num_considered_actions: 125

Observation
Despite the large simulation budget, plans are often suboptimal compared to a human baseline.

Questions

  1. Any recommended rules of thumb for choosing num_simulations vs. max_depth as the branching factor explodes?
  2. For joint action spaces, guidance on max_num_considered_actions (consider-all vs. subsample)?
  3. Suggested qtransform settings (e.g., value_scale, maxvisit_init, use_mixed_value, rescale_values) when values are zero/heuristic rather than learned?
  4. With uniform priors, should I keep a nonzero gumbel_scale to break ties, or is a deterministic setting preferable here?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions