-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
In MoE.py they do:
expert_id = ct.load(sorted_expert_ids, index=bid_m, shape=())which I suppose creates a 0-dimensional tile. But what's peculiar is they go on to use the value as an index in load:
b = ct.load(B, (expert_id, k, bid_n), shape=(1, TILE_K, TILE_N),
order=(0, 2, 1), padding_mode=zero_pad)I understand it might be tricky to use runtime values like this, but it is akin to how gather uses indices stored in tiles to index into an array, and I think it would be possible to do something similar with gather if there were two indices, but there are three!
Metadata
Metadata
Assignees
Labels
No labels