Skip to content

[Bug]Ark0.4.1 multi_gou_tutorial.py run error #207

@shenyanmei2020

Description

@shenyanmei2020

Describe the bug
ark0.4.1: run multi_gou_tutorial.py fail in sched_default.cc
line393 in configure_gpu_buf, tensor.cc line246 in update_pads, errors as follow:
invalid padding detected. This is likely caused because one GPU buffer is used by multiple operators that require different padding. A possible workaround is to let each operator use a different buffer by creating a new tensor rather than overwriting an existing tensor op name:send.

To Reproduce
run multi_gou_tutorial.py in ark0.4.1

Expected behavior

  1. explain why has the error;
  2. what relationship "ldims, type_bytes, tile" between ref_tensor and this_tensor satisfy in updae_pads?

System (please complete the following information):

  • ark0.4.1
  • OS: [e.g. Ubuntu18.04]
  • GPU [A100]
  • Networking Environment [Single-node, Multi-gpu]

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions