Skip to content

Got OOM message with GTX3060 #101

@masahiro-999

Description

@masahiro-999

I've been trying to Stable Diffusion with GPU.

But it failed and I got the OOM message

Is this error message due to insufficient GPU memory?
Is it possible to make it work by adjusting some parameters?
Stable Diffusion 1.4 is running on this GPU in the tensorflow environment. It would be nice if it works with bumblebee too.

it's working fine with :host . It's amazing how easy it is to use neural networks with livebooks!!!

OS Ubunt 22.04 on WSL2
GPU GTX3060(12GB)
Livebook v0.8.0
Elixir v1.14.2
XLA_TARGET=cuda111
CUDA Version: 11.7

05:32:56.019 [info] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

05:32:56.023 [info] XLA service 0x7fb39437dac0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:

05:32:56.023 [info]   StreamExecutor device (0): NVIDIA GeForce RTX 3060, Compute Capability 8.6

05:32:56.023 [info] Using BFC allocator.

05:32:56.023 [info] XLA backend allocating 10641368678 bytes on device 0 for BFCAllocator.

05:32:58.662 [info] Start cannot spawn child process: No such file or directory
05:34:00.234 [info] total_region_allocated_bytes_: 10641368576 memory_limit_: 10641368678 available bytes: 102 curr_region_allocation_bytes_: 21282737664

05:34:00.234 [info] Stats: 
Limit:                     10641368678
InUse:                      5530766592
MaxInUse:                   7566778624
NumAllocs:                        3199
MaxAllocSize:                399769600
Reserved:                            0
PeakReserved:                        0
LargestFreeBlock:                    0

05:34:00.234 [warn] **********___***********************************************************____________________________

05:34:00.234 [error] Execution of replica 0 failed: RESOURCE_EXHAUSTED: Out of memory while trying to allocate 3546709984 bytes.
BufferAssignment OOM Debugging.
BufferAssignment stats:
             parameter allocation:    3.84GiB
              constant allocation:       144B
        maybe_live_out allocation:   768.0KiB
     preallocated temp allocation:    3.30GiB
  preallocated temp fragmentation:       304B (0.00%)
                 total allocation:    7.15GiB
              total fragmentation:   821.0KiB (0.01%)

whole log is
oommessage.log

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions