Skip to content

RuntimeError: stack expects each tensor to be equal size, but got [14] at entry 0 and [12] at entry 1 #2

@zhanhl316

Description

@zhanhl316

Training of Epoch 0: GPU 0 will process 591616 data in 2311 iterations.
0%| | 0/2311 [00:31<?, ?it/s]
Traceback (most recent call last):
File "xmatching/main.py", line 313, in
main()
File "xmatching/main.py", line 43, in main
mp.spawn(train, nprocs=args.gpus, args=(args,))
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 200, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 158, in start_processes
while not context.join():
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 119, in join
raise Exception(msg)
Exception:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 20, in _wrap
fn(i, *args)
File "/home/zhanhaolan/codes/vokenization/xmatching/main.py", line 233, in train
for i, (uid, lang_input, visn_input) in enumerate(tqdm.tqdm(train_loader, disable=(gpu!=0))):
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/tqdm/std.py", line 1167, in iter
for obj in iterable:
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 363, in next
data = self._next_data()
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 989, in _next_data
return self._process_data(data)
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1014, in _process_data
data.reraise()
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 185, in _worker_loop
data = fetcher.fetch(index)
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
return self.collate_fn(data)
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 84, in default_collate
return [default_collate(samples) for samples in transposed]
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 84, in
return [default_collate(samples) for samples in transposed]
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 84, in default_collate
return [default_collate(samples) for samples in transposed]
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 84, in
return [default_collate(samples) for samples in transposed]
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 55, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: stack expects each tensor to be equal size, but got [14] at entry 0 and [12] at entry 1

Hi, Do you have any idea about this issue?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions