-
Notifications
You must be signed in to change notification settings - Fork 181
Description
Before creating a new issue, please check the FAQ to see if your question is answered there.
Environment data
- debugpy version: 1.8.12
- OS and version: Red Hat Enterprise Linux, 9.5 (Plow)
- Python version (& distribution if applicable, e.g. Anaconda): 3.10.16, Anaconda
- Using VS Code or Visual Studio: VS Code
Actual behavior
I am on a compute cluster that uses LSF and I launch an interactive job to get into a compute node. In that compute node, I start debugpy using
python -m debugpy --listen 0.0.0.0:1326 --wait-for-client -c "print('hello')"
And the serve is waiting for the client. However, when I try to connect to the server using VSCode, I get ECONNREFUSED. When I inspect the logs using python -m debugpy --log-to logs --listen 0.0.0.0:1326 --wait-for-client -c "print('hello')" I see the following:
debugpy.pydevd.2718046.log
0.32s - pydevd: Use libraries filter: False
0.00s - IDE_PROJECT_ROOTS []
0.00s - Collecting default library roots.
0.00s - LIBRARY_ROOTS ['/u/jub/.local/lib/python3.10/site-packages', '/u/jub/miniconda3/envs/torch/lib/python3.10', '/u/jub/miniconda3/envs/torch/lib/python3.10/site-packages']
0.00s - Apply debug mode: debugpy-dap
0.00s - Preimport: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages;debugpy._vendored.force_pydevd
0.00s - Connecting to 127.0.0.1:47939
0.00s - Connected to: <socket.socket fd=5, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 43098), raddr=('127.0.0.1', 47939)>.
0.00s - Applying patching to hide pydevd threads (Py3 version).
0.01s - ReaderThread: empty contents received (len(line) == 0).
0.00s - PyDB.dispose_and_kill_all_pydevd_threads (called from: File "/u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_comm.py", line 324, in _terminate_on_socket_close)
0.00s - PyDB.dispose_and_kill_all_pydevd_threads (first call)
0.00s - PyDB.dispose_and_kill_all_pydevd_threads no commands being processed.
0.00s - PyDB.dispose_and_kill_all_pydevd_threads killing thread: <ReaderThread(pydevd.Reader, started daemon 22788828034624)>
0.00s - pydevd.Reader received kill signal
0.00s - PyDB.dispose_and_kill_all_pydevd_threads killing thread: <WriterThread(pydevd.Writer, started daemon 22788898952768)>
0.00s - sending cmd (http_json) --> CMD_EXIT {"type": "event", "event": "terminated", "seq": 2, "body": {}, "pydevd_cmd_id": 129}
0.00s - pydevd.Writer received kill signal
0.00s - PyDB.dispose_and_kill_all_pydevd_threads waiting for pydb daemon threads to finish
0.00s - PyDB.dispose_and_kill_all_pydevd_threads (called from: File "/u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_comm.py", line 432, in _on_run)
0.00s - PyDB.dispose_and_kill_all_pydevd_threads (already disposed - wait)
0.10s - Successfully Loaded helper lib to set tracing to all threads.
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/threading.py - wait
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/threading.py - wait
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/pydevd.py - _on_run
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_daemon_thread.py - run
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/threading.py - _bootstrap_inner
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/threading.py - _bootstrap
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/threading.py - wait
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/threading.py - wait
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/pydevd.py - _on_run
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_daemon_thread.py - run
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/threading.py - _bootstrap_inner
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/threading.py - _bootstrap
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/pydevd.py - __wait_for_threads_to_finish
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/pydevd.py - dispose_and_kill_all_pydevd_threads
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_comm.py - _terminate_on_socket_close
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_comm.py - _on_run
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_daemon_thread.py - run
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/threading.py - _bootstrap_inner
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/threading.py - _bootstrap
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/threading.py - wait
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/threading.py - wait
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/pydevd.py - __wait_for_threads_to_finish
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/pydevd.py - dispose_and_kill_all_pydevd_threads
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_comm.py - _on_run
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_daemon_thread.py - run
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/threading.py - _bootstrap_inner
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/threading.py - _bootstrap
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/pydevd.py - set_tracing_for_untraced_contexts
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/pydevd.py - _locked_settrace
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/pydevd.py - settrace
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/server/api.py - _settrace
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/server/api.py - listen
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/server/api.py - debug
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/public_api.py - wrapper
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/server/cli.py - start_debugging
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/server/cli.py - run_code
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/server/cli.py - main
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/__main__.py - <module>
0.00s - Set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/runpy.py - _run_code
0.00s - Set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/runpy.py - _run_module_as_main
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/pydevd.py - settrace
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/server/api.py - _settrace
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/server/api.py - listen
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/server/api.py - debug
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/public_api.py - wrapper
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/server/cli.py - start_debugging
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/server/cli.py - run_code
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/server/cli.py - main
0.00s - SKIP set tracing of frame: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/__main__.py - <module>
0.40s - PyDB.dispose_and_kill_all_pydevd_threads: finished
0.00s - The following pydb threads may not have finished correctly: pydevd.CommandThread, pydevd.Writer
0.00s - PyDB.dispose_and_kill_all_pydevd_threads: finished
0.00s - ReaderThread: exit
Traceback (most recent call last):
File "/u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_comm.py", line 422, in _on_run
cmd.send(self.sock)
File "/u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_net_command.py", line 109, in send
sock.sendall(as_bytes)
BrokenPipeError: [Errno 32] Broken pipe
0.00s - WriterThread: exit
debugpy.server-2718046.log
I+00000.013: Linux-5.14.0-427.42.1.el9_4.x86_64-x86_64-with-glibc2.34 x86_64
CPython 3.10.16 (64-bit)
debugpy 1.8.12
I+00000.113: Initial environment:
System paths:
sys.executable: /u/jub/miniconda3/envs/torch/bin/python(/u/jub/miniconda3/envs/torch/bin/python3.10)
sys.prefix: /u/jub/miniconda3/envs/torch
sys.base_prefix: /u/jub/miniconda3/envs/torch
sys.real_prefix: <missing>
site.getsitepackages(): /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages
site.getusersitepackages(): /u/jub/.local/lib/python3.10/site-packages
sys.path (site-packages): /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages
sysconfig.get_path('stdlib'): /u/jub/miniconda3/envs/torch/lib/python3.10
sysconfig.get_path('platstdlib'): /u/jub/miniconda3/envs/torch/lib/python3.10
sysconfig.get_path('purelib'): /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages
sysconfig.get_path('platlib'): /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages
sysconfig.get_path('include'): /u/jub/miniconda3/envs/torch/include/python3.10
sysconfig.get_path('scripts'): /u/jub/miniconda3/envs/torch/bin
sysconfig.get_path('data'): /u/jub/miniconda3/envs/torch
os.__file__: /u/jub/miniconda3/envs/torch/lib/python3.10/os.py
threading.__file__: /u/jub/miniconda3/envs/torch/lib/python3.10/threading.py
debugpy.__file__: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/__init__.py
Installed packages:
kiwisolver==1.4.7
tzdata==2024.2
Jinja2==3.1.4
torch==2.5.1
py-cpuinfo==9.0.0
filelock==3.16.1
multidict==6.1.0
pip==24.2
pluggy==0.13.1
cmake==3.31.2
tomlkit==0.13.2
pybind11==2.13.6
packaging==24.2
click==8.1.7
huggingface_hub==0.26.5
huggingface-hub==0.27.0
safetensors==0.4.5
peft==0.14.0
Brotli==1.0.9
networkx==3.2
sentry-sdk==2.19.2
PyYAML==6.0
mypy==0.991
mccabe==0.7.0
triton-nightly==3.0.0.post20240716052845
transformers==4.47.1
multiprocess==0.70.16
xformers==0.0.29.post1
MarkupSafe==2.1.1
pylint==2.15.7
torchaudio==2.5.1
async-timeout==5.0.1
annotated-types==0.7.0
Pillow==9.2.0
pyarrow==18.1.0
fast_hadamard_transform==1.0.4.post1
PySocks==1.7.1
mypy-extensions==1.0.0
aiosignal==1.3.2
hjson==3.1.0
fsspec==2024.9.0
fsspec==2024.12.0
setuptools==75.1.0
datasets==3.2.0
six==1.17.0
tqdm==4.67.1
typing_extensions==4.12.2
threadpoolctl==3.5.0
debugpy==1.8.12
smmap==5.0.1
ninja==1.11.1.3
frozenlist==1.5.0
scipy==1.14.1
scipy==1.8.1
gmpy2==2.1.2
pydantic==2.10.3
docker-pycreds==0.4.0
protobuf==5.29.2
pytest==6.2.4
aiohttp==3.11.10
gitdb==4.0.11
yarl==1.18.3
py==1.11.0
mpi4py==4.0.1
urllib3==2.2.3
propcache==0.2.1
wrapt==1.17.0
lazy-object-proxy==1.10.0
scikit-build==0.18.1
pycparser==2.22
cycler==0.12.1
distro==1.9.0
iniconfig==2.0.0
idna==3.10
h2==4.1.0
hyperframe==6.0.1
triton==3.1.0
tomli==2.2.1
cffi==1.15.0
types-dataclasses==0.6.6
wandb==0.19.1
fonttools==4.55.3
pycodestyle==2.10.0
wheel==0.44.0
accelerate==1.2.1
scikit-learn==1.6.0
attrs==24.3.0
psutil==6.1.0
zstandard==0.19.0
dill==0.3.8
setproctitle==1.3.4
black==24.3.0
requests==2.32.3
isort==5.13.2
mpmath==1.3.0
certifi==2024.12.14
pyparsing==3.2.0
hpack==4.0.0
pandas==2.2.3
tokenizers==0.21.0
regex==2024.11.6
pytz==2024.2
contourpy==1.3.1
pydantic_core==2.27.1
aiohappyeyeballs==2.4.4
pathspec==0.12.1
torchvision==0.20.1
astroid==2.13.5
GitPython==3.1.43
types-requests==2.26.3
matplotlib==3.10.0
platformdirs==4.3.6
parameterized==0.8.1
msgpack==1.1.0
python-dateutil==2.9.0.post0
toml==0.10.2
numpy==1.26.4
xxhash==3.5.0
joblib==1.4.2
charset-normalizer==3.4.0
colorama==0.4.6
sympy==1.13.1
aihwkit_lightning==0.0.1
sigmamoe==0.0
deepspeed==0.15.4+unknown
analoglora==0.0
analogmoe==0.0
I+00000.113: sys.argv before parsing: ['/u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/__main__.py', '--log-to', 'logs', '--listen', '0.0.0.0:1326', '--wait-for-client', '-c', "print('hello')"]
after parsing: ['/u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/__main__.py']
D+00000.114: sys.argv after patching: ['-c']
D+00000.114: configure({'qt': 'none', 'subProcess': True}, {})
D+00000.114: listen(('0.0.0.0', 1326), **{})
I+00000.114: Initial debug configuration: {
"qt": "none",
"subProcess": true,
"python": "/u/jub/miniconda3/envs/torch/bin/python",
"pythonEnv": {}
}
I+00000.114: Waiting for adapter endpoints on 127.0.0.1:37193...
I+00000.114: debugpy.listen() spawning adapter: [
"/u/jub/miniconda3/envs/torch/bin/python",
"/u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/adapter",
"--for-server",
"37193",
"--host",
"0.0.0.0",
"--port",
"1326",
"--server-access-token",
"04ac658025d99f968fb21846b183284a1501cb1b3c8e54537b6a4bdd24772ce2",
"--log-dir",
"logs"
]
I+00000.283: Endpoints received from adapter: {
"client": {
"host": "0.0.0.0",
"port": 1326
},
"server": {
"host": "127.0.0.1",
"port": 47939
}
}
I+00000.283: Adapter is accepting incoming client connections on 0.0.0.0:1326
D+00000.283: pydevd.settrace(*(), **{'host': '127.0.0.1', 'port': 47939, 'wait_for_ready_to_run': False, 'block_until_connected': True, 'access_token': '04ac658025d99f968fb21846b183284a1501cb1b3c8e54537b6a4bdd24772ce2', 'suspend': False, 'patch_multiprocessing': True, 'dont_trace_start_patterns': ('/u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy',), 'dont_trace_end_patterns': ('debugpy_launcher.py',)})
I+00000.395: pydevd is connected to adapter at 127.0.0.1:47939
D+00000.395: wait_for_client()
debugpy.adapter-2718050.log
I+00000.013: Linux-5.14.0-427.42.1.el9_4.x86_64-x86_64-with-glibc2.34 x86_64
CPython 3.10.16 (64-bit)
debugpy 1.8.12
I+00000.127: debugpy.adapter startup environment:
System paths:
sys.executable: /u/jub/miniconda3/envs/torch/bin/python(/u/jub/miniconda3/envs/torch/bin/python3.10)
sys.prefix: /u/jub/miniconda3/envs/torch
sys.base_prefix: /u/jub/miniconda3/envs/torch
sys.real_prefix: <missing>
site.getsitepackages(): /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages
site.getusersitepackages(): /u/jub/.local/lib/python3.10/site-packages
sys.path (site-packages): /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages
sysconfig.get_path('stdlib'): /u/jub/miniconda3/envs/torch/lib/python3.10
sysconfig.get_path('platstdlib'): /u/jub/miniconda3/envs/torch/lib/python3.10
sysconfig.get_path('purelib'): /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages
sysconfig.get_path('platlib'): /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages
sysconfig.get_path('include'): /u/jub/miniconda3/envs/torch/include/python3.10
sysconfig.get_path('scripts'): /u/jub/miniconda3/envs/torch/bin
sysconfig.get_path('data'): /u/jub/miniconda3/envs/torch
os.__file__: /u/jub/miniconda3/envs/torch/lib/python3.10/os.py
threading.__file__: /u/jub/miniconda3/envs/torch/lib/python3.10/threading.py
debugpy.__file__: /u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/adapter/../../debugpy/__init__.py(/u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/__init__.py)
Installed packages:
kiwisolver==1.4.7
tzdata==2024.2
Jinja2==3.1.4
torch==2.5.1
py-cpuinfo==9.0.0
filelock==3.16.1
multidict==6.1.0
pip==24.2
pluggy==0.13.1
cmake==3.31.2
tomlkit==0.13.2
pybind11==2.13.6
packaging==24.2
click==8.1.7
huggingface_hub==0.26.5
huggingface-hub==0.27.0
safetensors==0.4.5
peft==0.14.0
Brotli==1.0.9
networkx==3.2
sentry-sdk==2.19.2
PyYAML==6.0
mypy==0.991
mccabe==0.7.0
triton-nightly==3.0.0.post20240716052845
transformers==4.47.1
multiprocess==0.70.16
xformers==0.0.29.post1
MarkupSafe==2.1.1
pylint==2.15.7
torchaudio==2.5.1
async-timeout==5.0.1
annotated-types==0.7.0
Pillow==9.2.0
pyarrow==18.1.0
fast_hadamard_transform==1.0.4.post1
PySocks==1.7.1
mypy-extensions==1.0.0
aiosignal==1.3.2
hjson==3.1.0
fsspec==2024.9.0
fsspec==2024.12.0
setuptools==75.1.0
datasets==3.2.0
six==1.17.0
tqdm==4.67.1
typing_extensions==4.12.2
threadpoolctl==3.5.0
debugpy==1.8.12
smmap==5.0.1
ninja==1.11.1.3
frozenlist==1.5.0
scipy==1.14.1
scipy==1.8.1
gmpy2==2.1.2
pydantic==2.10.3
docker-pycreds==0.4.0
protobuf==5.29.2
pytest==6.2.4
aiohttp==3.11.10
gitdb==4.0.11
yarl==1.18.3
py==1.11.0
mpi4py==4.0.1
urllib3==2.2.3
propcache==0.2.1
wrapt==1.17.0
lazy-object-proxy==1.10.0
scikit-build==0.18.1
pycparser==2.22
cycler==0.12.1
distro==1.9.0
iniconfig==2.0.0
idna==3.10
h2==4.1.0
hyperframe==6.0.1
triton==3.1.0
tomli==2.2.1
cffi==1.15.0
types-dataclasses==0.6.6
wandb==0.19.1
fonttools==4.55.3
pycodestyle==2.10.0
wheel==0.44.0
accelerate==1.2.1
scikit-learn==1.6.0
attrs==24.3.0
psutil==6.1.0
zstandard==0.19.0
dill==0.3.8
setproctitle==1.3.4
black==24.3.0
requests==2.32.3
isort==5.13.2
mpmath==1.3.0
certifi==2024.12.14
pyparsing==3.2.0
hpack==4.0.0
pandas==2.2.3
tokenizers==0.21.0
regex==2024.11.6
pytz==2024.2
contourpy==1.3.1
pydantic_core==2.27.1
aiohappyeyeballs==2.4.4
pathspec==0.12.1
torchvision==0.20.1
astroid==2.13.5
GitPython==3.1.43
types-requests==2.26.3
matplotlib==3.10.0
platformdirs==4.3.6
parameterized==0.8.1
msgpack==1.1.0
python-dateutil==2.9.0.post0
toml==0.10.2
numpy==1.26.4
xxhash==3.5.0
joblib==1.4.2
charset-normalizer==3.4.0
colorama==0.4.6
sympy==1.13.1
aihwkit_lightning==0.0.1
sigmamoe==0.0
deepspeed==0.15.4+unknown
analoglora==0.0
analogmoe==0.0
I+00000.128: Listening for incoming Client connections on 0.0.0.0:1326...
I+00000.128: Listening for incoming Server connections on 127.0.0.1:47939...
I+00000.129: Sending endpoints info to debug server at localhost:37193:
{
"client": {
"host": "0.0.0.0",
"port": 1326
},
"server": {
"host": "127.0.0.1",
"port": 47939
}
}
I+00000.129: Writing endpoints info to '/tmp/noConfigDebugAdapterEndpoints-368cbf5a634a2ec02ed2/debuggerAdapterEndpoint.txt':
{
"client": {
"host": "0.0.0.0",
"port": 1326
},
"server": {
"host": "127.0.0.1",
"port": 47939
}
}
E+00000.129: Error writing endpoints info to file:
Traceback (most recent call last):
File "/u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/adapter/__main__.py", line 115, in main
with open(listener_file, "w") as f:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/noConfigDebugAdapterEndpoints-368cbf5a634a2ec02ed2/debuggerAdapterEndpoint.txt'
Stack where logged:
File "/u/jub/miniconda3/envs/torch/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/u/jub/miniconda3/envs/torch/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/adapter/__main__.py", line 233, in <module>
main(_parse_argv(sys.argv))
File "/u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/adapter/__main__.py", line 119, in main
log.reraise_exception("Error writing endpoints info to file:")
File "/u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/adapter/../../debugpy/common/log.py", line 222, in reraise_exception
_exception(format_string, *args, **kwargs)
I+00000.129: Not logging to "<stderr>" anymore.
Notably, I see two issues (and I don't know which one causes which or which one comes first etc.):
E+00000.129: Error writing endpoints info to file:
Traceback (most recent call last):
File "/u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/adapter/__main__.py", line 115, in main
with open(listener_file, "w") as f:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/noConfigDebugAdapterEndpoints-368cbf5a634a2ec02ed2/debuggerAdapterEndpoint.txt'
Stack where logged:
File "/u/jub/miniconda3/envs/torch/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/u/jub/miniconda3/envs/torch/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/adapter/__main__.py", line 233, in <module>
main(_parse_argv(sys.argv))
File "/u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/adapter/__main__.py", line 119, in main
log.reraise_exception("Error writing endpoints info to file:")
File "/u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/adapter/../../debugpy/common/log.py", line 222, in reraise_exception
_exception(format_string, *args, **kwargs)
and
Traceback (most recent call last):
File "/u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_comm.py", line 422, in _on_run
cmd.send(self.sock)
File "/u/jub/miniconda3/envs/torch/lib/python3.10/site-packages/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_net_command.py", line 109, in send
sock.sendall(as_bytes)
BrokenPipeError: [Errno 32] Broken pipe
Expected behavior
This was working before on the cluster and now it doesn't. Probably something in the cluster config was changed, but I would like to have some guidance on how to fix it/some understanding what could be going on.
Steps to reproduce:
I am trying to reproduce on a different cluster right now, but it might take a while as it is very busy.