-
Notifications
You must be signed in to change notification settings - Fork 69
Description
Hi,
I had a lot of problems creating a conda environment that was consistent and still allowed the code to run. I finally managed to have something running in the default configuration (at least with the command line pasted below), so I decided to share the new environment setup with you, so you may spare some time. It works now (30.01.2026).
- create a file with the environment specs. Insert this in a terminal (bash):
cat > requirements_pasd_complete.txt << 'EOF'
# Core PyTorch (specific versions for stability)
torch==2.0.1
torchvision==0.15.2
# Diffusers ecosystem (from original requirements, compatible versions)
diffusers==0.29.2
transformers==4.44.0
accelerate==0.25.0
tokenizers==0.19.1
huggingface-hub==0.24.0
safetensors==0.4.5
# XFormers (platform-dependent)
xformers==0.0.22; sys_platform != 'darwin'
# Training framework
pytorch_lightning==2.1.0
einops==0.7.0
# Image processing and CV
opencv-python==4.8.1.78
Pillow==10.1.0
scikit-image==0.21.0
imageio==2.31.5
# BasicSR and enhancement models (specific versions to avoid conflicts)
basicsr==1.4.2
facexlib==0.3.0
gfpgan==1.3.8
realesrgan==0.3.0
# CLIP and vision models
open_clip_torch==2.23.0
ftfy==6.1.3
# Object detection
ultralytics==8.0.200
# Dataset handling
webdataset==0.2.86
# Gradio (with constraints from original)
gradio==3.50.2
# Scientific computing (compatible with gradio constraints)
numpy==1.24.4
scipy==1.11.4
matplotlib==3.8.0
# Utilities
omegaconf==2.3.0
tqdm==4.66.1
pyyaml==6.0.1
regex==2023.10.3
EOF
- create the conda environment accordingly. Insert this in a terminal (bash):
# Completely remove and recreate the environment
conda deactivate
conda env remove -n pasd_2023 -y
conda create -n pasd_2023 python=3.10 -y
conda activate pasd_2023
# Install PyTorch first (CUDA 11.8 version, adjust if needed)
pip install torch==2.0.1 torchvision==0.15.2 --index-url https://download.pytorch.org/whl/cu118
# Install everything else
pip install -r requirements_pasd_complete.txt
- patch the vaehook.py for remove the scale arguments (see other issues for that problem specifically). Insert this in a terminal (bash):
# Create a fix script
cat > fix_vaehook_scale.py << 'EOF'
import re
file_path = "pasd/models/pasd/vaehook.py"
with open(file_path, 'r') as f:
content = f.read()
# Remove scale parameter from all Linear layer calls
# Pattern: matches calls like .to_q(x, scale=scale) and replaces with .to_q(x)
patterns = [
(r'self\.to_q\(([^,]+), scale=scale\)', r'self.to_q(\1)'),
(r'self\.to_k\(([^,]+), scale=scale\)', r'self.to_k(\1)'),
(r'self\.to_v\(([^,]+), scale=scale\)', r'self.to_v(\1)'),
(r'self\.to_out\[0\]\(([^,]+), scale=scale\)', r'self.to_out[0](\1)'),
]
for pattern, replacement in patterns:
content = re.sub(pattern, replacement, content)
with open(file_path, 'w') as f:
f.write(content)
print("✓ Fixed vaehook.py - removed scale parameters from Linear layer calls")
EOF
python fix_vaehook_scale.py
-
in test_pasd.py, add a 'none' choice to the high_level_ingo argument in the argparse:
replace this lineparser.add_argument('--high_level_info', choices=['classification', 'detection', 'caption'], nargs='?', default='caption', help="high level information for prompt generation")with this oneparser.add_argument('--high_level_info', choices=['classification', 'detection', 'caption', 'none'], nargs='?', default='caption', help="high level information for prompt generation"). The code is already tailored for that, and the none option avoids activating stuff that threw errors for me. You can go and try without that patch... -
run inference. For me, this terminal command (bash) works:
python test_pasd.py \
--pasd_model_path runs/pasd_rrdb/checkpoint-100000 \
--image_path "path/to/input/folder" \
--output_dir "path/to/output/folder" \
--upscale 1 \
--process_size 768 \
--high_level_info none
Add a thumbs up or a message if that worked for you / it didn't, so other user know it is generalizable...