This is a text encoding node designed for Edit models (Flux Kontext, Qwen Image Edit, Flux 2 Klein...).
Key features:
- Adjustable Vision Language Model (VLM) resolution via
vl_megapixels - Control over the number of images processed via
max_images_allowed - Support for up to 3 input images
This parameter is only relevant for Qwen Image Edit (for other Edit models you must set this value to 0).
Qwen Image Edit uses a Vision Language Model (VLM) to analyze your input images and automatically enhance your prompt with more detailed descriptions.
The default TextEncodeQwenImageEdit node downscales your images to 0.15 megapixels before feeding them to the VLM.
This gives you control over that value, allowing you to eventually find a better sweet spot for your specific use case.
By adjusting this threshold, you may achieve:
- Better style preservation
- Reduced zoom effect: Mitigate the tendency for Qwen Image Edit to zoom in on images
combined_video.mp4
Controls the maximum number of images to process from the available inputs (image1, image2, image3).
For example: If you have 3 images connected to the node, setting this to "1" will only process image1.
Navigate to the ComfyUI/custom_nodes folder, open cmd and run:
git clone https://github.com/BigStationW/ComfyUi-TextEncodeEditAdvancedRestart ComfyUI after installation.
Find the TextEncodeEditAdvanced node.
I also provide worksflows for those interested.
A variant of TextEncodeEditAdvanced that let's you use CLIP Text Encode (Prompt) (Or something else)
A workflow using this node for Flux 2 Klein is provided.