- Subject-Driven: Place subjects into new scenes while maintaining identity consistency
- Style-Driven: Apply artistic styles to new content based on reference images
- Combined: Use both subject and style references simultaneously
ByteDance USO ComfyUI Native Workflow
1. Workflow and input
Download the image below and drag it into ComfyUI to load the corresponding workflow.
Download JSON Workflow
Use the image below as an input image.
2. Model links
checkpoints loras model_patches clip_visions Please download all models and place them in the following directories:3. Workflow instructions

- Load models:
- 1.1 Ensure the
Load Checkpointnode hasflux1-dev-fp8.safetensorsloaded - 1.2 Ensure the
LoraLoaderModelOnlynode hasdit_lora.safetensorsloaded - 1.3 Ensure the
ModelPatchLoadernode hasprojector.safetensorsloaded - 1.4 Ensure the
Load CLIP Visionnode hassigclip_vision_patch14_384.safetensorsloaded
- 1.1 Ensure the
- Content Reference:
- 2.1 Click
Uploadto upload the input image we provided - 2.2 The
ImageScaleToMaxDimensionnode will scale your input image for content reference, 512px will keep more character features, but if you only use the character’s head as input, the final output image often has issues like the character taking up too much space. Setting it to 1024px gives much better results.
- 2.1 Click
- In the example, we only use the
content referenceimage input. If you want to use thestyle referenceimage input, you can useCtrl-Bto bypass the marked node group. - Write your prompt or keep default
- Set the image size if you need
- The EasyCache node is for inference acceleration, but it will also sacrifice some quality and details. You can bypass it (Ctrl+B) if you don’t need to use it.
- Click the
Runbutton, or use the shortcutCtrl(Cmd) + Enterto run the workflow
4. Additional Notes
- Style reference only:

content reference node and only use an Empty Latent Image node.
- You can also bypass whole
Style Referencegroup and use the workflow as a text to image workflow, which means this workflow has 4 variations
- Only use content (subject) reference
- Only use style reference
- Mixed content and style reference
- As a text to image workflow