Ruihang Xu,
Dewei Zhou,
Fan Ma,
Yi Yang†
ReLER, CCAI, Zhejiang University
- 2025.12.8: Released the inference code, training code, pretrained model weights, and GUI support for ContextGen.
- 2025.10.19: Released the IMIG-Dataset construction pipeline.
ContextGen is a novel framework that uses user-provided reference images to generate image with multiple instances, offering layout control over their positions while guaranteeing identity preservation.
- Arxiv Paper
- Inference Code
- Training Code
- Dataset Generation Code
- Huggingface Model Weights
- GUI Support
conda create contextgen python=3.12 -y
conda activate contextgen
pip install -r requirements.txtDownload FLUX.1-Kontext and ContextGen Adapter. Configure the weight paths in a .env file and place it in the root directory. You can refer to the .env_template file. The format is as follows:
KONTEXT_MODEL_PATH="path_to_kontext_model"
ADAPTER_PATH="path_to_contextgen_adapter"
⚠️ GPU Memory Note: The inference process requires ~35-40GB GPU memory. We're working on quantization and optimization to reduce the memory footprint in future releases.
To run inference on the provided demos, simply execute:
python inference.pyThe generated results will be saved in the images/output folder.
-
For Custom Input: You can add your own images in the
images/inputfolder and modify the inference.py file accordingly. -
More Demos: More interesting demos and results can be found on our Project Page.
-
Recommended Interaction: For easier interaction, we highly recommend using our GUI Support.
You can customize your own dataset by referencing the IMIG-Dataset construction code. Remember to add your WANDB API key in the .env file for experiment tracking:
WANDB_API_KEY="your_wandb_api_key"Then configure the training parameters in train/config/config.yaml and run:
python src/model/train.pyWe provide a simple GUI built with Vite and React for easier interaction.
The GUI requires additional models. Please download them and set their full paths in the .env file:
-
Image Cutout (Required): Download the BEN2 model.
BEN_CKPT_PATH="path_to_ben2_model" -
Asset Generation from Text (Optional): Download the FLUX.1-dev model.
FLUX_MODEL_PATH="path_to_flux_model"
⚠️ GPU Memory Note: Using the optional asset generation feature consumes an additional ~30GB GPU memory. If your single GPU memory is limited, consider loading this model on a different GPU. If you do not require this feature, you can comment out the related code ingui/backend/app.py.
If you don't have Node.js and npm installed, you can install them as follows:
# install nvm
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.3/install.sh | bash
# restart your terminal to load nvm, then install Node.js
nvm install --ltsTo build and run the demo, follow these steps:
-
Start Frontend: In the first terminal, run the following commands:
cd gui/frontend npm install # for the first time only npm run dev
-
Start Backend: Open a second terminal and run the backend server:
python gui/backend/app.py
Once both the frontend and backend servers are successfully launched, if you are working on a remote server, port forwarding is required. Please ensure the frontend port (127.0.0.1:5173) and the backend port (127.0.0.1:5000) are forwarded to the corresponding ports on your local machine. You can then access the GUI via your local browser at http://localhost:5173. Here’s a quick preview of the interface:
- For better identity rendering and visual quality, we recommend using a middle resolution (e.g., 768x768 or 512x512). This strikes a balance, as higher resolutions may compromise identity consistency, while lower resolutions can introduce artifacts.
- To enhance visual quality and contextual consistency, we recommend using a richer prompt that includes detailed, interactive relationships between the instances.
- If a generated case fails or exhibits poor quality, please try again with a different random seed.
If you find ContextGen helpful to your research, please consider citing our paper:
@article{xu2025contextgencontextuallayoutanchoring,
title={ContextGen: Contextual Layout Anchoring for Identity-Consistent Multi-Instance Generation},
author={Ruihang Xu and Dewei Zhou and Fan Ma and Yi Yang},
year={2025},
eprint={2510.11000},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2510.11000},
}
