Skip to content

nenhang/ContextGen

Repository files navigation

ContextGen: Contextual Layout Anchoring
for Identity-Consistent Multi-Instance Generation

Ruihang Xu, Dewei Zhou, Fan Ma, Yi Yang
ReLER, CCAI, Zhejiang University

Project Page Paper Dataset Code

🔥 Updates

  • 2025.12.8: Released the inference code, training code, pretrained model weights, and GUI support for ContextGen.
  • 2025.10.19: Released the IMIG-Dataset construction pipeline.

📝 Introduction

Teaser

ContextGen is a novel framework that uses user-provided reference images to generate image with multiple instances, offering layout control over their positions while guaranteeing identity preservation.

✅ To-Do List

  • Arxiv Paper
  • Inference Code
  • Training Code
  • Dataset Generation Code
  • Huggingface Model Weights
  • GUI Support

🚀 Quick Start

Environment Setup

conda create contextgen python=3.12 -y
conda activate contextgen
pip install -r requirements.txt

Download Pretrained Models

Download FLUX.1-Kontext and ContextGen Adapter. Configure the weight paths in a .env file and place it in the root directory. You can refer to the .env_template file. The format is as follows:

KONTEXT_MODEL_PATH="path_to_kontext_model"
ADAPTER_PATH="path_to_contextgen_adapter"

Inference

⚠️ GPU Memory Note: The inference process requires ~35-40GB GPU memory. We're working on quantization and optimization to reduce the memory footprint in future releases.

To run inference on the provided demos, simply execute:

python inference.py

The generated results will be saved in the images/output folder.

  • For Custom Input: You can add your own images in the images/input folder and modify the inference.py file accordingly.

  • More Demos: More interesting demos and results can be found on our Project Page.

  • Recommended Interaction: For easier interaction, we highly recommend using our GUI Support.

Training

You can customize your own dataset by referencing the IMIG-Dataset construction code. Remember to add your WANDB API key in the .env file for experiment tracking:

WANDB_API_KEY="your_wandb_api_key"

Then configure the training parameters in train/config/config.yaml and run:

python src/model/train.py

GUI Support

We provide a simple GUI built with Vite and React for easier interaction.

1. Model Dependencies & Setup

The GUI requires additional models. Please download them and set their full paths in the .env file:

  • Image Cutout (Required): Download the BEN2 model.

    BEN_CKPT_PATH="path_to_ben2_model"
  • Asset Generation from Text (Optional): Download the FLUX.1-dev model.

    FLUX_MODEL_PATH="path_to_flux_model"

⚠️ GPU Memory Note: Using the optional asset generation feature consumes an additional ~30GB GPU memory. If your single GPU memory is limited, consider loading this model on a different GPU. If you do not require this feature, you can comment out the related code in gui/backend/app.py.

2. NodeJS Dependencies

If you don't have Node.js and npm installed, you can install them as follows:

# install nvm
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.3/install.sh | bash
# restart your terminal to load nvm, then install Node.js
nvm install --lts

3. Launching the GUI

To build and run the demo, follow these steps:

  • Start Frontend: In the first terminal, run the following commands:

    cd gui/frontend
    npm install # for the first time only
    npm run dev
  • Start Backend: Open a second terminal and run the backend server:

    python gui/backend/app.py

Accessing the GUI

Once both the frontend and backend servers are successfully launched, if you are working on a remote server, port forwarding is required. Please ensure the frontend port (127.0.0.1:5173) and the backend port (127.0.0.1:5000) are forwarded to the corresponding ports on your local machine. You can then access the GUI via your local browser at http://localhost:5173. Here’s a quick preview of the interface:

GUI Demo

💡 Tips

  • For better identity rendering and visual quality, we recommend using a middle resolution (e.g., 768x768 or 512x512). This strikes a balance, as higher resolutions may compromise identity consistency, while lower resolutions can introduce artifacts.
  • To enhance visual quality and contextual consistency, we recommend using a richer prompt that includes detailed, interactive relationships between the instances.
  • If a generated case fails or exhibits poor quality, please try again with a different random seed.

🎉 Enjoy Using ContextGen! 🎉

📭 Citation

If you find ContextGen helpful to your research, please consider citing our paper:

@article{xu2025contextgencontextuallayoutanchoring,
      title={ContextGen: Contextual Layout Anchoring for Identity-Consistent Multi-Instance Generation},
      author={Ruihang Xu and Dewei Zhou and Fan Ma and Yi Yang},
      year={2025},
      eprint={2510.11000},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2510.11000},
}

About

ContextGen: Contextual Layout Anchoring for Identity-Consistent Multi-Instance Generation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published