Better continuous batching tests #42699

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

remi-or wants to merge 6 commits into main from cb-tests

+204 −299

Collaborator

remi-or commented Dec 8, 2025

Since a lot of features have been added to continuous batching, this PR aims o refactor the associated tests so we can catch new failures. The structure of the tests used to be one test per model / attention implementation. Now we have one test backend, that makes sure the generation with and without CB is coherent. It is called in two tests:

we test all possible set of parameters on one tiny llama model
we test a restricted set of parameters on different architectures: full attention, sliding window, etc.

There was also an effort to regroup the streaming tests so it can use the same backend.

Overhaul, the new tests cover more ground for a lesser amount of code. And it already caught one bug: zero-sized cuda graphs failed silently, which led to slight generation divergence.

remi-or added 4 commits

December 5, 2025 09:23


          No more size 0 cuda graph

39757e8


          Better tests for CB

5b3ca98


          compile fix for CB test

5bbc1d3


          style

243fc00

remi-or requested review from ArthurZucker, LysandreJik and McPatate

December 8, 2025 13:25

HuggingFaceDocBuilderDev commented Dec 8, 2025

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

remi-or added 2 commits

December 8, 2025 18:32


          More cleanup and cuda exclusive

01ac836


          Returned to slow tests

1cdde01

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet