Issue on page /start/install.html (ROCm AMD Cards) #13

Open

Open

Issue on page /start/install.html (ROCm AMD Cards)#13

opened

Hi,

https://docs.sglang.ai/start/install.html

The installation instructions for docker does not work. Running the stock docker image as recommended for ROCm yields:

triton.runtime.errors.OutOfResources: out of resource: shared memory, Required: 196608, Hardware limit: 65536. Reducing block sizes or num_stages may help.

Other folks have run into this via cuda and it was recommended to upgrade vllm to vllm>=8.5. I built vllm from source and picked up with sglang's docker/Dockerfile.rocm builds as well as attempting to build from source and have run into multiple version conflicts and issues.

System i was attempting to run this on had 16x MI300X AMD cards.

Please advise, this seems doable to patch on my end but wondering if someone with more intimate knowledge with sglang's builds can provide insight or advise around any 'gotchas' I may run into. Either that or upgrading the Image to work.

Best,

P.S.

vllm-project/vllm#17578 (comment)

Metadata

Assignees

No one assigned

Labels

No labels

No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests