Skip to content

Issue on page /references/nvidia_jetson.html #14

@hhhhljp

Description

@hhhhljp

Problems installing sglang with jetson-containers

Reference Links: https://docs.sglang.ai/references/nvidia_jetson.html

Error while executing
CUDA_VERSION=12.6 jetson-containers build sglang

error message
sudo docker run -t --rm --runtime=nvidia --network=host
--volume /home/lmxox/jetson-containers/packages/llm/transformers:/test
--volume /home/lmxox/jetson-containers/data:/data
--workdir /test
sglang:r36.4.3-cu126-transformers
/bin/bash -c 'python3 test_version.py'
2>&1 | tee /home/lmxox/jetson-containers/logs/20250510_102644/test/sglang_r36.4.3-cu126-transformers_test_version.py.txt; exit ${PIPESTATUS[0]}

/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py:105: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead.
warnings.warn(
transformers version: 4.51.3
-- Testing container sglang:r36.4.3-cu126-transformers (transformers:4.51.3/huggingface-benchmark.py)

sudo docker run -t --rm --runtime=nvidia --network=host
--volume /home/lmxox/jetson-containers/packages/llm/transformers:/test
--volume /home/lmxox/jetson-containers/data:/data
--workdir /test
sglang:r36.4.3-cu126-transformers
/bin/bash -c 'python3 huggingface-benchmark.py'
2>&1 | tee /home/lmxox/jetson-containers/logs/20250510_102644/test/sglang_r36.4.3-cu126-transformers_huggingface-benchmark.py.txt; exit ${PIPESTATUS[0]}

/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py:105: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead.
warnings.warn(
Namespace(model='distilgpt2', prompt='Once upon a time,', precision=None, tokens=[128], token='', runs=2, warmup=2, save='')
Running on device cuda
Input tokens: tensor([[7454, 2402, 257, 640, 11]], device='cuda:0') shape: torch.Size([1, 5])
Loading model distilgpt2 (None)
Traceback (most recent call last):
File "/test/huggingface-benchmark.py", line 71, in
model = AutoModelForCausalLM.from_pretrained(args.model, device_map=device, **kwargs) #AutoModelForCausalLM.from_pretrained(args.model, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 571, in from_pretrained
return model_class.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 279, in _wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 4399, in from_pretrained
) = cls._load_pretrained_model(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 4793, in _load_pretrained_model
caching_allocator_warmup(model_to_load, expanded_device_map, factor=2 if hf_quantizer is None else 4)
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 5791, in caching_allocator_warmup
param_byte_count //= torch.distributed.get_world_size() if tp_plan_regex.search(generic_name) else 1
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/distributed_c10d.py", line 2020, in get_world_size
return _get_group_size(group)
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/distributed_c10d.py", line 986, in _get_group_size
default_pg = _get_default_group()
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/distributed_c10d.py", line 1150, in _get_default_group
raise ValueError(
ValueError: Default process group has not been initialized, please make sure to call init_process_group.
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/lmxox/jetson-containers/jetson_containers/build.py", line 120, in
build_container(args.name, args.packages, args.base, args.build_flags, args.build_args, args.simulate, args.skip_tests, args.test_only, args.push, args.no_github_api, args.skip_packages)
File "/home/lmxox/jetson-containers/jetson_containers/container.py", line 158, in build_container
test_container(container_name, pkg, simulate)
File "/home/lmxox/jetson-containers/jetson_containers/container.py", line 331, in test_container
status = subprocess.run(cmd.replace(NEWLINE, ' '), executable='/bin/bash', shell=True, check=True)
File "/usr/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'sudo docker run -t --rm --runtime=nvidia --network=host --volume /home/lmxox/jetson-containers/packages/llm/transformers:/test --volume /home/lmxox/jetson-containers/data:/data --workdir /test sglang:r36.4.3-cu126-transformers /bin/bash -c 'python3 huggingface-benchmark.py' 2>&1 | tee /home/lmxox/jetson-containers/logs/20250510_102644/test/sglang_r36.4.3-cu126-transformers_huggingface-benchmark.py.txt; exit ${PIPESTATUS[0]}' returned non-zero exit status 1.

jetson infomation
Model: NVIDIA Jetson AGX Orin Developer Kit
Distribution: Ubuntu 22.04 Jammy Jellyfish 699-level Part Number: 699-63701-0001-BS1 A.0
Release: 5.15.148-tegra P-Number: p3701-0001
Python: 3.10.12
Libraries CUDA Arch BIN: 8.7
CUDA: 12.6.68 L4T: 36.4.3
cuDNN: 9.3.0.75 Jetpack: 6.2
TensorRT: 10.3.0.30

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions