请问A100可以部署吗?计算卡必须是 Hopper架构吗?

#15
by hxxxxxx13 - opened

请问A100或者4090可以部署吗? 量化版本也可以
我看vllm官方要求 计算卡是** compute capability >= 9.0 (Hopper)**,但是A100 是 NVIDIA A100-SXM4-80GB, 8.0,之前用A100部署deepseek4-flsh就出现报错


VLLM要求

Prerequisites
● OS: Linux
● Python: 3.10 - 3.13
● NVIDIA: compute capability >= 9.0 (Hopper) recommended; 8x H200 / H20 for a tight single-node BF16 fit, or multi-node TP for long-context headroom
● AMD: MI350X/MI355X (gfx950), MI300X/MI325X (gfx942), ROCm 7.2+. BF16 needs TP=8; the MXFP8 variant runs from TP=4.
● --block-size 128 is mandatory on every platform (MSA sparse/index cache).

1

请问A100或者4090可以部署吗? 量化版本也可以
我看vllm官方要求 计算卡是** compute capability >= 9.0 (Hopper)**,但是A100 是 NVIDIA A100-SXM4-80GB, 8.0,之前用A100部署deepseek4-flsh就出现报错


VLLM要求

Prerequisites
● OS: Linux
● Python: 3.10 - 3.13
● NVIDIA: compute capability >= 9.0 (Hopper) recommended; 8x H200 / H20 for a tight single-node BF16 fit, or multi-node TP for long-context headroom
● AMD: MI350X/MI355X (gfx950), MI300X/MI325X (gfx942), ROCm 7.2+. BF16 needs TP=8; the MXFP8 variant runs from TP=4.
● --block-size 128 is mandatory on every platform (MSA sparse/index cache).

1

既然您使用的是基於 Ampere 架構的顯示卡,不妨試試支援 Minimax M3 的 vLLM 分支。

请问A100或者4090可以部署吗? 量化版本也可以
我看vllm官方要求 计算卡是** compute capability >= 9.0 (Hopper)**,但是A100 是 NVIDIA A100-SXM4-80GB, 8.0,之前用A100部署deepseek4-flsh就出现报错


VLLM要求

Prerequisites
● OS: Linux
● Python: 3.10 - 3.13
● NVIDIA: compute capability >= 9.0 (Hopper) recommended; 8x H200 / H20 for a tight single-node BF16 fit, or multi-node TP for long-context headroom
● AMD: MI350X/MI355X (gfx950), MI300X/MI325X (gfx942), ROCm 7.2+. BF16 needs TP=8; the MXFP8 variant runs from TP=4.
● --block-size 128 is mandatory on every platform (MSA sparse/index cache).

1

既然您使用的是基於 Ampere 架構的顯示卡,不妨試試支援 Minimax M3 的 vLLM 分支。

我没能找到适用SM80的VLLM分支

Sign up or log in to comment