News

This repo is used for vLLM's interative performance spot check. For automated benchmark, please refer to vLLM's nightly set. The goal for this repo is establish a set of commonly used benchmarks and ...
Detailed commands are in trt-guide.md. SGLang: @zhyncs has this repro command on 0.4.5.post1 (04/15/2025) turning on data parallelism in vLLM. Some notable changes: Bugfix in DP. The new version also ...
🙏 @deepseek_ai's highly performant inference engine is built on top of vLLM. Now they are open-sourcing the engine the right way: instead of a separate repo, they are bringing changes to the open ...