Hardware Provisioning

LOCAL DEPLOYMENT

Calculate the exact hardware specs required to run Sovereign models on bare metal.

Quick-Select Real Model

9 models

Model Parameters (Billions)8B

Quantization Level

Higher quantization reduces VRAM needs but may degrade reasoning quality in complex multi-step tasks.

Concurrent Instances1

Recommended Backend

vLLM / llama.cpp (GGUF)

Minimum VRAM

6.8GB

Bandwidth Tier

PCIe 4.0 x16 Sufficient

Preferred Hardware Architecture

Single RTX 4060 / 3060 Ti

Sovereign OS Verified (Ubuntu 24.04+)

CUDA 12.4 Compatible

Local Compute = Total Sovereignty

Ensure at least 1 vCPU core per 10 billion parameters to prevent bottlenecking the pre-fill stage.

NVMe Gen4 storage is mandatory for rapid model loading into VRAM. HDDs are entirely deprecated.

Unified Memory on M-series chips allows for larger model loading (up to 192GB) at the cost of lower throughput.