← All modelsMODEL CHECK

Can I run EXAONE 4.5 33B?

EXAONE 4.5 33B by LG AI Research needs around 32 GB of RAM at the recommended 4-bit quantization (20.0 GB download). Your hardware is checked below β€” instantly, nothing leaves your browser. Expect roughly ~17 tok/s on a Apple M-series Max.

Reading your hardware signals…

Specifications

Parameters33B
Context window256K tokens
ProviderLG AI Research
LicenseEXAONE License (NC)
Released2026-04
Best forVision, Reasoning, Chat

Size by quantization

QuantizationBits/weightDownloadMin RAMQuality
Q2_K3.3513.8 GB24 GBNoticeable loss
Q4_K_MRecommended4.8520.0 GB32 GBRecommended
Q5_K_M5.6523.3 GB32 GBHigh
Q8_08.535.1 GB48 GBNear-original
F161666.0 GB96 GBOriginal

Sizes are estimates from parameter count Γ— bits per weight; real GGUF builds vary slightly. Β· Data updated: 2026-06-11 Β· How we calculate these numbers β†’

Memory needed by context length

ContextKV cache (est.)Total memory (Q4)
4K tokens~1.0 GB~21.0 GB
8K tokens~2.0 GB~22.0 GB
32K tokens~7.9 GB~27.9 GB
128K tokens~31.7 GB~51.7 GB

The KV cache grows with context length β€” a model that fits at 4K can run out of memory at 32K. Estimates assume an FP16 cache with grouped-query attention; actual usage varies by runtime.

Estimated speed by hardware

HardwareBandwidth~Speed
NVIDIA RTX 3060 12GB360 GB/sWon't fit in VRAM
NVIDIA RTX 4090 24GB1008 GB/s~43 tok/s
Apple M-series (base)100 GB/s~4 tok/s
Apple M-series Pro270 GB/s~11 tok/s
Apple M-series Max410 GB/s~17 tok/s
CPU only (dual-channel DDR5)60 GB/s~3 tok/s

Token generation is memory-bandwidth bound: tok/s β‰ˆ bandwidth Γ— 0.85 Γ· model size at Q4. Real-world numbers vary by runtime and context length.

Frequently asked questions

EXAONE 4.5 33B System Requirements β€” Can I Run It Locally?