{"product_id":"k-ai-128-rome-5090-6704tops-4-rtx-5090-blackwell-ai-server","title":"K-AI 128 Rome 5090 6704TOPS — 4× RTX 5090 Blackwell AI Server","description":"\u003cdiv style=\"font-family:-apple-system,BlinkMacSystemFont,'Segoe UI',Roboto,sans-serif;line-height:1.7;color:#1a1a1a\"\u003e\n\n\u003cdiv style=\"background:linear-gradient(135deg,#0d0d0d 0%,#1a1a2e 100%);color:#fff;padding:32px;border-radius:12px;margin-bottom:32px\"\u003e\n\u003cp style=\"font-size:18px;margin:0 0 20px 0;color:#ccc\"\u003eK-AI 128 Rome 5090 6704TOPS\u003c\/p\u003e\n\u003cp style=\"font-size:28px;font-weight:700;margin:0 0 16px 0;line-height:1.3\"\u003e128 GB VRAM Blackwell Inference Server\u003cbr\u003e4x RTX 5090 | EPYC Milan | 6 704 TOPS INT8\u003c\/p\u003e\n\u003cdiv style=\"display:flex;gap:16px;flex-wrap:wrap;margin-top:24px\"\u003e\n\u003cdiv style=\"background:rgba(250,180,0,0.15);border:1px solid #fab400;border-radius:8px;padding:14px 16px;text-align:center;flex:1;min-width:100px\"\u003e\n\u003cdiv style=\"font-size:26px;font-weight:800;color:#fab400\"\u003e6 704\u003c\/div\u003e\n\u003cdiv style=\"font-size:11px;color:#ccc;text-transform:uppercase;letter-spacing:1px\"\u003eINT8 TOPS\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"background:rgba(250,180,0,0.15);border:1px solid #fab400;border-radius:8px;padding:14px 16px;text-align:center;flex:1;min-width:100px\"\u003e\n\u003cdiv style=\"font-size:26px;font-weight:800;color:#fab400\"\u003e128 GB\u003c\/div\u003e\n\u003cdiv style=\"font-size:11px;color:#ccc;text-transform:uppercase;letter-spacing:1px\"\u003eVRAM pool\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"background:rgba(250,180,0,0.15);border:1px solid #fab400;border-radius:8px;padding:14px 16px;text-align:center;flex:1;min-width:100px\"\u003e\n\u003cdiv style=\"font-size:26px;font-weight:800;color:#fab400\"\u003eBlackwell\u003c\/div\u003e\n\u003cdiv style=\"font-size:11px;color:#ccc;text-transform:uppercase;letter-spacing:1px\"\u003efp8 native\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"background:rgba(250,180,0,0.15);border:1px solid #fab400;border-radius:8px;padding:14px 16px;text-align:center;flex:1;min-width:100px\"\u003e\n\u003cdiv style=\"font-size:26px;font-weight:800;color:#fab400\"\u003e2.5x\u003c\/div\u003e\n\u003cdiv style=\"font-size:11px;color:#ccc;text-transform:uppercase;letter-spacing:1px\"\u003evs 4090 TOPS\u003c\/div\u003e\n\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cp style=\"margin-top:20px;font-size:15px;color:#aaa\"\u003eFour Blackwell RTX 5090 with native fp8\/fp4 tensor paths. Highest-throughput 4-GPU build on the Rome platform.\u003c\/p\u003e\n\u003c\/div\u003e\n\n\u003cp style=\"font-size:17px;color:#333;margin-bottom:24px\"\u003eA 4U rack-mount inference server with four GeForce RTX 5090 pooled to 128 GB VRAM, one AMD EPYC 7643 Milan CPU (48C\/96T), 512 GB DDR4 ECC (all 8 DIMM slots populated for max bandwidth), 2 TB NVMe boot, and dual synchronized 2 kW ATX PSU. Runs vLLM, SGLang, llama.cpp, ComfyUI with Blackwell-native fp8 and MXFP4 inference kernels.\u003c\/p\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eHardware\u003c\/h2\u003e\n\n\u003ctable style=\"width:100%;border-collapse:collapse;margin-bottom:24px;font-size:15px\"\u003e\n\u003cthead\u003e\u003ctr style=\"background:#0d0d0d;color:#fff\"\u003e\n\u003cth style=\"padding:12px 16px;text-align:left\"\u003eComponent\u003c\/th\u003e\n\u003cth style=\"padding:12px 16px;text-align:left\"\u003eDetail\u003c\/th\u003e\n\u003c\/tr\u003e\u003c\/thead\u003e\n\u003ctbody\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eGPUs\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e4x NVIDIA GeForce RTX 5090 32 GB GDDR7 (Blackwell, 575 W, PCIe 5.0 x16)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eVRAM pool\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e128 GB total across 4 cards (no NVLink on consumer 5090)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eCPU\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eAMD EPYC 7643 Milan (48C\/96T, 225 W, 128x PCIe 4.0 lanes)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eMotherboard\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eASRock Rack ROMED8-2T (SP3, 7x PCIe 4.0 x16, 8x DDR4 ECC, 2x 10 GbE, IPMI)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eSystem RAM\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e512 GB DDR4-2666 ECC RDIMM (8x 64 GB — all DIMM slots populated)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eBoot \/ storage\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e2 TB NVMe M.2 (PCIe 4.0 x4)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003ePower supply\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eDual 2 kW ATX PSU with sync cable + 12VHPWR adapter set\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eChassis\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e4U rack-mount, 4x GPU, passive PCIe 4.0 x16 risers\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eCooling\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eArctic Freezer 4U-M SP3 tower + 3x 120 mm front intake + 1x 120 mm rear exhaust\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eNetwork\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eOnboard dual 10 GbE (Intel X550)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003c\/tbody\u003e\n\u003c\/table\u003e\n\n\u003cdiv style=\"display:flex;gap:16px;flex-wrap:wrap;margin-bottom:32px\"\u003e\n\u003cdiv style=\"flex:1;min-width:250px;background:#f4f4f4;border-radius:8px;padding:20px\"\u003e\n\u003ch3 style=\"font-size:16px;font-weight:700;margin:0 0 12px 0\"\u003ePower envelope\u003c\/h3\u003e\n\u003cul style=\"margin:0;padding-left:18px;font-size:14px;color:#444\"\u003e\n\u003cli\u003eGPU draw: 4 x 575 W = 2 300 W\u003c\/li\u003e\n\u003cli\u003eSystem total at full load: ~2 650 W\u003c\/li\u003e\n\u003cli\u003ePSU total: 4 000 W (dual 2 kW synced) — 33.8 % headroom\u003c\/li\u003e\n\u003cli\u003eDual PSU for split power delivery — each PSU feeds a portion of the system\u003c\/li\u003e\n\u003c\/ul\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"flex:1;min-width:250px;background:#f4f4f4;border-radius:8px;padding:20px\"\u003e\n\u003ch3 style=\"font-size:16px;font-weight:700;margin:0 0 12px 0\"\u003eLane topology\u003c\/h3\u003e\n\u003cp style=\"margin:0;font-size:14px;color:#444\"\u003eROMED8-2T fans out 128 PCIe Gen4 lanes from the EPYC directly to seven x16 slots; four populated by GPUs at Gen4 x16. No PCIe switch. No NVLink on consumer 5090 — inter-GPU peer-to-peer. Cards are Gen5 native; Rome caps at Gen4.\u003c\/p\u003e\n\u003c\/div\u003e\n\u003c\/div\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eWhat you can run\u003c\/h2\u003e\n\n\u003cdiv style=\"background:#fefaf0;border-left:4px solid #fab400;padding:16px 20px;margin-bottom:24px;border-radius:0 8px 8px 0\"\u003e\n\u003cp style=\"margin:0;font-size:15px;color:#333\"\u003eWith 128 GB pooled VRAM and Blackwell-native fp8 tensor paths, this server steps up to Qwen3-235B-A22B Q4 and gpt-oss-120b MXFP4 with real KV headroom — beyond what 4x RTX 4090 can reach.\u003c\/p\u003e\n\u003c\/div\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eLLMs — text \/ reasoning \/ coding\u003c\/h3\u003e\n\n\u003cp style=\"font-size:14px;font-weight:700;color:#fab400;text-transform:uppercase;letter-spacing:1px;margin-bottom:8px\"\u003eChinese frontier\u003c\/p\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003e\n\u003cstrong\u003eQwen3 \/ Qwen3.5 (Alibaba):\u003c\/strong\u003e Qwen3-235B-A22B Q3-Q4 (~112-132 GB) fits the 128 GB pool with 8-16k ctx — the hero config; Qwen3-32B dense bf16 (~65 GB) with massive KV; Qwen3-Coder-30B-A3B agentic at 1M ctx; Qwen3.5-122B-A10B Q6\/fp8 (~75-80 GB); QwQ-32B bf16 reasoning\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eDeepSeek:\u003c\/strong\u003e DeepSeek-V3\/R1\/V3.1\/V3.2 fp8-native Q2 (~215 GB) with RAM spill across 512 GB host — feasible for batch; DeepSeek-R2 32B bf16 multi-stream (4 concurrent, one per card)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eGLM \/ Z.ai:\u003c\/strong\u003e GLM-4.5-Air 106B\/12B fp8 (~106 GB) or Q6 comfortably; GLM-4.5\/4.6\/4.7 Q2_K_XL (~135 GB) tight with MoE offload\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eTencent Hunyuan:\u003c\/strong\u003e Hunyuan-A13B fp8 native (~80 GB) — Blackwell runs fp8 without upcast penalty; Hunyuan-Large Q2 with RAM spill\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eByteDance Seed-OSS-36B\u003c\/strong\u003e bf16 with 512k native; ERNIE-4.5-424B Q2 (~150 GB spill)\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003cp style=\"font-size:14px;font-weight:700;color:#fab400;text-transform:uppercase;letter-spacing:1px;margin:20px 0 8px 0\"\u003eWestern frontier\u003c\/p\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003e\n\u003cstrong\u003eMeta Llama:\u003c\/strong\u003e Llama 3.3 70B Q4 across 4x 5090 (~30-40 tok\/s single-stream, ~270+ tok\/s batch-32 vLLM); Llama 4 Scout 109B\/17B MoE fp8\/Q6 (~90 GB); Llama 4 Maverick 400B\/17B Q3 (~188 GB spill)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eMistral:\u003c\/strong\u003e Mistral Small 3 \/ Magistral \/ Devstral Small 2 (24B) bf16 multi-stream; Pixtral Large \/ Mistral Large 2 (123B) Q6 (~88 GB)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eOpenAI (open weights):\u003c\/strong\u003e gpt-oss-120b MXFP4 native (80 GB) with real KV and long context — Blackwell hero workload; gpt-oss-20b MXFP4\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eGoogle Gemma 3:\u003c\/strong\u003e 27B multimodal bf16 (~54 GB) two concurrent streams; 12B \/ 4B\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eMicrosoft Phi-4\u003c\/strong\u003e 14B dense bf16; Phi-4-reasoning distilled\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eNVIDIA Nemotron:\u003c\/strong\u003e Llama-3.1-Nemotron Ultra 253B Q3 (~119 GB) tight; Super 49B bf16 (~98 GB)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eOthers:\u003c\/strong\u003e Cohere Command R+ 104B Q6 (~85 GB); Molmo 72B Q6-bf16 VLM; OLMo 2 32B; IBM Granite 4.0 H-Small\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eVision-Language Models\u003c\/h3\u003e\n\u003cp style=\"font-size:15px;color:#333\"\u003eQwen3-VL-235B-A22B Q3-Q4; Qwen3-VL-32B bf16; InternVL3.5-241B-A28B Q4 (~135 GB tight); InternVL3 78B bf16; Llama 3.2 90B Vision Q6 (~74 GB); Pixtral Large 124B Q6 (~88 GB); Molmo 72B Q6\/bf16; Gemma 3 27B multimodal bf16; GLM-4.6V 106B fp8.\u003c\/p\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eImage generation\u003c\/h3\u003e\n\u003cp style=\"font-size:15px;color:#333\"\u003eFLUX.1 [dev] bf16 and fp8 (~10-18 s\/image at fp8); FLUX.1 Kontext [dev]; SD 3.5 Large bf16; HunyuanImage-2.1 bf16 and Q4; HunyuanImage-3.0 base (80B MoE, 13B active) bf16 (~80 GB, hero footprint); HunyuanDiT; Kolors \/ Kolors 2.0; AuraFlow v0.3; OmniGen v1; PixArt-Sigma.\u003c\/p\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eVideo generation\u003c\/h3\u003e\n\u003cp style=\"font-size:15px;color:#333\"\u003eWan 2.2 MoE two-expert bf16 (~54 GB, full ctx); Wan 2.2 TI2V-5B; HunyuanVideo 13B bf16 both experts (~60-80 GB); HunyuanVideo 1.5; CogVideoX-5B bf16; Open-Sora 2.0 11B bf16 (~24 GB); Genmo Mochi-1 bf16 (~42 GB); LTX-Video; Pyramid Flow; SVD \/ SV3D \/ SV4D; NVIDIA Cosmos.\u003c\/p\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eAudio \/ Speech \/ TTS\u003c\/h3\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003e\n\u003cstrong\u003eASR:\u003c\/strong\u003e Whisper v3 large \/ turbo (~50x realtime); Parakeet-TDT; Canary 1B; Qwen3-ASR; SenseVoice\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eTTS:\u003c\/strong\u003e CosyVoice 2 \/ 3; Kokoro 82M; Stable Audio Open; XTTS v2; StyleTTS 2; Step-Audio-EditX\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eRealtime \/ S2S:\u003c\/strong\u003e Kyutai Moshi 7B; Step-Audio 2 mini\/R1; Qwen2.5-Omni-7B\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eMusic \/ SFX:\u003c\/strong\u003e MusicGen \/ AudioGen \/ Bark; SeamlessM4T v2\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eMulti-model \/ multi-tenant serving\u003c\/h3\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003e200B MoE at Q4 with batch inference (Qwen3-235B, GLM-4.5\/4.6\/4.7-Air) for 8-16 concurrent users\u003c\/li\u003e\n\u003cli\u003efp8-native frontier — DeepSeek V3 family, Hunyuan-Large fp8 with Blackwell native paths\u003c\/li\u003e\n\u003cli\u003eMixed resident stack: gpt-oss-120b MXFP4 + FLUX.1 + Whisper + Moshi on partitioned VRAM\u003c\/li\u003e\n\u003cli\u003eHigh-throughput 70B — tensor-parallel vLLM \/ SGLang with 200+ tok\/s batch aggregate\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eTarget workloads\u003c\/h2\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003e200B+ MoE production serving at Q3-Q4 with real KV (Qwen3-235B, GLM-4.5-Air 106B)\u003c\/li\u003e\n\u003cli\u003efp8-native frontier inference (DeepSeek V3\/R1 fp8, Hunyuan fp8) — Blackwell runs without upcast\u003c\/li\u003e\n\u003cli\u003eHigh-throughput 70B serving — tensor-parallel batch via vLLM or SGLang\u003c\/li\u003e\n\u003cli\u003eVideo generation studio at bf16 (Wan 2.2 dual-expert, HunyuanVideo 13B, Mochi-1)\u003c\/li\u003e\n\u003cli\u003eMulti-tenant mixed workload — 120B MoE + image gen + realtime voice all resident\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eMeasured performance\u003c\/h2\u003e\n\n\u003cdiv style=\"background:#0d0d0d;color:#fff;border-radius:12px;padding:24px;margin-bottom:24px\"\u003e\n\u003cp style=\"margin:0 0 4px 0;font-size:13px;color:#888;text-transform:uppercase;letter-spacing:1px\"\u003ePublished references | NVIDIA RTX 5090 datasheet + community benchmarks\u003c\/p\u003e\n\u003ctable style=\"width:100%;border-collapse:collapse;margin-top:16px;font-size:14px\"\u003e\n\u003cthead\u003e\u003ctr style=\"border-bottom:1px solid #333\"\u003e\n\u003cth style=\"padding:8px 12px;text-align:left;color:#888;font-weight:600\"\u003eBenchmark\u003c\/th\u003e\n\u003cth style=\"padding:8px 12px;text-align:left;color:#888;font-weight:600\"\u003eResult\u003c\/th\u003e\n\u003c\/tr\u003e\u003c\/thead\u003e\n\u003ctbody\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003ePer-card INT8 TOPS (NVIDIA datasheet)\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fab400;font-weight:700\"\u003e1 676 TOPS\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003eAggregate INT8 TOPS (4 cards)\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fab400;font-weight:700\"\u003e6 704 TOPS\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003eMemory bandwidth per card\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fff\"\u003e~1 792 GB\/s\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003eLlama 3.3 70B Q6 via vLLM (community)\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fff\"\u003e60-90 tok\/s single-stream, 300+ tok\/s batch\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003eQwen3-235B-A22B Q3-Q4\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fff\"\u003eFits 128 GB pool with 8-16k ctx\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003egpt-oss-120b MXFP4 native\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fff\"\u003e80 GB — comfortable with KV headroom\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003c\/tbody\u003e\n\u003c\/table\u003e\n\u003cp style=\"margin:12px 0 0 0;font-size:13px;color:#666\"\u003ePublished external references, not measured on Kentino hardware. Kentino will publish first-party numbers after the first customer build.\u003c\/p\u003e\n\u003c\/div\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eNot ideal for\u003c\/h2\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003eFrontier 400B+ at Q4 (Kimi-K2, Mistral Large 3, Intern-S1-Pro — require 8-GPU or 6x RTX Pro 6000)\u003c\/li\u003e\n\u003cli\u003ePCIe Gen5-link-sensitive workloads — pick the Genoa SKU for native Gen5 x16\u003c\/li\u003e\n\u003cli\u003eTraining from scratch (no NVLink on consumer 5090)\u003c\/li\u003e\n\u003cli\u003eECC-sensitive 24\/7 production — consumer 5090 has no ECC; prefer L40 or RTX Pro 6000 Server Edition\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eWarranty and lead time\u003c\/h2\u003e\n\u003cdiv style=\"display:flex;gap:16px;flex-wrap:wrap;margin-bottom:24px\"\u003e\n\u003cdiv style=\"flex:1;min-width:150px;background:#f4f4f4;border-radius:8px;padding:20px;text-align:center\"\u003e\n\u003cdiv style=\"font-size:24px;font-weight:800;color:#0d0d0d\"\u003e2 years\u003c\/div\u003e\n\u003cdiv style=\"font-size:13px;color:#666;text-transform:uppercase;letter-spacing:1px\"\u003eparts warranty\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"flex:1;min-width:150px;background:#f4f4f4;border-radius:8px;padding:20px;text-align:center\"\u003e\n\u003cdiv style=\"font-size:24px;font-weight:800;color:#0d0d0d\"\u003e1 year\u003c\/div\u003e\n\u003cdiv style=\"font-size:13px;color:#666;text-transform:uppercase;letter-spacing:1px\"\u003elabor warranty\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"flex:1;min-width:150px;background:#f4f4f4;border-radius:8px;padding:20px;text-align:center\"\u003e\n\u003cdiv style=\"font-size:24px;font-weight:800;color:#0d0d0d\"\u003e10-28 days\u003c\/div\u003e\n\u003cdiv style=\"font-size:13px;color:#666;text-transform:uppercase;letter-spacing:1px\"\u003elead time\u003c\/div\u003e\n\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cp style=\"font-size:14px;color:#666\"\u003eBuild includes assembly, BIOS configuration, driver install, burn-in testing, and functional verification. Lead time depends on component availability, confirmed at order.\u003c\/p\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eRecommended add-ons\u003c\/h2\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003eUpgrade PSU to dual 2.5 kW (FSP) for sustained worst-case bf16 + video — recommended for 24\/7\u003c\/li\u003e\n\u003cli\u003e4 TB NVMe for model library + MoE weight staging\u003c\/li\u003e\n\u003cli\u003e24U open cabinet for multi-server deployment\u003c\/li\u003e\n\u003cli\u003eConsider the Genoa-platform variant on request for Gen5 x16 link\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003c\/div\u003e","brand":"Kentino s.r.o.","offers":[{"title":"Default Title","offer_id":52940164497736,"sku":null,"price":25372.0,"currency_code":"EUR","in_stock":true}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/0843\/5479\/3800\/files\/kentino-ai-server-4-gpu-topdown_6b2c51b2-25c1-479d-929a-29eebe60e5ef.jpg?v=1776940959","url":"https:\/\/kentino.com\/he\/products\/k-ai-128-rome-5090-6704tops-4-rtx-5090-blackwell-ai-server","provider":"Kentino","version":"1.0","type":"link"}