{"product_id":"k-ai-384-rome-rtxpro6000-4-rtx-pro-6000-blackwell-server-edition-384-gb-ecc-vram","title":"K-AI 384 Rome RTXPro6000 — 4× RTX Pro 6000 Blackwell Server Edition (384 GB ECC VRAM)","description":"\u003cdiv style=\"font-family:-apple-system,BlinkMacSystemFont,'Segoe UI',Roboto,sans-serif;line-height:1.7;color:#1a1a1a\"\u003e\n\n\u003cdiv style=\"background:linear-gradient(135deg,#0d0d0d 0%,#1a1a2e 100%);color:#fff;padding:32px;border-radius:12px;margin-bottom:32px\"\u003e\n\u003cp style=\"font-size:18px;margin:0 0 20px 0;color:#ccc\"\u003eK-AI 384 Rome RTXPro6000 8000TOPS\u003c\/p\u003e\n\u003cp style=\"font-size:28px;font-weight:700;margin:0 0 16px 0;line-height:1.3\"\u003e384 GB ECC VRAM Datacenter Server\u003cbr\u003e4x RTX Pro 6000 Server Edition | EPYC Milan | 8 000 TOPS INT8\u003c\/p\u003e\n\u003cdiv style=\"display:flex;gap:16px;flex-wrap:wrap;margin-top:24px\"\u003e\n\u003cdiv style=\"background:rgba(250,180,0,0.15);border:1px solid #fab400;border-radius:8px;padding:14px 16px;text-align:center;flex:1;min-width:100px\"\u003e\n\u003cdiv style=\"font-size:26px;font-weight:800;color:#fab400\"\u003e8 000\u003c\/div\u003e\n\u003cdiv style=\"font-size:11px;color:#ccc;text-transform:uppercase;letter-spacing:1px\"\u003eTOPS INT8\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"background:rgba(250,180,0,0.15);border:1px solid #fab400;border-radius:8px;padding:14px 16px;text-align:center;flex:1;min-width:100px\"\u003e\n\u003cdiv style=\"font-size:26px;font-weight:800;color:#fab400\"\u003e384 GB\u003c\/div\u003e\n\u003cdiv style=\"font-size:11px;color:#ccc;text-transform:uppercase;letter-spacing:1px\"\u003eECC VRAM pool\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"background:rgba(250,180,0,0.15);border:1px solid #fab400;border-radius:8px;padding:14px 16px;text-align:center;flex:1;min-width:100px\"\u003e\n\u003cdiv style=\"font-size:26px;font-weight:800;color:#fab400\"\u003efp8\u003c\/div\u003e\n\u003cdiv style=\"font-size:11px;color:#ccc;text-transform:uppercase;letter-spacing:1px\"\u003eBlackwell native\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"background:rgba(250,180,0,0.15);border:1px solid #fab400;border-radius:8px;padding:14px 16px;text-align:center;flex:1;min-width:100px\"\u003e\n\u003cdiv style=\"font-size:26px;font-weight:800;color:#fab400\"\u003ePassive\u003c\/div\u003e\n\u003cdiv style=\"font-size:11px;color:#ccc;text-transform:uppercase;letter-spacing:1px\"\u003edatacenter cooling\u003c\/div\u003e\n\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cp style=\"margin-top:20px;font-size:15px;color:#aaa\"\u003ePublished external references. Not measured on Kentino hardware.\u003c\/p\u003e\n\u003c\/div\u003e\n\n\u003cp style=\"font-size:17px;color:#333;margin-bottom:24px\"\u003eA 4U rack-mount inference server with four NVIDIA RTX Pro 6000 Blackwell Server Edition passive datacenter cards (96 GB ECC each) pooled to 384 GB ECC VRAM, one AMD EPYC 7643 Milan CPU (48C\/96T), 384 GB DDR4-2666 ECC, 2 TB NVMe boot, and dual synchronized 2.5 kW ATX PSU. Blackwell silicon with fp8 native acceleration. Passive airflow-directed cooling for datacenter chassis. Runs DeepSeek V3 Q3, Mistral Large 3, Qwen3-Coder-480B, and every major frontier open-weight model.\u003c\/p\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eHardware\u003c\/h2\u003e\n\n\u003ctable style=\"width:100%;border-collapse:collapse;margin-bottom:24px;font-size:15px\"\u003e\n\u003cthead\u003e\u003ctr style=\"background:#0d0d0d;color:#fff\"\u003e\n\u003cth style=\"padding:12px 16px;text-align:left\"\u003eComponent\u003c\/th\u003e\n\u003cth style=\"padding:12px 16px;text-align:left\"\u003eDetail\u003c\/th\u003e\n\u003c\/tr\u003e\u003c\/thead\u003e\n\u003ctbody\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eGPUs\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e4x NVIDIA RTX Pro 6000 Blackwell Server Edition 96 GB ECC (passive datacenter cooler, 600 W TGP, PCIe 5.0 x16, 2000 INT8 TOPS\/card, fp8 native)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eVRAM pool\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e384 GB aggregate ECC across 4 cards\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eCPU\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eAMD EPYC 7643 Milan (48C\/96T, 225 W, 128x PCIe 4.0 lanes)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eMotherboard\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eASRock Rack ROMED8-2T (SP3, 7x PCIe 4.0 x16, 8x DDR4 ECC, 2x 10 GbE, IPMI)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eSystem RAM\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e384 GB DDR4-2666 ECC RDIMM (6x 64 GB — 2 DIMM slots open for upgrade to 512 GB)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eBoot \/ storage\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e2 TB NVMe M.2 (PCIe 4.0 x4)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003ePower supply\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e2x 2.5 kW ATX with dual-PSU sync cable (5 kW aggregate)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eChassis\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e4U rack-mount\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eCooling\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eSP3 tower cooler (Arctic Freezer 4U-M class) + front-to-back directed airflow (3x 120 mm front intake + 1x 120 mm rear exhaust). Passive GPU cards — requires datacenter chassis airflow.\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eNetwork\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eOnboard dual 10 GbE (Intel X550)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003c\/tbody\u003e\n\u003c\/table\u003e\n\n\u003cdiv style=\"display:flex;gap:16px;flex-wrap:wrap;margin-bottom:32px\"\u003e\n\u003cdiv style=\"flex:1;min-width:250px;background:#f4f4f4;border-radius:8px;padding:20px\"\u003e\n\u003ch3 style=\"font-size:16px;font-weight:700;margin:0 0 12px 0\"\u003ePower envelope\u003c\/h3\u003e\n\u003cul style=\"margin:0;padding-left:18px;font-size:14px;color:#444\"\u003e\n\u003cli\u003eGPU draw: 4 x 600 W = 2 400 W\u003c\/li\u003e\n\u003cli\u003eSystem total under full load: ~2 775 W\u003c\/li\u003e\n\u003cli\u003ePSU total: 5 000 W (dual 2.5 kW synced) — 44.5% headroom\u003c\/li\u003e\n\u003cli\u003eDual PSU for split power delivery — single PSU failure = loss of 2 GPUs or 2 GPUs + motherboard\u003c\/li\u003e\n\u003c\/ul\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"flex:1;min-width:250px;background:#f4f4f4;border-radius:8px;padding:20px\"\u003e\n\u003ch3 style=\"font-size:16px;font-weight:700;margin:0 0 12px 0\"\u003eLane topology\u003c\/h3\u003e\n\u003cp style=\"margin:0;font-size:14px;color:#444\"\u003eROMED8-2T exposes 7x PCIe 4.0 x16 direct from EPYC Milan. Four slots populated — three free for NIC \/ storage \/ telemetry. RTX Pro 6000 is Gen5-capable silicon; runs Gen4 at full x16 on this platform — no bandwidth bottleneck for inference. No PCIe switch. No NVLink.\u003c\/p\u003e\n\u003c\/div\u003e\n\u003c\/div\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eWhat you can run\u003c\/h2\u003e\n\n\u003cdiv style=\"background:#fefaf0;border-left:4px solid #fab400;padding:16px 20px;margin-bottom:24px;border-radius:0 8px 8px 0\"\u003e\n\u003cp style=\"margin:0;font-size:15px;color:#333\"\u003eWith 384 GB of pooled ECC VRAM on Blackwell fp8 native silicon, this server runs DeepSeek V3 \/ R1 at Q3 comfortably on-card, Mistral Large 3 Q3, GLM-5 Q3, Qwen3-Coder-480B Q3, and Llama 3.3 70B bf16 resident on a single card (96 GB\/card).\u003c\/p\u003e\n\u003c\/div\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eLLMs — text \/ reasoning \/ coding\u003c\/h3\u003e\n\n\u003cp style=\"font-size:14px;font-weight:700;color:#fab400;text-transform:uppercase;letter-spacing:1px;margin-bottom:8px\"\u003eChinese frontier\u003c\/p\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003e\n\u003cstrong\u003eDeepSeek V3 \/ V3-0324 \/ V3.1 \/ V3.2 \/ R1 \/ R1-0528\u003c\/strong\u003e Q3 (~290 GB) comfortably on-card (~30-40 tok\/s single, published reference); fp8 native (~670 GB) with RAM spill\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eQwen3-Coder-480B-A35B\u003c\/strong\u003e Q3 (~350 GB tight with RAM spill) — SOTA open coding agent (~18-25 tok\/s single, published reference)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eQwen3-235B-A22B\u003c\/strong\u003e Q6\/Q8 (~200-280 GB) with very long ctx and multi-user batching\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eGLM-5 \/ GLM-5.1\u003c\/strong\u003e Q3 (~317 GB) — Chinese frontier, close to Claude Opus 4.6 on coding\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eKimi-K2\u003c\/strong\u003e 1.58-bit UD (~240 GB) — trillion-param agent at real throughput\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eHunyuan-Large\u003c\/strong\u003e 389B\/52B Q4 (~220 GB), fp8 native (~390 GB spill)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eERNIE-4.5-424B-A47B\u003c\/strong\u003e Q4 (~240 GB); \u003cstrong\u003eMiniMax-M1\u003c\/strong\u003e Q4 (~260 GB) 1M-ctx\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eLlama 3.3 70B\u003c\/strong\u003e bf16 resident on a single card (96 GB\/card — no tensor-parallel needed)\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003cp style=\"font-size:14px;font-weight:700;color:#fab400;text-transform:uppercase;letter-spacing:1px;margin:20px 0 8px 0\"\u003eWestern frontier\u003c\/p\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003e\n\u003cstrong\u003eMistral Large 3\u003c\/strong\u003e (675B\/41B MoE, Apache 2.0) Q3 (~317 GB) — frontier Western open weights (~20-30 tok\/s single, published reference)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eLlama 4 Maverick\u003c\/strong\u003e (400B\/17B) Q4 (~232 GB) with generous KV budget (~45-55 tok\/s single, published reference)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eLlama-3.1-Nemotron Ultra 253B\u003c\/strong\u003e Q4-Q6 (~119-207 GB)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003egpt-oss-120b\u003c\/strong\u003e MXFP4 native (80 GB) with massive concurrent fleet headroom\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003ePixtral Large \/ Mistral Large 2\u003c\/strong\u003e bf16 (~248 GB); \u003cstrong\u003eDevstral 2\u003c\/strong\u003e 123B bf16 — 256k top open coding\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eLlama 3.3 70B\u003c\/strong\u003e bf16 on a single card; 4x concurrent 70B deployments possible\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eVision-Language Models\u003c\/h3\u003e\n\u003cp style=\"font-size:15px;color:#333\"\u003eQwen3-VL-235B-A22B bf16 (~240 GB); InternVL3.5-241B-A28B Q4 (~135 GB); Llama 3.2 90B Vision bf16; Pixtral Large 124B bf16 (~248 GB); Qwen3-Omni-30B-A3B; Molmo 72B; ERNIE-4.5-VL; GLM-4.6V 106B bf16 on TP. Blackwell fp8 delivers ~2x throughput on vision-tower inference vs Ada.\u003c\/p\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eImage generation\u003c\/h3\u003e\n\u003cp style=\"font-size:15px;color:#333\"\u003eFLUX.1 [dev] \/ Kontext \/ Tools at fp8 native (~15-20 s per 1024x1024 image on single RTX Pro 6000, published reference); SD 3.5 Large; HunyuanImage-2.1 (17B native 2K); HunyuanImage-3.0 80B\/13B MoE; AuraFlow; OmniGen; 4x concurrent ComfyUI workers.\u003c\/p\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eVideo generation\u003c\/h3\u003e\n\u003cp style=\"font-size:15px;color:#333\"\u003eWan 2.2 T2V-A14B \/ I2V-A14B dual expert bf16; HunyuanVideo 13B bf16 both experts; Open-Sora 2.0 (11B) bf16; CogVideoX-5B; Mochi-1; LTX-Video; Pyramid Flow; SVD \/ SV3D \/ SV4D; NVIDIA Cosmos Predict 2.\u003c\/p\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eAudio \/ Speech \/ TTS\u003c\/h3\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003e\n\u003cstrong\u003eASR:\u003c\/strong\u003e Whisper v3 large \/ turbo; Parakeet-TDT 1.1B; Canary 1B; Qwen3-ASR; SenseVoice\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eTTS:\u003c\/strong\u003e CosyVoice 2\/3; Kokoro; Stable Audio Open; XTTS v2; Step-Audio-EditX\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eRealtime \/ S2S:\u003c\/strong\u003e Kyutai Moshi; Step-Audio 2 mini \/ R1; Qwen2.5-Omni-7B\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eMusic \/ SFX:\u003c\/strong\u003e MusicGen \/ AudioGen \/ Bark \/ SeamlessM4T\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eMulti-model \/ multi-tenant serving\u003c\/h3\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003eDeepSeek V3 Q3 + concurrent 70B + FLUX.1 + Whisper all resident\u003c\/li\u003e\n\u003cli\u003e4-way tensor-parallel on 350-400B class at Q4\u003c\/li\u003e\n\u003cli\u003ePer-card tenant isolation — one 96 GB Llama 3.3 70B bf16 per card, 4 independent inference silos\u003c\/li\u003e\n\u003cli\u003eMulti-model RAG: reader + reranker + vision + embedder all on one host\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eTarget workloads\u003c\/h2\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003eFrontier open-weight inference backend — DeepSeek V3 Q3, Qwen3-Coder-480B Q3, GLM-5 Q3\u003c\/li\u003e\n\u003cli\u003eProduction serving of Llama 4 Maverick Q4 multimodal agents with generous context budget\u003c\/li\u003e\n\u003cli\u003e4-tenant per-card isolation — one Llama 3.3 70B bf16 per tenant, zero cross-contamination\u003c\/li\u003e\n\u003cli\u003efp8-native DeepSeek \/ R1 \/ Hunyuan serving on Blackwell silicon\u003c\/li\u003e\n\u003cli\u003eMistral Large 3 Q3 as Western Apache-2.0 frontier open-weight alternative\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003ePublished performance references\u003c\/h2\u003e\n\n\u003cdiv style=\"background:#0d0d0d;color:#fff;border-radius:12px;padding:24px;margin-bottom:24px\"\u003e\n\u003cp style=\"margin:0 0 4px 0;font-size:13px;color:#888;text-transform:uppercase;letter-spacing:1px\"\u003eExternal references | Not measured on Kentino hardware\u003c\/p\u003e\n\u003ctable style=\"width:100%;border-collapse:collapse;margin-top:16px;font-size:14px\"\u003e\n\u003cthead\u003e\u003ctr style=\"border-bottom:1px solid #333\"\u003e\n\u003cth style=\"padding:8px 12px;text-align:left;color:#888;font-weight:600\"\u003eBenchmark\u003c\/th\u003e\n\u003cth style=\"padding:8px 12px;text-align:left;color:#888;font-weight:600\"\u003eResult\u003c\/th\u003e\n\u003c\/tr\u003e\u003c\/thead\u003e\n\u003ctbody\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003eRTX Pro 6000 per-card INT8 TOPS\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fab400;font-weight:700\"\u003e2 000 TOPS\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003eRTX Pro 6000 memory bandwidth\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fff\"\u003e~1 800 GB\/s per card\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003evLLM — DeepSeek V3 Q3 on 4x Blackwell PCIe (single)\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fff\"\u003e~30-40 tok\/s\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003evLLM — DeepSeek V3 Q3 on 4x Blackwell PCIe (batch-8)\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fab400;font-weight:700\"\u003e~200 tok\/s aggregate\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003eSGLang — Llama 4 Maverick Q4 on 4x Blackwell (single)\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fff\"\u003e~45-55 tok\/s\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003ellama.cpp — Qwen3-Coder-480B Q3 on 4x Blackwell (single)\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fff\"\u003e~18-25 tok\/s\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003eFLUX.1 [dev] fp8 on single RTX Pro 6000\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fff\"\u003e~1.8 s per 1024x1024 image\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003c\/tbody\u003e\n\u003c\/table\u003e\n\u003cp style=\"margin:12px 0 0 0;font-size:13px;color:#666\"\u003eKentino will publish first-party numbers after initial customer build.\u003c\/p\u003e\n\u003c\/div\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eNot ideal for\u003c\/h2\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003eSingle-user workloads up to 70B — 4x RTX 5090 is materially cheaper for a 128 GB pool if ECC and passive reliability are not required\u003c\/li\u003e\n\u003cli\u003eSilent lab \/ office-adjacent deployment — passive cooler requires proper datacenter front-to-back airflow. For acoustic-sensitive sites choose the Max-Q turbofan variant (K-AI 384 Rome RTXPro6000MQ)\u003c\/li\u003e\n\u003cli\u003eFrontier training from scratch (no NVLink)\u003c\/li\u003e\n\u003cli\u003eFull DeepSeek V3 Q4 on-card (~404 GB) — upgrade to 6x RTX Pro 6000 \/ 576 GB\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eWarranty and lead time\u003c\/h2\u003e\n\u003cdiv style=\"display:flex;gap:16px;flex-wrap:wrap;margin-bottom:24px\"\u003e\n\u003cdiv style=\"flex:1;min-width:150px;background:#f4f4f4;border-radius:8px;padding:20px;text-align:center\"\u003e\n\u003cdiv style=\"font-size:24px;font-weight:800;color:#0d0d0d\"\u003e3 years\u003c\/div\u003e\n\u003cdiv style=\"font-size:13px;color:#666;text-transform:uppercase;letter-spacing:1px\"\u003eNVIDIA OEM GPU warranty\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"flex:1;min-width:150px;background:#f4f4f4;border-radius:8px;padding:20px;text-align:center\"\u003e\n\u003cdiv style=\"font-size:24px;font-weight:800;color:#0d0d0d\"\u003e2 years\u003c\/div\u003e\n\u003cdiv style=\"font-size:13px;color:#666;text-transform:uppercase;letter-spacing:1px\"\u003eparts warranty\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"flex:1;min-width:150px;background:#f4f4f4;border-radius:8px;padding:20px;text-align:center\"\u003e\n\u003cdiv style=\"font-size:24px;font-weight:800;color:#0d0d0d\"\u003e1 year\u003c\/div\u003e\n\u003cdiv style=\"font-size:13px;color:#666;text-transform:uppercase;letter-spacing:1px\"\u003elabor warranty\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"flex:1;min-width:150px;background:#f4f4f4;border-radius:8px;padding:20px;text-align:center\"\u003e\n\u003cdiv style=\"font-size:24px;font-weight:800;color:#0d0d0d\"\u003e10-28 days\u003c\/div\u003e\n\u003cdiv style=\"font-size:13px;color:#666;text-transform:uppercase;letter-spacing:1px\"\u003elead time\u003c\/div\u003e\n\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cp style=\"font-size:14px;color:#666\"\u003eBuild includes assembly, BIOS configuration, driver install, burn-in, memtest, and functional verification. Lead time depends on component availability, confirmed at order.\u003c\/p\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eRecommended add-ons\u003c\/h2\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003eUpgrade RAM to 512 GB DDR4 (add 2x 64 GB — 2 DIMM slots open) for RAM-spill headroom on Q3 frontier quants\u003c\/li\u003e\n\u003cli\u003e4 TB NVMe Gen4 x4 for frontier-model library (DeepSeek V3 Q3 alone is ~290 GB on disk)\u003c\/li\u003e\n\u003cli\u003eFull 24U rack cabinet with managed PDU + online UPS\u003c\/li\u003e\n\u003cli\u003eAlternative silhouette: Max-Q turbofan variant (K-AI 384 Rome RTXPro6000MQ) — same silicon, quieter blower cooler, for lab deployments\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003c\/div\u003e\n","brand":"Kentino s.r.o.","offers":[{"title":"Default Title","offer_id":52940310217032,"sku":null,"price":46583.0,"currency_code":"EUR","in_stock":true}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/0843\/5479\/3800\/files\/kentino-ai-server-4-gpu-topdown_6b2c51b2-25c1-479d-929a-29eebe60e5ef.jpg?v=1776940959","url":"https:\/\/kentino.com\/zh\/products\/k-ai-384-rome-rtxpro6000-4-rtx-pro-6000-blackwell-server-edition-384-gb-ecc-vram","provider":"Kentino","version":"1.0","type":"link"}