{"product_id":"k-ai-64-rome-5080-3600tops-4x-rtx-5080-budget-ai-server","title":"K-AI 64 Rome 5080 3600TOPS — 4x RTX 5080 Budget AI Server","description":"\u003cdiv style=\"font-family:-apple-system,BlinkMacSystemFont,'Segoe UI',Roboto,sans-serif;line-height:1.7;color:#1a1a1a\"\u003e\n\n\u003cdiv style=\"background:linear-gradient(135deg,#0d0d0d 0%,#1a1a2e 100%);color:#fff;padding:32px;border-radius:12px;margin-bottom:32px\"\u003e\n\u003cp style=\"font-size:18px;margin:0 0 20px 0;color:#ccc\"\u003eK-AI 64 Rome 5080 3600TOPS\u003c\/p\u003e\n\u003cp style=\"font-size:28px;font-weight:700;margin:0 0 16px 0;line-height:1.3\"\u003eBudget 4-GPU Blackwell Server\u003cbr\u003e4x RTX 5080 | EPYC Milan | 3 600 TOPS INT8\u003c\/p\u003e\n\u003cdiv style=\"display:flex;gap:16px;flex-wrap:wrap;margin-top:24px\"\u003e\n\u003cdiv style=\"background:rgba(250,180,0,0.15);border:1px solid #fab400;border-radius:8px;padding:14px 16px;text-align:center;flex:1;min-width:100px\"\u003e\n\u003cdiv style=\"font-size:26px;font-weight:800;color:#fab400\"\u003e3 600\u003c\/div\u003e\n\u003cdiv style=\"font-size:11px;color:#ccc;text-transform:uppercase;letter-spacing:1px\"\u003eTOPS INT8\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"background:rgba(250,180,0,0.15);border:1px solid #fab400;border-radius:8px;padding:14px 16px;text-align:center;flex:1;min-width:100px\"\u003e\n\u003cdiv style=\"font-size:26px;font-weight:800;color:#fab400\"\u003e64 GB\u003c\/div\u003e\n\u003cdiv style=\"font-size:11px;color:#ccc;text-transform:uppercase;letter-spacing:1px\"\u003eVRAM pool\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"background:rgba(250,180,0,0.15);border:1px solid #fab400;border-radius:8px;padding:14px 16px;text-align:center;flex:1;min-width:100px\"\u003e\n\u003cdiv style=\"font-size:26px;font-weight:800;color:#fab400\"\u003e4 GPU\u003c\/div\u003e\n\u003cdiv style=\"font-size:11px;color:#ccc;text-transform:uppercase;letter-spacing:1px\"\u003eBlackwell\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"background:rgba(250,180,0,0.15);border:1px solid #fab400;border-radius:8px;padding:14px 16px;text-align:center;flex:1;min-width:100px\"\u003e\n\u003cdiv style=\"font-size:26px;font-weight:800;color:#fab400\"\u003erack\u003c\/div\u003e\n\u003cdiv style=\"font-size:11px;color:#ccc;text-transform:uppercase;letter-spacing:1px\"\u003eready\u003c\/div\u003e\n\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cp style=\"margin-top:20px;font-size:15px;color:#aaa\"\u003eKentino's budget 4-GPU Blackwell server — 64 GB VRAM pool, 3 600 aggregate TOPS INT8, lowest CZK-per-TOPS in the lineup.\u003c\/p\u003e\n\u003c\/div\u003e\n\n\u003cp style=\"font-size:17px;color:#333;margin-bottom:24px\"\u003eA 4-GPU Blackwell inference server built around the RTX 5080 — 360 W per card, PCIe 5 silicon, 16 GB GDDR7 each. Four cards deliver a 64 GB pooled VRAM envelope and 3 600 INT8 TOPS aggregate at the best CZK-per-TOPS point Kentino offers. The entry into multi-GPU Blackwell inference: ideal for embedding clusters, 7-13B model serving at scale, image \/ video batch generation, and 70B Q4 tensor-parallel.\u003c\/p\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eHardware\u003c\/h2\u003e\n\n\u003ctable style=\"width:100%;border-collapse:collapse;margin-bottom:24px;font-size:15px\"\u003e\n\u003cthead\u003e\u003ctr style=\"background:#0d0d0d;color:#fff\"\u003e\n\u003cth style=\"padding:12px 16px;text-align:left\"\u003eComponent\u003c\/th\u003e\n\u003cth style=\"padding:12px 16px;text-align:left\"\u003eDetail\u003c\/th\u003e\n\u003c\/tr\u003e\u003c\/thead\u003e\n\u003ctbody\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eGPUs\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e4x NVIDIA GeForce RTX 5080 16 GB GDDR7 (360 W, PCIe 5.0 x16)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eVRAM pool\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e64 GB\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eCPU\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eAMD EPYC 7643 Milan (48C\/96T, 225 W, 128x PCIe 4.0 lanes)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eMotherboard\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eASRock Rack ROMED8-2T (SP3, 7x PCIe 4.0 x16, 8x DDR4 ECC, 2x 10 GbE, IPMI)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eSystem RAM\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e256 GB DDR4-2666 ECC RDIMM (4x 64 GB)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eBoot \/ storage\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e2 TB NVMe M.2 (PCIe 4.0 x4)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003ePower supply\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eSingle 2 kW ATX PSU\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eChassis\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e4U rack-mount, 4x GPU, passive Gen4 x16 risers, front-to-back directed airflow\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eCooling\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eSP3 tower cooler, 3x 120 mm front intake + 1x 120 mm rear exhaust (industrial fans)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eNetwork\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eOnboard dual 10 GbE (Intel X550) + IPMI\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003c\/tbody\u003e\n\u003c\/table\u003e\n\n\u003cdiv style=\"display:flex;gap:16px;flex-wrap:wrap;margin-bottom:32px\"\u003e\n\u003cdiv style=\"flex:1;min-width:250px;background:#f4f4f4;border-radius:8px;padding:20px\"\u003e\n\u003ch3 style=\"font-size:16px;font-weight:700;margin:0 0 12px 0\"\u003ePower envelope\u003c\/h3\u003e\n\u003cul style=\"margin:0;padding-left:18px;font-size:14px;color:#444\"\u003e\n\u003cli\u003eGPU draw: 4 x 360 W = 1 440 W\u003c\/li\u003e\n\u003cli\u003eSystem total at full load: ~1 765 W\u003c\/li\u003e\n\u003cli\u003ePSU total: 2 000 W (single 2 kW ATX) — 11.75 % headroom\u003c\/li\u003e\n\u003cli\u003eAbove the 10 % floor but tighter than other 4-GPU builds; dual-PSU upgrade recommended for high-duty workloads\u003c\/li\u003e\n\u003c\/ul\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"flex:1;min-width:250px;background:#f4f4f4;border-radius:8px;padding:20px\"\u003e\n\u003ch3 style=\"font-size:16px;font-weight:700;margin:0 0 12px 0\"\u003eLane topology\u003c\/h3\u003e\n\u003cp style=\"margin:0;font-size:14px;color:#444\"\u003eROMED8-2T fans out 4x16 Gen4 from CPU root complex. 5080 is PCIe Gen5 silicon running Gen4 x16 without bandwidth bottleneck for inference. No PCIe switch. No NVLink — tensor parallel over PCIe.\u003c\/p\u003e\n\u003c\/div\u003e\n\u003c\/div\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eWhat you can run\u003c\/h2\u003e\n\n\u003cdiv style=\"background:#fefaf0;border-left:4px solid #fab400;padding:16px 20px;margin-bottom:24px;border-radius:0 8px 8px 0\"\u003e\n\u003cp style=\"margin:0;font-size:15px;color:#333\"\u003eWith 64 GB of pooled VRAM across 4 Blackwell cards, this server handles 70B Q4 tensor-parallel, embedding clusters at scale, image and video batch pipelines, and 7-13B multi-tenant serving for 64-128 concurrent users.\u003c\/p\u003e\n\u003c\/div\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eLLMs — text \/ reasoning \/ coding\u003c\/h3\u003e\n\n\u003cp style=\"font-size:14px;font-weight:700;color:#fab400;text-transform:uppercase;letter-spacing:1px;margin-bottom:8px\"\u003eChinese frontier\u003c\/p\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003e\n\u003cstrong\u003eQwen3-32B\u003c\/strong\u003e Q8 (dense at near-fp16 quality); \u003cstrong\u003eQwen3.5-27B\u003c\/strong\u003e bf16\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eQwen3-30B-A3B\u003c\/strong\u003e \/ \u003cstrong\u003eQwen3-Coder-30B-A3B\u003c\/strong\u003e bf16 (~60 GB fits tight)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eQwen3.5-122B-A10B\u003c\/strong\u003e Q4 (~70-75 GB — tight, spill to DDR4 RAM)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eHunyuan-A13B\u003c\/strong\u003e fp8 (~80 GB native — tight, prefer Q6)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eSeed-OSS-36B\u003c\/strong\u003e bf16 (~72 GB tight)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eDeepSeek-R2\u003c\/strong\u003e 32B sparse MoE bf16 (~64 GB) (~45-60 tok\/s single-stream at Q4 on Blackwell, published reference)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eGLM-4.5-Air\u003c\/strong\u003e 106B\/12B Q3_K (~55 GB) — tight KV headroom\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eERNIE-4.5-47B-A3B\u003c\/strong\u003e Q4 (~28 GB with headroom for second model)\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003cp style=\"font-size:14px;font-weight:700;color:#fab400;text-transform:uppercase;letter-spacing:1px;margin:20px 0 8px 0\"\u003eWestern frontier\u003c\/p\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003e\n\u003cstrong\u003eLlama 3.3 70B\u003c\/strong\u003e Q4_K_M (~43 GB) — the sweet spot for this pool (~30-36 tok\/s single-stream on 4x 5080, published reference)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eHermes 3 70B \/ Tulu 3 70B\u003c\/strong\u003e Q4 — open Llama derivatives with full post-training transparency\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eMistral Small 3 \/ Magistral \/ Devstral Small 2\u003c\/strong\u003e 24B bf16\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eGemma 3 27B\u003c\/strong\u003e bf16 multimodal\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003ePhi-4 14B\u003c\/strong\u003e \/ \u003cstrong\u003eNemotron-Super 49B\u003c\/strong\u003e Q6-Q8\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003egpt-oss-20b\u003c\/strong\u003e MXFP4 (16 GB — 4 instances on 4 cards for parallel tenants); \u003cstrong\u003egpt-oss-120b\u003c\/strong\u003e MXFP4 (80 GB — tight; spill manageable)\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eVision-Language\u003c\/h3\u003e\n\u003cp style=\"font-size:15px;color:#333\"\u003eQwen3-VL-32B \/ Qwen3-VL-30B-A3B \/ Qwen3-Omni-30B-A3B; InternVL3.5-38B Q6-Q8; Llama 3.2 90B Vision Q4 (~52 GB tight); Pixtral 12B \/ Pixtral Large 124B Q2-Q3; Gemma 3 27B multimodal bf16; PaliGemma 2 28B bf16; Molmo 72B Q4 (~45 GB); Aya Vision 32B bf16.\u003c\/p\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eImage generation\u003c\/h3\u003e\n\u003cp style=\"font-size:15px;color:#333\"\u003eFLUX.1 [dev] \/ [schnell] fp16 — batch-4 parallel (~10-15 seconds per 1024x1024 image at fp8 on Blackwell, published reference); FLUX.1 Kontext [dev] — in-context editing across 4 tenants; SD 3.5 Large (18 GB fp16) — 4 parallel generators; SDXL 1.0 + ControlNet + AnimateDiff stacks x 4; HunyuanImage-2.1 bf16 per-card; AuraFlow v0.3 \/ OmniGen v1 \/ Kolors 2.0 \/ PixArt-Sigma.\u003c\/p\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eVideo generation\u003c\/h3\u003e\n\u003cp style=\"font-size:15px;color:#333\"\u003eWan 2.2 TI2V-5B bf16 on a single card — 4 parallel tenants; Wan 2.1 14B T2V\/I2V Q4-Q6 per card; HunyuanVideo 13B Q4 (~30 GB) tensor-parallel 2-way; HunyuanVideo 1.5 (8.3B) bf16 per card; Open-Sora 2.0 (11B) Q8 per card — 4 parallel generations; CogVideoX-5B int8; Mochi-1 Q4 per card.\u003c\/p\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eAudio \/ Speech \/ TTS\u003c\/h3\u003e\n\u003cp style=\"font-size:15px;color:#333\"\u003eFull Western and Chinese audio stack fits per card: Whisper v3 + Parakeet + Canary + Moshi + Step-Audio 2 \/ R1 + CosyVoice 3.0 + Kokoro + Stable Audio Open + MusicGen + AudioGen + SeamlessM4T v2. With 4 cards, each card can host a dedicated speech tenant. Whisper v3 turbo runs at ~50x realtime per card (published reference).\u003c\/p\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eMulti-model \/ multi-tenant\u003c\/h3\u003e\n\u003cp style=\"font-size:15px;color:#333;font-weight:700;margin-bottom:8px\"\u003eThe target use case. 16 GB per card rewards partitioned workloads:\u003c\/p\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003e\n\u003cstrong\u003eEmbedding cluster:\u003c\/strong\u003e BGE-M3 \/ Nomic \/ Jina-embed \/ E5 \/ Cohere Embed v3 — 4 tenants at high RPS\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003e7-13B serving at scale:\u003c\/strong\u003e 16-32 concurrent users per card via vLLM \/ SGLang; 64-128 concurrent total\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eMixed pipeline:\u003c\/strong\u003e Card 1 = Qwen3-14B + reranker; Card 2 = Whisper + Moshi; Card 3 = FLUX.1; Card 4 = Wan 2.2 TI2V\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003e4-way tensor-parallel for 70B Q4\u003c\/strong\u003e — Llama 3.3 70B AWQ INT4 across 4 cards, ~90-130 tok\/s batch aggregate (extrapolated from gf-logic 4x4090 bench)\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eTarget workloads\u003c\/h2\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003eBudget multi-GPU AI serving platform for a startup or lab on a capex floor\u003c\/li\u003e\n\u003cli\u003eEmbedding + RAG infrastructure at 4-way horizontal scale\u003c\/li\u003e\n\u003cli\u003eImage \/ video generation batch farm (Stable Diffusion \/ FLUX \/ Wan 2.2)\u003c\/li\u003e\n\u003cli\u003e7-13B small-model serving at scale — 4 independent tenants or 64-128 concurrent pooled\u003c\/li\u003e\n\u003cli\u003eDevelopment staging box for 70B Q4 tensor-parallel workflows\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003ePublished performance references\u003c\/h2\u003e\n\n\u003cdiv style=\"background:#0d0d0d;color:#fff;border-radius:12px;padding:24px;margin-bottom:24px\"\u003e\n\u003cp style=\"margin:0 0 4px 0;font-size:13px;color:#888;text-transform:uppercase;letter-spacing:1px\"\u003eKentino measured (4x4090 reference) + published 5080 estimates\u003c\/p\u003e\n\u003ctable style=\"width:100%;border-collapse:collapse;margin-top:16px;font-size:14px\"\u003e\n\u003cthead\u003e\u003ctr style=\"border-bottom:1px solid #333\"\u003e\n\u003cth style=\"padding:8px 12px;text-align:left;color:#888;font-weight:600\"\u003eBenchmark\u003c\/th\u003e\n\u003cth style=\"padding:8px 12px;text-align:left;color:#888;font-weight:600\"\u003eResult\u003c\/th\u003e\n\u003c\/tr\u003e\u003c\/thead\u003e\n\u003ctbody\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003e4x4090 reference: sustained fp16\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fff\"\u003e647 TFLOPS\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003e4x4090 reference: vLLM Llama 3.3 70B AWQ (batch-32)\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fab400;font-weight:700\"\u003e179.3 tok\/s aggregate\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003e4x4090 reference: llama.cpp 70B Q4_K_M (single)\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fff\"\u003e20.3 tok\/s decode\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003e5080 estimated: Llama 3.3 70B Q4 TP-4 single\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fab400;font-weight:700\"\u003e~15-20 tok\/s\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003e5080 estimated: FLUX.1 fp8 per card\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fff\"\u003e~2.2-2.8 s per 1024x1024 at 20 steps\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003c\/tbody\u003e\n\u003c\/table\u003e\n\u003cp style=\"margin:12px 0 0 0;font-size:13px;color:#666\"\u003e5080 tensor throughput ~1.35x 4090 per INT8 TOPS; single-stream decode is memory-bandwidth-bound (GDDR7 ~960 GB\/s vs 4090 ~1 008 GB\/s — roughly parity).\u003c\/p\u003e\n\u003c\/div\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eNot ideal for\u003c\/h2\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003e70B dense at Q6+ (16 GB-per-card limits per-card footprint; 64 GB pool is tight for Q6)\u003c\/li\u003e\n\u003cli\u003eLong-context MoE flagships (Qwen3-235B, GLM-4.5) — insufficient VRAM even Q2\u003c\/li\u003e\n\u003cli\u003eSingle-stream latency-sensitive work on very large models (TP overhead eats into 16 GB cards)\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eWarranty and lead time\u003c\/h2\u003e\n\u003cdiv style=\"display:flex;gap:16px;flex-wrap:wrap;margin-bottom:24px\"\u003e\n\u003cdiv style=\"flex:1;min-width:150px;background:#f4f4f4;border-radius:8px;padding:20px;text-align:center\"\u003e\n\u003cdiv style=\"font-size:24px;font-weight:800;color:#0d0d0d\"\u003e2 years\u003c\/div\u003e\n\u003cdiv style=\"font-size:13px;color:#666;text-transform:uppercase;letter-spacing:1px\"\u003eparts warranty\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"flex:1;min-width:150px;background:#f4f4f4;border-radius:8px;padding:20px;text-align:center\"\u003e\n\u003cdiv style=\"font-size:24px;font-weight:800;color:#0d0d0d\"\u003e1 year\u003c\/div\u003e\n\u003cdiv style=\"font-size:13px;color:#666;text-transform:uppercase;letter-spacing:1px\"\u003elabor warranty\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"flex:1;min-width:150px;background:#f4f4f4;border-radius:8px;padding:20px;text-align:center\"\u003e\n\u003cdiv style=\"font-size:24px;font-weight:800;color:#0d0d0d\"\u003e10-28 days\u003c\/div\u003e\n\u003cdiv style=\"font-size:13px;color:#666;text-transform:uppercase;letter-spacing:1px\"\u003elead time\u003c\/div\u003e\n\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cp style=\"font-size:14px;color:#666\"\u003eBuild includes assembly, BIOS configuration, driver install, burn-in testing, and functional verification. Lead time depends on component availability, confirmed at order.\u003c\/p\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eRecommended add-ons\u003c\/h2\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003eUpgrade PSU to dual 2 kW ATX synced — raises headroom to 55 %\u003c\/li\u003e\n\u003cli\u003eNVIDIA ConnectX-5 100 GbE MCX555A-ECAT\u003c\/li\u003e\n\u003cli\u003eUpgrade boot drive to 4 TB NVMe\u003c\/li\u003e\n\u003cli\u003eUpgrade RAM to 384 GB (6x 64 GB) — better multi-model concurrent headroom\u003c\/li\u003e\n\u003cli\u003eRack PDU (C13\/C19 metered) and 3 kVA online UPS\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003c\/div\u003e","brand":"Kentino s.r.o.","offers":[{"title":"Default Title","offer_id":52927613534536,"sku":null,"price":11940.0,"currency_code":"EUR","in_stock":true}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/0843\/5479\/3800\/files\/PXL_20260413_071103100.jpg?v=1776441356","url":"https:\/\/kentino.com\/es\/products\/k-ai-64-rome-5080-3600tops-4x-rtx-5080-budget-ai-server","provider":"Kentino","version":"1.0","type":"link"}