{"title":"GPU Workstations","description":"","products":[{"product_id":"hp-vga-nvidia-rtx-a4000-16gb-gddr6-pcie-4-0x16-card","title":"HP VGA NVIDIA RTX A4000 16GB GDDR6, PCIe 4.0×16 Card","description":"\u003ch2\u003eTechnical specifications:\u003c\/h2\u003e \u003cstrong\u003eChipset\u003c\/strong\u003e: NVIDIA RTX A4000 \u003cstrong\u003eMemory\u003c\/strong\u003e: 16 GB GDDR6 \u003cstrong\u003eMemory bus width\u003c\/strong\u003e: 256-bit \u003cstrong\u003eInterface\u003c\/strong\u003e: PCI Express Gen 4 x 16 \u003cstrong\u003eDirectX\u003c\/strong\u003e: DirectX 12.07 \u003cstrong\u003eOpen GL\u003c\/strong\u003e: 4.6 \u003cstrong\u003eVirtual reality\u003c\/strong\u003e: Yes \u003cstrong\u003eCooling\u003c\/strong\u003e: Active \u003cstrong\u003eDisplayPort\u003c\/strong\u003e: 4x DisplayPort 1.4 \u003cstrong\u003eExternal power supply\u003c\/strong\u003e: 1x 6-pin \u003cstrong\u003eMaximum power supply\u003c\/strong\u003e: 140 W \u003cstrong\u003eHP VGA NVIDIA RTX A4000 16GB GDDR6, PCIe 4.0×16 Card \u003c\/strong\u003e NVIDIA's Quadro products are a range of \u003cstrong\u003eprofessional\u003c\/strong\u003e graphics cards designed for both rackmount servers and, above all, high-performance workstations. These are high-quality GPUs with a large memory, ECC and a special set of drivers that are adapted for work in CAD applications. \u003cstrong\u003eCUDA technology\u003c\/strong\u003e Users of professional applications can use graphical CUDA stream processors thanks to the CUDA architecture. As a result, the gross performance of the graphics card can be used for specific calculations, which can speed up the work many times over using a conventional processor, which is significantly limited by a lower number of cores. \u003cstrong\u003eSpecifications:\u003c\/strong\u003e \u003cdiv\u003e \u003cdiv\u003e \u003cdiv\u003e \u003ctable\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd\u003e\n\u003cstrong\u003eC\u003c\/strong\u003e\u003cstrong\u003eUDA Cores\u003c\/strong\u003e\n\u003c\/td\u003e \u003ctd\u003e6144\u003c\/td\u003e \u003c\/tr\u003e \u003ctr\u003e \u003ctd\u003e\u003cstrong\u003eTensor Cores\u003c\/strong\u003e\u003c\/td\u003e \u003ctd\u003e192\u003c\/td\u003e \u003c\/tr\u003e \u003ctr\u003e \u003ctd\u003e\u003cstrong\u003eRT Cores\u003c\/strong\u003e\u003c\/td\u003e \u003ctd\u003e48\u003c\/td\u003e \u003c\/tr\u003e \u003ctr\u003e \u003ctd\u003e\u003cstrong\u003eSingle Precision Performance\u003c\/strong\u003e\u003c\/td\u003e \u003ctd\u003e19.2 TFLOPS\u003c\/td\u003e \u003c\/tr\u003e \u003ctr\u003e \u003ctd\u003e\u003cstrong\u003eRT Core Performance\u003c\/strong\u003e\u003c\/td\u003e \u003ctd\u003e37.4 TFLOPS\u003c\/td\u003e \u003c\/tr\u003e \u003ctr\u003e \u003ctd\u003e\u003cstrong\u003eTensor Performance\u003c\/strong\u003e\u003c\/td\u003e \u003ctd\u003e153.4 TFLOPS\u003c\/td\u003e \u003c\/tr\u003e \u003ctr\u003e \u003ctd\u003e\u003cstrong\u003eGPU Memory\u003c\/strong\u003e\u003c\/td\u003e \u003ctd\u003e16 GB GDDR6 with ECC\u003c\/td\u003e \u003c\/tr\u003e \u003ctr\u003e \u003ctd\u003e\u003cstrong\u003eMemory Interface\u003c\/strong\u003e\u003c\/td\u003e \u003ctd\u003e256-bit\u003c\/td\u003e \u003c\/tr\u003e \u003ctr\u003e \u003ctd\u003e\u003cstrong\u003eMemory Bandwidth\u003c\/strong\u003e\u003c\/td\u003e \u003ctd\u003e448 GB\/sec\u003c\/td\u003e \u003c\/tr\u003e \u003ctr\u003e \u003ctd\u003e\u003cstrong\u003eSystem Interface\u003c\/strong\u003e\u003c\/td\u003e \u003ctd\u003ePCI Express 4.0 x16\u003c\/td\u003e \u003c\/tr\u003e \u003ctr\u003e \u003ctd\u003e\u003cstrong\u003eDisplay Connectors\u003c\/strong\u003e\u003c\/td\u003e \u003ctd\u003e4x DisplayPort 1.4a\u003c\/td\u003e \u003c\/tr\u003e \u003ctr\u003e \u003ctd\u003e\u003cstrong\u003eMaximum Power Consumption\u003c\/strong\u003e\u003c\/td\u003e \u003ctd\u003e140 W\u003c\/td\u003e \u003c\/tr\u003e \u003c\/tbody\u003e \u003c\/table\u003e \u003c\/div\u003e \u003cstrong\u003eSupported technologies:\u003c\/strong\u003e NVIDIA NVLink NVIDIA Ampere NVIDIA Mosaic NVIDIA RTX Experience OpenGL 4.68 Shader Model 5.17 DirectX 12.07 Vulkan 1.2 HDCP 2.2 Max. resolution – 7680 x 4320 px DisplayPort 1.4a \u003c\/div\u003e \u003c\/div\u003e","brand":"HP","offers":[{"title":"Default Title","offer_id":49002885284168,"sku":null,"price":1436.49,"currency_code":"EUR","in_stock":true}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/0843\/5479\/3800\/files\/pi8-1511988-1559809-0a_-1_-1_78956.jpg?v=1725528288"},{"product_id":"pcpraha-epic-elite","title":"PCPRAHA Epic Elite","description":"\u003cp\u003e \u003c\/p\u003e\n\u003cdiv class=\"product-description\"\u003e\n\u003ch3\u003ePCPRAHA Epic Elite: Powerful AI Server for Your Most Demanding Computational Tasks\u003c\/h3\u003e\n\u003cp\u003e\u003cstrong\u003ePCPRAHA Epic Elite\u003c\/strong\u003e is a high-performance server designed to meet the highest computational power requirements. It's ideal for professionals in artificial intelligence (AI), machine learning (ML), data analysis, graphic rendering, and scientific research. This server will provide your company or project with all the necessary resources to achieve impressive results.\u003c\/p\u003e\n\u003ch4\u003ePCPRAHA Epic Elite Applications\u003c\/h4\u003e\n\u003cp\u003e\u003cstrong\u003e1. Artificial Intelligence and Machine Learning:\u003c\/strong\u003e\u003cbr\u003eThe server is equipped with an AMD EPYC™ 9754 processor and two MSI GeForce RTX 4090 SUPRIM X graphics cards, making it ideal for deep learning, neural networks, and other AI processing. The massive memory (384 GB DDR5-4800 ECC) enables simultaneous processing and analysis of large data volumes, accelerating model training and reducing system response time.\u003c\/p\u003e\n\u003cp\u003e\u003cstrong\u003e2. Rendering and Graphics Processing:\u003c\/strong\u003e\u003cbr\u003eTwo RTX 4090 graphics cards provide the highest level of performance when working with 3D graphics, animation, and video. The server can quickly process complex graphic projects, making it an ideal choice for studios working with virtual reality (VR), architectural renders, and other graphic projects.\u003c\/p\u003e\n\u003cp\u003e\u003cstrong\u003e3. Big Data Analysis:\u003c\/strong\u003e\u003cbr\u003eThis server is capable of working with enormous volumes of data, ideal for companies involved in analytics, predictive modeling, and real-time data processing. It significantly accelerates work and enables faster data-driven decision making.\u003c\/p\u003e\n\u003cp\u003e\u003cstrong\u003e4. Cloud Computing and Virtualization:\u003c\/strong\u003e\u003cbr\u003eWith this server, you can develop virtual machines, run cloud applications, and support infrastructure for scalable online services. It's an excellent choice for companies looking to create flexible and reliable cloud solutions.\u003c\/p\u003e\n\u003ch4\u003eProfit Potential\u003c\/h4\u003e\n\u003cp\u003e\u003cstrong\u003e1. Computational Power Rental:\u003c\/strong\u003e\u003cbr\u003eYou can rent out PCPRAHA Epic Elite's computational capacity for projects related to AI model training or graphic rendering. In the market, server rental with similar specifications can cost from €5,000 to €10,000 monthly, depending on tasks and rental duration. This server can cover its costs within 6-12 months of active use.\u003c\/p\u003e\n\u003cp\u003e\u003cstrong\u003e2. Custom AI Solution Development:\u003c\/strong\u003e\u003cbr\u003eBy using the server for creating and testing AI solutions, you can provide custom AI model and algorithm development services, opening opportunities for profit from client contracts. These services typically range from €20,000 to €100,000 per project, depending on complexity.\u003c\/p\u003e\n\u003cp\u003e\u003cstrong\u003e3. Custom Graphics Rendering:\u003c\/strong\u003e\u003cbr\u003eIf you're involved in rendering and visualizations, this server will significantly accelerate project processing, allowing you to handle more orders in less time. The average price for a single render ranges from €500 to €5,000, opening substantial business development opportunities.\u003c\/p\u003e\n\u003ch4\u003eKey Technical Specifications:\u003c\/h4\u003e\n\u003cul\u003e\n\u003cli\u003e\n\u003cstrong\u003eMotherboard:\u003c\/strong\u003e ASRock – GENOAD8X-2T\/BCM\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eProcessor:\u003c\/strong\u003e AMD EPYC™ 9754\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eCooling:\u003c\/strong\u003e Supermicro SNK-P0084AP4\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eRAM:\u003c\/strong\u003e 384 GB DDR5-4800 ECC\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eGraphics Cards:\u003c\/strong\u003e 2 x MSI GeForce RTX 4090 SUPRIM X 24G\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eStorage:\u003c\/strong\u003e U.2\/U.3 NVMe Mobile Rack Cage\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003ePower Supply:\u003c\/strong\u003e LX2600W\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eCase:\u003c\/strong\u003e 24″ U4 Rack Mount Case\u003c\/li\u003e\n\u003c\/ul\u003e\n\u003cdiv class=\"additional-features\"\u003e\n\u003ch4\u003eAdditional Features\u003c\/h4\u003e\n\u003cul class=\"features-list\"\u003e\n\u003cli\u003eProfessional-grade ECC memory for enhanced reliability\u003c\/li\u003e\n\u003cli\u003eDual RTX 4090 configuration for maximum GPU performance\u003c\/li\u003e\n\u003cli\u003eEnterprise-class cooling solution\u003c\/li\u003e\n\u003cli\u003eRack-mountable design for data center deployment\u003c\/li\u003e\n\u003cli\u003eHigh-efficiency power supply for demanding workloads\u003c\/li\u003e\n\u003c\/ul\u003e\n\u003c\/div\u003e\n\u003cdiv class=\"warranty-support\"\u003e\n\u003ch4\u003eWarranty \u0026amp; Support\u003c\/h4\u003e\n\u003cul\u003e\n\u003cli\u003eProfessional technical support\u003c\/li\u003e\n\u003cli\u003eExtended warranty options available\u003c\/li\u003e\n\u003cli\u003eInstallation and setup assistance\u003c\/li\u003e\n\u003cli\u003eRemote configuration support\u003c\/li\u003e\n\u003c\/ul\u003e\n\u003c\/div\u003e\n\u003c\/div\u003e","brand":"Pcpraha","offers":[{"title":"Default Title","offer_id":49282672525640,"sku":"","price":12444.0,"currency_code":"EUR","in_stock":true}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/0843\/5479\/3800\/files\/bluerack.png?v=1737994278"},{"product_id":"k-ai-32-rome-5090-1676tops-1x-rtx-5090-ai-workstation","title":"K-AI 32 Rome 5090 1676TOPS — 1x RTX 5090 AI Workstation","description":"\u003cdiv style=\"font-family:-apple-system,BlinkMacSystemFont,'Segoe UI',Roboto,sans-serif;line-height:1.7;color:#1a1a1a\"\u003e\n\n\u003cdiv style=\"background:linear-gradient(135deg,#0d0d0d 0%,#1a1a2e 100%);color:#fff;padding:32px;border-radius:12px;margin-bottom:32px\"\u003e\n\u003cp style=\"font-size:18px;margin:0 0 20px 0;color:#ccc\"\u003eK-AI 32 Rome 5090 1676TOPS\u003c\/p\u003e\n\u003cp style=\"font-size:28px;font-weight:700;margin:0 0 16px 0;line-height:1.3\"\u003eSingle-GPU Blackwell Workstation\u003cbr\u003e1x RTX 5090 | EPYC Milan | 1 676 TOPS INT8\u003c\/p\u003e\n\u003cdiv style=\"display:flex;gap:16px;flex-wrap:wrap;margin-top:24px\"\u003e\n\u003cdiv style=\"background:rgba(250,180,0,0.15);border:1px solid #fab400;border-radius:8px;padding:14px 16px;text-align:center;flex:1;min-width:100px\"\u003e\n\u003cdiv style=\"font-size:26px;font-weight:800;color:#fab400\"\u003e1 676\u003c\/div\u003e\n\u003cdiv style=\"font-size:11px;color:#ccc;text-transform:uppercase;letter-spacing:1px\"\u003eTOPS INT8\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"background:rgba(250,180,0,0.15);border:1px solid #fab400;border-radius:8px;padding:14px 16px;text-align:center;flex:1;min-width:100px\"\u003e\n\u003cdiv style=\"font-size:26px;font-weight:800;color:#fab400\"\u003e32 GB\u003c\/div\u003e\n\u003cdiv style=\"font-size:11px;color:#ccc;text-transform:uppercase;letter-spacing:1px\"\u003eVRAM GDDR7\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"background:rgba(250,180,0,0.15);border:1px solid #fab400;border-radius:8px;padding:14px 16px;text-align:center;flex:1;min-width:100px\"\u003e\n\u003cdiv style=\"font-size:26px;font-weight:800;color:#fab400\"\u003efp8\u003c\/div\u003e\n\u003cdiv style=\"font-size:11px;color:#ccc;text-transform:uppercase;letter-spacing:1px\"\u003enative tensor\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"background:rgba(250,180,0,0.15);border:1px solid #fab400;border-radius:8px;padding:14px 16px;text-align:center;flex:1;min-width:100px\"\u003e\n\u003cdiv style=\"font-size:26px;font-weight:800;color:#fab400\"\u003erack\u003c\/div\u003e\n\u003cdiv style=\"font-size:11px;color:#ccc;text-transform:uppercase;letter-spacing:1px\"\u003eready\u003c\/div\u003e\n\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cp style=\"margin-top:20px;font-size:15px;color:#aaa\"\u003eSingle Blackwell GPU, 32 GB GDDR7, fp8 native — the sharpest single-card AI workstation Kentino builds.\u003c\/p\u003e\n\u003c\/div\u003e\n\n\u003cp style=\"font-size:17px;color:#333;margin-bottom:24px\"\u003eA single-GPU, workstation-class AI server on the ROMED8-2T \/ EPYC Milan platform. One RTX 5090 delivers 32 GB of GDDR7 VRAM with native fp8 tensor math — the sweet spot for a developer box, a small-team inference endpoint, or an image\/video generation workstation where one strong GPU beats two weaker ones. 4U rack form factor, but drop-in for a quiet office under-desk deployment.\u003c\/p\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eHardware\u003c\/h2\u003e\n\n\u003ctable style=\"width:100%;border-collapse:collapse;margin-bottom:24px;font-size:15px\"\u003e\n\u003cthead\u003e\u003ctr style=\"background:#0d0d0d;color:#fff\"\u003e\n\u003cth style=\"padding:12px 16px;text-align:left\"\u003eComponent\u003c\/th\u003e\n\u003cth style=\"padding:12px 16px;text-align:left\"\u003eDetail\u003c\/th\u003e\n\u003c\/tr\u003e\u003c\/thead\u003e\n\u003ctbody\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eGPU\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e1x NVIDIA GeForce RTX 5090 32 GB GDDR7 (575 W, PCIe 5.0 x16, Blackwell)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eVRAM pool\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e32 GB\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eCPU\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eAMD EPYC 7643 Milan (48C\/96T, 225 W, 128x PCIe 4.0 lanes)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eMotherboard\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eASRock Rack ROMED8-2T (SP3, 7x PCIe 4.0 x16, 8x DDR4 ECC, 2x 10 GbE, IPMI)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eSystem RAM\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e128 GB DDR4-2666 ECC RDIMM (2x 64 GB)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eBoot \/ storage\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e1 TB NVMe M.2 (PCIe 4.0 x4)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003ePower supply\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eSingle 2 kW ATX PSU\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eChassis\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e4U rack-mount, passive Gen4 x16 riser\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eCooling\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eSP3 tower cooler (Arctic Freezer 4U-M class), 3x 120 mm front intake + 1x 120 mm rear exhaust\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eNetwork\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eOnboard dual 10 GbE (Intel X550) + IPMI\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003c\/tbody\u003e\n\u003c\/table\u003e\n\n\u003cdiv style=\"display:flex;gap:16px;flex-wrap:wrap;margin-bottom:32px\"\u003e\n\u003cdiv style=\"flex:1;min-width:250px;background:#f4f4f4;border-radius:8px;padding:20px\"\u003e\n\u003ch3 style=\"font-size:16px;font-weight:700;margin:0 0 12px 0\"\u003ePower envelope\u003c\/h3\u003e\n\u003cul style=\"margin:0;padding-left:18px;font-size:14px;color:#444\"\u003e\n\u003cli\u003eGPU draw: 1 x 575 W = 575 W\u003c\/li\u003e\n\u003cli\u003eSystem total at full load: ~900 W\u003c\/li\u003e\n\u003cli\u003ePSU total: 2 000 W (single 2 kW ATX) — 55 % headroom\u003c\/li\u003e\n\u003cli\u003eGenerous transient margin, silent operation at light load\u003c\/li\u003e\n\u003c\/ul\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"flex:1;min-width:250px;background:#f4f4f4;border-radius:8px;padding:20px\"\u003e\n\u003ch3 style=\"font-size:16px;font-weight:700;margin:0 0 12px 0\"\u003eLane topology\u003c\/h3\u003e\n\u003cp style=\"margin:0;font-size:14px;color:#444\"\u003ePCIe Gen4 x16 at the GPU (ROMED8-2T is Gen4; 5090 is Gen5 silicon running Gen4 without bandwidth penalty for inference). 16 lanes direct from CPU root complex. No PCIe switch. No NVLink on GeForce 5090.\u003c\/p\u003e\n\u003c\/div\u003e\n\u003c\/div\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eWhat you can run\u003c\/h2\u003e\n\n\u003cdiv style=\"background:#fefaf0;border-left:4px solid #fab400;padding:16px 20px;margin-bottom:24px;border-radius:0 8px 8px 0\"\u003e\n\u003cp style=\"margin:0;font-size:15px;color:#333\"\u003eWith 32 GB of GDDR7 VRAM and native fp8 tensor math, this workstation handles open-weight LLMs up to 32B dense, image generation with FLUX.1, video generation, speech AI, and single-developer multi-model stacks.\u003c\/p\u003e\n\u003c\/div\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eLLMs — text \/ reasoning \/ coding\u003c\/h3\u003e\n\n\u003cp style=\"font-size:14px;font-weight:700;color:#fab400;text-transform:uppercase;letter-spacing:1px;margin-bottom:8px\"\u003eChinese frontier\u003c\/p\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003e\n\u003cstrong\u003eQwen3-32B\u003c\/strong\u003e dense Q6_K — 32k context, flagship general reasoning (~40-55 tok\/s single-stream on Blackwell fp8, published reference)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eQwen3-30B-A3B\u003c\/strong\u003e MoE at Q4_K_M with long KV headroom (Qwen3-Coder-30B-A3B agentic, 256k ctx)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eQwQ-32B\u003c\/strong\u003e Q6 — reasoning preview\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eDeepSeek-R2\u003c\/strong\u003e 32B sparse MoE at Q4-Q6 — single-GPU reasoning that scores 92.7 % AIME-2025 (~45-60 tok\/s single-stream on Blackwell fp8, published reference)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eQwen3.5-27B\u003c\/strong\u003e dense Q6 (Feb 2026 release)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eHunyuan-A13B\u003c\/strong\u003e at Q4_K_M (~28-30 GB) — 80B\/13B MoE, 256k ctx, dual-mode reasoning\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eSeed-OSS-36B\u003c\/strong\u003e Q4_K_M — 512k native context for long-doc analysis\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003cp style=\"font-size:14px;font-weight:700;color:#fab400;text-transform:uppercase;letter-spacing:1px;margin:20px 0 8px 0\"\u003eWestern frontier\u003c\/p\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003e\n\u003cstrong\u003eLlama 3.3 70B\u003c\/strong\u003e at Q2_K (~27 GB tight) or Q3_K (~34 GB with RAM spill) — usable for general chat\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eMistral Small 3 \/ Magistral Small \/ Devstral Small 2\u003c\/strong\u003e (24B dense) at Q6-Q8 or bf16\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eGemma 3 27B\u003c\/strong\u003e multimodal at Q6 with 128k context\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003ePhi-4 14B\u003c\/strong\u003e \/ \u003cstrong\u003ePhi-4-reasoning\u003c\/strong\u003e bf16\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eReka Flash 3 (21B Apache 2.0)\u003c\/strong\u003e at bf16\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003egpt-oss-20b\u003c\/strong\u003e native MXFP4 (~16 GB — fits with generous KV)\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eVision-Language\u003c\/h3\u003e\n\u003cp style=\"font-size:15px;color:#333\"\u003eQwen3-VL-8B \/ -32B at Q4-Q6; Qwen3-VL-30B-A3B MoE; InternVL3.5-8B \/ -38B Q4; MiniCPM-V 2.6 \/ MiniCPM-o 2.6 (8B); Llama 3.2 11B Vision bf16; Pixtral 12B bf16 (24 GB — tight, use Q8); Gemma 3 12B \/ 27B multimodal; PaliGemma 2 (3\/10B); Phi-4-multimodal 5.6B; Aya Vision 8B.\u003c\/p\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eImage generation\u003c\/h3\u003e\n\u003cp style=\"font-size:15px;color:#333\"\u003eFLUX.1 [dev] \/ [schnell] fp8 (~12 GB) native Blackwell speedup (~8-12 seconds per 1024x1024 image at 20 steps on Blackwell, published reference); FLUX.1 Kontext [dev] — in-context editing, character consistency; SD 3.5 Large (18 GB fp16 \/ 11 GB fp8); SDXL 1.0 10-12 GB fp16; HunyuanImage-2.1 NF4 (~14 GB); Kolors 2.0 fp8; AuraFlow v0.3 \/ OmniGen v1 \/ PixArt-Sigma.\u003c\/p\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eVideo generation\u003c\/h3\u003e\n\u003cp style=\"font-size:15px;color:#333\"\u003eWan 2.2 TI2V-5B at ~16 GB — 720p@24fps on a single 5090; Wan 2.1 T2V\/I2V 14B at Q4-Q6 (~16 GB); HunyuanVideo 1.5 (8.3B) — 14 GB minimum; CogVideoX-5B \/ 5B-I2V int8 (~12 GB); LTX-Video 2B realtime-class 30 fps; Mochi-1 Q4 (~17-18 GB).\u003c\/p\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eAudio \/ Speech \/ TTS\u003c\/h3\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003e\n\u003cstrong\u003eASR:\u003c\/strong\u003e Whisper v3 large \/ turbo (~50x realtime on single GPU, published reference); NVIDIA Parakeet-TDT 1.1B; Canary 1B\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eTTS:\u003c\/strong\u003e CosyVoice 2.0 \/ Fun-CosyVoice 3.0; Kokoro 82M; Stable Audio Open\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eRealtime \/ S2S:\u003c\/strong\u003e Kyutai Moshi (7B) — only open realtime full-duplex voice; Step-Audio 2 mini \/ R1\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eMulti-model \/ multi-tenant\u003c\/h3\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003eResident stack for a single developer: Qwen3-32B Q6 (~20 GB) + FLUX.1 fp8 (~12 GB fits tight) on swap, or Qwen3-14B Q6 (~9 GB) + FLUX.1 + Whisper-turbo + Kokoro simultaneously (~20-24 GB pinned)\u003c\/li\u003e\n\u003cli\u003e2-4 concurrent users on 14-32B class LLMs via vLLM \/ SGLang\u003c\/li\u003e\n\u003cli\u003eLoRA \/ QLoRA fine-tuning of 7-14B dense models\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eTarget workloads\u003c\/h2\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003eDeveloper workstation for a single AI engineer running mixed inference + image gen\u003c\/li\u003e\n\u003cli\u003eSmall-team coding-agent endpoint (Qwen3-Coder-30B-A3B) with 1-4 concurrent users\u003c\/li\u003e\n\u003cli\u003eContent pipeline: FLUX.1 or SD 3.5 Large batch image gen + Wan 2.2 short-form video\u003c\/li\u003e\n\u003cli\u003eOn-premises ASR + TTS voice stack (Whisper + Kokoro + Moshi) for a branch office\u003c\/li\u003e\n\u003cli\u003eProsumer LLM + VLM research box — test Qwen3, Llama 3.3, Gemma 3, Phi-4 on real hardware\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003ePublished performance references\u003c\/h2\u003e\n\n\u003cdiv style=\"background:#0d0d0d;color:#fff;border-radius:12px;padding:24px;margin-bottom:24px\"\u003e\n\u003cp style=\"margin:0 0 4px 0;font-size:13px;color:#888;text-transform:uppercase;letter-spacing:1px\"\u003ePublished reference | single RTX 5090 comparable hardware\u003c\/p\u003e\n\u003ctable style=\"width:100%;border-collapse:collapse;margin-top:16px;font-size:14px\"\u003e\n\u003cthead\u003e\u003ctr style=\"border-bottom:1px solid #333\"\u003e\n\u003cth style=\"padding:8px 12px;text-align:left;color:#888;font-weight:600\"\u003eBenchmark\u003c\/th\u003e\n\u003cth style=\"padding:8px 12px;text-align:left;color:#888;font-weight:600\"\u003eResult\u003c\/th\u003e\n\u003c\/tr\u003e\u003c\/thead\u003e\n\u003ctbody\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003eLlama 3.3 70B Q4_K_M llama.cpp decode\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fff\"\u003e~18-22 tok\/s with CPU KV offload\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003eQwen3-32B Q6 vLLM single-stream\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fab400;font-weight:700\"\u003e~45-55 tok\/s decode at fp8\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003eFLUX.1 [dev] fp8 on Blackwell\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fab400;font-weight:700\"\u003e~1.7-2.0 s per 1024x1024 image at 20 steps\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003eWan 2.2 TI2V-5B 720p clip\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fff\"\u003e~3-4 minutes at fp16\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003c\/tbody\u003e\n\u003c\/table\u003e\n\u003cp style=\"margin:12px 0 0 0;font-size:13px;color:#666\"\u003ePublished reference points from comparable single-5090 hardware. Kentino measured numbers will be posted once gf-logic extends bench to single-5090.\u003c\/p\u003e\n\u003c\/div\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eNot ideal for\u003c\/h2\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003e70B dense models at Q6+ (32 GB is insufficient — use 2x 5090 for proper 64 GB pool)\u003c\/li\u003e\n\u003cli\u003eMulti-user concurrent serving at scale (single tensor-parallel partition)\u003c\/li\u003e\n\u003cli\u003eFrontier 100B+ MoE (GLM-4.5, Kimi K2, Mistral Large 3 — out of reach on a single consumer card)\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eWarranty and lead time\u003c\/h2\u003e\n\u003cdiv style=\"display:flex;gap:16px;flex-wrap:wrap;margin-bottom:24px\"\u003e\n\u003cdiv style=\"flex:1;min-width:150px;background:#f4f4f4;border-radius:8px;padding:20px;text-align:center\"\u003e\n\u003cdiv style=\"font-size:24px;font-weight:800;color:#0d0d0d\"\u003e2 years\u003c\/div\u003e\n\u003cdiv style=\"font-size:13px;color:#666;text-transform:uppercase;letter-spacing:1px\"\u003eparts warranty\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"flex:1;min-width:150px;background:#f4f4f4;border-radius:8px;padding:20px;text-align:center\"\u003e\n\u003cdiv style=\"font-size:24px;font-weight:800;color:#0d0d0d\"\u003e1 year\u003c\/div\u003e\n\u003cdiv style=\"font-size:13px;color:#666;text-transform:uppercase;letter-spacing:1px\"\u003elabor warranty\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"flex:1;min-width:150px;background:#f4f4f4;border-radius:8px;padding:20px;text-align:center\"\u003e\n\u003cdiv style=\"font-size:24px;font-weight:800;color:#0d0d0d\"\u003e10-28 days\u003c\/div\u003e\n\u003cdiv style=\"font-size:13px;color:#666;text-transform:uppercase;letter-spacing:1px\"\u003elead time\u003c\/div\u003e\n\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cp style=\"font-size:14px;color:#666\"\u003eBuild includes assembly, BIOS configuration, driver install, burn-in testing, and functional verification. Lead time depends on component availability, confirmed at order.\u003c\/p\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eRecommended add-ons\u003c\/h2\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003eNVIDIA ConnectX-5 100 GbE MCX555A-ECAT\u003c\/li\u003e\n\u003cli\u003eUpgrade boot drive to 2 TB NVMe — or 4 TB\u003c\/li\u003e\n\u003cli\u003eUpgrade RAM to 256 GB (4x 64 GB DDR4) for bigger KV cache \/ multi-model concurrent stacks\u003c\/li\u003e\n\u003cli\u003eRack PDU (C13\/C19 metered) and 2 kVA online UPS\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003c\/div\u003e","brand":"Kentino s.r.o.","offers":[{"title":"Default Title","offer_id":52927463620936,"sku":null,"price":8092.0,"currency_code":"EUR","in_stock":true}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/0843\/5479\/3800\/files\/PXL_20260413_071103100.jpg?v=1776441356"},{"product_id":"k-ai-96-rome-rtxpro6000-2000tops-single-card-96-gb-blackwell-workstation-server","title":"K-AI 96 Rome RTXPro6000 2000TOPS — Single-Card 96 GB Blackwell Workstation Server","description":"\u003cdiv style=\"font-family:-apple-system,BlinkMacSystemFont,'Segoe UI',Roboto,sans-serif;line-height:1.7;color:#1a1a1a\"\u003e\n\n\u003cdiv style=\"background:linear-gradient(135deg,#0d0d0d 0%,#1a1a2e 100%);color:#fff;padding:32px;border-radius:12px;margin-bottom:32px\"\u003e\n\u003cp style=\"font-size:18px;margin:0 0 20px 0;color:#ccc\"\u003eK-AI 96 Rome RTXPro6000 2000TOPS\u003c\/p\u003e\n\u003cp style=\"font-size:28px;font-weight:700;margin:0 0 16px 0;line-height:1.3\"\u003e96 GB ECC Single-Card Workstation Server\u003cbr\u003e1x RTX Pro 6000 Blackwell | EPYC Milan | 2 000 TOPS INT8\u003c\/p\u003e\n\u003cdiv style=\"display:flex;gap:16px;flex-wrap:wrap;margin-top:24px\"\u003e\n\u003cdiv style=\"background:rgba(250,180,0,0.15);border:1px solid #fab400;border-radius:8px;padding:14px 16px;text-align:center;flex:1;min-width:100px\"\u003e\n\u003cdiv style=\"font-size:26px;font-weight:800;color:#fab400\"\u003e2 000\u003c\/div\u003e\n\u003cdiv style=\"font-size:11px;color:#ccc;text-transform:uppercase;letter-spacing:1px\"\u003eINT8 TOPS\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"background:rgba(250,180,0,0.15);border:1px solid #fab400;border-radius:8px;padding:14px 16px;text-align:center;flex:1;min-width:100px\"\u003e\n\u003cdiv style=\"font-size:26px;font-weight:800;color:#fab400\"\u003e96 GB\u003c\/div\u003e\n\u003cdiv style=\"font-size:11px;color:#ccc;text-transform:uppercase;letter-spacing:1px\"\u003eECC VRAM\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"background:rgba(250,180,0,0.15);border:1px solid #fab400;border-radius:8px;padding:14px 16px;text-align:center;flex:1;min-width:100px\"\u003e\n\u003cdiv style=\"font-size:26px;font-weight:800;color:#fab400\"\u003esingle\u003c\/div\u003e\n\u003cdiv style=\"font-size:11px;color:#ccc;text-transform:uppercase;letter-spacing:1px\"\u003ecard design\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"background:rgba(250,180,0,0.15);border:1px solid #fab400;border-radius:8px;padding:14px 16px;text-align:center;flex:1;min-width:100px\"\u003e\n\u003cdiv style=\"font-size:26px;font-weight:800;color:#fab400\"\u003efp8\u003c\/div\u003e\n\u003cdiv style=\"font-size:11px;color:#ccc;text-transform:uppercase;letter-spacing:1px\"\u003enative Blackwell\u003c\/div\u003e\n\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cp style=\"margin-top:20px;font-size:15px;color:#aaa\"\u003eOne card, 96 GB ECC VRAM, the entire Blackwell tensor pipeline. 70B dense bf16 on a single GPU — no tensor-parallel overhead.\u003c\/p\u003e\n\u003c\/div\u003e\n\n\u003cp style=\"font-size:17px;color:#333;margin-bottom:24px\"\u003eA 4U rack-mount workstation server with a single NVIDIA RTX Pro 6000 Blackwell Workstation card (96 GB ECC GDDR7), one AMD EPYC 7643 Milan CPU (48C\/96T), 256 GB DDR4 ECC, 2 TB NVMe boot, and one 2 kW ATX PSU with 54 % headroom. The simplest software path Kentino ships — no tensor-parallel config, no multi-GPU debugging. vLLM, SGLang, llama.cpp, ComfyUI run single-device and just work.\u003c\/p\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eHardware\u003c\/h2\u003e\n\n\u003ctable style=\"width:100%;border-collapse:collapse;margin-bottom:24px;font-size:15px\"\u003e\n\u003cthead\u003e\u003ctr style=\"background:#0d0d0d;color:#fff\"\u003e\n\u003cth style=\"padding:12px 16px;text-align:left\"\u003eComponent\u003c\/th\u003e\n\u003cth style=\"padding:12px 16px;text-align:left\"\u003eDetail\u003c\/th\u003e\n\u003c\/tr\u003e\u003c\/thead\u003e\n\u003ctbody\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eGPU\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e1x NVIDIA RTX Pro 6000 Blackwell Workstation 96 GB ECC GDDR7 (600 W, PCIe 5.0 x16)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eVRAM\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e96 GB ECC on a single card — no pooling, no tensor-parallel overhead\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eCPU\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eAMD EPYC 7643 Milan (48C\/96T, 225 W, 128x PCIe 4.0 lanes)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eMotherboard\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eASRock Rack ROMED8-2T (SP3, 7x PCIe 4.0 x16, 8x DDR4 ECC, 2x 10 GbE, IPMI)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eSystem RAM\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e256 GB DDR4-2666 ECC RDIMM (4x 64 GB)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eBoot \/ storage\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e2 TB NVMe M.2 (PCIe 4.0 x4)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003ePower supply\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e1x 2 kW ATX PSU\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eChassis\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003e4U rack-mount (4-slot capacity, 1 populated — room to expand)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"background:#f8f8f8\"\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eCooling\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eArctic Freezer 4U-M SP3 tower + 3x 120 mm front intake + 1x 120 mm rear exhaust\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 16px;font-weight:600\"\u003eNetwork\u003c\/td\u003e\n\u003ctd style=\"padding:10px 16px\"\u003eOnboard dual 10 GbE (Intel X550)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003c\/tbody\u003e\n\u003c\/table\u003e\n\n\u003cdiv style=\"display:flex;gap:16px;flex-wrap:wrap;margin-bottom:32px\"\u003e\n\u003cdiv style=\"flex:1;min-width:250px;background:#f4f4f4;border-radius:8px;padding:20px\"\u003e\n\u003ch3 style=\"font-size:16px;font-weight:700;margin:0 0 12px 0\"\u003ePower envelope\u003c\/h3\u003e\n\u003cul style=\"margin:0;padding-left:18px;font-size:14px;color:#444\"\u003e\n\u003cli\u003eGPU draw: 1 x 600 W = 600 W\u003c\/li\u003e\n\u003cli\u003eSystem total at full load: ~925 W\u003c\/li\u003e\n\u003cli\u003ePSU total: 2 000 W — 53.8 % headroom\u003c\/li\u003e\n\u003cli\u003eSingle PSU, simple cabling — generous margin for single-card build\u003c\/li\u003e\n\u003c\/ul\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"flex:1;min-width:250px;background:#f4f4f4;border-radius:8px;padding:20px\"\u003e\n\u003ch3 style=\"font-size:16px;font-weight:700;margin:0 0 12px 0\"\u003eLane topology\u003c\/h3\u003e\n\u003cp style=\"margin:0;font-size:14px;color:#444\"\u003ePCIe Gen4 x16 at the GPU (card is Gen5 native; Rome board caps at Gen4). Direct root-complex connection — no PCIe switch. No NVLink required — single card, no inter-GPU link at all. Six x16 slots remain open for NIC \/ storage \/ expansion.\u003c\/p\u003e\n\u003c\/div\u003e\n\u003c\/div\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eWhat you can run\u003c\/h2\u003e\n\n\u003cdiv style=\"background:#fefaf0;border-left:4px solid #fab400;padding:16px 20px;margin-bottom:24px;border-radius:0 8px 8px 0\"\u003e\n\u003cp style=\"margin:0;font-size:15px;color:#333\"\u003eWith 96 GB of ECC VRAM on a single Blackwell card, this server handles 70B dense bf16 on one GPU, open-weight LLMs, vision models, image and video generation, speech AI, and production inference — no tensor-parallel coordination needed.\u003c\/p\u003e\n\u003c\/div\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eLLMs — text \/ reasoning \/ coding\u003c\/h3\u003e\n\n\u003cp style=\"font-size:14px;font-weight:700;color:#fab400;text-transform:uppercase;letter-spacing:1px;margin-bottom:8px\"\u003eChinese frontier\u003c\/p\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003e\n\u003cstrong\u003eQwen3 \/ Qwen3.5 (Alibaba):\u003c\/strong\u003e Qwen3-32B dense bf16 (~65 GB) with generous KV; Qwen3-72B Q6 (~58 GB, ~25-35 tok\/s single-stream); Qwen3-30B-A3B MoE bf16; Qwen3-Coder-30B-A3B agentic at 256k ctx; Qwen3.5-122B-A10B Q4 (~70 GB) with tight KV; QwQ-32B bf16 reasoning\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eDeepSeek:\u003c\/strong\u003e DeepSeek-R2 32B sparse MoE bf16 (~64 GB, 92.7 % AIME 2025 single-card); DeepSeek-R1-Distill-Qwen-32B bf16; DeepSeek-V2-Lite 16B full precision\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eGLM \/ Z.ai:\u003c\/strong\u003e GLM-4.5-Air 106B\/12B Q4-Q5 (60-70 GB); GLM-4.6V 106B Q4\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eTencent Hunyuan:\u003c\/strong\u003e Hunyuan-A13B 80B\/13B MoE Q4-fp8 (~48-80 GB) with 256k ctx and dual-mode reasoning\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eByteDance Seed-OSS-36B\u003c\/strong\u003e bf16 (~72 GB tight) or fp8 (~36 GB) with full 512k native context\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eBaidu ERNIE-4.5-47B-A3B\u003c\/strong\u003e Q4-fp8 with long context\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003cp style=\"font-size:14px;font-weight:700;color:#fab400;text-transform:uppercase;letter-spacing:1px;margin:20px 0 8px 0\"\u003eWestern frontier\u003c\/p\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003e\n\u003cstrong\u003eMeta Llama:\u003c\/strong\u003e Llama 3.3 70B at bf16 (~70 GB) on a single card with 8-16k ctx — the hero config; Llama 3.3 70B Q6 (~58 GB, ~35-50 tok\/s single-stream); Llama 3.1 8B bf16 (~80-120 tok\/s); Llama 3.2 90B Vision Q4 (~52 GB); Llama 4 Scout 109B\/17B MoE Q4 (~63 GB)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eMistral:\u003c\/strong\u003e Mistral Small 3 \/ Magistral Small 1.2 \/ Devstral Small 2 (24B) all at bf16 with 256k ctx; Mixtral 8x7B Q6; Codestral Mamba 7B; Pixtral 12B bf16\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eOpenAI (open weights):\u003c\/strong\u003e gpt-oss-20b MXFP4 native (16 GB); gpt-oss-120b MXFP4 native (80 GB) — single-card single-stream\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eGoogle Gemma 3:\u003c\/strong\u003e 27B multimodal bf16 (~54 GB) with 128k ctx; 12B \/ 4B bf16\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eMicrosoft Phi-4\u003c\/strong\u003e 14B dense bf16; Phi-4-reasoning; Phi-4-multimodal\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eNVIDIA Nemotron:\u003c\/strong\u003e Llama-3.1-Nemotron-Super 49B Q6 (~40 GB); Nemotron-Nano 8B\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eOthers:\u003c\/strong\u003e IBM Granite 4.0 H-Small 32B\/9B; OLMo 2 32B; Reka Flash 3 21B; Falcon H1R 7B; Command R 35B\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eVision-Language Models\u003c\/h3\u003e\n\u003cp style=\"font-size:15px;color:#333\"\u003eQwen3-VL-8B \/ 32B bf16, Qwen3-VL-30B-A3B MoE bf16, Qwen3-Omni-30B-A3B; InternVL3 up to 78B Q4 (~48 GB); InternVL3.5-38B bf16; DeepSeek-VL2 full range; Llama 3.2 11B Vision bf16; Llama 3.2 90B Vision Q4 (~52 GB); Pixtral 12B bf16; Molmo 72B Q4; Molmo 7B bf16; Gemma 3 12B \/ 27B multimodal; PaliGemma 2 28B; Phi-3.5-Vision; Aya Vision 8B \/ 32B; MiniCPM-V 2.6 \/ MiniCPM-o 2.6; GLM-4.6V.\u003c\/p\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eImage generation\u003c\/h3\u003e\n\u003cp style=\"font-size:15px;color:#333\"\u003eFLUX.1 [dev] \/ [schnell] bf16 (~24 GB) and quantized (~15-25 s\/image at fp8); FLUX.1 Kontext [dev] in-context editing; FLUX Tools (Fill \/ Depth \/ Canny \/ Redux); SD 3.5 Large bf16 (~18 GB); SDXL 1.0; HunyuanImage-2.1 bf16 (~34 GB) at 2K native; HunyuanDiT 1.5B; Kolors \/ Kolors 2.0; AuraFlow v0.3; OmniGen v1; PixArt-Sigma.\u003c\/p\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eVideo generation\u003c\/h3\u003e\n\u003cp style=\"font-size:15px;color:#333\"\u003eWan 2.2 T2V-A14B \/ I2V-A14B MoE bf16 (~54 GB, both experts resident); Wan 2.2 TI2V-5B fast path; HunyuanVideo 13B bf16 (~60-80 GB, tight at 720p); HunyuanVideo 1.5 (8.3B); CogVideoX-5B; Open-Sora 2.0 (11B) bf16; Genmo Mochi-1 bf16 (~42 GB); LTX-Video; Pyramid Flow; SVD \/ SV3D \/ SV4D; NVIDIA Cosmos Predict 2.\u003c\/p\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eAudio \/ Speech \/ TTS\u003c\/h3\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003e\n\u003cstrong\u003eASR:\u003c\/strong\u003e Whisper v3 large \/ turbo (~50x realtime); NVIDIA Parakeet-TDT 1.1B; Canary 1B; Qwen3-ASR; SenseVoice\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eTTS:\u003c\/strong\u003e CosyVoice 2 \/ Fun-CosyVoice 3.0; Kokoro 82M; Stable Audio Open; Coqui XTTS v2; StyleTTS 2; Step-Audio-EditX\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eRealtime \/ S2S:\u003c\/strong\u003e Kyutai Moshi (200 ms full-duplex); Step-Audio 2 mini; Step-Audio-R1 \/ R1.1; Qwen2.5-Omni-7B\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eMusic \/ SFX:\u003c\/strong\u003e Meta MusicGen; AudioGen; Suno Bark; SeamlessM4T v2\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch3 style=\"font-size:18px;font-weight:700;margin:28px 0 12px 0;color:#0d0d0d\"\u003eMulti-model \/ multi-tenant serving\u003c\/h3\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003eSingle-tenant streaming coding assistant — 70B dense bf16, low latency, no TP penalty\u003c\/li\u003e\n\u003cli\u003eMixed resident stack: Qwen3-32B bf16 + FLUX.1 fp8 + Whisper-turbo + Moshi on one card with partitioned VRAM\u003c\/li\u003e\n\u003cli\u003eFine-tuning: LoRA \/ QLoRA on 13-34B models; full-param on 7B\u003c\/li\u003e\n\u003cli\u003eEmbedding service: BGE-M3 \/ E5 \/ Jina resident alongside a generator LLM\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eTarget workloads\u003c\/h2\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003eSingle-tenant streaming coding assistant running Llama 3.3 70B bf16 or Qwen3-Coder-30B-A3B — no TP coordination overhead\u003c\/li\u003e\n\u003cli\u003eDeveloper workstation for a single engineer or tight team needing a 70B-class model with 32-128k context\u003c\/li\u003e\n\u003cli\u003eVideo or image generation lab — HunyuanVideo 13B, Wan 2.2 dual-expert, HunyuanImage-2.1 all at bf16 resident\u003c\/li\u003e\n\u003cli\u003eVLM \/ OCR bench — Qwen3-VL-32B bf16 or InternVL3.5-38B with long-document pipelines\u003c\/li\u003e\n\u003cli\u003eClean appliance for a small LLM API gateway — one model, one card, easy ops\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eMeasured performance\u003c\/h2\u003e\n\n\u003cdiv style=\"background:#0d0d0d;color:#fff;border-radius:12px;padding:24px;margin-bottom:24px\"\u003e\n\u003cp style=\"margin:0 0 4px 0;font-size:13px;color:#888;text-transform:uppercase;letter-spacing:1px\"\u003ePublished references | NVIDIA RTX Pro 6000 Blackwell datasheet + community benchmarks\u003c\/p\u003e\n\u003ctable style=\"width:100%;border-collapse:collapse;margin-top:16px;font-size:14px\"\u003e\n\u003cthead\u003e\u003ctr style=\"border-bottom:1px solid #333\"\u003e\n\u003cth style=\"padding:8px 12px;text-align:left;color:#888;font-weight:600\"\u003eBenchmark\u003c\/th\u003e\n\u003cth style=\"padding:8px 12px;text-align:left;color:#888;font-weight:600\"\u003eResult\u003c\/th\u003e\n\u003c\/tr\u003e\u003c\/thead\u003e\n\u003ctbody\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003ePer-card INT8 TOPS (NVIDIA datasheet)\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fab400;font-weight:700\"\u003e2 000 TOPS\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003eVRAM per card\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fab400;font-weight:700\"\u003e96 GB ECC GDDR7\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003eMemory bandwidth\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fff\"\u003e~1 800 GB\/s\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003eLlama 3.3 70B Q6 single-GPU (community)\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fff\"\u003e40-55 tok\/s single-stream\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr style=\"border-bottom:1px solid #222\"\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003eLlama 3.3 70B bf16 single-GPU (community)\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fff\"\u003e15-25 tok\/s single-stream\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:10px 12px;color:#ccc\"\u003eBlackwell fp8 native\u003c\/td\u003e\n\u003ctd style=\"padding:10px 12px;color:#fff\"\u003eDeepSeek-V3 fp8, Hunyuan-A13B fp8 run without bf16 upcast\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003c\/tbody\u003e\n\u003c\/table\u003e\n\u003cp style=\"margin:12px 0 0 0;font-size:13px;color:#666\"\u003ePublished external references, not measured on Kentino hardware. Kentino will publish first-party numbers after the first customer build.\u003c\/p\u003e\n\u003c\/div\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eNot ideal for\u003c\/h2\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003eTraining large models from scratch (single GPU — no tensor\/pipeline parallelism)\u003c\/li\u003e\n\u003cli\u003eFrontier 200B+ MoE at real quantizations (Qwen3-235B Q4, GLM-4.5\/4.6 — use K-AI 192 RTXPro6000 or larger)\u003c\/li\u003e\n\u003cli\u003eHigh-concurrency multi-tenant inference (single card caps aggregate throughput; 4x RTX 4090 or 4x L40 scale better)\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eWarranty and lead time\u003c\/h2\u003e\n\u003cdiv style=\"display:flex;gap:16px;flex-wrap:wrap;margin-bottom:24px\"\u003e\n\u003cdiv style=\"flex:1;min-width:150px;background:#f4f4f4;border-radius:8px;padding:20px;text-align:center\"\u003e\n\u003cdiv style=\"font-size:24px;font-weight:800;color:#0d0d0d\"\u003e2 years\u003c\/div\u003e\n\u003cdiv style=\"font-size:13px;color:#666;text-transform:uppercase;letter-spacing:1px\"\u003eparts warranty\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"flex:1;min-width:150px;background:#f4f4f4;border-radius:8px;padding:20px;text-align:center\"\u003e\n\u003cdiv style=\"font-size:24px;font-weight:800;color:#0d0d0d\"\u003e1 year\u003c\/div\u003e\n\u003cdiv style=\"font-size:13px;color:#666;text-transform:uppercase;letter-spacing:1px\"\u003elabor warranty\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cdiv style=\"flex:1;min-width:150px;background:#f4f4f4;border-radius:8px;padding:20px;text-align:center\"\u003e\n\u003cdiv style=\"font-size:24px;font-weight:800;color:#0d0d0d\"\u003e10-28 days\u003c\/div\u003e\n\u003cdiv style=\"font-size:13px;color:#666;text-transform:uppercase;letter-spacing:1px\"\u003elead time\u003c\/div\u003e\n\u003c\/div\u003e\n\u003c\/div\u003e\n\u003cp style=\"font-size:14px;color:#666\"\u003eNVIDIA OEM 3-year warranty on RTX Pro 6000 + Kentino integration warranty. Build includes assembly, BIOS configuration, driver install, burn-in testing, and functional verification. Lead time depends on component availability, confirmed at order.\u003c\/p\u003e\n\n\u003ch2 style=\"font-size:22px;font-weight:700;margin:40px 0 16px 0;padding-bottom:8px;border-bottom:3px solid #2563eb\"\u003eRecommended add-ons\u003c\/h2\u003e\n\u003cul style=\"font-size:15px;color:#333;line-height:1.8\"\u003e\n\u003cli\u003eUpgrade RAM to 512 GB (add 4x 64 GB DDR4 — four DIMM slots still open)\u003c\/li\u003e\n\u003cli\u003e4 TB NVMe secondary drive for model library \/ dataset staging\u003c\/li\u003e\n\u003cli\u003e24U open cabinet for production rack-mount\u003c\/li\u003e\n\u003cli\u003eFor Gen5 x16 link speed consider the Genoa-platform variant on request\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003c\/div\u003e","brand":"Kentino s.r.o.","offers":[{"title":"Default Title","offer_id":52940156698952,"sku":null,"price":15847.0,"currency_code":"EUR","in_stock":true}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/0843\/5479\/3800\/files\/kentino-ai-server-4-gpu-topdown_6b2c51b2-25c1-479d-929a-29eebe60e5ef.jpg?v=1776940959"}],"url":"https:\/\/kentino.com\/zh\/collections\/gpu-workstations.oembed","provider":"Kentino","version":"1.0","type":"link"}