PSU Sizing and Dual-PSU Configurations for Multi-GPU AI Servers

Power is the single thing most multi-GPU builds get wrong on the first pass. It is also the failure mode that is most expensive to fix later: undersize the PSU and the system reboots randomly under load, overspec it and you have wasted €400 on a unit running at 30% efficiency. The dual-PSU question gets worse, because most of what is written online about "redundancy" is wrong in the context of a 4U workstation or server chassis with consumer GPUs.

This article is the math, the form-factor reality, and the honest framing for a 4-GPU and 8-GPU build on the hardware we actually ship: RTX 5090, 4090, RTX Pro 6000 Blackwell (Workstation and Max-Q), L40, and L4, on EPYC host platforms.

The total power calculation

The number you care about is sustained wall draw under realistic load, plus enough headroom that transient spikes do not trip the PSU's over-current protection. The formula is straightforward:

P_total  =  (GPU_TDP × N_gpu)  +  CPU_TDP  +  drives  +  fans  +  motherboard
P_psu    =  P_total / efficiency_at_load  ×  1.30  (30% headroom)

The 30% headroom is not arbitrary. It covers three things at once: transient GPU spikes, the efficiency droop as you push the PSU past ~70% of its rated output, and the fact that GPU TDP is a marketing number that real workloads occasionally exceed.

Reference TDPs we use for sizing:

Component	Nominal TDP	Realistic peak
RTX 5090 (FE / partner board)	575 W	600–650 W transient
RTX 4090	450 W	500–550 W transient
RTX Pro 6000 Blackwell Workstation	600 W	600 W (hard cap)
RTX Pro 6000 Blackwell Max-Q	300 W	300 W (hard cap)
L40	300 W	300 W (hard cap)
L4	72 W	72 W (hard cap)
EPYC 9354 / 9374F (host CPU)	280–320 W	350 W boost
EPYC 9554 / 9654 (high-core)	360–400 W	~400 W
NVMe SSD (per drive, sustained)	8–12 W	15 W burst
120 mm industrial fan (per fan)	5–10 W	10 W
Motherboard + DIMMs (8× DDR5)	80–120 W	150 W

The pattern: workstation cards (Pro 6000, L40, L4) hold their rated TDP rigidly because they have firmware power caps designed for sustained datacenter load. Consumer cards (5090, 4090) spike. A 5090 will pull 600 W or more for tens of milliseconds during a workload transition. Multiply that by four cards drawing transients out of phase with each other, and your PSU sees brief spikes well above the steady-state average.

This is why "the math says 1500 W, I'll buy a 1500 W PSU" is the most common way a 4× 5090 build ends up rebooting under stress.

Transient spikes — why the headroom is real

The transient behavior on Blackwell-class consumer GPUs is well-documented. A 5090 idling at ~30 W can jump to 600 W within a single millisecond when a CUDA kernel launches against an empty queue. The card's own VRM smooths some of it, but a non-trivial fraction makes it back to the PSU rails. A 4090 does the same thing at ~500 W peaks.

Two consequences:

The PSU's over-current protection (OCP) is the failure point, not the average rail capacity. A 1500 W PSU with aggressive OCP set at ~130% of rated will trip when four 5090s coincidentally spike. The reboot is silent — no event log, no warning, the system just comes back up. Diagnosing this without instrumentation takes days.
PSU response time matters more than peak rating. Server-grade and high-end ATX PSUs have hold-up capacitance that can absorb sub-millisecond transients without ringing the rail. Cheap or older units cannot. This is why the unit price difference between a 2 kW industrial-grade PSU and a 2 kW consumer "gaming" PSU is real — it is not just badge engineering.

The practical rule we use: target 70% of PSU rated output as the steady-state load, leave 30% for transients and efficiency curve. A 4× 5090 build at ~2.3 kW sustained wants a 3 kW PSU budget, which in practice means two 1500 W ATX PSUs split across the load.

80+ ratings — what they actually mean

The 80+ certification tiers describe efficiency at 20%, 50%, and 100% load, at either 115 V or 230 V input. The relevant numbers for a multi-GPU AI server (which lives near 50% load most of the time) on a 230 V European supply:

Tier	20% load	50% load	100% load
80+ Bronze	81%	85%	81%
80+ Gold	88%	92%	88%
80+ Platinum	90%	94%	91%
80+ Titanium	94%	96%	94%

At 50% load, the delta between Gold and Titanium is four percentage points. On a 2 kW system running 24/7, four points is roughly 80 W continuous, or ~700 kWh per year. At €0.20/kWh that is €140/year per PSU. Titanium pays for itself inside two years on a server that actually runs the duty cycle it is built for; Gold is the right answer if the system idles half the time.

We do not claim 80+ Platinum or Titanium on Kentino product pages unless we have the certification on file. The vast majority of 2 kW ATX PSUs we ship are Gold-rated. Customers who specifically need Platinum or Titanium for a 24/7 colo deployment can request it as a build option — we will source and quote.

ATX vs server-grade hot-swappable PSUs

The form-factor question splits cleanly:

ATX (single PSU, up to ~2 kW)

Standard 4U workstation chassis accept one or two ATX PSUs.
Max practical rating per ATX unit is ~2 kW (the 240 V single-phase circuit limit at 16 A is 3.6 kW total).
Cables are user-replaceable, modular, and the connector pinouts are standard.
No hot-swap. PSU failure means powered-down rebuild.
Cost: €200–€500 for a serious 2 kW ATX unit (Corsair AX, Seasonic PRIME, EVGA SuperNOVA G+, Super Flower Leadex).

CRPS (Common Redundant Power Supply, server form factor)

Industry-standard server PSU module, ~73.5 mm × 185 mm × 40 mm.
Used in Supermicro, Tyan, Gigabyte, and Bone64c server chassis.
True hot-swap when paired with a redundant backplane (1+1 or 2+2).
Typical ratings: 1200 W, 1600 W, 2000 W, 2400 W, 3000 W per module.
Cost: €350–€700 per module, plus the backplane.

The honest framing for Kentino builds:

4-GPU K-AI servers ship in 4U workstation/server chassis with dual ATX PSUs — specifically dual 1500 W or dual 2000 W depending on the GPU mix.
8-GPU K-AI servers ship in server chassis with dual or quad CRPS modules at 2000–2400 W each. These are the configurations where true 1+1 redundancy becomes a meaningful option, because the chassis backplane supports it.

The dual-PSU honesty — split delivery, not N+1

This is the single most-misrepresented spec in the multi-GPU build market, and we will not repeat the mistake.

In a 4U workstation chassis with two ATX PSUs:

The two PSUs are not redundant. They feed different loads. A typical wiring is:

PSU A

Motherboard (24-pin ATX)
CPU (EPS 8-pin)
Drives + fans
GPU 1 (12V-2x6)
GPU 2 (12V-2x6)

⇄split

PSU B

GPU 3 (12V-2x6)
GPU 4 (12V-2x6)
(sometimes: drive cage)

PSU B fails → GPU 3 & 4 offline
PSU A fails → system dead

Dual ATX PSU split delivery. No rail sharing, no failover. Two separate load groups.

There is no "auto-failover" between two ATX PSUs in this topology. ATX PSUs do not share rails. The 12 V output of PSU A is not electrically tied to the 12 V output of PSU B. If you wired them together you would create a current loop and damage one or both units.

The reason we use dual PSU in 4-GPU and larger builds is split power delivery: a single 2 kW ATX unit at 70% load is fine on paper, but the cable bundle alone — four GPU PCIe runs plus motherboard plus EPS — is physically miserable to route from one PSU. Splitting into two 1500 W or 2000 W units halves the cable mass per side, halves the per-unit thermal load, and gives you a graceful 2-GPU fallback if a PSU dies mid-job rather than a hard system death.

CRPS in a server chassis is different. A 2+2 CRPS backplane with four 2 kW modules and 1+1 redundant pairs is genuinely hot-swappable, and one module can fail without taking the system down. This is the 8-GPU server configuration, and we are explicit on the product page when a build ships with that backplane. It is also the configuration that justifies a "redundant PSU" claim. We do not make that claim on 4-GPU ATX builds, because it would be wrong.

Rail balance and per-rail current limits

Modern high-end ATX PSUs are single-rail 12 V designs by default, which simplifies things — the entire 12 V output is one big pool, and the only limit is the PSU's total wattage. A 2000 W single-rail unit running on 230 V can deliver ~166 A on 12 V, which is more than enough for any single GPU.

Some older or industrial PSUs are multi-rail (12V1, 12V2, 12V3, 12V4), each with a per-rail OCP cap of typically 20–40 A. This matters in two cases:

You connect a 5090 with its 12V-2x6 (12VHPWR successor) plug to a single 12 V rail. A 5090 at 600 W transient peak draws 50 A on 12 V. A 40 A multi-rail OCP will trip.
You connect two GPUs to the same multi-rail port group. Same problem, doubled.

The practical answer: for multi-GPU builds, use single-rail 12 V PSUs. Multi-rail is a relic from the era when 12V OCP was a safety feature on single-GPU gaming systems. It is actively unhelpful in a 4× 5090 chassis.

A concrete 4-GPU 5090 build

Numbers from a representative K-AI 96 Turin build with 4× RTX 5090:

Component                       Sustained        Peak
---------                       ---------        ----
4× RTX 5090                     4 × 500 W = 2000 W   4 × 600 W = 2400 W (transient)
EPYC 9354 (32-core, 280 W)      ~ 250 W              350 W
Motherboard + 8× 64 GB DDR5     ~ 100 W              150 W
2× NVMe SSD                     ~ 20 W               30 W
4× 120 mm industrial fans       ~ 30 W               40 W
                                --------             --------
Total system                    ~ 2.4 kW             ~ 3.0 kW transient

PSU sizing: 3.0 kW transient / 0.92 (Gold @ 50% load) = 3.26 kW PSU budget. Round up to 2× 1500 W ATX or 2× 2000 W ATX, single-rail, Gold-or-better, split as:

PSU A (2000 W): motherboard, CPU, drives, fans, GPU 1, GPU 2
PSU B (1500 W): GPU 3, GPU 4

The 2× 2000 W variant is what we quote for customers who want a runway to upgrade to RTX Pro 6000 Workstation cards later (600 W each, harder cap on transient, but a 2.4 kW sustained ceiling either way).

A concrete 8-GPU 5090 build

Numbers for a K-AI 256 Turin Dual with 8× RTX 5090:

Component                       Sustained        Peak
---------                       ---------        ----
8× RTX 5090                     8 × 500 W = 4000 W   8 × 600 W = 4800 W (transient)
2× EPYC 9554 (64-core, 360 W)   ~ 650 W              800 W
Motherboard + 16× 64 GB DDR5    ~ 180 W              250 W
4× NVMe SSD                     ~ 40 W               60 W
8× industrial server fans       ~ 80 W               120 W
                                --------             --------
Total system                    ~ 5.0 kW             ~ 6.0 kW transient

PSU sizing: 6.0 kW transient / 0.94 (Platinum CRPS @ 50% load) = 6.4 kW PSU budget. The standard configuration is dual 2000 W CRPS modules at a minimum, more commonly 2+2 CRPS at 2000–2400 W each with a redundant backplane.

This is the configuration where a genuine 1+1 redundancy claim is warranted. The capex delta over a non-redundant dual-PSU server is ~€800–€1200 per build.

240 V input matters here. An 8-GPU 5090 system on a 230 V single-phase 16 A circuit is at 73% of the breaker's continuous rating, which is the upper bound of what most jurisdictions allow as continuous draw. We recommend a 32 A circuit, or a three-phase rack PDU with 230 V per leg, for any 8-GPU deployment.

UPS sizing

If you are putting a 4-GPU or 8-GPU AI server on UPS — which you should, at minimum for graceful shutdown — the math is:

4-GPU build: 2.4 kW sustained. A 3 kVA / 2.4 kW online UPS gives you full coverage but minimal runtime (~5 minutes at full load). For graceful shutdown that is enough.
8-GPU build: 5 kW sustained. A 6 kVA online UPS is the minimum. For a real 10-minute runtime under load you are looking at 10 kVA or a parallel pair.

A UPS that is undersized for the transient peak will go to bypass or shut down the moment the GPUs spike. The UPS rating must cover the transient peak, not the sustained average. Online double-conversion is the right topology for AI compute. Line-interactive units have a 4–10 ms transfer time that occasionally crashes inference jobs on the transition. Pure-sine output, not modified-sine — modern ATX and CRPS PSUs do not tolerate modified-sine well at high load.

Summary table — PSU recommendations per build class

Build	Sustained	Transient	PSU config	Redundancy claim
1× 4090 / 5090 workstation	~700 W	900 W	1× 1200 W ATX Gold, single-rail	None
2× 4090	~1.2 kW	1.5 kW	1× 1600 W ATX Gold, single-rail	None
4× 4090	~2.0 kW	2.6 kW	2× 1500 W ATX Gold, split delivery	None (split)
4× 5090	~2.4 kW	3.0 kW	2× 1500–2000 W ATX Gold, split delivery	None (split)
4× RTX Pro 6000 (Worksta.)	~2.6 kW	2.8 kW	2× 2000 W ATX Gold/Platinum	None (split)
8× 5090	~5.0 kW	6.0 kW	2× 2000 W CRPS or 2+2 CRPS @ 2000 W	1+1 (CRPS only)
8× RTX Pro 6000 (Worksta.)	~5.5 kW	5.7 kW	2+2 CRPS @ 2400 W	1+1 (CRPS only)
8× L40 / 8× L4 (inference)	2.6 / 0.7 kW	same	2× 1500 W ATX or 1+1 CRPS @ 1600 W	Optional

The L40 and L4 numbers are why these cards remain interesting: an 8× L4 inference server runs on a single 1200 W ATX PSU with room to spare, and fits in any office circuit. Not every workload needs Blackwell.

What to do next

If you are sizing a build, the questions worth answering before specifying PSUs:

What is the exact GPU model and how many? Transient peak per card × N, not nominal TDP × N.
Is this a 4U workstation chassis or a server chassis with CRPS backplane? This determines whether dual PSU is split-delivery or genuine 1+1 redundancy.
What is your circuit? 230 V 16 A is fine for 4-GPU. 8-GPU wants 32 A or three-phase. 110/120 V US households cannot deliver 8-GPU 5090 on a single circuit, period.
What is the duty cycle? 24/7 sustained inference justifies Platinum or Titanium PSUs. Intermittent training or development can run on Gold and save €400 per build.
Do you actually need redundancy, or do you need a graceful 2-GPU fallback? They are different things. Dual ATX gives you the second. Only a CRPS backplane gives you the first.

If you can answer those five, the PSU choice falls out of the math. The next article in the W-series (W05) covers thermals and airflow — the other half of why dual-PSU 4U builds need careful cable routing, and why "industrial fan" is not marketing.

This is part of the Kentino Wiki, a reference series on AI compute, robotics, and the systems that connect them. Comments and corrections welcome at info@kentino.com.

カートにアイテムが追加されました