Start small.
Leave room to scale.
The cheapest mistake at Stage 1 is buying hardware that cannot host the next GPU generation. The most expensive mistake at Stage 3 is realising your facility was the constraint all along. Below is a five-stage buying guide for sovereign AI infrastructure - what each stage looks like, what it costs, and what you learn before the next one.
From on-paper to dedicated AI infrastructure.
Stage
00
Scoping the first workload
Pick ONE use case · estimate token volume, latency, concurrency · identify which data classes are involved
→ Answers: "What's the smallest real workload we can run?"
Stage
01
Starter node
Single workstation or 1U/2U server · 1–2 GPUs · run quantised open-weight model (8B–30B class) · inference only · single use case · limited users
→ Answers: "Does this work end-to-end inside our perimeter?"
Stage
02
Production single-node
Properly racked server · 2–8 GPUs · redundant power · supported OS · monitoring · backups · larger model or higher concurrency · 1–3 use cases sharing the box
→ Answers: "Can we run this as a real internal service?"
Stage
03
Multi-node cluster
2–8 nodes · high-speed interconnect · shared storage · workload scheduler · larger models · fine-tuning · multi-team access
→ Answers: "Can we serve the whole company from this?"
Stage
04
Dedicated AI infrastructure
Purpose-built environment · dense GPU racks · liquid cooling as default · capacity planning tied to roadmap · DR / failover · governance and audit at platform level
→ Answers: "Is AI now a core piece of company infra?"
Four principles that protect your early spend.
The decisions you take at Stage 1 either compound favourably or compound against you. These four principles - applied early - are the difference between a Stage 4 estate that grew gracefully from a Stage 1 box, and a Stage 4 estate that required throwing away a generation of hardware along the way.
Principle 01
Buy chassis and cooling that can host the next GPU generation
Today's 700 W GPUs are tomorrow's 1 000+ W GPUs. Air cooling at high density runs out of headroom faster than the GPU you bought. Specifying liquid cooling at Stage 1 is cheaper than rebuilding at Stage 3.
Principle 02
Standardise on an inference stack portable across stages
Whatever runtime, container layout, and observability stack you pick at Stage 1 should still work at Stage 4. The cost of porting a poorly-chosen runtime across three rack moves is high.
Principle 03
Treat Stage 1 hardware as a lab asset after Stage 2 - not as throwaway
The starter node becomes the development and evaluation environment once the production single-node is live. Plan for that from the start; the box keeps earning its keep.
Principle 04
Decide the Stage 4 power and cooling envelope at Stage 2
The hardest constraint in scaling AI infrastructure is usually the facility, not the silicon. If you wait until Stage 3 to think about Stage 4 power and cooling, you have already missed the design window.
RM-4U8G is engineered for Stages 2 and 3.
A production single-node deployment (Stage 2) and a multi-node cluster (Stage 3) are the same chassis, the same cooling loop, the same power topology. You upgrade by replacing GPUs and motherboard - the platform stays.
That continuity is the buying-guide thesis in physical form. The Stage 2 starter - two RTX 6000 Ada GPUs in the same chassis - is the same Stage 3 maximum - eight H200 NVL GPUs. Same chassis, same cooling, no rebuild.
See the platform
What you would actually buy at each stage.
Reference figures · examples drawn from real customer deployments. Final sizing is specified per workload - these are not bundles LM TEK sells, they are sketch BOMs to anchor the buying conversation.
| Stage | Configuration sketch | Indicative spend |
|---|---|---|
| Stage 01 | 2× RTX 6000 Ada, 1× CPU, 256 GB RAM, 2× 3.84 TB NVMe | ~€80k–€120k |
| Stage 02 | 4× RTX A6000, 2× CPU, 512 GB RAM, 4× 3.84 TB NVMe | ~€180k–€280k |
| Stage 03 | 8× H200 NVL, 2× CPU, 1 TB RAM, 8× 7.68 TB NVMe (per node × 2–4 nodes) | ~€800k–€2M |
Sizing your first deployment?
Tell us your stage, your workload, and the constraints in your facility. We will recommend a configuration that fits - and partners who can take it from spec to deployed system.