What “self-owned, brand-new” means
We run a 100% self-owned fleet. There are no reseller nodes, no sublet inventory, and no marketplace risk. Hardware is brand-new, sourced directly through verified channels, with unified firmware and power profiles for consistent performance. This results in predictable behavior, clean provenance, and tight control over failure domains.
Key points
- No outsourcing or reseller nodes. Capacity and SLAs are our responsibility end-to-end.
- Not a marketplace. Unified firmware & power profiles keep performance consistent across the fleet.
- Burn-in tests. New hosts pass thermal, memory, and sustained-load checks for predictable behavior under real workloads.
Why it matters
Consistency & repeatability
Identical firmware baselines and power/thermal policies reduce run-to-run variance. Training and rendering jobs behave the same across equivalent SKUs, simplifying capacity planning and reproducibility.
Predictable capacity
Because inventory is self-owned, we can plan upgrades, retirements, and spares deliberately—instead of chasing ephemeral marketplace supply. That translates to steadier queues and faster recovery.
Controlled blast radius
Hardware cohorts are segmented by SKU and rack domains. When something fails, isolation and documented swap procedures keep impact bounded and recovery time short.
Procurement & lifecycle
New batches arrive with vendor diagnostics, then enter our burn-in lab for sustained-load and thermal characterization. We document power curves per chassis, validate firmware, and enroll the hosts into inventory with labeled cohorts for scheduling and maintenance windows.
Burn-in & validation
- Thermals: sustained load at target ambient; throttle/derate checks.
- Memory & storage: multi-pass tests and SMART baseline capture.
- GPU stress: mixed compute & memory pressure with perf counters.
- Noise & variance: sample percentiles to flag outliers before fleet entry.
Power & thermals
Hosts ship with consistent power limits and fan curves aligned to the chassis. We pin firmware/driver versions per cohort, and we log temperature and throttle signals for trend detection—not just point-in-time alarms.
Security posture
Self-owned gear gives us provenance and control over management interfaces: firmware provenance is tracked, image signing is enforced, and management access remains on an isolated network with MFA and short-lived credentials.
FAQ
Are the GPUs second-hand or marketplace rentals?
No. Hardware is brand-new and self-owned. We do not rent anonymous marketplace cards.
How do you keep performance consistent?
Unified firmware baselines, validated power/thermal profiles, and per-SKU cohorts reduce variance. Burn-in filters out outliers.
What happens when a host fails?
We isolate, evacuate tenants, and swap with on-site spares. Recovery is a runbook, not an improvisation.