Priya is the head of platform engineering at a healthtech startup in Sea Point, Cape Town. The product is a radiology screening assistant. The data is patient imaging. POPIA says it does not leave the country.
Her closest hyperscaler region is AWS af-south-1, in the same city. The vision model she has been evaluating wants either an H100 or, at minimum, a serious A10 partition. So she called her AWS rep, then Azure, then Google.
The answers were so different that her deployment plan changed twice in a week. AWS Cape Town does not have the GPU class she needs. Azure has one, but it lives in Johannesburg, 1,400 km away. Google has nothing locally at all. Below is what is actually in those three South African regions as of June 24, 2026, with the SKUs, the VRAM, and the sources.
The answers were so different that her deployment plan changed twice. Below is what is actually in those three South African regions as of June 24, 2026, with the SKUs, the VRAM, and the sources.
8 × H100
available in Azure Johannesburg
16 GB
max VRAM in AWS Cape Town (T4)
0
GPU SKUs in Google Johannesburg
1 of 3
hyperscalers with serious AI compute
Azure Johannesburg
Azure runs an 8-GPU H100 node in South Africa North. The SKU is Standard_ND96isr_H100_v5. Each VM is eight NVIDIA H100 80GB cards, 640GB of GPU memory, 96 vCPUs, 1.9TB of system RAM, and a 3.2 Tbps interconnect on a single VM. It is documented and listed on the South Africa North regional catalogue.
Microsoft also lists the NVadsA10_v5 family in the same region. NVads is a vGPU partition off an NVIDIA A10, ranging from a 4GB slice to a full 24GB A10. The big partition (NV72ads_A10_v5) gives you two full A10s for 48GB total. CUDA-capable through Azure's unified vGPU driver, originally designed for graphics and virtual desktops, but it can serve smaller models. Good for inference, not a clean training rig.

AWS Cape Town
The region in her own city, but not the GPUs.
Priya tried AWS first, because the region was geographically closest. The catalogue she got back: G4dn and Inf1. G4dn is NVIDIA T4. T4 is a 16 GB inference card that shipped in 2018. Useful for small models, document classification, light vision, edge inference. Not a serious training instance for anything past about a 13B model in 4-bit.
Inf1 is AWS Inferentia, AWS's own inference accelerator. Up to 16 chips per instance. Works if your stack supports AWS Neuron. Most open-weight serving stacks (vLLM, llama.cpp, TGI) target CUDA first. Neuron support exists but lags.
What is not in af-south-1: G5, G6, L4, L40S, A100, H100, H200, Blackwell. None of it. AWS has not published GPU additions for Cape Town for several years.
Google Cloud Johannesburg
africa-south1 launched in January 2024. The region has E2, N2, N2D, C4, C4A, T2D, and M3 machine families. It does not have GPU. It does not have TPU. Google's own Compute Engine accelerator-location matrix, updated June 18 2026, shows all three Johannesburg zones marked 'GPU unavailable' and 'TPU unavailable'. Cloud Run's GPU region list also excludes africa-south1.
The Google pricing catalogue lists A100, L4, and H100 SKUs under Johannesburg, which makes the catalogue look misleading. The accelerator-location matrix is authoritative. The catalogue entries are placeholder rows for SKUs not currently deployable.
“Cape Town is being treated as an enterprise / data-residency region, not yet as an AI-compute supply region. That is a market-prioritisation issue, not a sanctions issue.”
The hyperscaler comparison
South Africa cloud GPUs · June 24 2026
Azure Johannesburg
The only serious option
- 8 × H100 80GB on Standard_ND96isr_H100_v5
- A10 partitions, 4 GB to 48 GB, on NVadsA10_v5
- South Africa North only, not South Africa West
- ~$98/hr on-demand for the H100 cluster
AWS Cape Town · Google Joburg
The other two
- AWS: G4dn (T4, 16 GB) and Inf1, nothing newer
- AWS: no G5, G6, L4, A100, H100, H200 in af-south-1
- Google: zero GPU SKUs deployable in africa-south1
- Google: zero TPU SKUs deployable in africa-south1
It is not sanctions
Before this gets blamed on sanctions: South Africa sits in Country Group A on the US Bureau of Industry and Security export-control list. There is no country-specific OFAC program against South Africa. The export controls on high-end AI chips target specific countries, entities, end users, and prohibited uses. They do not blanket-ban modern NVIDIA hardware from being deployed in SA.
The proof is Azure. Microsoft is operating H100 80GB hardware in Johannesburg right now. There is no legal block. AWS and Google have simply chosen not to ship the capacity yet. That is a commercial allocation decision, not a regulatory one.
What this means for South African enterprises
If you are POPIA-bound, the practical hyperscaler choice in South Africa is Azure or nothing. That is a fragile dependency. It means one provider's price changes, one region's capacity ceiling, and one product roadmap controls your AI workload.
If you need anything that does not fit an 8-H100 SKU (most teams do not need a full 8-GPU cluster), Azure's local A10 partitions are CUDA-capable but were designed for graphics workloads. Some inference setups have well-known driver headaches on them.
The third option is one that the hyperscaler catalogues will not tell you about: deploy on hardware you own, in your own rack or co-located in a local data centre. South Africa has multiple competent colocation providers. The hardware itself, even at the high end, is buyable.
Three honest paths for SA enterprise AI in 2026
- Azure JohannesburgWorks if your workload fits 8-H100 or A10 partitions, and the price.
- AWS or Google in SAPossible for very light inference. Not viable for modern training or large-model serving.
- Hardware you ownDGX Spark, AMD Ryzen AI Max+ 395, or rack-mounted servers in a local colo. Sized to the workload, owned outright, jurisdiction settled.
What NPU Labs does about it
We design private AI clusters, source the hardware, install it on premises or in colocation here in South Africa, and stand up the engineering layer that runs on top. The conversation starts with the workload, not the catalogue.

On Priya's desk
Four GEEKOM A9 nodes, ~R260k all-in.
AMD Ryzen AI Max+ 395 in each node, 128 GB LPDDR5x memory per node, mini-PC form factor, ~120W per unit. Same memory class as a flagship cluster at roughly a fifth of the price.
The model weights are hers. The audit trail is hers. The Commerce Department does not get a say.
If the cloud catalogue is the bottleneck on your South African AI roadmap, it does not have to be.
The mirror universe
Priya, after the catalogue stopped being the constraint.
Same office, same desk, same engineer, same POPIA-bound radiology workload. Different inference layer: four AMD Ryzen AI Max+ 395 nodes on Priya's open rack, in her own rack in Sea Point.
When AWS adds a new SKU to af-south-1, she will notice. She does not need it. When Azure raises ND H100 v5 prices in Johannesburg, she will read about it in a newsletter. Her clusters do not move. Her data does not move. Her roadmap does not move.

