AWS best practice: shall I have a NAT gateway in each AZ?
From the AWS official NAT Gateway doc:
If you have resources in multiple Availability Zones and they share one NAT gateway, in the event that the NAT gateway's Availability Zone is down, resources in the other Availability Zones lose internet access, To create an Availability Zone-independent architecture, create a NAT gateway in each Availability Zone and configure your routing to ensure that resources use the NAT gateway in the same Availability Zone.
The NAT Gateway enables outgoing Internet connectivity for a private subnet. It is important to note that you need to create a NAT Gateway for every Availability Zone that you have created private subnets to achieve high availability.
The described network architecture consisting of public subnets, private subnets, and HA NAT gateways
Considerations
- If keeping costs to a minimum is essential, the baseline costs of $32.00 per month per NAT Gateway could be a show stopper. When using three AZs, you will pay $96.00 per month for three NAT Gateways.
- The NAT Gateway also increases costs for outbound traffic. You have to pay a premium of $0.045 per GB flowing from a private subnet to the Internet. That’s raising the costs for outgoing traffic by 50%.
Extra Points!
- Terraform Infra as Code module simple VPC example
...
azs = ["us-east-1a", "us-east-1b", "us-east-1c"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
public_subnets = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]
enable_nat_gateway = true
single_nat_gateway = false # to get 1 NGW x AZ
...
Ref Link: https://cloudonaut.io/advanved-aws-networking-pitfalls-that-you-should-avoid/
Yes, ideally you would have one NAT gateway per Availability Zone (AZ).
AWS documents this advice at Comparison of NAT Instances and NAT Gateways:
Highly available: NAT gateways in each Availability Zone are implemented with redundancy. Create a NAT gateway in each Availability Zone to ensure zone-independent architecture.
A single NAT gateway in a single AZ has redundancy within that AZ only, so if there were zonal issues then instances in other AZs would have no route to the internet.
Note: there are per hour charges for each NAT gateway as well as per GB data processed (see VPC Pricing). See How can I reduce data transfer charges for my NAT gateway?