Google Kubernetes Engine & VPN
After some investigation, I found the root cause of the problem. Basically, the communication wasn't working properly because there is something called ip masquerade (https://cloud.google.com/kubernetes-engine/docs/how-to/ip-masquerade-agent) that is used for NAT translation.
As GKE has some default addresses that are configured to not be masquerade (on the version that I was using, the defaults were: 10.0.0.0/8
, 172.16.0.0/12
and 192.168.0.0/16
) and the destination ip was 10.197.100.201 (part of 10.0.0.0/8) and that ip was outside the cluster, the solution was modifying the nonMasqueradeCIDRs
and remove 10.0.0.0/8
and use 10.44.0.0/14
(GKE cluster CIDR) instead.
In order to do that, I used the following configmap:
apiVersion: v1
data:
config: |-
nonMasqueradeCIDRs:
- 10.44.0.0/14
- 172.16.0.0/12
- 192.168.0.0/16
resyncInterval: 60s
kind: ConfigMap
metadata:
name: ip-masq-agent
namespace: kube-system
After that, to apply the config, you can upload the configmap using the follwing command:
kubectl create configmap ip-masq-agent --from-file <configmap file> --namespace kube-system
I found a solution in this blog.
The problem is that the default iptables config looks like this:
iptables -A POSTROUTING ! -d 10.0.0.0/8 \
-m comment --comment “kubenet: outbound traffic" -m addrtype \
! --dst-type LOCAL -j MASQUERADE -t nat
It means that traffic from the pods will be NATted to the host IP only if the destination is not in 10.0.0.0/8.
This 10.0.0.0/8 is the problem: it’s too large.
It also includes your 10.197.100.201
IP.
To fix this you can add the following DaemonSet to your Kubernetes Cluster:
kind: DaemonSet
apiVersion: extensions/v1beta1
metadata:
name: fix-nat
labels:
app: fix-nat
spec:
template:
metadata:
labels:
app: fix-nat
spec:
hostPID: true
containers:
- name: fix-nat
image: gcr.io/google-containers/startup-script:v1
imagePullPolicy: Always
securityContext:
privileged: true
env:
- name: STARTUP_SCRIPT
value: |
#! /bin/bash
while true; do
iptables-save | grep MASQUERADE | grep -q "NAT-VPN"
if [ $? -ne 0 ]; then
echo "Missing NAT rule for VPN, adding it"
iptables -A POSTROUTING -d 10.197.100.0/24 -m comment --comment "NAT-VPN: SNAT for outbound traffic through VPN" -m addrtype ! --dst-type LOCAL -j MASQUERADE -t nat
fi
sleep 60
done
This small script will check every minute, forever, if we have the right iptables rule and, if not, add it.
Note that the privileged: true is necessary for the pod to be able to change iptables rules from the host.
I had the same problem and this solved the issue.