What is TCP Proxy Protocol and why do you need to know about it?
When working with LoadBalancers in the Cloud or Physical Environments, we’re often found looking for the source IPs of our connecting clients.
You might want to know if a user is connecting from an External or Internal Network, you might want to trigger different application logic based on a particular Client IP Address, etc…
Unlike DNAT or Redirect requests in Firewalls, LoadBalancers terminate the Incoming TCP connection and establish a new TCP session to one of the backends. Traffic to the backend is always generated with the source of the LoadBalancer IP.
If the application is HTTP-based and you have Layer7 LoadBalancer(Reverse Proxy) – you can preserve the source IP by capturing it on Reverse Proxy and passing it on as an X-Forwarded-For header. But what to do when you run non-HTTP Service? or a Layer3 Network LoadBalancer?
In this case, TCP Proxy Protocol comes to the rescue (not to be mistaken with HTTP Proxy).
When enabled on the load balancer, it will insert an additional Header to the outgoing TCP request containing the source IP address of the original Client.
To extract the SRC IP address the backend service should have Proxy Protocol enabled as well. The service will expect the additional TCP Proxy Header and will extract it before processing the request as usual.
There are 2 versions of the Proxy Protocol:
V1 – is quite basic and only allows capture of the source IP
V2 – is more advanced and in addition to the source IP, it will also capture some additional Metadata about the original request. For example, it can preserve the ID of a Private Endpoint* in the Cloud Environment.
More information on Proxy Protocol is available here: https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
Some typical use cases and supported LoadBalancers in Public Clouds:
- Network and Classic LoadBalancers AWS
- Standard LoadBalancer in Azure
- Private Endpoints (Azure, AWS)
- GCP LoadBalancers
And on the receiving end, most of the “reverse proxies” and service meshes support it as well:
The important bit is to remember that enabling Proxy Protocol on the Backend Service will break all requests without Proxy Protocol and vice versa. There is no auto-detection mechanism so if you switch on Proxy Protocol on a service make sure that no direct requests are coming to it without Proxy Protocol enabled.
Especially relevant when dealing with Private Link Services. Enabling Proxy Protocol on the Private Endpoint will prevent the use of a LoadBalancer directly unless you configure multiple frontend/backend pairs with and without Proxy Protocol.
Private Endpoint* – by Private Endpoint we mean the functionality available in AWS and Azure allowing Service Owner to publish their services behind LoadBalancers and Consumers connect to them by creating Private Endpoints in their Network.