Simple question, difficult solution. I can’t work it out. I have a server at home with a site-to-site VPN to a server in the cloud. The server in the cloud has a public IP.
I want people to access server in the cloud and it should forward traffic through the VPN. I have tried this and it works. I’ve tried with nginx streams, frp and also HAProxy. They all work, but, in the server at home logs I can only see that people are connecting from the site-to-site VPN, not their actual source IP.
Is there any solution (program/Docker image) that will take a port, forward it to another host (or maybe another program listening on the host) that then modifies the traffic to contain the real source IP. The whole idea is that in the server logs I want to see people’s real IP addresses, not the server in the cloud private VPN IP.
Short answer: Don’t bother, it’s too complex to setup (unless your app is HTTP or supports the PROXY protocol). You better read your proxy logs instead.
Long answer: What you want is called “IP transparency” and require your proxy to “spoof” the IP address of the client when forwarding packets to the remote server. Some proxies do it (Nginx plus, Avi Vantage, Fortinet) but are paid services. I don’t know for free solutions as I only ever implemented it with those listed above.
This require a fairly complex setup though:
0. IP address spoofing
The proxy must rewrite all downstream request to spoof the client IP address, making it look like the traffic originates from the client at the TCP layer.
1. Backend server routing
As the packet will most likely originate from random IP on the internet, your backend server must have a way to route back the traffic to the proxy, instead of it’s default gateway. Otherwise you’d implement what is called "Direct Server Return*, which won’t work in your case (packet will be dropped by the client as originating from your backend server directly, and not from the proxy).
You have two solutions here:
- set your default gateway to the proxy over its VPN interface (don’t do that unless you truly understand all the implications of such a setup)
- use packet tagging and VRF on the backend server to route back all traffic coming from the VPN, back to the VPN interface (I’m not even sure this would work with an IPsec VPN though because of ACL…)
3. Intercept and route back return traffic
The proxy must be aware that it must intercept this traffic targeted at the destination IP of the client as part of a proxied request. This require a proxy that can bind on an IP that is not configured on the system.
So yeah, don’t do that unless you NEED to do that (trust me as I had to do it, and hated setting it up).
Edit: apparently haproxy supports this feature, which they call transparent mode
I think this is right but to make it work you’d need to do one of two things to pull it off. First off, if you’re doing it just for Web the nginx proxy putting original ip in the header and unpacking on the other side is the smart move. Otherwise.
1: route all your traffic on your side via the vpn, and have the routing on the vpn side forward the packets to the intranet ip on your side not do dnat on it.
2: if you want to route normal traffic over your normal link then you could do it with source routing on the router. You would need two subnets, one for your normal Internet and one for the vpn traffic. Setup source routing to route packets with the vpn ip addresses go via vpn and the rest nat the normal way then the same as before, vpn on cloud forwards not nat to your side of the vpn.
In both cases snat should be done on the cloud side.
It’s a fiddly setup just to get the ip addresses though.
I think you meant to reply to another comment. I never talked about setting up NAT rules, neither source, nor destination.
The proxy is responsible for responding with the correct IP address as it terminates the connection. Setting up NAT rules is not needed.
Well, I was replying to OP through your reply since it was pretty much spot on. Except I was giving some idea of other ways to bring the original IP through a VPN using the linux ip stack features. Whatever way they go about it, it’s a lot of effort for not that much upside though.
Short answer no, but you can add the source IP as part of the http header https://www.nginx.com/resources/wiki/start/topics/examples/forwarded/ then you have to log that bit of the header at the app level.
There can be ways of your are using ipv6, basically turning your cloud host into a router, but but ipv4 you would have to have a 1:1 mapping and setup the routing carefully to make it work.
Isn’t that what the logs on the proxy are for?
This is only true if the proxy can understand the application layer of the backend (eg. HTTP). For TCP/UDP based proxy, you only get “X connected to Y” type of logs, which isn’t very useful to debug an application.
You’re best off using the PROXY protocol assuming your application(s) support it.
This is the solution. I reverse proxy from a digitalocean droplet running haproxy which sends traffic via send-proxy-v2, then I set the tunnel subnet as a trusted proxy ip range on traefik which is what haproxy hits through the tunnel, which causes traefik to substitute in the reverse proxied original ip so all my apps behind traefik see the correct public IP (very important for things like nextcloud brute force protection to work)
Would this work for my use case? I just want a service to be able to see the real source IPs but still going through a proxy
Depends on the service. What application are you running on the backend server ?
But I imagine this only works if TLS is terminated at HAProxy rather than Traefik, right? Otherwise how can HAProxy mess with the HTTP headers?
The way I would solve this is by putting nginx or other reverse proxy directly on your instance in the cloud. You can use this to set one of the well known proxy headers and proxies as others have mentioned and have this then proxy to your backend instances over the VPN (even if it’s pointing to an internal nginx instance). Then the access logs on your cloud instance will also contain the actual IP address of the client, setting headers will obviously only work for HTTP traffic, there really isn’t a similar mechanism for TCP/UDP traffic as those are layer 3 and HTTP is layer 4. If you are concerned about it you can always ship the logs to somewhere on prem as well.
For TCP/UDP traffic, you’d just move the problem on another box. The application logs would report connections from 127.0.0.1 (the local proxy), and not the client IP.
Yep you are correct, that’s what I was trying to when I was talking about the logs on the public instance and forwarding them to a central place of that is important information, sorry if it didn’t make sense, I must have been tired haha.
I forgot before, it is also possible use ProxyProtocol for TCP applications but the application will need to understand it for it to show in the application logs. It would also be possible to use this to allow the on-prem instance (nginx->nginx let’s say) to see the true client IP from the public instance, the exact configuration is implementation dependent though.
I realised I forgot to update this. Thank you to everyone that contributed, I appreciate it. This was a weird use case and barely anyone online has documented it, only a handful of places. Nevertheless, I figured it out.
So basically, you run HAProxy with the send-proxy-v2 protocol. Let’s say I’m forwarding SSH from VPS to home, I’d have the VPS running HAProxy listening on port 22. Then I’d have it forward to home on port 220. Then, on the home server, you run this amazing piece of software called go-mmproxy. Configure that to listen on port 220 and forward to localhost 22. And there you have it.
HAProxy passes the real source IP to go-mmproxy with the proxy protocol, go-mmproxy takes the proxy header and strips it from the request, spoofs the source IP address from localhost to the real source IP contained in the proxy header then makes the request to localhost. And then you also have to configure traffic to go back through localhost so go-mmproxy can pick it up and add the proxy header back to the request, to be sent back to the source.