I have done something magically wrong and I do not know what.
I was working up until today when we tried to push a lot of http across it and it had a meltdown.
Last month we did the same thing without a problem.
there are 2 pairs of VRRP routers sharing a backend network v1/2 and v3/4, they have different front ends though. (staging and production)
These are hosted on ECL2 (NTT COM’s openstack solution)
There are 5 vrrp addresses on the front and 1 on each of the back interfaces. all in sync-group VY.
front (separate networks) 10,20,30,40,50 These have global IP addresses.
back (shared network 1) 11 for staging, 101 for production. These are my DMZ, they have 10.41.0.0/x addresses.
back (shared network 2) 12 for staging, 102 for production. These are internal, they can get out and have 10.41.10.0/x addresses.
(I checked, 1/2 use one multicast address, 3/4 use another.)
[There is a chance this is idiotic, but that does not explain why it worked last month…]
There are nat rules, nothing special, cribbed from the user guide.
There are firewall rules for “to the firewall” and “in”
I can ping all the routers just fine.
But as of a few hours ago, I suddenly became unable to ping the servers behind them, and rebooting (the old windows trick) doesn’t fix it.
From the router itself, I can ping both the inside and the outside just fine with no loss.
From the inside server, I have the same problem.
This mysteriously went away after ~ 4 hours.
Implying that crossing through VYOS was a problem.
What might I have I done wrong? Besides crossing the streams of course. Where might I look?
You are, of course, free to laugh at this. It’s been a decade since I last touched a router.