i run vyos in an esxi host and i notice traffic going to and through vyos will experience spikes of bad latency, while traffic to other guest VM’s on the same host is fine. i will see latency up to 1000ms running mtr to and through vyos. i have tested traffic all through my network and the bottle neck is vyos.
any suggestions on how to troubleshoot this? i am not using bgp/ospf/iegrp/etc, maybe i can disable those services completely?
might be my fault - the vm had 2 cores and 2gb of ram so i kicked it up to 16 cores and 6gb of ram and seems to be better. seeing the minimum specs so low maybe gave me a false sense of how to size the vm.
You will get some overhead since you use vm but also in the hands of the vm host.
How loaded is your vm host?
Do you have ballooning enabled from the vm host?
Whats the physical cpu and other specs?
How is this vm guest configured from the vm host (virtio or e1000 drivers etc)?
What is the interface speed and how many is configured for your vm guest?
Do you do other traffic generation either for routing/firewalling or services runned locally on your VyOS such as encrypted/unencrypted tunnels, bgp/isis/ospf sessions, dhcp etc? Perhaps vrrp failover?
Also try to login and run atop or htop to try to spot what happens in your system when these high latency spikes occurs.
They can still be caused by other things out in your network.
did this and couldnt really identify a particular service but could identify cpu spikes during the high latency
i realized this was likely a resource issue after i posted the issue. since cranking up the cpu and mem its been smooth sailing. i think sometimes typing it out helps.