Drive filling up?

Here’s a new one, I just got an alert on one of my systems running 1.3.0 regarding drive utilization:

Filesystem      Size  Used Avail Use% Mounted on
udev            7.9G     0  7.9G   0% /dev
tmpfs           1.6G  170M  1.4G  11% /run
/dev/sda1        32G   26G  4.7G  85% /usr/lib/live/mount/persistence
/dev/loop0      281M  281M     0 100% /usr/lib/live/mount/rootfs/1.3.0.squashfs
tmpfs           7.9G     0  7.9G   0% /usr/lib/live/mount/overlay
overlay          32G   26G  4.7G  85% /
tmpfs           7.9G     0  7.9G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           7.9G     0  7.9G   0% /sys/fs/cgroup
tmpfs           7.9G   52K  7.9G   1% /tmp
tmpfs           7.9G  184K  7.9G   1% /var/tmp
none            7.9G  1.2M  7.9G   1% /opt/vyatta/config
tmpfs           1.6G     0  1.6G   0% /run/user/1003

Can anyone point me in the right direction as to why /usr/lib/live/mount/persistence is 26G? This has been running since March so not super long in the grand scheme of things. The 3 other systems I setup at the same time aren’t seeing this same drive utilization though so I’m unclear why this 1 system is seeing such high utilization.
Thank you so much for your time!

Check find /usr/lib/live/mount/persistence -type f -size +50M
Show system image

Thanks for your reply @Viacheslav,

The find command shows 2 files:

vyos@Vy1:~$ ls -laht /usr/lib/live/mount/persistence/boot/1.3.0/rw/var/log/auth.log.1
-rw-r----- 1 root adm 111M Oct 28 06:05 /usr/lib/live/mount/persistence/boot/1.3.0/rw/var/log/auth.log.1
vyos@Vy1:~$ ls -laht /usr/lib/live/mount/persistence/boot/1.3.0/1.3.0.squashfs
-r--r--r-- 1 root root 281M Dec 28  2021 /usr/lib/live/mount/persistence/boot/1.3.0/1.3.0.squashfs

Results of show system image

vyos@Vy1:~$ show system image
The system currently has the following image(s) installed:

   1: 1.3.0 (default boot)

Try to check logs
or /

find /var/log -type f -size +50M
or
find / -type f -size +50M

No luck either, I do have an auth log that’s large(ish) at 111M, and /usr/lib/live/mount/persistence/boot/1.3.0/1.3.0.squashfs is 281M and a couple other files that come up as over 50M but there’s no great quanitity of those as well as there’s no other larger files.

Check if there many files in boot or its subfolder:

$sudo du -hs /usr/lib/live/mount/persistence/boot/*
$sudo du -hs /usr/lib/live/mount/persistence/boot/1.3.0/*

@a.srividya thanks for your reply
I checked and that is only accounting for 515M still

vyos@Vy1:~$ sudo du -hs /usr/lib/live/mount/persistence/boot/
515M    /usr/lib/live/mount/persistence/boot/
vyos@Vy1:~$ sudo du -hs /usr/lib/live/mount/persistence/boot/1.3.0/
510M    /usr/lib/live/mount/persistence/boot/1.3.0/

That just leaves us with with the persistence folder, you need to check that as well. Please make sure to put * at the end of the command:

$ sudo du -hs /usr/lib/live/mount/persistence/*

Also run this command to see if any files are hidden:

sudo ls -lSha /usr/lib/live/mount/persistence

You got it, here’s the output

vyos@Vy1:~$ sudo du -hs /usr/lib/live/mount/persistence/*
510M    /usr/lib/live/mount/persistence/boot
16K     /usr/lib/live/mount/persistence/lost+found
4.0K    /usr/lib/live/mount/persistence/persistence.conf
vyos@Vy1:~$ sudo ls -lSha /usr/lib/live/mount/persistence
total 32K
drwx------ 2 root root  16K Mar 16  2022 lost+found
drwxr-xr-x 4 root root 4.0K Mar 16  2022 .
drwxr-xr-x 5 root root 4.0K Mar 16  2022 ..
drwxr-xr-x 4 root root 4.0K Mar 16  2022 boot
-rw-r--r-- 1 root root    8 Mar 16  2022 persistence.conf

Checkout for any deleted files:

sudo lsof | egrep “deleted|COMMAND”

If you find files with larger size, you can try a graceful shutdown and if it doesn’t help, have to kill the specific pid

DING DING DING, I think that’s the culprit, thank you so much @a.srividya!

vyos@Vy1:~$ sudo lsof | egrep 'deleted|COMMAND'
COMMAND     PID  TID TASKCMD          USER   FD      TYPE             DEVICE    SIZE/OFF       NODE NAME
conntrack 20289                       root    3u      REG                8,1 26606285948    1049933 /var/log/conntrackd-stats.log.3 (deleted)

Looks like there’s a deleted VERY large conntrackd-stats.log file listed that’s 26G. I’ll try a reboot and see if that clear it.
I checked the other conntrackd-stats.log files and they’re all empty so I’m not sure why this one that was rotated out is so large though so that’s still a mystery. Rebooting is an easy fix right now but disruptive if this reoccurs.
Any ideas why this large file exists in the first place?

There is a bug report for this issue and has been fixed in later release:

https://phabricator.vyos.net/T4259

Workaround:

sudo sed -i 's/$CONNTRACKD_BIN -C $CONNTRACKD_CONFIG -d/systemctl restart conntrackd.service/' /usr/libexec/vyos/vyos-vrrp-conntracksync.sh

@a.srividya Nice, that makes sense. Thank you so much for your help!

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.