Where’d my space go?

It’s never good when “FATAL:Running out of space (100%)” pops up on your production alerts Slack channel, but there it was.  A quick check revealed that the root file system was, in fact full:

[root@gluster-01-ao ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/rootvg-rootlv
                      27G   27G     0 100% /

No room at the inn.

Usually in this case, a quick (or not-so-quick) ‘du -shx’ reveals some giant log file that got missed by logrotate or tons of Docker images.  However, today, I got nothing:

[root@gluster-01-ao ~]# du -shx /
2.0G /

Huh?  That seems odd.  Where’s my missing 25G of data?  The next usual thing is that some process is hanging on to a file that’s been deleted.  In this case, you can use ‘losf | grep deleted’ to see if a process is hanging on to a zombie file.  Nada.  We even rebooted the server, but it came back and was still full.

After some more in-depth trawling of Stack Overflow, we finally came to the answer: files hidden under mount points.  We had several GlusterFS file shares mounted on this server that we used to do rsyncs between environments.  We unmounted them one by one, and as soon as I unmounted one of them, voila:

[root@gluster-01-ao ~]# du -shx /
27 G /

As far as we can tell, one of the Gluster mounts failed one day but the rsync script continued to dump files in that mount directory.  And when the Gluster mount came back, it essentially hid the files that were there.

Now I know where I can hide data on my Linux servers if I need to…

Leave a comment