rentzsch.com: tales from the red shed

Union Filesystems

Mac OS X

Mac OS X inherits some interesting features from its FreeBSD heritage. One of those is its support for "union filesystems".

A union filesystem is kind of like onion-skin paper for your filesystem. You take one filesystem and "overlay" it on top on another. Your system will present one logical filesystem, featuring a combination of both filesystems in a very specific way.

All files that are in the base filesystem show up in the unified presentation. In addition, any files in the overlaying filesystem also appear. Any files in the overlay that have the same path as the base will take precedence.

More intriguing is what happens when you create or edit a file. The base remains untouched -- only the overlaying filesystem gets modified. As far as I can tell, if you were to open a megabyte file that only existed in the base and flipped a single byte in the middle of it, the union file system would copy the entire file from the base to the overlay, and then work in your modification.

However, the base filesystem isn't wholly protected -- deleting a file from the unified system will "pierce" the overlay and delete the file in the base (with an exception we'll get into).

But discussing theoretical union semantics is bore-ring. Instead, let's mess around and ogle pretty figures. Feel free to play along at home.

First up, we'll be doing wacky filesystem related-stuff, so we need to jump to superuser:

wolf:~ wolf$ sudo -s
Password:
wolf:~ root#

OK, now we're dangerous. Create a new test folder and file:

wolf:~ root# mkdir demo
wolf:~ root# echo 'original'>demo/file1

Check our new reality:

wolf:~ root# ls -l demo
total 8
-rw-r--r-- 1 root staff 9 22 Apr 01:13 file1

Thar she blows. Visually, we have something like this:

Now to create a new big sparse disk image:

wolf:~ root# hdiutil create demo_overlay -volname demo_overlay -size 1g -type SPARSE -fs HFS+J
Initializing...
Creating...
Formatting...
Finishing...
created: /Volumes/Island/wolf/demo_overlay.sparseimage

Here's the meat: mounting the new file system "over" the existing folder:

wolf:~ root# hdiutil attach demo_overlay.sparseimage -mountpoint demo -union
Initializing...
Attaching...
Finishing...
Finishing...
/dev/disk2       Apple_partition_scheme     
/dev/disk2s1      Apple_partition_map      
/dev/disk2s2      Apple_HFS            /Volumes/Island/wolf/demo

Continuing our onion-skin analogy, we have something like this:

With our overlay in place, let's create a new file and see what happens:

wolf:~ root# echo 'original'>demo/file2
wolf:~ root# ls -l demo
total 16
d-wx-wx-wt 3 root unknown 102 22 Apr 01:14 .Trashes
-rw-r--r-- 1 root staff   9 22 Apr 01:13 file1
-rw-r--r-- 1 root unknown  9 22 Apr 01:14 file2

Visually, we now have this:

We can still access both layers as if they were one:

wolf:~ root# cat demo/file*
original
original

If we attempt to edit a file that doesn't exist in the overlay, it will be copied there on our behalf:

wolf:~ root# echo 'replacement'>demo/file1
wolf:~ root# cat demo/file*
replacement
original

Holy implicitly duplicated files, Batman!

So far we've added a file and editing an existing file. However, all those changes took place in the overlay -- the original folder remains untouched. Here, let me pop off the overlay and prove it to you:

wolf:~ root# hdiutil detach /dev/disk2
"disk2" unmounted.
"disk2" ejected.
wolf:~ root# ls -l demo
total 8
-rw-r--r-- 1 root staff 9 22 Apr 01:13 file1
wolf:~ root# cat demo/file*
original

It gets kind of weird when deleting files: you have a scenario where you delete a file, you don't get an error back, but the file is still there!

wolf:~ root# hdiutil attach demo_overlay.sparseimage -mountpoint demo -union
Initializing...
Attaching...
Finishing...
Finishing...
/dev/disk2       Apple_partition_scheme     
/dev/disk2s1      Apple_partition_map      
/dev/disk2s2      Apple_HFS            /Volumes/Island/wolf/demo
wolf:~ root# cat demo/file*
replacement
original
wolf:~ root# rm demo/file1
wolf:~ root# ls -l demo
total 16
d-wx-wx-wt 3 root unknown 102 22 Apr 01:14 .Trashes
-rw-r--r-- 1 root staff   9 22 Apr 01:13 file1
-rw-r--r-- 1 root unknown  9 22 Apr 01:14 file2

Well, it's not really the same file -- we only deleted the overlay's file1, not the base:

wolf:~ root# cat demo/file1
original

If we delete again, this time we'll wind up slashing the base file:

wolf:~ root# rm demo/file1
wolf:~ root# ls -l demo
total 8
d-wx-wx-wt 3 root unknown 102 22 Apr 01:14 .Trashes
-rw-r--r-- 1 root unknown  9 22 Apr 01:14 file2

Welcome to the wacky world of union filesystems. They're rather esoteric, but I'm sure someone, somewhere has a pressing problem that they perfectly solve.

Unless it involves attempting to mount an overlay over root. That just kernel panics the machine...

Update: Piers Uso Walter write me about how Mac OS X's union file system implementation doesn't support "file whiteout":

What you report about deleting files demonstrates a major difference between Mac OS X and FreeBSD/NetBSD.

In FreeBSD/NetBSD, deleting a file in the base system should not really delete this file, but rather add a 'whiteout' file to the overlay. If the file system of the overlay does not support whiteout files, deleting files in the base system should not be possible at all. See 1 2 3, or check 'BUGS' in 4.

Friday, April 22, 2005
12:00 AM