Abusing Solaris attempt #2: stressing out ZFS, PART2

In my last post, the files were being written to an IDE hard disks. Now lets see what happens if we write to /tmp instead. Will Solaris cope with ten million files in /tmp? First, if we want to make use of the compression, we need to make a file system:

We make the files (we can use files instead of real disks…):

anton@solaris-devx ~ $ mkfile 100M /tmp/file1
anton@solaris-devx ~ $ mkfile 100M /tmp/file2

and then su to root to make the ZFS file system (mirrored):

# zpool create crazedPool mirror /tmp/file1 /tmp/file2

I should note that for some reason ZFS didn’t make use of the entire file size:

# zfs list crazedPool
NAME USED AVAIL REFER MOUNTPOINT
crazedPool 110K 63.4M 20K /crazedPool

And now the real test. How about a big file? Lets say, 100G?:

anton@solaris-devx dir1 $ time mkfile 100G woot
real 1m21.995s
user 0m0.191s
sys 0m30.308s

And what about 10000 files, each 10M in size?:
anton@solaris-devx dir1 $ i="0"
anton@solaris-devx dir1 $ time while [ $i -lt 10000 ]
> do
> mkfile 10M la0$i
> i=$[$i+1]
> done
real 1m46.789s
user 0m4.665s
sys 0m43.492s

So far, so good. So now lets push the envelope off the desk. Or maybe off a cliff. Lets see what happens when we make a 100TB file with ZFS!

anton@solaris-devx dir1 $ ls -l megaFile
-rw------- 1 anton staff 107374182400000 Mar 15 18:05 megaFile

and the compression ratio?:

anton@solaris-devx tmp $ zfs get compressratio crazedPool
NAME PROPERTY VALUE SOURCE
crazedPool compressratio 1.00x -

hmm, not quite what I was expecting!

Tags: , , ,

7 Responses to “Abusing Solaris attempt #2: stressing out ZFS, PART2”

  1. Jeff Bonwick Says:

    Yep, this is an artifact of the implementation. When you enable compression, the first thing we do is scan the block to see if it’s all zeroes. If so, we simply discard the block and mark it as a hole in the block tree. So a compressed block of zeroes consumes no space. However, because it has no associated block pointer, it doesn’t enter either the numerator or the denominator of the compression ratio. You will see, however, if you run du -sh on the file, that it consumes zero bytes.

    We considered addressing this by distinguishing between a true hole and a block of zeroes by using a special block pointer value, but doing so would just complicate the code without solving any real problem… so we erred on the side of simplicity.

  2. Rishi Says:

    Keep on going, and the chances are that you will stumble on something, perhaps when you are least expecting it. I never heard of anyone ever stumbling on something sitting down ;-)

  3. Anton Parol Says:

    Ah ha! So the next test should fill the files with all 1’s instead, and the compression ratio will go up!! Makes alot of sense. But Jeff, when going from a ZFS filesystem to UFS (i.e. not compressed), how does it know to make a block pointer to re-create all those zeros?

    /me open up opensolaris.org and tries to make sense of c code!
    Ant

  4. Jeff Bonwick Says:

    Simple: when you ask to read some block of a file, if the pointer to that block is a null pointer, that means it’s a hole, so we just return a buffer full of zeroes. Any filesystem that supports holey files does more or less the same thing; the only thing that’s different about ZFS is that when compression is enabled, we deliberately convert runs of zeroes into holes.

  5. Mark J Musante Says:

    Of course you’ll get a better ratio compressing 1’s than compressing 0’s, because obviously 1’s are bigger than 0’s!

    Seriously, though, the reason you only got around 64mb in your mirror pool of 100mb files is that ZFS uses around 32mb of overhead. In proper pools, you won’t miss it. But in such a tiny pool as crazedPool, it’s a bit more obvious.

  6. Anton Parol Says:

    Mark, is that a minimum overhead? i.e. is that the least that ZFS has to take in order to work?
    Something that gets me interested is how much space do null block pointers take up, i.e. the pointer itself!

  7. parolski.com » Blog Archive » Abusing Solaris with style Says:

    [...] Solaris to goo with destructive commands is something I’ve been enjoying for a while now, so its great to see someone add cash to the equation and take the next step in [...]

Leave a Reply