Wednesday, March 9, 2011

Is it turned On?

Question #0 on any tech troubleshooting checklist is, as you well know, 'Is it plugged in?'. This is followed closely by question #1, 'Is it turned On?'. Sometimes I have to relearn this the hard way.

Those joining the party late will have missed yesterday's exciting episode, at the end of which the intrepid hero is left scratching his head at the failure of his Shiny New SSD to outperform his clunky HDD. Let's move the incredibly riveting plot forward a bit with some hot command line action scenes:

$ dd if=/dev/zero of=/tmp/ssd/foo bs=4k count=1000000 oflag=direct

32.9 MB/s

This sucks. Let's swap in the noop scheduler, downgrade the journalled ext3 to ext2 and change the mount to noatime:

$ dd if=/dev/zero of=/tmp/ssd/foo bs=4k count=1000000 oflag=direct

35.0 MB/s

Better, but still sucking.

$ dd if=/dev/zero of=/tmp/ssd/foo bs=8k count=500000 oflag=direct

58.2 MB/s

$ dd if=/dev/zero of=/tmp/ssd/foo bs=16k count=250000 oflag=direct


87.6 MB/s

ok, so the performance is a direct factor of the block size. Ramp the block size up high enough and we saturate the drive at over 200MB/s. But what is limiting the number of blocks we can throw at the device? The hardware spec rates it at 50000 x 4k IOPS, which would be 195MB/s . Let's throw a few more processor cores at the problem just for the hell of it:

$ dd if=/dev/zero of=/tmp/ssd/foo1 bs=4k count=1000000 oflag=direct &

$ dd if=/dev/zero of=/tmp/ssd/foo2 bs=4k count=1000000 oflag=direct &


$ dd if=/dev/zero of=/tmp/ssd/foo3 bs=4k count=1000000 oflag=direct &


$ dd if=/dev/zero of=/tmp/ssd/foo4 bs=4k count=1000000 oflag=direct &


10.5 MB/s


10.5 MB/s


10.5 MB/s


10.5 MB/s


Well, 42 > 35, but nowhere near a linear speedup. Something is fishy here. SATA should do NCQ, which would allow all four of those processes (actually up to 32) to have outstanding requests, so we should be soaking up a lot more of that lovely bandwidth.

Unless...

$ lsmod | grep libata
libata 209361 1 piix


umm, oops.

The Intel ICH10R on our P6X58D-E is running in running in legacy IDE mode, because someone didn't check the BIOS settings carefully enough when building the machine. Not that I have any clue who that may have been. No, Sir, not at all.

Ahem. Let's reboot shall we...

$ lsmod | grep libata
libata 209361 1 ahci


Right, that's better. Off we go again:

$ dd if=/dev/zero of=/tmp/ssd/foo bs=4k count=1000000 oflag=direct
76.8 MB/s


Double the speed. Not too bad for five minutes work, even if it did require walking all the way down the hall to the machine room.

$ dd if=/dev/zero of=/tmp/ssd/foo1 bs=4k count=1000000 oflag=direct &

$ dd if=/dev/zero of=/tmp/ssd/foo2 bs=4k count=1000000 oflag=direct &

34.5 MB/s

34.5 MB/s

Huh?

With libata now correctly driving the SSD with all its features supported, those concurrent processes should be getting 70+MB/s each, not sharing it. Grrr.

Oh well, let's see how the transaction system is doing shall we. It's writing a single file at a time anyhow. Since we were already running at over 36k tx/s against a theoretical max of 55k we can't expect a 2x speedup the raw dd numbers would suggest, but we should see some improvement...

30116 tx/second.

Some days it's just not worth getting out of bed.

Change request to Clebert: make the block size in the Journal a config option.

No comments: