iPad: Yet Another Opinion

Here are my initial, general thoughts about the much-hyped iPad. Clearly the world doesn’t need another blog post about this, but it sets the stage for something coming next.

  • As many have observed, iPad is most easily summarized as a larger iPod Touch, plus some of the mobile data capability of an iPhone. Although this has been expressed widely as a criticism, I note that a very large number of people have bought an iPod Touch or iPhone.
  • By making the iPad fit the above description so well, I fear that there is a tinge of Apple playing it safe for Wall Street. Playing it safe, has not been the strategy that invigorated Apple (and its financial performance) over the last decade.
  • This iPad “1.0” is somewhat short on hardware features. I suspect a second generation device will arrive in 2011 with a few more ports, more storage, more wireless, etc. 1.0 only has to be good enough to prime the market for 2.0.
  • The screen needs more pixels; the resolution / DPI is unimpressive. Also, OLED would have been nice; but Apple had to trade off some things to get to a price point, and the screen technology was obviously one of them.
  • The battery life Apple claims, even if it is vaguely close to reality, is fantastic.
  • I am surprised at the lack of a video camera.
  • I expect to see some kind of trivial tethering interoperation between iPad and iPhone over Bluetooth, sometime in the next couple of revisions of both products. I suspect that loyal Apple fans carrying an iPhone 3GS will end up able to use their iPhone mobile voice/data service for both devices… possibly with some extra monthly service charge.
  • iPad 1.0 will not replace Kindle or other eBook readers, though it might slow their sales growth a bit. But what about iPad 2.0, 3.0, with a better screen and even longer battery life? Once a beautiful color LCD device is good enough, monochrome eInk will be a very tough sell.
  • I will quite likely buy an iPad shortly after it ships; but I’ll be buying perhaps 25% to enjoy it as a consumer, and 75% as a means of more fully understanding the industry importance of the tablet form factor.
  • As a user of a “real” Apple computer (a MacBook Pro running OSX 10.6), I find the closed App Store software distribution model something of a disappointment, compared to a tablet form factor Mac OSX PC I could easily imagine; but I have another blog post coming about that in a few days, after I get some real (non-punditry) work out the door.

I Went In a Boy, I Came Out a Man

Apple Store large logo sign

Not really, it just seemed like the sort of over-the-top thing a rabid Mac fan might say.

But I did replace my main Windows PC with a MacBook Pro. I’ve used Apple products occasionally over the decades, going all the way back to the Apple II, IIe, IIgs, and orignal 1984 Macintosh. I’m not “switching”, but rather adding; our client projects at Oasis Digital continue to run primarily on Windows or Linux. Our Java work runs with little extra effort on all three platforms.

Here are some thoughts from my first days on this machine and OSX:

  • The MacBook Pro case is very nice. I didn’t see any Windows-equipped hardware with anything similar. The high-tech metal construction is an expensive (and thus meaningful) signal that Apple sends: Apple equipment is high end. The case also has the great practical benefit of acting as a very large heat sink.
  • The MPB keyboard is a bit disappointing; I miss a real Delete key (in addition to Backspace), Home, End, PageUp, PageDown. At my desk I continue to use a Microsoft Natural Keyboard, so this is only a nuisance on the road.
  • I bought a Magic Mouse for the full Apple experience; but I’ll stick with a more normal mouse (and its clickable middle wheel-button) for most use. I find wireless mice too heavy, because of their batteries.
  • Apple’s offerings comprise a fairly complete solution for common end user computing needs; for example, Apple computers, running Time Machine for backup, storing on a Time Capsule. I didn’t go this route, but it is great to see it offered.
  • Printing is very easy to set up, particularly compared to other Unix variants.
  • VMWare Fusion is fantastic, and amply sufficient to use this machine for my Windows work. Oddly, my old Windows software running inside seems slightly more responsive than the native Mac GUI outside (!).
  • I need something like UltraMon; the built in multi-monitor support is trivial to get working, but the user experience is not as seamless as Windows+UltraMon. For example, where is my hotkey to move windows between screens, resizing automatically to account for their different sizes?
  • Windows has a notion of Cut and Paste of files in Explorer. It is conceptually a bit ugly (the files stay there when you Cut them, until Pasted), but extremely convenient. OSX Finder doesn’t do this, as discussed at length on many web pages.
  • I would like to configure the Apple Remote to launch iTunes instead of Front Row, but haven’t found a way to do so yet. No, Mr. Jobs, I do not wish to use my multi-thousand-dollar computer in a dedicated mode as an overgrown iPod. Ever.
  • The 85W MagSafe power adapter, while stylish and effective, is heavy. I’d much prefer a lighter aftermarket one, even if it was inferior in a dozen ways, but apparently Apple’s patent on the connector prevent this. I’d actually be happy to pay Apple an extra $50 for a lightweight power adapter, if they made such a thing.
  • This MBP is much larger, heavier, and more expensive than the tiny Toshiba notebook PC it replaces; yet it is not necessarily any better for web browsing, by far the most common end user computer activity in 2009. This is not a commentary on Apple, it merely points out why low-spec, small, cheap netbooks are so enormously popular.

Finally, massive storage done right

Last year, I wrote about my efforts to find a storage server with lots of storage at a low cost-per-byte. What was obvious to me at the time, but apparently not obvious to many vendors, is that the key to cost effective storage is to buy mostly hard drives and as little else as possible. I built on Linux and commodity hardware, but the principle applies regardless of OS or hardware vendor.

The team at BackBlaze went much farther down the same path. They ended up with a custom made 4U case (a bit expensive) while the rest of the parts are few in number, inexpensive, and off the shelf. Their cost overhead is stunningly low, as seen in this chart (which I copied from their article):

cost-of-a-petabyte-chart

Is this right for everyone? Of course not. Enterprise buyers, for example, may need the extra functionality offered by the enterprise class solutions (at many times the cost). Cloud providers and web-scale data storage users, though, simply cannot beat BackBlaze’s approach. What about performance? Clearly this low-overhead approach is optimized for size and cost, not performance. Yet the effective performance can be very high, because this approach makes it possible to use a very large number of disk spindles, and thus has a very high aggregate IO capacity.

Predictably, the response to BackBlaze’s design has been notably mixed, with numerous complaint about performance and reliability. For a very thoughtful (though unavoidably biased) response, read this Sun engineer’s thoughts.

The key thing to keep in mind is the problem being solved. BackBlaze’s design is ideal for use as backup, bulk storage. That is a very common need; the solution I set up (described at the link above) had a typical use case of a given file being written once, then never read again, i.e. kept “just in case”. Reliability, likewise, is obtained as the system level, by having multiple independent servers, preferably spread across multiple physical sites. Once you’re paying the complexity cost to achieve this, there isn’t much additional benefit to paying the cost a second time in the form of more expensive storage.

Large, economical RAID: 10 1 TB drives

I recently needed a file server with ample capacity, on which to store data backups. The typical data access pattern for backups is that data is written once and read rarely, so bulk backup storage has much lower performance requirements than, for example, disks used for database files.

I need to store a great number of files, and I had an old server to recycle, so I ended up with:

  • 4U ASUS case with room for many internal drives
  • Qty1, leftover 320GB drive (to boot from)
  • Qty 10, 1 TB drives for data storage: WD Caviar Green
  • An extra SATA controller (plus the 8 SATA ports on the motherboard)
  • Ubuntu Linux 8.04.1 LTS
  • Software RAID6

The separate boot drive is for simplicity; it contains a trivial, vanilla Ubuntu install; if availability mattered more I could replace it with a RAID1 pair, or flash storage – even a cheap USB “key drive” would be sufficient, if I went to the trouble of setting up /var and /tmp to not write to it (thus avoid premature wearout).

The terabyte drives have one large RAID container partition each (quick work with sfdisk). The 10 of them in a RAID6 yield 8 drives worth of capacity. Adjusting also for the difference between marketing TB/GB and the real thing, plus a bit of filesystem overhead, I ended up with 7.3 TB of available storage. Here it is, with some data already loaded:

Filesystem            Size  Used Avail Use% Mounted on
/dev/sde1             285G  1.3G  269G   1% /
varrun                2.0G  164K  2.0G   1% /var/run
varlock               2.0G     0  2.0G   0% /var/lock
udev                  2.0G  112K  2.0G   1% /dev
devshm                2.0G     0  2.0G   0% /dev/shm
/dev/md0              7.3T  3.1T  4.3T  42% /raid

I went with software RAID for simplicity, low cost, and easy management:

# cat /proc/mdstat
[....]
md0 : active raid6 sda1[0] sdk1[9] sdj1[8] sdi1[7] sdh1[6] sdg1[5] sdf1[4] sdd1[3] sdc1[2] sdb1[1]
      7814079488 blocks level 6, 64k chunk, algorithm 2 [10/10] [UUUUUUUUUU]

I chose RAID6 over RAID5 because:

  • This array is so large, and a rebuild takes so long, that the risk of a second drive failing before a first failure is replaced and rebuilt seems high.
  • 8 drives of capacity for 10 drives is a decent value. 5/10 (with RAID10) is not.

It turns out that certain default settings in Linux software RAID can yield instability under RAM pressure in very large arrays; after some online research I made the adjustments below and it appears solid. The sync_speed_max setting throttles back the RAID rebuild, helpful because I was able to start populating the storage during the very long rebuild process.

vi /etc/rc.local
echo 30000 >/proc/sys/vm/min_free_kbytes
echo 8192 >/sys/block/md0/md/stripe_cache_size
echo 10000 >/sys/block/md0/md/sync_speed_max

vi /etc/sysctl.conf
vm.vfs_cache_pressure=200
vm.dirty_expire_centisecs = 1000
vm.dirty_writeback_centisecs = 100

7.3T is far beyond the size limit for ext2/3, so I went with XFS. XFS appears to deal well with the large size without any particular tuning, but increasing the read-ahead helps with my particular access pattern (mostly sequential), also in rc.local:

blockdev --setra 8192 /dev/sda
blockdev --setra 8192 /dev/sdb
blockdev --setra 8192 /dev/sdc
blockdev --setra 8192 /dev/sdd
blockdev --setra 8192 /dev/sde
blockdev --setra 8192 /dev/sdf
blockdev --setra 8192 /dev/sdg
blockdev --setra 8192 /dev/sdh
blockdev --setra 8192 /dev/sdi
blockdev --setra 8192 /dev/sdj
blockdev --setra 8192 /dev/sdk

I was happy to find that XFS has a good “defrag” capability; simply install the xfsdump toolset (apt-get install xfsdump) then schedule xfs_fsr to run daily in cron.

Power consumption seems reasonable at 3.0 amps under load.

Marvell: Not Marvellous

In this machine I happen to have an Intel D975XBX2 motherboard (in retrospect an awful choice, but already installed) which includes a Marvell 88SE61xx SATA controller. This controller does not get along well with Linux. Again with some online research, the fix is just a few commands:

vi /etc/modprobe.d/blacklist
# add at the end:
blacklist pata_marvell

vi /etc/initramfs-tools/modules
# add at the end:
sata_mv

# then regen the initrd:
update-initramfs -u

This works. But if I had it to do over, I’d rip out and throw away that motherboard, and replace it with any of the 99% of other motherboards that work well with Linux out of the box, or disable the Marvell controller and add another extra SATA controller on a card.

Is this as big as it gets?

Not by a long shot; this is a secondary, backup storage machine, worth a blog post because of the technical details. It has barely over $1000 worth of hard drives, and a total cost of under $3000 (even less for me, since I reused some old hardware). You can readily order off-the-shelf machines with much more storage (12, 16, 24, even 48 drives). The pricing per-byte is appealing up to the 16 or 24-drive level, then escalates to the stratosphere.

Network / System Monitoring Smorgasbord

At one of my firms (a Software as a Service provider), we have a Zabbix installation in place to monitor our piles of mostly Linux servers. Recently we look a closer look at it and and found ample opportunities to monitor more aspects, of more machines and device, more thoroughly. The prospect of increased investment in monitoring led me to look around at the various tools available.

The striking thing about network monitoring tools is that there are so many from which to choose. Wikipedia offers a good list, and the comments on a Rich Lafferty blog post include a short introduction from several of the players. (Update – Jane Curry offers a long and detailed analysis of network / system monitoring and some of these tools (PDF).)

For OS level monitoring (CPU load, disk wait time, # of processes waiting for disk, etc.), Linux exposes extensive information with “top”, “vmstat”, “iostat”, etc. I was disappointed to not find any of these tools conveniently presenting / aggregating / graphing the data therein. From my short look, some of the tools offer small subsets of that data; for details, they offer the ability for me to go in and figure out myself what data I want in and how to get it. Thanks.

Network monitoring is a strange marketplace; many of the players have a very similar open source business model, something close to this:

  • core app is open source
  • low tier commercial offering with just a few closed source addons, and support
  • high tier commercial offering with more closed source addons, and more support

I wonder if any of them are making any money.

Some of these tools are agent-based, others are agent-less. I have not worked with network monitoring in enough depth to offer an informed opinion on which design is better; however, I have worked with network equipment enough to know that it’s silly not to leverage SNMP.
I spent yesterday looking around at some of the products on the Wikipedia list, in varying levels of depth. Here I offer first impressions and comments; please don’t expect this to be comprehensive, nor in any particular order.

Zabbix

Our old installation is Zabbix 1.4; I test-drove Zabbix 1.6 (advertised on the Zabbix site as “New look, New touch, New features”. The look seemed very similar to 1.4, but the new feature list is nice.

We most run Ubuntu 8.04, which offers a package for Zabbix 1.4. Happily, 8.04 packages for Zabbix 1.6 are available at http://oss.travelping.com/trac.

The Zabbix agent is delightfully small and lightweight, easily installing with a Ubuntu package. In its one configuration file, you can tell it how to retrieve additional kinds of data. It also offers a “sender”, a very small executable that transmits a piece of application-provided data to your Zabbix server.

I am reasonably happy with Zabbix’s capabilities, but I have the GUI design to be pretty weak, with lots of clicking to get through each bit of configuration. I built far better GUIs in the mid-90s with far inferior tools to what we have today.  Don’t take this as an attack on Zabbix in particular though; I have the same complaint about most of the other tools here.

We run PostgreSQL; Zabbix doesn’t offer any PG monitoring in the box, but I was able to follow the tips at http://www.zabbix.com/wiki/doku.php?id=howto:postgresql and get it running. This monitoring described there is quite high-level and unimpressive, though.

Hyperic

I was favorably impressed by the Hyperic server installation, which got two very important things right:

  1. It included its own PostgreSQL 8.2, in its own directory, which it used in a way that did not interfere with my existing PG on the machine.
  2. It needed a setting changed (shmmax), which can only be adjusted by root. Most companies faced with this need would simply insist the installer run as root. Hyperic instead emitted a short script file to make the change, and asked me to run that script as root. This greatly increased my inclination to trust Hyperic.

Compared to Zabbix, the Hyperic agent is very large: a 50 MB tar file, which expands out to 100 MB and includes a JRE. Hyperic’s web site says “The agent’s implementation is designed to have a compact memory and CPU utilization footprint”, a description so silly that it undoes the trust built up above. It would be more honest and useful of them to describe their agent as very featureful and therefore relatively large, while providing some statistics to (hopefully) show that even its largish footprint is not significant on most modern servers.

Setting all that aside, I found Hyperic effective out-of-the-box, with useful auto-discovery of services (such as specific disk volumes and software packages) worth monitoring, it is far ahead of Zabbix in this regard.

For PostgreSQL, Hyperic shows limited data. It offers table and index level data for PG up through 8.3, though I was unable to get this to work, and had to rely on the documentation instead for evaluation. This is more impressive at first glance than what Zabbix offers, but is still nowhere near sufficiently good for a substantial production database system.

Ganglia

Unlike the other tools here, Ganglia comes from the world of high-performance cluster computing. It is nonetheless apparently quite suitable nowadays for typical pile of servers. Ganglia aims to efficiently gather extensive, high-rate data from many PCs, using efficient on-the-wire data representation (XDR) and networking (UDP, including multicast). While the other tools typically gather data at increments of once per minute, per 5 minutes, per 10 minutes, Ganglia is comfortable gathering many data points, for many servers, every second.

The Ganglia packages available in Ubuntu 8.04 are quite obsolete, but there are useful instructions here to help with a manual install.

Nagios

I used Nagios briefly a long time ago, but I wasn’t involved in the configuration. As I read about all these tools, I see many comments about the complexity of configuring Nagios, and I get the general impression that it is drifting in to history. However, I also get the impression that its community is vast, with Nagios-compatible data gathering tools for any imaginable purpose.

Others

Zenoss

Groundwork

Munin

Cacti

How Many Monitoring Systems Does One Company Need?

It is tempting to use more than one monitoring system, to quickly get the logical union of their good features. I don’t recommend this, though; it takes a lot of work and discipline to set up and operate a monitoring system well, and dividing your energy across more than one system will likely lead to poor use of all of them.

On the contrary, there is enormous benefit to integrated, comprehensive monitoring, so much so that it makes sense to me to replace application-specific monitors with data feeds in to an integrated system. For example, in our project we might discard some code that populates RRD files with history information and published graphs, and instead feed this data in to a central monitoring system, using its off-the-shelf features for storage and graphing.

A flip side of the above is that as far as I can tell, none of these systems offers detailed DBA-grade database performance monitoring. For our PostgreSQL systems, something like pgFouine is worth a look.

Conclusion

I plan to keep looking and learning, especially about Zenoss and Ganglia. For the moment though, our existing Zabbix, upgraded to the current version, seems like a reasonable choice.

Comments are welcome, in particular from anyone who can offer comparative information based on substantial experience with more than one of these tools.

RocketModem Driver Source Package for Debian / Ubuntu

A couple of months ago I posted about using the current model Comtrol RocketModem IV with Debian / Ubuntu Linux. Ubuntu/Debian includes an older “rocket” module driver in-the-box, which works well for older RocketModem IV cards. But for the newest cards, it does not work at all. The current RocketModem IV is not recognized by the rocket module in-the-box in Linux, it requires an updated driver from Comtrol.

With some work (mostly outsourced to a Linux guru) I now present a source driver package for the 3.08 “beta” driver version (from the Comtrol FTP site):

comtrol-source_3.08_all.deb

Comtrol ships the driver source code under a GPL license, so unless I badly misunderstand, it’s totally OK for me to redistribute it here.

To install this, you follow the usual Debian driver-build-install process. The most obvious place to do so is on the hardware where you want to install it, but you can also use another machine the same distro version and Linux kernel version as your production hardware. Some of these commands must be run as root.

dpkg -i comtrol-source_3.08_all.deb
module-assistant build comtrol

This builds a .deb specific to the running kernel. When I ran it, the binary .deb landed here:

/usr/src/comtrol-module-2.6.22-14-server_3.08+2.6.22-14.52_amd64.deb

Copy to production hardware (if you are not already on it), then install:

dpkg -i /usr/src/comtrol-module-2.6.22-14-server_3.08+2.6.22-14.52_amd64.deb

and verify the module loads:

modprobe rocket

and finds the hardware:

ls /dev/ttyR*

To verify those devices really work (that they talk to the modems on your RocketModem card), Minicom is a convenient tool:

apt-get install minicom
minicom -s

Kernel Versions

Linux kernel module binaries are specific to the kernel version they are built for; this is an annoyance, but is not considered a bug (by Linus). Because of this, when you upgrade your kernel, you need to:

  • Uninstall the binary kernel module .deb you created above
  • Put in the new kernel
  • Build and install a new binary module package as above

Rebuilding the source .deb

Lastly, if you care to recreate the source .deb, you can do so by downloading the “source” for the source deb: comtrol-source_3.08.tar.gz then following a process roughly like so:

apt-get install module-assistant debhelper fakeroot
m-a prepare
tar xvf comtrol-source_3.08.tar.gz
cd comtrol-source
dpkg-buildpackage –rfakeroot

The comtrol subdirectory therein contains simply the content of Comtrol’s source driver download, and this is likely to work trivially with newer driver versions (3.09, etc.) when they appear.