Saturday, February 27, 2010

instaDMG and the installer

I have been working with the command-line installer for a while, usually to further InstaDMG's compatibility with badly written installers (many of them from Apple). Since I am taking another round at trying to solve some of the problems that I had solved in 10.5 where my solution broke (the 10.6 installer does not like chroot jails), I thought that it would be a good idea for me to review the problem, and doing so in type might slow me down enough to get it right... or not.
The problems that I am running up against with installers come in two basic flavors. One of the classes is solidly the fault of the installer writers, the other is a problem with how the installer evaluates things (but is a bit complicated). To be more specific:
1) Installer scripts and the "target" volume
When the installer runs the various scripts that can be present in a package it gives passes the script the target volumes as one of the arguments (to be specific the third argument, or $3 in bash parlance). However this bit of information is not well communicated in Apple's (spotty) documentation for package developers, and many package developers are writing for the common case (installing from the running OS to the running OS). They might also think that the "install only on root volume" flag that they set on the package ensures this for them, and so write scripts that assume that the target volume and the boot volume are the same thing.
But in the case of a Direct-To-DMG install like InstaDMG the target volume is not the booted volume, and the command-line installer skips over things like the preflight and volume-check requirements. So all of the regular files wind up going the correct volume, but the scripts wind up doing whatever they were doing on the boot volume. This can have some really nasty implications, such as one VPN installer who would try any launch kext, but since parts of the install were split between two volumes it would usually wind up crashing the system. Or the iTunes installer that opens a daemon to talk to iPhones/iPods from the unbooted system (that one has a bug logged). But the real killer was when the 10.5.7 and 10.5.8 updates started doing things like this.
I managed to work around that in 10.5 by wrapping a chroot jail around the installer for everything by the initial OS install. This worked beautifully for this class of problem since now scripts that are run by the installer process see the root of my target volume as the root. And since I have a fully installed OS at that location they are free to use whatever tools are available, including the ones that they just installed. This works out remarkably well, even in cases that I thought would still go south on me (like the VPN installer loading a kext).
I realize that this is a bit of a hack, and from Apple's installer team's perspective they are probably unable to deal with the root problem here. There are legitimate cases when you might be installing a program onto a non-boot volume that does not have a serviceable OS on it (and thus can't use the chroot trick). The user could be short on space on the root device and wanting to install to a second volume for instance. Or this could be for a "network" install. So they can't just start wrapping every installer script in a chroot jail rooted on the target volume. But since I am in the driver's seat for InstaDMG installs, I can make sure that this works, and there was much joy when I got it working.
However the joy came to a screeching halt when I started working with the Developer Seeds of MacOS X 10.6. Snow Leopard's installer has some really neat tricks up its sleeve that allow it to install a new OS over the top of an old one (while the old one is still running) and recover if something happens in the middle of the install. However, things blew up badly with 10.6 when I tried to use the same chroot jail trick. And during the Seeding process how things went wrong keep changing. After trying to run on that treadmill for a while I gave up and turned off the chroot jails for 10.6 installs.
Now that things have settled down, I have been tinkering again to see if I can get them back to working. I had an idea during Macworld about how I could fix this (more about that later), but so far it has just taken me to a different dead end. I have finally sorted it out to an easily repeatable single show-stopper, and you demonstrate it for yourself with these steps:
  1. On a computer that has two volumes with 10.6.x installed on it boot to one of these volumes. For this walk-thourgh I will be using the name "UNBOOTED" for the other volume, and it will be auto-mounted at /Volumes/UNBOOTED.
  2. Find any package you would like to use and copy it to the non-booted volume, into the "tmp" folder on that volume. The location and name are not actually important, but for this walk-through it will be at /Volumes/UNBOOTED/tmp/simple.pkg.
  3. Run the following command:
    /usr/sbin/chroot /Volumes/UNBOOTED/ /usr/sbin/installer -pkg /tmp/simple.pkg -target /
You should get a single error line back:
installer: Error trying to locate volume at /
. The best I have figured out is that the installer is choking on the fact that there is nothing in /dev/ on the target volume (since it is not booted). I have gone through a lot of permutations on trying to both troubleshoot and fix this, and so far I have gotten nothing but rock walls:
  • None of the ways of referencing a target work. I tried all of them: "/", ".", "/dev/disk1s1", "disk1s1".
  • Instruments does not seem to work on chroot, and tries to load a bundle from the target volume and fails since I don't have XCode or other tools installed on that volume. But since it is trying to load them there I don't think I am going to get real values anyways.
  • I can't hard-link onto the volume from the "real" /dev/ entries on my booted volume since that would cross a device boundary (I did try, even knowing it would not work).
  • I can't soft-link to /dev/ since from within the chroot jail I can't see it.
  • The "-volinfo" flag in the installer just pauses for a while before returning nothing (since it can't find anything).
  • This problem seems to be directly in the installer, since I can disable the LaunchDaemon and it does not change the behavior.
At this point I have come to the conclusion that this is not something I can solve with the installer. I will be re-posting this simpler example as another bug with Apple to go next to the less-well-described one I posted way-back-when. I would appreciate anyone else filing duplicates, especially ones with impact statements (how many computers does this affect, and how is it going to impact you buying future hardware from Apple). My radar number for this is: 7699285.
2) Package Requirements always look at the booted volume
There is a remarkably similar problem the in the "Package Requirements" section of the installer, but this one has a twist. This one is very similar in that the problems are not necessarily created by Apple's installer, they are more likely the fault of the package creators (often other people at Apple), but the mistakes are common enough that they need systematic work-arounds at the installer level.
The problem is in the system that installers use to determine what sub-packages should be installed by default, offered to the user, or required. This system comprises of a configuration file that is read in including some JavaScript that gets combined together and run by a special system in the installer. At the end of that process the installer has a nice tree of what is allowed to install, what is installed by default, and what is required to install. Things work out very well with this in most cases. But things can go south the same way as before when the installer creators make the (bad) assumption that their installers are always going to be run on the boot volume.
An example can be found in the iLife Support 9.0.3 Update. This update installs two components to help the iLife programs view some types of files, namely the "iLife Media Browser" and the "iLife Slideshow" bundle. It is a well-written installer that checks to see what versions of those bundles are already installed, so it can abort the install if you already have newer versions of those bundles installed. It will even selectively install only one of those bundles if you already have a newer version of the other already installed. However, when checking for these version numbers it checks the ones on the root volume.
So in the case of InstaDMG if your "host" OS (the one you are booted from) already has the update installed (generally I advise people to keep up to date with their host OS), then it will report that you already have this installed, and bail out installing anything, even though the target volume has older versions of these bundles and needs the update.
You can comically demonstrate this problem for yourself by grabbing the iLife Support 9.0.3 Update and running this command (run this from 10.6.2 or a 10.5.8 version that already has the update installed):
sudo installer -pkg "/Volumes/iLife Support 9.0.3/iLifeSupport903.pkg" -target /dev/null
You should get a message:
installer: Error - A newer version of this software is already installed.
Note that last bit the "-target /dev/null" part. This means that we want to install this into a black hole. There is no version of the iLife Support bundles there at all, and there will never be. You can also do this to a real, empty volume, or one with an older version of the OS (10.6.2 comes with a newer version of this bundle). The installer is only ever looking a the booted volume when evaluating this.
I had hoped that when I put in the chroot jails that this would solve this problem as well, and was surprised when it did not. Since then I have figured out the how-and-why of this: the JavaScript interpreter is being run from a process that is launched via a MachServices port connection in conjunction with either a LaunchDaemon or a LaunchAgent (depending on whether the package needs root privileges presumably). Those two are, respectively:
So even if the installer process is running inside the chroot jail, the call to a Mach Port punches a nice little hole through the wall of my nice little jail and the Package Requirements are still run looking at the "host" OS's root.
This brings me to my little epiphany at Macworld: launchd plists can include a "RootDirectory" entry to cause them to be chrooted when they launch. So I can play a game where I unload the system's LaunchDaemon, and load a modified version of my own with the RootDirectory key pointed where I want it. This is a bit of a dangerous game since I can't do this just for my own process, but have to hijack the Daemon that services all installs on the system. I have code that reliably restores the proper LaunchDaemon when I am done with it, and have experimented a little with this: GUI installers get hung, so it looks pretty harmless even if someone is running InstaDMG in the background and forgets and tries to run an installer on top.
I have not been able to get through a full test of this, as at the moment I am concentrated completely on 10.6, and the first bug prevents the installer from even getting this far in the process when used in a chroot environment. But I am going to try this out a little more for 10.5 and other solutions tomorrow to see if I can use this segment of the fix even if I can't get installer scripts fixed.
I really do wish that Apple would solve these problems for me. My thinking at this point is that they are the only ones that can solve the first problem at all (short of me writing my own version of the installer). And I really wish that they would stop making this sort of mess with the installers that they are putting out. And the thing I really hate about this is that otherwise I really like the architecture and implementations of both the installer and most .pkg's that Apple puts out. I just sit at the middle of a big edge case and that has been feeling rather sharp recently.

1 comment:

  1. Great explanation on the problems at hand. I'm afraid that the target audience for 'well behaving' installers is too small for a lot of package creators. I hope you can sort this out. For monitoring what the installer checks about the target volume inside the chroot you might use /usr/bin/fs_usage