Thursday, August 19, 2010

Notes from the field

Working in computer labs you build up a repertoire of little tips and tricks, but I have never really taken the time to write them down and share them. So I am going to make an attempt here. I am going to try and come back and edit this page over time, and things will be a little random (as they pop out of my head), but here goes:

If you work with diverse Windows hardware keep local, uncompressed copies of the DriverPacks

The Windows hardware I have to maintain is a lab full of computers that are all unique hardware configurations. Every one has a different combination of hardware than the others (by design). For those of you who have to deal with Windows imaging you are probably already cringing, but this is what I have to deal with. I eventually stumbled on the DriverPacks web site, and while they are designed to be used in conjunction with SysPrep (which I can't use for other reasons), they are just folders of nicely collected Windows drivers compressed with 7zip. If you download all of them and collect them into a single tree of folders then when you have one-stop shopping for the random driver that you can't find. On Vista and 7 you can even just point the driver installer at the root folder and it will do the job of digging through for you. For XP you are going to have to do it more manually. (tip: driverpacks usually name the driver folders by the initials of the manufacturer, and Goggling the first part of the hardware types in the properties for the unknown device usually gets you the vendor)

I have gone one step further in my lab. I have a script on my local server that scrapes the DriverPacks web site and looks to see when they post new versions. It will then delete my old copy, download the new one, decompress it, and move the folder into place in the server share point. Occasionally I copy the network drivers onto a USB stick, and use that to get new installs onto the network. For the rest of the drivers I point the hardware search wizard at the appropriate point on my server share point, and it does the rest for me. Thus far this has worked wonders and I have to search for relatively few things on my own.

Tie down your cables with zip-ties and nail them into place, with power and networking similarly nailed into place

While it may at first seem a little obsessive-compulsive (CDO) to wrap up all of your cables in zip-ties and then to nail them into place, once things get moving in a lab the clear organization keeps things from becoming total chaos. I inherited a lab in such a state and whenever there was a problem with the wiring (and there was a new one every other day), and wasted enough time trying to solve things one-wire-at-a-time that I finally decided to declare the lab closed for a day, ripped everything out, and redid the whole thing. Since then I have not wasted any more time with bad connections, and the computers all have nice slots with all of the cables ready-to-go when they arrive. And since there are clear spots for the computers, when other people move things around it still stays pretty neat.

I went out and got a bunch of carpet tacks (HomeDepot) and inch-wide elastic bands (hobby store). I pre-drilled two groups of holes a couple of inches apart (mine are on the wider side as I need KVM cables) on the underside of the tables where I wanted a computer to go, and then cut a length of elastic band a little shorter than that distance so it had to stretch a little to fit over, and used the tacks to hammer them in creating a harness for my cables. Then I put in the screws to hold up the power (one surge protecter per station), and networking equipment. Each of those went up so that the screws fit into the holes on the back of the equipment, and then a single nail put at the appropriate end so that it could not slide back off.

Then I used small zip-ties to bind the network, power, and KVM cables for most of their length (zip-ties every 8-14 inches, leaving 8-10 inches on the computer end, and more at the other end), then slip the bundle through the elastic loop and plug things in. Then bind up the extra cable into bundles with bigger zip-ties. Do this the right way once and you can stop dealing with cables for a long time.

Buy three or four toenail clippers and use them to clip zip-ties
For about a doller a piece you can get the large toenail clippers form places like Target. They have both the standard clipper style and the scissor style and I recommend buying both. The clipper style (especially the big toenail ones) are great for taking off the used end of the zip-tie without leaving sharp edges, and the scissor type are good at removing zip-ties from cables in tight spaces. I keep a few in each lab I run ready whenever I need them. Plus you get some funny moments when people try to figure out why there are toenail clippers sitting in the lab, or even funnier moments when you catch people using them for their original purpose.
CDO is like OCD (Obsessive Compulsive Disorder), but with the letters in their proper alphabetic order. *grin*

Wednesday, August 18, 2010

Writing screensavers for 10.5 and 10.6

I needed to write a screensaver recently, one that would work on both 10.5 and 10.6, but ran into a problem when I tried to build such a project on XCode 3.2. The problem is that with 10.6 the program that does the job of running the plugins that are ScreenSavers (.saver modules) runs in 64bit mode when possible. And since .saver modules are plugins, they also need to be able to run in 64bit mode. However, the 10.5 version of the ScreenSaver.framework only had 32bit modes (PPC and i386), and so you can't build x86_64 (64bit Intel) modules against it. Conversely things changed enough between 10.5 and 10.6 that modules build against the 10.6 SDK will not run on 10.5. Rock meet hard place.

The solution to this problem seems obvious in retrospect: build the 64bit module against the 10.6 SDK, and build the 32bit modules (PPC and i386) against the 10.5 SDK. I am proud to say that I came up with the idea of building two different .saver modules, and then choosing which one to run... but I did not want the hassle of having to maintain two codebases, or even two compiles. But luckily for me there are people who are much smarter than me who already solved the problem. So I tried closing my XCode project and trying to translate what Warren Dodge wrote about his older project file into my newer one. Eventually I got what worked, and then discovered how to to do it in the XCode GUI:

Step 1: Open the project inspector
figure 1To do this click on the blue icon at the top of the "Groups & Files" column in the main window of your project. Then click on the "Info" button on the toolbar, or select "Get Info" from the File menu. The should look something like this:
Step 2: Change the "Base SDK for all configurations" to 10.5
Circled in red on the figure 1, the "Base SDK" serves as a master switch for all of the configurations (usually "Debug" and "Release"). While the label might imply that it only sets the targeted SDK, this also changes the build target.
Step 3: Setup the Configurations
Click over to the "Build" tab, then make sure that the "Configuration" selector (circled in red) is set to "All Configurations". This will make sure that the rest applies to both "Build" and "Release" configurations. Then type "10.5" in the search box (to the right of the "Configuration" setting). This will narrow down the really big list to just a few.
Setp 4: Add the BaseSDK setting

Click on the "Base SDK" entry under the "Architectures" section to highlight it (circled in green). Then click on the gear box at the bottom of the page (circled in purple), opening a drop-menu. From this menu select "Add Build Setting Condition". That will add a row underneath the "Base SDK" row. Clicking on that will pop look something like this:

Select "Intel 64-bit" from the menu, then click on the "Mac OS X 10.5" from just to the right of it (the lower one of the two together, not the default one), and from its pop-up menu select "Mac OS X 10.6".

At this point when the compiler goes to work it will use the 10.5 SDK for PPC and i386, but use the 10.6 SDK for x86_64. We are almost there!

Step 5: Add the Deployment Target setting
This final step is very similar to that for the Base SDK. The difference is that you highlight the "Mac OS X Deployment Target" line under the "Deployment" section, then once you have added the "Build Setting Condition" with the gear menu you change the "Any Architecture" selection to "Intel 64-bit" and the "Compiler Default" selection to "Mac OS X 10.6". You can leave the "Any SDK" alone, or set it to "Mac OS X 10.6" either way it will have the same effect.

Now you can close the "Project Info" box, and when you hit compile the settings will be right to have the output work on 10.5 or 10.6 without going through any other tricks (other than getting your code right). The same trick would probably work with setting things to 10.4 for 10.4 compatibility, but my project did not require that (and then I could not have used ObjC 2.0 tricks that I like to use).

Saturday, August 7, 2010

Apple refuses to solve my installer problem

A while ago I wrote about two problems I was having with the 10.6 installer and then a little while later how I had solved one of them with a bit of a hack. I had filed the other one as a issue with Apple, and I got a response back on the bug recently.

As a quick refresher: The problem is that there are a number of installers, both from Apple and from third parties, contain scripts that make the assumption that you are always installing on the root volume. Obviously this is a problem with things like InstaDMG, DeployStudio, or even System Image Utility. I managed to solve this class of problem for 10.5 by wrapping the installer in a chroot jail, a solution that worked better than I had hoped. Unfortunately the 10.6 installer breaks when I try to wrap it the same way. My best guess is that it is dying while trying to enumerate the volumes so that VolumeCheck scripts can run... but when using the command line version they are never run.

The answer I got back from Apple on this was a single line of text telling me that installers need to be written correctly to target non-boot volumes. I am more than a bit disappointed and angry at this response from Apple, as it means that this problem will not be fixed, and those in charge of fixing it do not see it as a problem and see the answers to this situation as lying with others. There are a few problems with this attitude:

  1. Apple has proven on more than a few instances that it is not capable of consistently authoring packages that do the right thing in these cases. iTunes, the iLife Updaters, and iWork installers are just a few cases. If Apple can not get this right then what hope is there that third parties will get it right (even accepting that Adobe will never get within visual distance of getting it right).
  2. Apple does have a product that needs exactly this setup: System Image Utility. Both in the NetRestore-from-installer and the NetInstall paths the installer needs to work on non-booted volumes with a variety of packages. That the SIU team has been very slow to acknowledge problems with their approach in this area is frustrating. They got the iTunes installer finally working (Ya!), but it took me yelling at them personally at a conference for it to happen (Booo!). I expect more from Apple.

So what do I do now? How can I overcome this class of issue? I have a few possibilities:

  1. Come up with some brilliant solution that tricks the 10.6 installer into working inside a chroot jail
  2. Create a system that unwrapped every form of .pkg (and there are a number of formats), replaces all of the scripts with a version that is wrapped with a chroot jail
  3. Write my own version of the installer that does things right
  4. Yell at Apple for a while, and get other people to do so as well in the hope that this decision will be reversed
  5. Convince every .pkg author out there to write their installers to work on non-booting images
  6. Just accept that some installers will never work in InstaDMG, and hope that one of them is never in a software update, or something else absolutely required (ie: give up)

The first three items are ones that I could in theory do (although there is probably no hope for the first, and the latter two are going to be difficult). Of the last three the fourth item is the one that I wish would work (it would be the best solution), and the last one is the one I am most afraid of.

So, if anyone is reading, and would like to do something, please tell Apple how much value there is to you in installers working on non-boot volumes. If you would like to mention the Radar number 7699285 that would be great.

Sunday, June 20, 2010

Bug with hdiutil and symlinks

I got an error report from an InstaDMG user who was using symlinks to point at their installer DVD. I had never tried out using symlinks for that, and so tried it out, successfully, and wrote back saying that it was working for me (with a much newer version of InstaDMG), and that they should probably be using the -I flag to specify the disk rather than use a symlink. But I did do some more testing while I was setup this way, and ran into a problem just once after a couple of runs of InstaDMG.

It turns out that there is a bug in hdiutil, at least on 10.6.4, when it comes to resolving symlinks. But this bug only seems to come out on some percentage of runs, and even then the percentage seems to vary with the hardware (or some other variable). On my iMac8,1 I see it 15-25% of the time, while with my iMac5,1 I only see it 0.5% of the time. Granted the older iMac is running a brand-new install, where the newer iMac is running an OS that I constantly beat on.

I have reported this back to Apple as Radar number 8111753, as well as on OpenRadar. But I am curious if other people are getting error numbers like I am, so if you would like to run the following script a few times on your system and post the results in the comments that would be great.

#!/bin/bash

# print the system information
/usr/sbin/system_profiler SPHardwareDataType SPSoftwareDataType | /usr/bin/awk '/Model Identifier:|System Version:/ { $1 = ""; $2 = ""; gsub(/^[ \t]+|[ \t]+$/,""); print }'

# create a temproary folder with three items in it
TEMP_FOLDER=`/usr/bin/mktemp -d /tmp/hdiutilBugTest.XXXX`
/usr/bin/touch "$TEMP_FOLDER/a"
/usr/bin/touch "$TEMP_FOLDER/b"
/usr/bin/touch "$TEMP_FOLDER/c"

# create a compressed image from the temp folder
/usr/bin/hdiutil create -srcfolder "$TEMP_FOLDER" "$TEMP_FOLDER/testImage.dmg" 1>/dev/null

# create the symlink to the image
/bin/ln -s "testImage.dmg" "$TEMP_FOLDER/symlink"

SYMLINK_PATH="$TEMP_FOLDER/symlink"
ABSOLUTE_PATH="$TEMP_FOLDER/testImage.dmg"

PATHS[0]="$SYMLINK_PATH"
PATHS[1]="$ABSOLUTE_PATH"

REPEAT_COUNT=1000
IFS=$'\n'
for THIS_PATH in ${PATHS[@]}; do
 echo "Working on: $THIS_PATH"
 FAILED_COUNT=0
 i=0
 while [ $i -lt $REPEAT_COUNT ]; do
  /usr/bin/hdiutil imageinfo "$THIS_PATH" 1>/dev/null 2>/dev/null
  if [ $? -ne 0 ]; then
   let FAILED_COUNT=FAILED_COUNT+1
  fi
  let i=i+1
 done
 echo "  Failed $FAILED_COUNT out of $REPEAT_COUNT times"
done

# delete the temp folder
if [ ! -z "$TEMP_FOLDER" ] && [ -d "$TEMP_FOLDER" ]; then
 /bin/rm -rf "$TEMP_FOLDER"
fi

Tuesday, June 8, 2010

html timer

For my presentation at Macworld in January I created a semi-time-lapse screen capture of a complete InstaDMG run to run as a demo. Since different parts of it were going to fly by at different rates I wanted to have some sort of timer to show the real clock time. Looking around for some little application or widget I did not find anything I like, and I finally gave in and made one myself.
Since I wanted this done fast, and with something I could easily control with AppleScript (sense the rest of the demo was being driven by it anyways), I decided to create a little JavaScript timer, and run it inside Safari.
<html>
<head>
    <title>Timer</title>
    <script>
        var hours = null, minutes = null, seconds = null
        var startTime = null
        var currentTimer = null
        
        function startTimer() {
            // setup things
            hours = document.getElementById("hours")
            minutes = document.getElementById("minutes")
            seconds = document.getElementById("seconds")
            
            startTime = new Date()
            displayTimer();
        }
        
        function displayTimer() {
            
            currentTime = new Date(new Date() - startTime)
            seconds.innerHTML = currentTime.getUTCSeconds()
            minutes.innerHTML = currentTime.getUTCMinutes()
            hours.innerHTML = currentTime.getUTCHours()
            currentTimer = setTimeout('displayTimer()',500);
        }
        
        function stopTimer() {
            clearTimeout(currentTimer)
        }
        
    </script>
    <style>
        body
{ font-size: large }
        div
{ width: .65in; display: inline-table; font-size: .5in; text-align: right }
    </style>

</head>
<body>
    <div id="hours">0</div> hrs <div id="minutes">0</div> min <div id="seconds">0</div> sec
</body>
</html>
Then I just had to trigger it with some code like:
tell application "Safari" to do JavaScript "startTimer()" in timerDocument
Edit: figured out the problem with the hours, and the correction was to use UTC time.

Saturday, May 1, 2010

Using plists from Python

Python is my current scripting-language-of-choice for a number of reasons, but one of them is that I can handle plists easily, including complex ones, without having to worry about the format that they are in (xml, binary, or even old-style NeXT). I should put the caveat up-front here that this will only work in 10.5 and later, but at this point I don't touch 10.4 machines, and don't anticipate ever working with 10.3 again. So if you can work with that then this method might be for you.
I use the Cocoa bridge to get access to MacOS X's native Foundation layer and the native plist processing available to Obj-C programmers. I know a few other scripter/programmers who use similar techniques in their work, but so far everyone else has been using NSDictionary's dictionaryWithContentsOfFile method. This is great and works well for most plists that you will use, but there are two things you lose by using it:
  1. It will read in all of the native plist formats, but the you don't know what format you started with. I like writing things back down in the format I found them in. It probably does not ever matter, but what can I say? In my job I am a little anal about things like this.
  2. Not all plists have a dict as their root, some have NSArrays. You probably know going in what the format of the plist you are working with should be, so this is not such a big deal, but I like to be able to be a little more specific about what went wrong why my programs bail.
The solution for these two issues is to use the NSPropertyListSerialization class to read from, and write out your plists. This is easy to do, and the best explanation of it is to give an example, first a minimal one:
#!/usr/bin/python pathToPlist = [insert path here] plistNSData, errorMessage = Foundation.NSData.dataWithContentsOfFile_options_error_(pathToPlist, Foundation.NSUncachedRead, None) plistContents, plistFormat, errorMessage = Foundation.NSPropertyListSerialization.propertyListFromData_mutabilityOption_format_errorDescription_(plistNSData, Foundation.NSPropertyListMutableContainers, None, None) # plistContents is now a tree with the data plistNSData, errorMessage = Foundation.NSPropertyListSerialization.dataFromPropertyList_format_errorDescription_(plistContents, plistFormat, None) suceeeded, errorMessage = plistNSData.writeToFile_options_error_(pathToPlist, Foundation.NSUncachedRead, None)
Important note: Blogger is probably cutting off the ends of lines on the display, and wrapping others. But a copy-and-paste should get you what you need. You also have to fill in the path to your plist of choice there, and this does not do anything other than read the plist, and write it back down unchanged. But if you are looking for a quick cut-and-paste that is probably what you want.
For a more complicated example lets make sure that Acrobat has not been set as the default handler for PDFs. This pulls out most of the stops and checks for all types of problems, so should be a much better example to follow for production code:
#!/usr/bin/python '''This script sets the default file opener for PDFs to Preview''' import os, sys, Foundation # get the path to this user's LaunchServices preference file pathToLaunchServicesPlist = os.path.expanduser("~/Library/Preferences/com.apple.LaunchServices.plist") if not os.path.isfile(pathToLaunchServicesPlist): raise Exception("The LaunchServices preferences file seems missing: %s" % pathToLaunchServicesPlist) # read out the data in the file plistNSData, errorMessage = Foundation.NSData.dataWithContentsOfFile_options_error_(pathToLaunchServicesPlist, Foundation.NSUncachedRead, None) if errorMessage is not None or plistNSData is None: raise Exception("Unable to read in the data from the plist file: %s\nRecived error message: %s" % (pathToFinderPlist, errorMessage)) # convert the data into a useable form launchServicesPreferences, plistFormat, errorMessage = Foundation.NSPropertyListSerialization.propertyListFromData_mutabilityOption_format_errorDescription_(plistNSData, Foundation.NSPropertyListMutableContainers, None, None) if errorMessage is not None or pathToLaunchServicesPlist is None: raise Exception("Unable to read the data as a plist: %s\nRecived error message: %s" % (pathToLaunchServicesPlist, errorMessage)) # launchServicesPreferences is now a tree of objects that we can modify with normal python methods #   but we have to check to make sure it looks like we expect # check to make sure that the root is a dict like we expect it to be # Note that the root is actually a NSDictionary object, # but this is bridged to work everywhere at python dict object would. # But it is not actually a dict object if not hasattr(launchServicesPreferences, "has_key"): raise Exception("The plist does not have a dictionary as its root as expected: %s" % pathToLaunchServicesPlist) # confirm the LSHandlers item at the first level, and that it reacts like a python list (really a bridged NSArray) if not "LSHandlers" in launchServicesPreferences or not hasattr(launchServicesPreferences["LSHandlers"], "append"): raise Exception("The plist is missing the LSHandlers section, or it was not an array: %s" % pathToLaunchServicesPlist) # iterate over the array to find any that set the handler for pdfs for handlerSetting in launchServicesPreferences["LSHandlers"]: if hasattr(handlerSetting, "has_key") and "LSHandlerContentType" in handlerSetting and handlerSetting["LSHandlerContentType"] == "com.adobe.pdf": handlerSetting["LSHandlerRoleAll"] = "com.apple.preview" # the setting (if it was set) should now be changed in our in-memory version, we only need to save this back to disk # convert the tree back to a NSData using the same format we read it in with plistNSData, errorMessage = Foundation.NSPropertyListSerialization.dataFromPropertyList_format_errorDescription_(launchServicesPreferences, plistFormat, None) if errorMessage is not None or plistNSData is None: raise Exception("Unable to sealize preferences data. Got error message: %s\nTrying to seraliza data:\n%s" % (errorMessage, launchServicesPreferences)) # write the data back down to disk suceeeded, errorMessage = plistNSData.writeToFile_options_error_(pathToLaunchServicesPlist, Foundation.NSUncachedRead, None) if errorMessage is not None and suceeeded == True: raise Exception("Unable to write preferences back to disk to: %s\nRecieved error message: %s" % (pathToLaunchServicesPlist, errorMessage)) sys.exit(0)

Wednesday, March 17, 2010

Troubleshooting an odd symlink bug

About a week ago an odd bug that was brought to my attention that occurs when people tried to install the Puppet package into an image made with InstaDMG. The bug started out in private emails, but we got it moved over to the developer mailing list, and you can take a look at it. A group of us banged our collective heads over it for a while, and finally I found it by just going over every step to see what was wrong. The problem manifested itself as the Puppet installer overwriting the softlink that you normally find at '/usr/lib/ruby/site_ruby', and instead putting a folder with the desired contents there. Replacing this symlink apparently broke other things, and thus began the bug-hunt. The bug was reported against InstaDMG because the installer works fine when used on a booted volume. My bet is that a similar problem would have manifested if someone had tried installing this to another volume other than the boot volume, thus clearing InstaDMG in this bug, but we didn't think of that at the time. My first instinct was that there was something wrong with the code in the 'installer' program when faced with the complex series of softlinks that it had to follow (a listing of that appears in a moment). I even created a script that mounted a dmg and tried to re-create the problem in a much simpler manner, but with no success. I did repeat the observed behavior, and knew that there was a problem in there somewhere, so I decided to try and figure out what was different about the softlink chain in this case from my test case. So I carefully followed the chain of symlinks on a mounted volume (InstaDMG output dmg, since I have a few of those lying around). Here is what I found: /usr/lib/ruby -> ../../System/Library/Frameworks/Ruby.framework/Versions/Current/usr/lib/ruby /System/Library/Frameworks/Ruby.framework/Versions/Current -> 1.8 /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/site_ruby -> ../../../../../../../../../../Library/Ruby/Site So if you follow this set of rules, on a booted volume '/usr/lib/ruby/site_ruby' winds up pointing at '/Library/Ruby/Site'. My understanding is that this the same in both 10.5.x and 10.6.x, but was different in 10.4. But if you are careful you would count the number of back-references in that last link. There are 10. But if you count the number of folders in the chain back form the 'site_ruby' folder you will only find 9. When you are booted this does not matter, once you are at the root directory you can just keep back-referencing all you like, and you still wind up in the same place. As a quick demonstartion you can do this in the Terminal: 'cd /; cd ..; pwd' and you will still be at root. But when the volume is mounted this means that the 'site_ruby' link winds up pointing back outside the image. So this explains the bad behavior: when the installer goes to look for the folder at this point it finds a broken symlink, so instead replaces that broken symlink with a valid folder. A pretty reasonable thing for the installer to do. I might have made this a failing error if I were the one programming it, but I am sure a lot of smart people came together in a meeting at Apple (or possibly NeXT) at some point in the past and decided that this was the correct behavior, and I can't call them wrong. When I started looking back, it seems that this extra back-reference has been in place since 10.5.0, and has been kept all the way through 10.6.2 (and it might continue). It has been masked all along because it only becomes a problem when you are not booted from the volume. I really should write up a small tool to comb through the whole filesystem and see if there are any other similar problems in any other symlinks, and report those as well in another Radar report, but I think I will leave that for another day. But I thought I would get this out there so if anyone else runs into some other similar bug they might remember this.

Wednesday, March 3, 2010

One installer issue down, one to go

I wrote recently about the two issues I have been trying to solve with some bad installers in 10.6. Well with rev261 of InstaDMG I now have one of the issues solved. The solution is exactly as I described: replace the launchdaemon offering the installd service with one that is chrooted into my install target. I rain a pair of tests with the new version of InstaDMG on a 10.6.2 vanilla image: one with the new code, and one with it disabled (there is a switch for that). The results were exactly as I had hoped for: the iLife Support Update 9.0.3 components get installed with the new code, but get left out of the one without the new code.
So I am marking one of those two issues worked-arround. I wish that the solution to the other one suddenly presents itself, but I am not going to hold my breath, as I am pretty convinced that that one is going to take Apple making a change to solve.
As always, if you want it solved, then tell Apple how this is affecting you, and how many purchases it is affecting. This is not strictly a bug with Apple's code (at least not in the installer binary), so they are likely to not see it worth the engineer time to make the changes unless we can show them reason that it is worth the time (that would otherwise go into other features/changes/fixes).

Tuesday, March 2, 2010

JavaScript reference for Apple .pkg makers

I have never seen it mentioned anywhere and just stumbled across Apple's developer documentation on the AppleScript objects that are available to installer writers. I probably have totally missed something obvious telling me where to find it, but I have always just used what I have gleaned from taking Apple's installers apart, but now have actual documentation:
For those writing scripts that check for the presence of things to decide what to install (I am looking at you iLife Support team) I will pointedly reference the "target" item, and it's "mountpoint" property.

Saturday, February 27, 2010

instaDMG and the installer

Introduction
I have been working with the command-line installer for a while, usually to further InstaDMG's compatibility with badly written installers (many of them from Apple). Since I am taking another round at trying to solve some of the problems that I had solved in 10.5 where my solution broke (the 10.6 installer does not like chroot jails), I thought that it would be a good idea for me to review the problem, and doing so in type might slow me down enough to get it right... or not.
The problems that I am running up against with installers come in two basic flavors. One of the classes is solidly the fault of the installer writers, the other is a problem with how the installer evaluates things (but is a bit complicated). To be more specific:
1) Installer scripts and the "target" volume
When the installer runs the various scripts that can be present in a package it gives passes the script the target volumes as one of the arguments (to be specific the third argument, or $3 in bash parlance). However this bit of information is not well communicated in Apple's (spotty) documentation for package developers, and many package developers are writing for the common case (installing from the running OS to the running OS). They might also think that the "install only on root volume" flag that they set on the package ensures this for them, and so write scripts that assume that the target volume and the boot volume are the same thing.
But in the case of a Direct-To-DMG install like InstaDMG the target volume is not the booted volume, and the command-line installer skips over things like the preflight and volume-check requirements. So all of the regular files wind up going the correct volume, but the scripts wind up doing whatever they were doing on the boot volume. This can have some really nasty implications, such as one VPN installer who would try any launch kext, but since parts of the install were split between two volumes it would usually wind up crashing the system. Or the iTunes installer that opens a daemon to talk to iPhones/iPods from the unbooted system (that one has a bug logged). But the real killer was when the 10.5.7 and 10.5.8 updates started doing things like this.
I managed to work around that in 10.5 by wrapping a chroot jail around the installer for everything by the initial OS install. This worked beautifully for this class of problem since now scripts that are run by the installer process see the root of my target volume as the root. And since I have a fully installed OS at that location they are free to use whatever tools are available, including the ones that they just installed. This works out remarkably well, even in cases that I thought would still go south on me (like the VPN installer loading a kext).
I realize that this is a bit of a hack, and from Apple's installer team's perspective they are probably unable to deal with the root problem here. There are legitimate cases when you might be installing a program onto a non-boot volume that does not have a serviceable OS on it (and thus can't use the chroot trick). The user could be short on space on the root device and wanting to install to a second volume for instance. Or this could be for a "network" install. So they can't just start wrapping every installer script in a chroot jail rooted on the target volume. But since I am in the driver's seat for InstaDMG installs, I can make sure that this works, and there was much joy when I got it working.
However the joy came to a screeching halt when I started working with the Developer Seeds of MacOS X 10.6. Snow Leopard's installer has some really neat tricks up its sleeve that allow it to install a new OS over the top of an old one (while the old one is still running) and recover if something happens in the middle of the install. However, things blew up badly with 10.6 when I tried to use the same chroot jail trick. And during the Seeding process how things went wrong keep changing. After trying to run on that treadmill for a while I gave up and turned off the chroot jails for 10.6 installs.
Now that things have settled down, I have been tinkering again to see if I can get them back to working. I had an idea during Macworld about how I could fix this (more about that later), but so far it has just taken me to a different dead end. I have finally sorted it out to an easily repeatable single show-stopper, and you demonstrate it for yourself with these steps:
  1. On a computer that has two volumes with 10.6.x installed on it boot to one of these volumes. For this walk-thourgh I will be using the name "UNBOOTED" for the other volume, and it will be auto-mounted at /Volumes/UNBOOTED.
  2. Find any package you would like to use and copy it to the non-booted volume, into the "tmp" folder on that volume. The location and name are not actually important, but for this walk-through it will be at /Volumes/UNBOOTED/tmp/simple.pkg.
  3. Run the following command:
    /usr/sbin/chroot /Volumes/UNBOOTED/ /usr/sbin/installer -pkg /tmp/simple.pkg -target /
You should get a single error line back:
installer: Error trying to locate volume at /
. The best I have figured out is that the installer is choking on the fact that there is nothing in /dev/ on the target volume (since it is not booted). I have gone through a lot of permutations on trying to both troubleshoot and fix this, and so far I have gotten nothing but rock walls:
  • None of the ways of referencing a target work. I tried all of them: "/", ".", "/dev/disk1s1", "disk1s1".
  • Instruments does not seem to work on chroot, and tries to load a bundle from the target volume and fails since I don't have XCode or other tools installed on that volume. But since it is trying to load them there I don't think I am going to get real values anyways.
  • I can't hard-link onto the volume from the "real" /dev/ entries on my booted volume since that would cross a device boundary (I did try, even knowing it would not work).
  • I can't soft-link to /dev/ since from within the chroot jail I can't see it.
  • The "-volinfo" flag in the installer just pauses for a while before returning nothing (since it can't find anything).
  • This problem seems to be directly in the installer, since I can disable the LaunchDaemon and it does not change the behavior.
At this point I have come to the conclusion that this is not something I can solve with the installer. I will be re-posting this simpler example as another bug with Apple to go next to the less-well-described one I posted way-back-when. I would appreciate anyone else filing duplicates, especially ones with impact statements (how many computers does this affect, and how is it going to impact you buying future hardware from Apple). My radar number for this is: 7699285.
2) Package Requirements always look at the booted volume
There is a remarkably similar problem the in the "Package Requirements" section of the installer, but this one has a twist. This one is very similar in that the problems are not necessarily created by Apple's installer, they are more likely the fault of the package creators (often other people at Apple), but the mistakes are common enough that they need systematic work-arounds at the installer level.
The problem is in the system that installers use to determine what sub-packages should be installed by default, offered to the user, or required. This system comprises of a configuration file that is read in including some JavaScript that gets combined together and run by a special system in the installer. At the end of that process the installer has a nice tree of what is allowed to install, what is installed by default, and what is required to install. Things work out very well with this in most cases. But things can go south the same way as before when the installer creators make the (bad) assumption that their installers are always going to be run on the boot volume.
An example can be found in the iLife Support 9.0.3 Update. This update installs two components to help the iLife programs view some types of files, namely the "iLife Media Browser" and the "iLife Slideshow" bundle. It is a well-written installer that checks to see what versions of those bundles are already installed, so it can abort the install if you already have newer versions of those bundles installed. It will even selectively install only one of those bundles if you already have a newer version of the other already installed. However, when checking for these version numbers it checks the ones on the root volume.
So in the case of InstaDMG if your "host" OS (the one you are booted from) already has the update installed (generally I advise people to keep up to date with their host OS), then it will report that you already have this installed, and bail out installing anything, even though the target volume has older versions of these bundles and needs the update.
You can comically demonstrate this problem for yourself by grabbing the iLife Support 9.0.3 Update and running this command (run this from 10.6.2 or a 10.5.8 version that already has the update installed):
sudo installer -pkg "/Volumes/iLife Support 9.0.3/iLifeSupport903.pkg" -target /dev/null
You should get a message:
installer: Error - A newer version of this software is already installed.
Note that last bit the "-target /dev/null" part. This means that we want to install this into a black hole. There is no version of the iLife Support bundles there at all, and there will never be. You can also do this to a real, empty volume, or one with an older version of the OS (10.6.2 comes with a newer version of this bundle). The installer is only ever looking a the booted volume when evaluating this.
I had hoped that when I put in the chroot jails that this would solve this problem as well, and was surprised when it did not. Since then I have figured out the how-and-why of this: the JavaScript interpreter is being run from a process that is launched via a MachServices port connection in conjunction with either a LaunchDaemon or a LaunchAgent (depending on whether the package needs root privileges presumably). Those two are, respectively:
/System/Library/LaunchDaemons/com.apple.installd.plist
/System/Library/LaunchAgents/com.apple.installd.user.plist
So even if the installer process is running inside the chroot jail, the call to a Mach Port punches a nice little hole through the wall of my nice little jail and the Package Requirements are still run looking at the "host" OS's root.
This brings me to my little epiphany at Macworld: launchd plists can include a "RootDirectory" entry to cause them to be chrooted when they launch. So I can play a game where I unload the system's com.apple.installd LaunchDaemon, and load a modified version of my own with the RootDirectory key pointed where I want it. This is a bit of a dangerous game since I can't do this just for my own process, but have to hijack the Daemon that services all installs on the system. I have code that reliably restores the proper LaunchDaemon when I am done with it, and have experimented a little with this: GUI installers get hung, so it looks pretty harmless even if someone is running InstaDMG in the background and forgets and tries to run an installer on top.
I have not been able to get through a full test of this, as at the moment I am concentrated completely on 10.6, and the first bug prevents the installer from even getting this far in the process when used in a chroot environment. But I am going to try this out a little more for 10.5 and other solutions tomorrow to see if I can use this segment of the fix even if I can't get installer scripts fixed.
Summary:
I really do wish that Apple would solve these problems for me. My thinking at this point is that they are the only ones that can solve the first problem at all (short of me writing my own version of the installer). And I really wish that they would stop making this sort of mess with the installers that they are putting out. And the thing I really hate about this is that otherwise I really like the architecture and implementations of both the installer and most .pkg's that Apple puts out. I just sit at the middle of a big edge case and that has been feeling rather sharp recently.

Monday, February 15, 2010

On Saturday I gave a presentation about the state of imaging on the Mac, and for that presentation I created a movie demoing creating an image from start to finish using InstaDMG and InstaUp2Date. The movie shows the whole process from start to finish, but since the full run (including capturing an image of the installer dvd) took 2 hour 41 minutes, I had to speed up sections of it quite a bit. So the final movie takes a bit over 2 minutes to run. For your viewing pleasure:

Sunday, February 14, 2010

Adobe's installer session at Macworld

This last week the some brave members of the Creative Suite installer team form Adobe team courageously sat in front of a group of attendees at the Macworld conference. I was in the crowd as they talked about the concerns they have heard from the MacOS X admin community about their installers (and to a lesser extent the products themselves). They had been convinced by John Welsh to come and speak at the conference, and the "pitchfork and torches" crowd (myself included) came ready to voice their frustrations.
The speakers from Adobe did a great job of summarizing the problems that we have been having, and demonstrated beautifully that they had not only been listening to our complaints, but had actually heard and understood them. During the development of CS4 their focus had been on solving the issues the individual users had had with the previous installers, and they talked about how their focus for the "next major version" of the Creative Suite had turned to solving the problems that Mac Admins have had.
I was very impressed both by the frankness of the folks presenting, and by what they had to say. The biggest news to me was that they are working to make the Creative Suite Deployment Toolkit produce native installers on both platform (so .msi's on Windows and .pkg's on MacOS). This was absolutely wonderful news to me, and by itself will solve most of my problems with their installers.
I was a little confused by their reasoning about keeping the generic installer as their custom installer system that they are using now, and having the Deployment Toolkit the produce native installers from that. As I understood it the reasoning was that it would require them to maintain 2 separate installers with separate problems and limitations. But to my thinking they now have to spend resources on 4 projects rather than 2: Windows custom, Windows .msi, Mac custom, Mac .pkg. But as long as they are going to give me a way to produce .pkg installers for what I want, I am mollified.
The only negative part of it for me was that the Acrobat team was conspicuously absent from the group on stage, and every time Acrobat was talked about it was to say that they were not expecting much to change with it in the "next major version". This was disappointing, as Acrobat has been one of the worst offenders of the bunch. I keep telling myself that only having to deal with one problem child in the Creative Suite is a vast improvement of the current state of affairs.
On a note of specific interest to me: I talked to them about making sure that the .pkg's that the new Deployment Toolkit will produce is compatible with InstaDMG and SIU's NetRestore from DVD features. The lead engineer initially was not aware of this the Direct-to-DMG idea, but with only a minimal explanation from me he quickly warmed to the idea and had a private aside with another member of the team. After the session I had a conversation with some Adobe administrative staff who have been using InstaDMG for their internal work at Adobe and they were going to offer their input to the engineering staff. So we have some hope that the .pkg' produced will just work, greatly simplifying the job of installing the "next major version" of the Creative Suite products.

Tuesday, February 9, 2010

InstaDMG Quick-start

With a recent addition to InstaDMG it now is even easier to create a basic "Vanilla" image. In fact it is now down to just three commands:
svn checkout http://instadmg.googlecode.com/svn/trunk instadmg sudo ./instadmg/AddOns/InstaUp2Date/importDisk.py --automatic --legacy sudo ./instadmg/AddOns/InstaUp2Date/instaUp2Date --process 10.6_vanilla
Just "cd" into the directory where you want InstaDMG to go, and make sure you have the appropriate MacOS X Installer DVD in the drive, and let it go to work. But be prepared for some waits, the first command should only take 5-10 seconds, but the other two take at least 45 minutes apiece. I have a script that has been doing all of this in about 2 hours 41 minutes. After you have done this once you don't need the first two commands, and can get by with only the third command. I have a nice screen recording of this process that I have edited down to less than 2 minutes that I will be posting after I use it in my presentation at Macworld.