Thursday, April 8, 2010

XenServer: The host is still booting

Had a strange issue with some 5.5u2 servers in a pool yesterday I thought I might share, after rebooting 2 of the pool slave boxes both came up (after a very long boot) unmanageable - in both cases issuing most xe commands at the console returned "The host is still booting" as an error message. After some frustrating head banging of the keyboard the issue was tracked down to a problem with the server not being able to resolve the host name of the pool master.

Though this situation may be isolated to environments that match ours, the issue was quite frustrating, and very easy to fix once the problem was found. In our situation all servers are in a subdomain - for this article's sake let us say servers.dhcollier.corp - all XenServers were connected to the network, and in all cases had working DNS - but Xen was attempting to reach servername not servername.servers.dhcollier.corp (where the DNS entries are sitting).

To resolve the "still booting" issue on these disconnected XenServer Pool slaves they were moved to pool master temporarily (at this time they were disconnected from the pool and thought they were alone), the DNS suffix was added to the search order, and the master role returned to the true master server by doing the the following:

xe pool-emergency-transition-to-master
xe pif-param-set uuid=[management nics UUID] other-config:domain=servers.dhcollier.corp
xe pool-emergency-reset-master master-address:[name real pool master]

Right away (and after the subsequent reboots) they contacted the pool masters without issue.

You can also specify multiple search domains by separating the entries with a comma, such as:
xe pif-param-set uuid=[management nics UUID] other-config:domain=servers.dhcollier.corp,lab.dhcollier.corp,dhcollier.corp

See CTX118840 for more info on this setting.

1 comment:

  1. Here's how I got fixed: