PXE安装Solaris 10 1/06 + 系统

来源:互联网 发布:神马小说下载软件 编辑:程序博客网 时间:2024/05/21 08:57
Boot Solaris OS on x86 platform from a network

Well, if you just need to boot or manually install x86 machine from network, and you find Network-Based Installations book not really covering a network boot process, and utilities like setup_install_server and add_install_client looks messy, doing too much and scary to you, then you can check the following info out.
Notes explains x86 machine boot process, how to configure TFTP and DHCP servers on Solaris OS for PXE use, how to work in GRUB.

1. x86 machine boot process

First of all, x86 machine's BIOS must support Preboot eXecution Environment (PXE) -- that's a facility for doing network init basically before the actual boot occurs. How it works? Let's go through the boot process.
  1. After power on a machine, BIOS initializes processor, tests memory and initializes other I/O devices:
    Phoenix ServerBIOS 3 Release 6.0
    Copyright 1985-2002 Phoenix Technologies Ltd.
    All Rights Reserved
    Production RELEASE: System BIOS Revision = V1.34.4.2
    SP Interface (PRS) Revision = 97
    SP - BIOS Interface Active
    +==============================+
    | Sun Microsystems |
    | Sun Fire V20z |
    +==============================+
    CPU 0 = AMD Opteron(tm) Processor 244
    1 CPU Detected, E4
    PCIX - Slot1: PCIX-66 Slot2: PCIX-133
    672M System RAM Passed
    Press <F2> to enter SETUP Press <F12> to Network Boot
    On some machines it is possible to enable/disable PXE in BIOS, some machines display a boot menu or allow to modify a boot sequence, on some machines F12 key enters PXE (i. e. Network Boot).
    If you work through some management interface which cann't send directly F12 key to a console, then you can use Esc-Shift-2 (i. e. Esc-@) instead.
    Whatever the key is, somehow we do enter PXE ...

  2. PXE in turn uses Dynamic Host Configuration Protocol (DHCP) [2] to discover network configuration and find out where the bootable image is:
    Broadcom NetXtreme Ethernet Boot Agent v7.6.6
    Copyright (C) 2000-2004 Broadcom Corporation
    All rights reserved.

    Broadcom UNDI PXE-2.1 v7.6.6
    CLIENT MAC ADDR: 00 C0 9F 9E 40 09 GUID: F40A0DA0 197C 11DA 9FDF 0060B0B39DD0
    DHCP /
    In particular, PXE broadcasts DHCPDISCOVER message
    IP:   ----- IP Header -----
    IP:
    IP: Protocol = 17 (UDP)
    IP: Destination address = 255.255.255.255, BROADCAST
    IP:
    UDP: ----- UDP Header -----
    UDP:
    UDP: Source port = 68
    UDP: Destination port = 67 (BOOTPS)
    UDP:
    DHCP: ----- Dynamic Host Configuration Protocol -----
    DHCP:
    DHCP: Message type = DHCPDISCOVER
    DHCP: Requested Options:
    DHCP: 1 (Subnet Mask)
    DHCP: 2 (UTC Time Offset)
    DHCP: 3 (Router)
    DHCP: 5 (IEN 116 Name Servers)
    DHCP: 6 (DNS Servers)
    DHCP: 11 (RFC 887 Resource Location Servers)
    DHCP: 12 (Client Hostname)
    DHCP: 13 (Boot File size in 512 byte Blocks)
    DHCP: 15 (DNS Domain Name)
    DHCP: 16 (SWAP Server)
    DHCP: 17 (Client Root Path)
    DHCP: 18 (BOOTP options extensions path)
    DHCP: 43 (Vendor Specific Options)
    DHCP: 54 (DHCP Server Identifier)
    DHCP: 60 (Client Class Identifier =)
    DHCP: 67 (Option BootFile Name)
    DHCP: 128 (Site Option)
    DHCP: 129 (Site Option)
    DHCP: 130 (Site Option)
    DHCP: 131 (Site Option)
    DHCP: 132 (Site Option)
    DHCP: 133 (Site Option)
    DHCP: 134 (Site Option)
    DHCP: 135 (Site Option)
    DHCP: Maximum DHCP Message Size = 1260 bytes
    DHCP: Unrecognized Option = 97, length = 17 octets
    DHCP: Unrecognized Option = 93, length = 2 octets
    DHCP: Value = 0x00 0x00 (unprintable)
    DHCP: Unrecognized Option = 94, length = 3 octets
    DHCP: Value = 0x01 0x02 0x01 (unprintable)
    DHCP: Client Class Identifier = "PXEClient:Arch:00000:UNDI:002001"
    Such DHCP trace can be obtained by the snoop -v dhcp command on DHCP server machine, if it runs Solaris OS.
    Obviously DHCP server must be running in the same network segment as the PXE client machine.
    See "Configuring DHCP server on Solaris OS for PXE use" section for how to get DHCP worked.

  3. After receiving discover message, DHCP server offers some IP address together with options, telling about the network configuration and bootable image:
    DHCP: ----- Dynamic Host Configuration Protocol -----
    DHCP:
    DHCP: Hardware address type (htype) = 1 (Ethernet (10Mb))
    DHCP: Hardware address length (hlen) = 6 octets
    DHCP: Relay agent hops = 0
    DHCP: Transaction ID = 0xa09e4009
    DHCP: Time since boot = 4 seconds
    DHCP: Flags = 0x8000
    DHCP: Client address (ciaddr) = 0.0.0.0
    DHCP: Your client address (yiaddr) = 10.18.138.12
    DHCP: Next server address (siaddr) = 10.18.138.10
    DHCP: Relay agent address (giaddr) = 0.0.0.0
    DHCP: Client hardware address (chaddr) = 00:C0:9F:9E:40:09
    DHCP:
    DHCP: ----- (Options) field options -----
    DHCP:
    DHCP: Message type = DHCPOFFER
    DHCP: DHCP Server Identifier = 10.18.138.10
    DHCP: Site Option = 150, length = 16 octets
    DHCP: Value = "menu.lst.pxe"
    DHCP: Subnet Mask = 255.255.255.192
    DHCP: Router at = 10.18.138.1
    DHCP: Boot File Name = solaris/boot/grub/pxegrub
    If there are several DHCP servers running in a segment, and all of them offers something, then PXE most likely selects the first offer containing Boot Server Address and Boot File Name options. These options are mandatory and it is a minimal set of options must be specified for PXE, otherwise PXE ends up with an error.
    Boot Server Address can be found at DHCP: Next server address (siaddr) = 10.18.138.10, refer to RFC 2131 [4].

  4. Ok, PXE happy with an offer made, so send DHCPREQUEST informing everybody about its choise, get an acknowledgement and try to get "Boot File Name" file from "Boot Server Address" server via Trivial File Transfer Protocol (TFTP):
    DHCP: ----- Dynamic Host Configuration Protocol -----
    DHCP:
    DHCP: Message type = DHCPREQUEST
    DHCP: Requested IP Address = 10.18.138.12
    DHCP:
    DHCP: ----- Dynamic Host Configuration Protocol -----
    DHCP:
    DHCP: Message type = DHCPACK
    DHCP: DHCP Server Identifier = 10.18.138.10
    Loading bootable image ...
    Broadcom UNDI PXE-2.1 v7.6.6
    CLIENT MAC ADDR: 00 C0 9F 9E 40 09 GUID: F40A0DA0 197C 11DA 9FDF 0060B0B39DD0
    CLIENT IP: 10.18.138.12 MASK: 255.255.255.192
    ...
    TFTP server not necessarily have to be in the same subnet with client, it just must be reachable with respect to the network settings provided.
    In Solaris x86 world, bootable image basically is a GRand Unified Bootloader (GRUB) [3] program, more precisely its version for PXE, which is usually located at boot/grub/pxegrub in a Solaris OS distribution.
    Thus, assuming TFTP server is running at "Boot Server Address" (siaddr) and "Boot File Name" option points to a valid and accessible pxegrub program image:
    # tftp 10.18.138.10
    tftp> get solaris/boot/grub/pxegrub
    Received 120133 bytes in 0.1 seconds
    that case PXE should be able to download and run it.
    See "Configuring TFTP server on Solaris OS for PXE use" section for how to setup TFTP.

  5. pxegrub program running ...
    After start, pxegrub normally tries to load its menu from TFTP:
    • firstly it searches for menu.lst.{client_ID} file,
      which is basically menu.lst.01{mac_address} for ethernet networks,
      i. e. menu.lst.0100C09F9E4009 in our example;
    • if it's not there, it looks for the file name specified in DHCP option #150,
      in our example it is menu.lst.pxe;
    • next it looks for boot/grub/menu.lst file;
    • and if everything fails, GRUB just displays a command line.

    Menu file contains GRUB options and commands,
    for example following file will cause GRUB to display 2 menu items -- one for Solaris Single-user Session, another for Solaris Install, install_media is an NFS path to some Solaris OS distribution:
    default 0
    timeout -1
    title Solaris Single-user Session
    kernel /solaris/boot/multiboot kernel/unix -v -s -B console=ttya
    module /solaris/boot/x86.miniroot
    title Solaris Install
    kernel /solaris/boot/multiboot kernel/unix -v
    -B console=ttya,install_media=10.18.138.10:/export/install/i86pc/os/nv/40
    module /solaris/boot/x86.miniroot
    Go for "Solaris Single-user Session", just want to get Solaris OS shell ...
    See "In the GRUB" section for details about GRUB commands and functions.

  6. OS startup
    GRUB loads OS kernel, modules, and passes control to the OS kernel ...
    SunOS Release 5.11 Version snv_40 32-bit
    Copyright 1983-2006 Sun Microsystems, Inc. All rights reserved.
    Use is subject to license terms.
    Booting to milestone "milestone/single-user:default".
    Configuring devices.
    Searching for installed OS instances...
    Starting shell.
    #
Here we go, boot process complete.

2. Configuring TFTP server on Solaris OS for PXE use

Trivial FTP server in.tftpd (1M) on Solaris OS is one of the inetd (1M) controlled services in Service Management Facility smf (5).
So, TFTP service should be defined in SMF repository and should be up and running, checking that ...
# svcs -a | grep -i tftp
Ok, there is no TFTP service, so firstly define it in /etc/inetd.conf file:
# TFTPD - tftp server (primarily used for booting)
tftp dgram udp6 wait root /usr/sbin/in.tftpd in.tftpd -s /tftpboot
where /tftpboot defines a root directory for the TFTP service, that's where bootable images should be.
Then /etc/inetd.conf definitions need to be converted into service manifests and imported into the SMF repository, due to /etc/inetd.conf file is maintained just for the legacy programs, while effective service parameters are in SMF repository:
# inetconv 
tftp -> /var/svc/manifest/network/tftp-udp6.xml
Importing tftp-udp6.xml ...Done
Well, actually inetconv not always imports manifests generated, even if the message "Importing tftp-udp6.xml ...Done" exists, so that case generated manifest file can be imported manually into the SMF repository:
# svccfg -v import /var/svc/manifest/network/tftp-udp6.xml
svccfg: Taking "initial" snapshot for svc:/network/tftp/udp6:default.
svccfg: Taking "last-import" snapshot for svc:/network/tftp/udp6:default.
svccfg: Refreshed svc:/network/tftp/udp6:default.
svccfg: Successful import.
Now tftp/udp6 service should appear in services list as an online service:
# svcs tftp/udp6
STATE STIME FMRI
online 16:22:30 svc:/network/tftp/udp6:default

# inetadm -l tftp/udp6
SCOPE NAME=VALUE
name="tftp"
endpoint_type="dgram"
proto="udp6"
isrpc=FALSE
wait=TRUE
exec="/usr/sbin/in.tftpd -s /tftpboot"
user="root"
By the way, service can be removed by svccfg delete tftp/udp6 command.
Simple removal of the corresponding line from /etc/inetd.conf file, even with execution of inetconv after it and with the inetd service restart, won't help. The next thing is which files should be in /tftpboot in order to support the other machines boot?
A minimal set is:
  • boot/grub/pxegrub -- GRand Unified Bootloader (GRUB) program image for PXE use;
  • boot/multiboot -- kernel ELF executable, that's whom GRUB supposes to pass the control after boot command;
  • boot/x86.miniroot -- module, provides some small (minimal) functionality of Solaris OS, and needs to be loaded together with kernel before boot.
It's also useful to have GRUB menu file, otherwise you'll have to type all GRUB commands manually. A boot server simply may be setted up for example by:
  • Mounting Solaris OS distribution on /tftpboot/solaris;
  • Creating /tftpboot/menu.lst.pxe file for providing PXE clients with a menu containing 2 items -- one for Solaris Single-user Session, another for Solaris Install,
    install_media is an NFS path to the Solaris OS distribution:
    default 0
    timeout -1
    title Solaris Single-user Session
    kernel /solaris/boot/multiboot kernel/unix -v -s -B console=ttya
    module /solaris/boot/x86.miniroot
    title Solaris Install
    kernel /solaris/boot/multiboot kernel/unix -v
    -B console=ttya,install_media=10.18.138.10:/export/install/i86pc/os/nv/40
    module /solaris/boot/x86.miniroot
See "In the GRUB" section for details about GRUB commands and functions.

3. Configuring DHCP server on Solaris OS for PXE use

Assume that DHCP [2] server in.dhcpd (1M), which is actually an smf (5) service, is unconfigured and therefore disabled.
# dhcpconfig -Sq
dhcpconfig: Error - failed to read DHCP server parameters.

# svcs dhcp-server
STATE STIME FMRI
disabled 12:42:43 svc:/network/dhcp-server:default
So, firstly DHCP must be initialized. Simpliest way of doing that is to create DHCP configuration database as a number of plain text files somewhere in /var/dhcp:
# dhcpconfig -D -r SUNWfiles -p /var/dhcp
Created DHCP configuration file.
Created dhcptab.
Added "Locale" macro to dhcptab.
Added server macro to dhcptab.
DHCP server started.
By the way dhcpconfig -U unconfigures everything.
Wondering why dhcpconfig -D starts DHCP service automatically after initialization, anyway dhcpconfig -Sd or svcadm -v disable dhcp-server stops it. Eventually we have DHCP initialized, now couple of words about what DHCP server does on Solaris OS after receiving DISCOVER message from its clients.
The scenario is following:
  1. scan the network definition table (refer to dhcp_network (4) and pntadm (1M) manual);
  2. find there IP address available for the client, hardware address of which is in a DISCOVER message;
  3. expand corresponding macros ("macro" basically is the set of DHCP options with its values, refer to dhtadm (1M) and dhcptab (4) pages),
    usually these macros are:
    • auto-expanding macros:
      • macro with the name equals to "Client Class Identifier",
        for example PXE normally specifies "PXEClient:Arch:00000:UNDI:002001" as a "Client Class ID" in a DISCOVER message, so DHCP server configuration may have "PXEClient:Arch:00000:UNDI:002001" macro, which will be expanded for all PXE clients;
      • macro named as a Network Address, which obviously expands for all clients of this network;
      • macro named as a Client ID, which expands only for a particular client.
        In Ethernet networks a Client ID is "01{MAC_address}", for example Client ID 0100C09F9E4009 corresponds to 00:C0:9F:9E:40:09 MAC address.
    • macro associated with the given address in the network definition table, see pntadm -P {network name}.
  4. respond with the OFFER message containing all of the options from macros expanded.
Mapping between symbolic DHCP option names and numeric DHCP option IDs can be found in /etc/dhcp/inittab file and in dhcptab (4) table, i. e. custom symbols definition can be viewed by the dhtadm -P command. Finally, in order to make PXE working, we need DHCP service to be responding with:
  • network configuration parameters, e. g. IP address, Subnet, Router, ...;
  • and boot parameters, minimal set of those is "boot server address" and "bootable image file name".
Usually the pxegrub is used as a bootable image, that case additional DHCP option #150 value can be populated with a "GRUB menu file name". There are lots of ways of making DHCP configuration, the example of one of them is:
if
  • the network is 10.18.138.0/26;
  • router there is 10.18.138.1;
  • TFTP server is running at 10.18.138.10;
  • bootable GRUB image is /tftpboot/solaris/boot/grub/pxegrub on this TFTP server,
    where /tftpboot is a TFTP root directory;
  • GRUB menu file we'd like to use is /tftpboot/menu.lst.pxe;
  • MAC address of the workstation we want to boot is 00:C0:9F:9E:40:09;
  • and we want give 10.18.138.12 address to this workstation.
then
  1. check /etc/netmasks file, if the network is subnetted, it's better to reflect that in /etc/netmasks:
    # grep 10.18.138.0 /etc/netmasks 
    10.18.138.0 255.255.255.192

  2. configure 10.18.138.0/26 network at DHCP:
    # dhcpconfig -N 10.18.138.0/26 -m 255.255.255.192 -t 10.18.138.1
    Added network macro to dhcptab.
    Created network table.
    which is equals to:
    # dhtadm -A -m 10.18.138.0 -d /
    :Subnet=255.255.255.192:Router=10.18.138.1:
    # pntadm -C 10.18.138.0

  3. define custom symbol "GrubMenu" for DHCP option #150
    # dhtadm -A -s GrubMenu -d Site,150,ASCII,1,0
    where
    Site category means that option is specific for the site,
    150 code reflects DHCP option code,
    ASCII data type states that value is a text string,
    granularity 1 and maximum 0 means that value should consist of 1 text string and "GrubMenu" option may contain unlimited number of values.

  4. define "netboot" macro with appropriate boot parameters:
    # dhtadm -A -m netboot -d / 
    :BootSrvA=10.18.138.10:BootFile="solaris/boot/grub/pxegrub":GrubMenu="menu.lst.pxe":

  5. add client definition to the network table
    # pntadm -A 10.18.138.12 -i 0100C09F9E4009 -m netboot 10.18.138.0

  6. start DHCP server
    # dhcpconfig -Se
    DHCP server enabled.
    DHCP server started.
Now the machine with PXE support having 00:C0:9F:9E:40:09 hardware address and connected to our subnet should be able to boot from network. Actually there is GUI tool exists in Solaris OS for the DHCP management /usr/sadm/admin/bin/dhcpmgr, so it can be used by GUI fans.

4. In the GRUB

GNU GRUB  version 0.95  (616K lower / 2095552K upper memory)                

[ Minimal BASH-like line editing is supported. For the first word, TAB
lists possible command completions. Anywhere else TAB lists the possible
completions of a device/filename. ESC at any time exits. ]

grub>

Independently from how do we entering commands, either by menu or typing them in "grub>" prompt, whatever the case, main GRUB purpose is:
  • identify root device, i. e. where to load the kernel/modules from:
    • it can be local device, even if GRUB has been initially launched from the network,
      for example (hd0,0,a) -- (first BIOS hard disk - hd0, first fdisk partition - 0, Solaris/BSD slice 0 - a):
      grub> root (hd0,0,a)
      (hd0,0,a): Filesystem type is ufs, partition type 0xbf
    • or it can be the network device, even if GRUB has been launched from some local device.
      In other words, if for instance we have Solaris OS installed already, but for some reason want to load kernel from the network,
      or we don't have PXE, so want to start GRUB from floppy and then get kernel from the network,
      that case we can:
      • firstly initialize network device,
        either using DHCP
        grub> dhcp
        Probing pci nic...
        [UNDI] ROM MBA v7.6.6 Slot 0211 by Broadcom Corporation
        Address: 10.18.138.12
        Netmask: 255.255.255.192
        Server: 10.18.138.10
        Gateway: 10.18.138.1
        or do that manually
        grub> ifconfig --address=10.18.138.12 --gateway=10.18.138.1
        --mask=255.255.255.192 --server=10.18.138.10
      • then set network device (nd) as a root device
        grub> root (nd)
        Filesystem type is tftp, using whole disk
      Additionally, it is possible to change just TFTP server address using tftpserver command;

  • load the OS kernel, which is ELF executable, e. g.
    grub> kernel /solaris/boot/multiboot kernel/unix -v -s -B console=ttya
    [Multiboot-elf, <0x1000000:0x13ca3:0x128a5>]
    will load solaris/boot/multiboot executable in 32 bit mode (kernel/unix), specifying options: verbose (-v) and single-user (-s) to the kernel, see kernel (1M) manual.
    Option -B console=ttya may be needed if you access console via some serial-line management interface;

  • load gzip-compressed modules, which need to be loaded before root filesystem mount, as an example
    grub> module solaris/boot/x86.miniroot
    [Multiboot-module]
    will load minimal set of modules required to start an operating system;

  • pass control to the kernel
    grub> boot
NOTE: it is possible to tell the kernel start an OS installation process by specifying path to the Solaris OS distribution media via install_media parameter, which can have value of "cdrom" or some valid NFS path of Solaris OS distribution, e. g.
grub> kernel /solaris/boot/multiboot kernel/unix -v 
-B console=ttya,install_media=10.18.138.10:/export/install/i86pc/os/nv/40