PXE Very Slow During the installation.

Hi all,

I have a problem with PXE Server.
I have finished to configure PXE Server with DHCP.
when I create a new Machine on my vSphere and turn it on the installation process takes more then 2 hours .

I have configured on my ESX Server new Switch that is dedicated for this installation (closed Switch), and I've added the PXE Sever and the new server to this Switch.

while booting the new server, the PXE stages are very slow.
when getting to the installation progress (I have choose @Everything in the ks.cfg) its show me more then 2hours of installation of full RHEL installation.
anyone have an idea what can cause to the slow installation.

Thx,

I've added my dhcpd.conf and my tftp and .vmx.

#____ DHCP CONFIGURATION FILE.
allow bootp;
allow booting;

default-lease-time 86400;
max-lease-time 604800;

authoritative;

subnet 172.16.50.0 netmask 255.255.255.0 {
        range 172.16.50.20 172.16.50.50;
        filename "/tftpboot/pxelinux.0";
        next-server 172.16.50.10;
        option subnet-mask 255.255.255.0;
}

#____ TFTP CONFIGURATION FILE 
# /etc/default/tftpd-hpa
TFTP_USERNAME="tftp"
TFTP_DIRECTORY="/tftpboot/"
TFTP_ADDRESS="172.16.50.10:69"
RUN_DAEMON="yes"
OPTIONS="-c -l -s /tftpboot"

#____ VM MACHINE CONFIGURATION FILE
#!/usr/bin/vmware
.encoding = "UTF-8"
config.version = "8"
virtualHW.version = "7"
memsize = "1024"
numvcpus = "1"
displayName = "rheltest"
guestOS = "rhel5"

ide0:0.present = "TRUE"
ide0:0.deviceType = "cdrom-raw"
ide0:0.autodetect = "TRUE"
ide0:0.startConnected = "false"

extendedConfigFile = "rheltest.vmxf"
virtualHW.productCompatibility = "hosted"
tools.upgrade.policy = "manual"

uuid.location = "56 4d d2 cc 1a 8a 3c 60-36 c8 8f 0f 71 63 35 7c"
uuid.bios = "56 4d d2 cc 1a 8a 3c 60-36 c8 8f 0f 71 63 35 7c"
cleanShutdown = "FALSE"
replay.supported = "FALSE"
sched.swap.derivedName = "/vmfs/volumes/47b17e0f-f7a4e222-1b37-0010f30f45e4/rheltest/rheltest-1f2189a6.vswp"
vmotion.checkpointFBSize = "4194304"
tools.remindInstall = "TRUE"
hostCPUID.0 = "0000000a756e65476c65746e49656e69"
guestCPUID.0 = "0000000a756e65476c65746e49656e69"
userCPUID.0 = "0000000a756e65476c65746e49656e69"
hostCPUID.1 = "000006f6000208000004e33dbfebfbff"
guestCPUID.1 = "000006f600010800800022010febfbff"
userCPUID.1 = "000006f6000208000004e33dbfebfbff"
hostCPUID.80000001 = "00000000000000000000000120100800"
guestCPUID.80000001 = "00000000000000000000000120100800"
userCPUID.80000001 = "00000000000000000000000120100800"
evcCompatibilityMode = "FALSE"

#ethernet0.virtualDev = "vlance"
tools.syncTime = "FALSE"
debugStub.linuxOffsets = "0x0,0xffffffff,0xc65ee28,0x0,0xc65ee3c,0x0,0xfc05215c,0xffffffff,0x0,0x0,0xc65ee08 ,0x0,0xc65ee18,0x0"

scsi0.present = "TRUE"
scsi0:0.present = "TRUE"

scsi0.sharedBus = "none"
scsi0.virtualDev = "lsilogic"
scsi0:0.fileName = "rheltest.vmdk"
scsi0:0.deviceType = "scsi-hardDisk"

scsi0:0.redo = ""
scsi0.pciSlotNumber = "16"

Ethernet0.present = "TRUE"

ethernet0.virtualDev = "e1000"
ethernet0.networkName = "VM Network 2"
ethernet0.addressType = "generated"
floppy0.present = "FALSE"

ethernet0.generatedAddress = "00:0c:29:63:35:7c"
ethernet0.pciSlotNumber = "17"
ethernet0.generatedAddressOffset = "0"

If I'm understanding you right, it's not the PXE communication that's happening slowly, but everything after it?

Hi,
Thanks for the reply.
yes and no,
When I'm creating new VM Machine and PowerOn it.
It's take more the 1 min to load the stage2.img file from the PXE.

My first thought is the network, maybe a duplex mismatch. I don't see anything in the config file that would make it slow.

DHCP/BOOTP/PXE, while theoretically nearly instantaneous, can in fact be annoyingly slow because servers and clients these days do a lot of extra checking with ARP to make sure the address they've been assigned isn't yet inhabited.

How large is stage2.img anyway?

Hi all,
Thanks for the reply.

first - the size of the stage2.img is 87m so if we do the math it's should download it in a few sec with giga nic.

second - in this ESX box, I have only those two machine.
I have also created a specific switch in the ESX for this propose and I didnt connected it to nic because I don't want him to have a lag to the LAN.

third - I have stopped the unnecessary service in my Debian to isolate the problem but still working like a turtle.

TFTP can be quite slow because it uses UDP and default packet sizes are typically 512-bytes. Try increasing the packet size.

hi,

I know UDP is quite slow, but this is not the problem .
the problem that you can start 5 test of system installation via PXE each run can finish after 2h that's wrong it's should take no more then 90 min.

How do you know? Did you implement any of the suggested fixes?

By what arithmetic do you arrive at this answer?