Archive for October, 2014

De-bloating the Dell Server Update Utility (SUU) DVD Image

Dell issues a quarterly Server Update Utility (SUU) image which is used to update most firmware on PowerEdge servers (and some other Dell devices). As I use FreeBSD on my servers (which is not supported by Dell) I have to boot the Dell CDU CD to get a standalone Linux system suitable for launching SUU. Unfortunately, the SUU ISO image has become increasingly bloated over time, and is now too big to either burn to a double-layer DVD or upload to the 8GB vFlash card in the iDRAC. I suppose there’s some method for dealing with this if you’re running a Dell-supported operating system, but us FreeBSD users are left out. Here is a list of the last 4 quarters of SUU images, showing their sizes:

01/03/2014 08:08 AM 7,986,208,768 SUU_740_Q42013_A00.ISO
04/13/2014 08:00 AM 8,434,493,440 SUU_14.03.00_A00.iso
07/26/2014 06:36 AM 9,057,501,184 SUU_14.07.00_A00.ISO
10/21/2014 03:23 AM 9,922,859,008 SUU_14.10.200.117.iso

The main part of the bloat is that the disc contains two versions of every update utility, one for Linux systems and one for Windows systems. Since the CDU provides a Linux system, we can delete all of the Windows files with no impact. I found it easiest to copy the entire SUU DVD to a scratch directory and then delete all the .exe files from the \repository directory. There’s quite a few of them:

F:\repository>dir *.exe
Volume in drive F is SUU743_117
Volume Serial Number is 442E-5D5D

Directory of F:\repository

[snip]

400 File(s) 5,490,684,272 bytes
0 Dir(s) 0 bytes free

Once I deleted these un-necessary files, I burned the remaining files (preserving the directory structure) to a DVD (a single layer DVD is now sufficient) with ImgBurn. There are more Windows files in other directories (for example, a Java runtime) but it isn’t necessary to delete those to get the size below the limit of a single layer DVD. Booting CDU and then switching to my modified SUU disc worked fine, and installed the few updates I was missing on my PowerEdge R710.

I don’t know why Dell doesn’t create separate SUU ISO images for Windows and Linux – it would cut people’s download times in half. Until they decide to do something, the above method should give you a usable SUU DVD.

Troubleshooting Catalyst 4948-10GE red status LED and no console output

This is not intended as a complete DIY. It requires equipment most of my readers won’t have, such as a hot air PCB rework station with magnifier. I am posting it to give you an idea of what is involved in the repair of these devices, and to provide info to any readers who do have the necessary equipment and just need to know the repair procedure.

I have been encountering more and more dead Catalyst 4948-10GE switches lately. These usually have a solid red Status LED and do not display any messages on the console when power is applied. This means that the switch did not make any progress at all in booting (one of the first steps in the boot process is to change the Status LED from red to orange). Catalyst 4948-10GE switches with this type of fault are frequently listed on eBay in the $250-$350 price range (usually marked “For parts or not working”). When troubleshooting these, the problem is often defective memory. Unfortunately, this memory is soldered to the circuit board in the switch, so it isn’t simply a matter of removing a faulty memory module and replacing it with a known good one. The old memory needs to be de-soldered and new memory soldered in, and you need to have a source for the obsolete memory chips needed for replacements.

These switches have 256MB of ECC memory, implemented via 5 32MB x 16-bit memory chips such as the Micron MT46V32M16-6T F. Three of the chips are located on the top side of the motherboard next to the power supply, and another two are located on the underside of the board (all images in this post can be clicked for a larger version):

Top side of board

Top side of board

Bottom side of board

Bottom side of board

In each of the boards I have repaired, the fault has always been in one of the bottom two chips. This makes sense as there is no airflow across the bottom of the board, so those chips are more likely to overheat than the ones on the top of the board. Cisco has announced an issue with an unspecified memory supplier (often rumored to be Micron), and the Catalyst 49xx family is on that list. However, the switches that I am seeing failures on are not on a Cisco support contract, and I haven’t read anything about Cisco fixing equipment not on a support contract for free, so I’ve been repairing them myself.

The first step is to remove the two existing memory chips from the underside of the board and clean and prepare the board for the new chips:

Memory removed

The next step is to solder the replacement chips into place:

New memory installed

Of course, you need to ensure that the chips are installed in the correct orientation (of course!) and that all pins are soldered to their respective pads (66 pins per chip) and that there are no shorts between pins. You also need to avoid damaging any of the neighboring components or the circuit board itself while doing this.

If all goes well, when you reinstall the board in the chassis and apply power, you will be greeted with the appropriate console messages and the switch will boot up normally. If not, remove the board and examine the area around the replaced chips under a magnifier to double check for bad connections or solder bridges.