Hardware Validation and Burn-In

When building a new machine or adding new disks, it is wise to do some hardware validation and burn-in testing. This can verify that the machine has sufficient cooling and that disks have no obvious faults before loading all of your data onto it.

The s-tui and stress-ng packages are good for stress testing the CPU and thermal behavior under load. The s-tui application brings up a nice text-mode summary of all processors, loads, and temperatures while saturating the CPU so you can see where temperatures stabilize.

% apt install stress-ng
% apt install s-tui
% apt install tmux

% s-tui

The first phase of testing drives is to run the SMART tests. First run a short self-test, followed by a conveyance test. These should only take a few minutes to complete. Then finally, run the long test, which will take quite some time to complete.

% smartctl -t short /dev/adaX
% smartctl -t conveyance /dev/adaX
% smartctl -t long /dev/adaX

If the drive is not in service yet, use badblocks to do a data-destructive test of the entire disk. This may take a day or two for larger disks

Run tmux to get several virtual consoles that you can run copies in parallel, and reconnect if you lose your SSH connection. Type Ctrl+B " (double-quote) to split the screen and make a new shell and Ctrl+B ; to move between shells. Type tmux attach to connect to a running session.

Run badblocks on the drive. THIS WILL DESTROY ANY DATA ON THE DISK. It will write a series of values to every location on the disk. For drives larger than 2TB, it may be necessary to specify a larger block size for the test.

% badblocks -ws /dev/adaX
% badblocks -b 4096 -ws /dev/adaX   # bigger block size
% badblocks -b 8192 -ws /dev/daX    # for +14TB

When done, assuming everything completed successfully, get a SMART report and look for anything unusual in the RAW VALUE results.

% smartctl -A /dev/adaX