hard-drive interview questions
Top hard-drive frequently asked interview questions
IBM still develop and sell tape drives today. The capacity of them seems to be on a par with today's hard drives, but the search time and transfer rate are both significantly lower than that of hard drives.
So when is tape drives preferable to hard drives (or SSDs) today?
Source: (StackOverflow)
I have an old hard disk (Maxtor 250Gb) from about 3 years ago that started giving errors and now sits in a draw in my desk. It has some confidential data on it but it's unlikely that it can be read because the disk started to go bad. However, before I dispose of it I want to make sure that the data can't be recovered by destroying the disk.
What is the best way to destroy the disk such that the data can't be read? (I live in Arizona and was thinking of leaving it in the yard when we have those 125 F days...?)
What is the best way to dispose of the disk after it's destroyed? (I believe that it's environmentally unsound to chuck it in the trash.)
Source: (StackOverflow)
This is a Canonical Question about RAID levels.
What are:
- the RAID levels typically used (including the RAID-Z family)?
- deployments are they commonly found in?
- benefits and pitfalls of each?
Source: (StackOverflow)
I've got an application which requires data recording in a outdoor environment, and I am interested in the reliability of SSDs vs HDD when placed in a cold (down to -20) and hot (+50) ambient environments. Intuition leads me to believe SSDs will be more reliable, with the possible exception of high temperatures. Air conditioning enclosures is not an option.
Does anyone have any information on disk reliability in these situations?
Source: (StackOverflow)
man smartctl
states (SNIPPED for brevity):
The first category, called "online" testing. The second category of testing is called "offline" testing. Normally, the disk will suspend offline testing while disk accesses are taking place, and then automatically resume it when the disk would otherwise be idle. The third category of testing (and the only category for which the word ´testing´ is really an appropriate choice) is "self" testing.
Enables or disables SMART automatic offline test, which scans the drive every four hours for disk defects. This command can be given during normal system operation.
Who runs the test - drive firmware? What sort of tests are these - does the firmware read/write to disk - what exactly goes on? Is it safe to invoke testing whilst in the OS (linux) or can one schedule a test for later - how does this take place - when you reboot the OS at the BIOS prompt ('offline test')? Where are the results displayed - SMART logs?
Source: (StackOverflow)
On Linux I can use smartctl
to get a hard drive's vendor, model, firmware revision and serial number:
# smartctl -a /dev/sdb
smartctl 5.40 2010-07-12 r3124 [x86_64-unknown-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
Device: SEAGATE ST9300603SS Version: 0006
Serial number: 6SE1ZCSR0000B121LU63
Device type: disk
Transport protocol: SAS
Is the hard drive's serial number (here 6SE1ZCSR0000B121LU63
) guaranteed to be globally unique? Is it only unique for a specific vendor? Or even a specific model?
Source: (StackOverflow)
A friend is talking with me about the problem of bit rot - bits on drives randomly flipping, corrupting data. Incredibly rare, but with enough time it could be a problem, and it's impossible to detect.
The drive wouldn't consider it to be a bad sector, and backups would just think the file has changed. There's no checksum involved to validate integrity. Even in a RAID setup, the difference would be detected but there would be no way to know which mirror copy is correct.
Is this a real problem? And if so, what can be done about it? My friend is recommending zfs as a solution, but I can't imagine flattening our file servers at work, putting on Solaris and zfs..
Source: (StackOverflow)
I will begin by stating that I do not believe this is a duplicate of Why is business-class storage so expensive?.
My question is specifically about SAS drive enclosures, and justifying their expense.
Examples of the types of enclosures I'm referring to are:
- 1 HP D2700
- 2 Dell MD1220
- IBM EXP3524
Each of the above is a 2U direct attached external SAS drive enclosure, with space for around 24 X 2.5" drives.
I'm talking about the bare enclosure, not the drives. I am aware of the difference between enterprise class hard drives and consumer class.
As an example of "ball-park" prices, the HP D2700 (25 X 2.5" drives) is currently around $1750 without any drives (checked Dec 2012 on Amazon US). A low end HP DL360 server is around $2000, and that contains CPU, RAM, motherboard, SAS RAID controller, networking, and slots for 8 X 2.5" drives.
When presenting clients or management with a breakdown of costs for a proposed server with storage, it seems odd that the enclosure is a significant item, given that it is essentially passive (unless I am mistaken).
My questions are:
Have I misunderstood the components of a SAS drive enclosure? Isn't it just a passive enclosure with a power supply, SAS cabling, and space for lots of drives?
Why is the cost seemingly so expensive, especially when compared to a server. Given all the components that an enclosure does not have (motherboard, CPU, RAM, networking, video) I would expect an enclosure to be significantly less expensive.
Currently our strategy when making server recommendations to our clients is to avoid recommending an external drive enclosure because of the price of the enclosures. However, assuming one cannot physically fit enough drives into the base server, and the client does not have a SAN or NAS available, then an enclosure is a sensible option. It would be nice to be able to explain to the client why the enclosure costs as much as it does.
Source: (StackOverflow)
What are the pro's and con's of consumer SSDs vs. fast 10-15k spinning drives in a server environment? We cannot use enterprise SSDs in our case as they are prohibitively expensive. Here's some notes about our particular use case:
- Hypervisor with 5-10 VM's max. No individual VM will be crazy i/o intensive.
- Internal RAID 10, no SAN/NAS...
I know that enterprise SSDs:
- are rated for longer lifespans
- and perform more consistently over long periods
than consumer SSDs... but does that mean consumer SSDs are completely unsuitable for a server environment, or will they still perform better than fast spinning drives?
Since we're protected via RAID/backup, I'm more concerned about performance over lifespan (as long as lifespan isn't expected to be crazy low).
Source: (StackOverflow)
I am using Ubuntu 12.04 and can't write to any file, even as root, or do any other operation that requires writing. Neither can any process that needs to write, so they're all failing. df
says I've got plenty of room:
Filesystem Size Used Avail Use% Mounted on
/dev/xvda1 30G 14G 15G 48% /
udev 984M 4.0K 984M 1% /dev
tmpfs 399M 668K 399M 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 997M 0 997M 0% /run/shm
All of the results I find for "can't write to disk" are about legitimately full disks. I don't even know where to start here. The problem appeared out of nowhere this morning.
PHP's last log entry is:
failed: No space left on device (28)
Vim says:
Unable to open (file) for writing
Other applications give similar errors.
After deleting ~1gb just to be sure, the problem remains. I've also rebooted.
df -i
says
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/xvda1 1966080 1966080 0 100% /
udev 251890 378 251512 1% /dev
tmpfs 255153 296 254857 1% /run
none 255153 4 255149 1% /run/lock
none 255153 1 255152 1% /run/shm
Source: (StackOverflow)
People keep telling me that in order to improve an SQL server's performance, buy the fastest hard disks possible with RAID 5, etc.
So I was thinking, instead of spending all the money for RAID 5 and super-duper fast hard disks (which isn't cheap by the way), why not just get tonnes of RAM? We know that an SQL server loads the database into memory. Memory is wayyyy faster than any hard disks.
Why not stuff in like 100 GB of RAM on a server? Then just use a regular SCSI hard disk with RAID 1. Wouldn't that be a lot cheaper and faster?
Source: (StackOverflow)
The main advantage of SSD drives is better performance. I am interested in their reliability.
Are SSD drives more reliable then normal hard drives? Some people say they must be because they have no moving parts, but I am concerned about the fact that this is a new technology that is possibly not completely matured yet.
Source: (StackOverflow)
Is it safe to backup data to a hard drive and then leave it for a number of years?
Assuming the file system format can still be read, is this a safe thing to do. Or is it better to continually rewrite the data (every 6 months or so) to make sure it remains valid?
Or is this a stupid question?
Source: (StackOverflow)
Google did a very thorough study on hard drive failures which found that a significant portion of hard drives fail within the first 3 months of heavy usage.
My coworkers and I are thinking we could implement a burn-in process for all our new hard drives that could potentially save us some heartache from losing time on new, untested drives. But before we implement a burn-in process, we would like to get some insight from others who are more experienced:
- How important is it to burn in a hard drive before you start using it?
- How do you implement a burn-in process?
- How long do you burn in a hard drive?
- What software do you use to burn in drives?
- How much stress is too much for a burn-in process?
EDIT:
Due to the nature of the business, RAIDs are impossible to use most of the time. We have to rely on single drives that get mailed across the nation quite frequently. We back up drives as soon as we can, but we still encounter failure here and there before we get an opportunity to back up data.
UPDATE
My company has implemented a burn-in process for a while now, and it has proven to be extremely useful. We immediately burn in all new drives that we get in stock, allowing us to find many errors before the warranty expires and before installing them into new computer systems. It has also proven useful to verify that a drive has gone bad. When one of our computers starts encountering errors and a hard drive is the main suspect, we'll rerun the burn-in process on that drive and look at any errors to make sure the drive actually was the problem before starting the RMA process or throwing it in the trash.
Our burn-in process is simple. We have a designated Ubuntu system with lots of SATA ports, and we run badblocks in read/write mode with 4 passes on each drive. To simplify things, we wrote a script that prints a "DATA WILL BE DELETED FROM ALL YOUR DRIVES" warning and then runs badblocks on every drive except the system drive.
Source: (StackOverflow)