Types of SCSI Drives
By Erik Rodriguez
This article describes the use, purpose, and configuration of a SCSI drives.
What is SCSI?
SCSI (pronounced "scuzzy") stands for Small Computer System Interface. It has been around for ages and is arguably the best type of drive used in servers. The difference between SCSI and IDE or SATA lies in the communication process of the drives and the CPU. SCSI uses different standards to move data that offers speed and reliability. While newer formats have risen such as ATA 133 and SATA 150, SCSI has several major advantages for a server environment. Like everything else, it also has some drawbacks. The table below outlines the main pro's and con's of SCSI technology.
Pro's |
Con's |
Read/Write Simultaneously |
Price |
High Spindle Speed |
Noise Level |
High Cache |
Lower Data Capacity |
High Life Expectancy |
Controller Card Required |
Highly Reliable |
Physical Dimensions |
Very Low Latency |
Delayed Start up (Spin Up) |
Pro's
SCSI drives can read and write simultaneously. Standard IDE drives cannot perform this task as they must: write--stop--read, write--stop--read, etc. This is a serious advantage because servers are constantly transmitting and saving data simultaneously. If SCSI drives were unable to perform this way, clients accessing the server would notice a delay in saving and accessing files.
SCSI also has an advantage in the spindle rate. Most modern IDE and ATA drives operate at 7200 RPMs. Slower drives run at 5400 PRMs, and SATA Raptor drives run at 10,000 RPM! Most SCSI drives (decent ones) operate at 10,000 RPMs or faster. The more expensive drives even run at 15,000 RPMs. The faster rotation speed of the internal disks helps create a low latency when accessing data.
IDE and SATA drives today usually contain 8 MB of cache (buffer). The basic or lower-end drives have 2 or 4 MB. Some SCSI drives have as much as 16 MB of cache. This, combined with the fact that some SCSI controller cards can contain 128 MB of host-based cache, makes SCSI an attractive solution for data access. Remember that anything chip-based is quickly read by the controller cards' CPU and processed immediately. This is more efficient than what the small cache IDE drives have to work with, which quickly fills up. This means the main CPU is doing more work by reading and unloading the IDE drive cache.
Remember that SCSI drives are designed for a server environment. They are tested under high loads in the design phase to ensure durability in production for the end user. The seagate cheetah drives I use in my servers have a Mean Time Between Failure (MTBF) of 1,200,000 hours. I don't know how they arrive at this figure, but basically, it means the drive is designed to live a long time. Every drive does tend to go out from time to time. Sometimes this happens for unknown reasons or may be caused by poor management by a system administrator. In any case, hard drive manufacturers value customer loyalty. It is not in their best interest to produce junk and move units. They want SCSI drives to perform as expected.
The fastest SCSI drives that operate at 15,000 RPM's have latency times as low as 2 ms. 10,000 RPM Raptor SATA drives have latency times as low as 2.29 ms, and standard IDE drives running at 7200 RPM's have an average latency of 4.2 ms. The low latency is a result of large amounts of cache, high spindle speed, and drive geometry.
Con's
The maximum capacity available with this technology is 300GB. 300GB SCSI drives have been out for nearly a decade, and they are still expensive. Newer SAS technology has improved drive capacity to 2TB. Although, SAS is not a cheap alternative either. There is a very big price jump from SATA to SAS.
It only takes a few 10,000 RPM drives and a few high-output case fans to bring a rack mount server up to the 45-50 dB level. A normal conversation is around 60 dB, so 45-50 is more than noticeable.
Types of SCSI
There are many types of SCSI drives. Each drive requires a special type of host adapter to work with your machine. Old SCSI technology (SCSI-1, SCSI-2, FAST 20) isn't really used anymore. There are several types of SCSI technologies in common use:
- Wide Ultra SCSI
- Wide Ultra 2 SCSI
- Ultra 160 SCSI (abbreviated U160)
- Ultra 320 SCSI (abbreviated U320)
Okay, so the numbers obviously signify the maximum speeds attainable by these drives. Here is the fact of the matter: the figures the manufacturer publishes are maximum values you may see if you live on Mars. Just like everything else in the computer world, you're lucky to get half of what they claim. With that being said, I'll explain the speeds of these technologies.
- Wide Ultra =40MB/sec
- Wide Ultra2 = 80MB/sec
- Wide Ultra3 (Ultra 160) = 160MB/sec
- Wide Ultra3 (Ultra 320) = 320MB/sec
Note that LVD stands for Low Voltage Differential, and is simply a way the drive communicates with the machine. There are drives labeled "U160 LVD" or "U320 LVD." Don't be confused by this term!
SCSI Connector Types
Standard IDE drives use a 40 pin female cable. Each IDE hard drive has the same universal connector for supplying data. The standard molex power connector fits next to the data pins. Something like this:
SCSI drives come in all different configurations. There are all different types of connectors depending on the drive specs or design of the drive. There are several
types of SCSI connectors in common use:
- Ultra-Wide SCSI-1 (50 pin)
- Ultra-Wide SCSI-3 (68 pin)
- SCA SCSI (80 pin)
The two most common drives used in servers today are SCSI-3 and SCA SCSI. SCA drives are used in high-end servers or servers that require hot-swappable capability.
SCA SCSI
SCA stands for single connector attachment. This is done by using 80 pins to send data power to the drive. This eliminates the problem of plugging/un-plugging
wires when a drive is added or removed. SCA drives are also used in SANs (Storage Area Networks) because they are easy to "swap" in and out. Most SCA drives have a
MTBF (Mean Time Between Failures) rate of 1,200,000 hours. As for how they come up with that huge number, I have no idea?
Fiber (Fibre) Channel
Fiber channel, somtimes spelled fibre, is a special type of SCSI drive that uses a different signaling method to move data. These drives are not commonly used in servers. They work a bit differently than the rest of the drives, and, therefore, I am not going to go into detail about them. I just wanted to mention that they exist and are different from the rest of the SCSI family.
RAID
Originally, this term meant Redundant Array of Inexpensive Disks but this term has evolved to mean Redundant Array of Independent Disks. Like SCSI, there are several different types of RAID configurations. In servers, RAID is commonly used to protect the data. SCSI RAID is a great way to combine the speed of SCSI and the reliability of RAID. There are 3 commonly used RAID configurations: RAID 0,1, and 5.
RAID level 0 is not used on servers with a medium to high work load. RAID0 offers no redundancy because both drives are required for data to be read and written correctly. If one of the drives fails, the whole server will crash. This is handy on servers with small loads or workstations where fast access is required. I run all my local servers in RAID0 because I don't put a very large load on them, and I want a fast response time from them. This is also useful when you want to hold a large amount of data using two small drives. For instance, if you have (2) 100 GB hard drives, you could set them in RAID level 0 to create (1) 200 GB drive.
RAID level 1, also called data mirroring, is usually used on a server with a more serious workload. The purpose of RAID1 is to duplicate the data across two identical drives to provide a copy. This can also be done with more than 2 drives. If one drive goes out for any reason, the data requested is pulled from the second drive. The is also known as redundancy. While this does slow the access time down, it is a great way to feel comfortable about the safety of your data.
RAID 5, also called parity RAID, requires a minimum of 3 drives. It stripes data across two drives and uses the 3rd drive to hold parity about the data on the other two drives. This is the best solution for data recovery in the server environment. However, this is typically expensive, and does not offer performance for "write-intensive" environments.
|