Techniques, Problems, and Solutions
• Have you had a USB drive suddenly stop working?
• Do you need to continuously backup data or plan to archive a
large amount of material?
b y Ric k D a vis
These situations highlight some very important issues in the field of data
storage and getting to the answers can require a variety of information across
several areas of technology. However, understanding these topics can greatly help
you prepare for any serious storage needs.
In the field of computers and technology, there are many items that carry
the name of storage, memory, and
other similar names, however, it is data
storage that is of importance here. Data
storage refers to instructions that form a
computer program and the data that the
programs manipulate. Without these
types of storage, computers would not
reach beyond basic calculators.
Just as there are a variety of terms to
label these types of storage, there exists
an equal variety of data storage types.
For purposes of archiving and backups,
data storage refers specifically to data
stored separately from the computer’s
motherboard (which would be the primary storage) or, in many cases, stored
outside the computer itself in a variety of
levels of portability. These include hard
disks (secondary storage), as well as
optical disks and USB drives (off-line
storage) or a combination of these either
on a network or some other link not
directly part of the computer itself.
1) and then eight of these bits comprise
one byte. Now for the sticky part.
One KB (kilobyte) according to the
computers would be 210 or 1,024 bytes.
However, the prefix kilo is defined as
103 or 1,000 units (which has caused a
variety of headaches). It’s important to
understand that every computer operating system imposes a “management
surcharge” in order to index and store
your data. That means, for example, that
for every 4,096 bytes of your data on
the disk, the OS will wrap it in 64 more
so that it can be quickly accessed, verified, etc. This causes confusion when
a 100 meg hard drive “formats down” to
about 98. 4 megs. Chalk it up to the cost
of doing business on the disk!
Let’s get back to measuring the
size of things. Every new unit of measure is simply the old unit multiplied by
1,000. For example; 1,000 kilobytes
(KB) equals one megabyte (MB); 1,000
megabytes is one gigabyte (GB); and
from there, we move to terabytes (TB)
and (just recently) petabytes.
Equally significant to capacity is the
density of the data being stored. Density
is simply a measure of the amount of
data that can be stored in a specific
amount of storage medium. For example,
a CD-ROM can store 700 MB of data
while a DVD-ROM (which utilizes the
exact same physical space) can store 4. 7
GB ( 4,700 MB) of data. Higher density is
better, but it comes at higher costs.
er: latency (the access time to the stored
data) and throughput (the rate at which
the data can be transferred). Both of
these variables usually carry a separate
value when reading or writing data. Their
initial values are almost always different
from a sustained, or average, value.
Latency is usually measured in
fractions of a second and generally not a
concern (except in unusually demanding
applications). Throughput, on the other
hand, has become more significant.
When the common file size was under 1
MB, it was fine if the computer needed a
full second or two to transfer the data.
But now, with files reaching into the
tens of GB and beyond, even a small
difference in transfer rate can be the difference between hours and days when
large volumes are being transferred.
Density and Capacity
Probably the first thing anyone
looks for when considering data storage
is its capacity, or the amount of data that
can be stored. The density of the storage
medium not only plays a part in pricing,
but also in the feasibility of its use.
Storage capacity has travelled an
interesting road because computers are
binary in nature (either 0 or 1), even
though most of the world uses the SI
system (a base 10 system). It starts out
easily enough in binary with a bit (0 or
68 August 2007
There are a few things to keep in
mind when comparing the types of
storage. First, it is not easy to compare
data that is archived with data that
needs to be accessed on a regular basis.
And things will get more complicated
when storing data for multiple users.
Latency and Throughput
The two remaining factors to consid-
Hard disks are always needed for
the regular operation of a computer
system. And now it may be possible to
archive large amounts of data on these
as well. Individual drives are now
approaching the 1 TB mark and smaller models are becoming inexpensive.