Alasir Enterprises
Main Page >  Misc >  MicroHouse PC Hardware Library Volume I: Hard Drives  

Previous Table of Contents Next

Today, PC operating systems use three common file systems:

  FAT (File Allocation Table). The standard file system used by DOS, Windows 95 (Non-OSR2 release), OS/2, and Windows NT. FAT partitions support file names of 11 characters maximum (8+3 character extension) under DOS, and 255 characters under Windows 95 or NT 4.0 or later versions. Under the standard FAT system, 12-bit or 16-bit numbers are used to identify allocation units, resulting in a maximum volume size of 2GB.
  FAT32 (File Allocation Table, 32-bit). An optional file system used by Windows 95 OSR2 (also called OEM Service Release 2 or Windows 95B) or later versions. Under FAT32, file allocation units are stored as 32-bit numbers, allowing for a single volume of 2TB or 2,048GB in size. FAT32 support will likely be added to Windows NT in the future.
  HPFS (High Performance File System). A file system that’s accessible only under OS/2 and Windows NT 3.51 or earlier. DOS applications running under OS/2 or Windows NT, or via a network, can access files in HPFS partitions, but straight DOS cannot. File names can be 256 characters long, and volume size is limited to 8GB.
  NTFS (Windows NT File System). A UNIX-like file system that’s accessible only under Windows NT. DOS cannot access these partitions, but DOS applications running under Windows NT or accessing a Windows NT volume from the network can. File names can be 256 characters long, and volume size is limited to 8GB.

Of these three file systems, the FAT file system still is by far the most popular (and recommended). The main problem with the original 16-bit FAT file system is that disk space is used in groups of sectors called allocation units or clusters. Because the total number of clusters is limited to 65,536 (the most which can be represented with a 16-bit number), larger drives required that the disk be broken into larger clusters. The larger cluster sizes required cause disk space to be used inefficiently. FAT32 solves this problem by allowing the disk to be broken up into over 4 billion clusters, so the cluster sizes can be kept smaller. Most FAT32 and NTFS volumes use 4KB clusters.

The term cluster was changed to allocation unit in DOS 4 and later versions. The newer term is appropriate because a single cluster is the smallest unit of the disk that DOS can allocate when it writes a file. A cluster is equal to one or more sectors, and although a cluster can be a single sector in some cases (specifically 1.2MB and 1.44MB floppies), it is usually more than one. Having more than one sector per cluster reduces the size and processing overhead of the FAT and enables DOS to run faster because it has fewer individual units of the disk to manage. The tradeoff is in wasted disk space. Because DOS and Windows can manage space only in full cluster units, every file consumes space on the disk in increments of one cluster.

Smaller clusters generate less slack (space wasted between the actual end of each file and the end of the cluster). With larger clusters, the wasted space grows larger. For hard disks, the cluster size varies with the size of the partition. Table 3.5 shows the default cluster sizes FDISK selects for a particular partition volume size.

Table 3.5  Default Cluster Sizes.

Hard Disk Partition Size Cluster (Allocation Unit) Size FAT Type

0 – 15MB 8 sectors or 4,096 (4KB) bytes 12-bit
16 – 128MB 4 sectors or 2,048 (2KB) bytes 16-bit
129 – 256MB 8 sectors or 4,096 (4KB) bytes 16-bit
257 – 512MB 16 sectors or 8,192 (8KB) bytes 16-bit
513 – 1,024MB 32 sectors or 16,384 (16KB) bytes 16-bit
1,025 – 2,048MB 64 sectors or 32,768 (32KB) bytes 16-bit
0 – 260MB 1 sector or 512 bytes 32-bit
260M – 8GB 8 sectors or 4,096 (4KB) bytes 32-bit
8 – 16GB 16 sectors or 8,192 (8KB) bytes 32-bit
16 – 32GB 32 sectors or 16,384 (16KB) bytes 32-bit
32 – 2,048GB 64 sectors or 32,768 (32KB) bytes 32-bit

In most cases, these cluster sizes, which are selected by the FORMAT command, are the minimum possible for a given partition size. Therefore, 8KB clusters are the smallest possible for a partition size greater than 256MB. Note that FDISK creates a FAT using 12-bit numbers if the partition is 16MB or less, while all other FATs are created using 16-bit numbers, unless Large Disk Support is specifically enabled in Windows 95 OSR2 or later.

The effect of the larger cluster sizes on larger disk partitions can be substantial. There can be a significant amount of slack space in the leftover portions of larger clusters. On average the amount of slack space for a file is one half of the space of the last cluster the file uses. To calculate an estimate of slack space on an entire drive use the following formula:

   slack space = #_files * cluster_size / 2

A drive partition of over 1GB and up to 2GB using FAT16 (thus 32KB clusters) containing 10,000 files wastes about 16KB per file or 160,000KB (160MB) total [10000*32KB/2]. If you were to repartition the drive into two separate partitions of less than or equal to 1GB each, then the cluster size would be cut in half, as would the total wasted slack space. You would therefore gain 80MB of disk space. The tradeoff is that managing multiple partitions is not as convenient as a single large partition. The only way you can control cluster or allocation unit sizing using FAT16 is by changing the sizes of the partitions.

If you were to reformat the drive using FAT32, the wasted space would drop to only 2KB per file or about 20MB total! In other words, by converting to FAT32, you would end up with approximately 140MB more disk space free in this example.

NTFS, HPFS, and FAT32 all dramatically reduce the slack space but also increase file management overhead because many more allocation units must be managed.

Despite the problem with slack space, the basic FAT file system is still often the most recommended for compatibility reasons. All the operating systems can access FAT volumes, and the file structures and data-recovery procedures are well known. Also note that data recovery can be difficult to impossible under the HPFS and NTFS systems; for those systems, good backups are imperative.

Previous Table of Contents Next