Tuesday, August 16, 2011

Digg this

What Does Defrag Really Do?

Hard Disk
We've all heard it from the "IT Guy" at one point in time: "You need to defrag your hard drive!"  But what does that really mean and what does it do?  This post answers the question of what defragmenting your hard drive really does.

What is Fragmentation?
In order to understand what defragmenting does, it's stands to reason that we should first examine what fragmentation is and why it's a bad thing.  Fragmentation and the degree of fragmentation
that can exist has a lot to do with the type of file system you have.  Generally speaking, there are two types of fragmentation: internal and external.  External fragmentation is unused space that exists between files whereas internal fragmentation is unused space that exists within files.

However, when we are talking about the kind of fragmentation that defragmenting fixes then we are talking about data fragmentation.  The file systems that most people are familiar with (FAT, NTFS, ext3, UFS, etc.) do not have external fragmentation.  What they have done instead is traded external fragmentation for data fragmentation.

NTFS and FAT use central allocation tables that point to file data.  Because each file is chopped up into smaller file blocks, any external unused space will not be smaller than one of these blocks.  And therefore if a file expands or a new file is created without enough contiguous free space, the file's blocks can be stored non-contiguously. External fragmentation problem solved!  But over time files can develop summer blocks: some are here, some are there. :)

The problem with summer blocks is that they make reads slow because the disk has to look all over the place just to read what logically is a single file.  And that's the problem with data fragmentation: it makes reads slow.  Both sequential reads and random access reads become slower with increased data fragmentation due to the additional seeks that inevitably have to done to read files.

FAT, arguably, is the biggest offender of data fragmentation.  The reason has to do with how space is allocated to files.  In FAT, files are allocated sequentially.  So, when files have additional blocks allocated they are unable to allocate those blocks contiguously, so they must fragment.  Also, when files are deleted or reduce in size, holes are left behind which can further contribute to the problem of fragmentation by what is known as free space fragmentation.  NTFS improves on file allocation by using clusters and allowing for some growth and contraction (for some more background on NTFS, see my post on file systems) , but it is still more susceptible to fragmentation than say, ext3.

Ext3 and OS X(HFS+)
Ext3 is commonly used with Linux operating systems.  It works differently than NTFS or FAT and is much less susceptible to data fragmentation.  Why?  It has to do with the positioning of files on the disk.  Instead of placing files close together, it scatters them on the disk to allow for growth and contraction.  The end result is that data fragmentation is not required because there is ample unallocated space between files.  Of course, this changes as the disk fills up.  But it does mean that in many cases, data fragmentation is not much of an issue with ext3.

OS X tries to eliminate file system fragmentation by rewriting fragmented files so that they are contiguous.  It is quite effective at reducing data fragmentation.  However, it leaves another problem: free space fragmentation.  This only becomes a problem in HFS+ when the disk fills up and there is not enough room for contiguous file allocation (similar to how ext3 can become fragmented).

The Bottom Line
All of the file systems listed in this post can suffer from (data) fragmentation.  Because FAT and NTFS allocate files sequentially on the disk they will suffer the most fragmentation (FAT more so than NTFS).  Ext3 and HFS+ file systems manage fragmentation. And although they can still become fragmented, they are less likely to do so.

So, what does defrag do?  It undoes the fragmentation on a drive by rearranging file blocks so that files are allocated contiguously for faster access.

No comments:

Post a Comment