Introduction

We all store more and more of our lives in digital form; spreadsheets, résumés, wedding speeches, novels, tax information, schedules, and of course digital photographs and video. All of this data is easy to store, transmit, copy, and share, but how easy is it to get back?

All of this data can be a harsh reminder that computers are not without fault. For years, storage costs have been dropping while at the same time the amount of storage in any one computer has been increasing almost exponentially. We are at a point where a single hard drive can contain multiple terabytes of information, and with a single mishap, lose it all forever. Everyone knows someone who has had the misfortune of having a computer stop working and wanting their information back.

It’s always been possible to safeguard your data, but now it’s not only necessary thanks to the explosion of personal data, it’s also more affordable than ever. When you think of the costs of backing up your data, just remember what it would cost you if you were to ever lose it all. This guide will walk you through saving your data in multiple ways, with the end goal being to have a backup system that is simple, effective, and affordable. In this day and age, you really can have it all.

It’s prudent at this point to define what a backup is, because there are a lot of misconceptions out there which can cause much consternation when the unthinkable happens, and people who thought they were protected find out they were not.

Backups are simply duplicates of data which are archived, and which can be restored to a previous point in time. The key is the data must be duplicated, and you have to be able to go back to an earlier time. Anything that doesn’t meet both of those requirements is not a backup.

As an example, many people trust their data to network storage devices with RAID (Redundant Array of Independent Disks). Without going into the intricacies of various forms of RAID, none of these Network Attached Storage (NAS) devices are any sort of a backup on their own. RAID is designed to protect a system from a hard disk failure and nothing more. Depending on the RAID level, it either duplicates disks, or uses a calculation to create a parity of the data which can be used to calculate the original value of the data if any part of the data is missing from a failed disk. While RAID is an excellent mechanism to keep a system operational in the event of a disk failure, it is not a backup because if a file is changed or deleted, it is instantly updated or removed on all disks, and therefore there is no way to roll back that change. RAID is excellent for use as a file share, and can even be effectively utilized as the target for backups, but it still requires a file backup system if important data is kept on the array.

Another similar example is cloud storage. Properly configured, cloud storage can be a backup target, and different services can even properly perform backups, but the average person with the average Google Drive or OneDrive account can’t copy their files there and hope they are protected. As with RAID, it is a more robust file storage than any single hard drive, but if you delete a file, or copy over another, it can be difficult or impossible to go back to a previous version.

Both RAID and cloud storage suffer from the same problem – you can’t go back to an earlier time, and therefore are not a true backup. True backups will allow you to recover from practically any scenario – fire, flood, theft, equipment failure, or the inevitable user error. This guide will walk you through several methods of performing backups starting at simple and moving up to elaborate systems that will truly protect your data. These methods work for home and business alike, just the type of equipment will likely differ.

There is some common terminology used in backups that should be defined before we start discussing the intricacies of backups:

  • Archive Flag: A bit setting on all files which states whether or not the file has been modified since the last time the flag was cleared.
  • Full Backup: A backup of all files which resets the archive flag.
  • Differential Backup: A backup of all files with the archive flag set, but it does not clear the archive flag.
  • Incremental Backup: A backup of all files with the archive flag set which resets the archive flag.
  • Image or System Based Backup: A complete disk level backup which would allow you to image a machine back to a previous state.
  • Deduplication: A software algorithm which removes all duplicate file parts to reduce the amount of storage required.
  • Source Deduplication: removing duplicate file information from files on the client end. This requires more CPU and memory usage on the client, but allows for a much smaller file size to be transferred to the backup target.
  • Target Deduplication: removing duplicate file information from files on the target end. This saves client CPU and memory usage, and is used to reduce the amount of storage space required on the backup target.
  • Block Level: A backup or system process which accesses a sequence of bytes of data directly on the disk.
  • File Level: A backup or system process which accesses files by querying the Operating System for the entire file.
  • Versioning: A list of previous versions of a file or folder.
  • Recovery Point Objective (RPO): The amount of time since the last backup deemed safe to lose in a disaster scenario. For example, if you perform backups nightly, your RPO would be the previous night’s backups. Anything created in between backups is assumed to be recoverable through other methods, or an acceptable loss.
  • Recovery Time Objective (RTO): The amount of time deemed acceptable between the loss of data and the recovery of data. For home use, there’s really no RTO but many commercial companies will have this defined either with in-house IT or with a Service Level Agreement (SLA) to a support company.
 
Plan Your Backups
Comments Locked

133 Comments

View All Comments

  • jimhsu - Thursday, May 22, 2014 - link

    Availability, capacity, cost: pick two. Sounds like (for a business) that you need an enterprise-grade solution, and if you need next-day availability, crashplan won't be able to deliver that. Crashplan does offer initial drive seeding and backup-to-door: however those also take a week. If a single day of downtime is unacceptable, you probably need something in-house combined with professional services that offer overnight rush delivery -- and that's $$$.
  • dstarr3 - Wednesday, May 21, 2014 - link

    I took a much simpler approach. I have a hard drive which is just a clone of my entire computer, and I keep it in my desk at work. Once a week, I bring it home, run error checks, and do another clone onto the disk, take it back to work the next morning. I also have a local backup disk for files, a portable hard drive. The benefit is that one of my backups is off-site, and both of the backups are never plugged in during non-use, so there's no threat of power surges killing the drives. I'm only susceptible to fire or theft at this point, and that would have to happen to both my home and my work simultaneously to be a problem.
  • rooman - Wednesday, May 21, 2014 - link

    A drive at work is a good idea; an alternative (work isn't always an option) is to store the cloned drive in a safety deposit box which provides an extremely secure location. One probably wouldn't clone once a week, but once a quarter would protect against the worst case of total data loss.
  • dstarr3 - Wednesday, May 21, 2014 - link

    Yeah, I considered a safety deposit box, as well, but in some areas (like mine), it's bloody impossible to get one. haha
  • BeethovensCat - Saturday, May 24, 2014 - link

    I did the same - am using SyncToy between my internal data drives and FreeFileSync between my computer and two external HDDs. The external HDD is entirely encrypted with TrueCrypt. I have a couple of external HDDs that are copies of my data drives (leave Windows on C: alone). Then I just take a drive to work once a week or two. Daily syncs (why bother with a backup program, when one can use a sync program?) to an encrypted USB stick. Works well and with 2Tb of HDD costing around 100$ there is no excuse for not having a couple of those.
  • Kevin G - Wednesday, May 21, 2014 - link


    Overall an excellent article!

    Backups using Shadow Volumes should note some of its limitations: you'll need to have enough free disc space to store another copy of your largest file. For example, say you downloaded a 10 GB installed for a new game you'd need to have another 10 GB of free disk space for Window's Shadow volume to back it up. With the move to SSD's, this could be an issue in some cases.

    I do agree that RAID IS NOT A BACKUP but when backing up to a NAS, the NAS should be using RAID 1/5/6 etc. A paragraph on the introductory page does go into these points but I've always felt the need to discuss backup reliability in this context. It helps clear up potential questions like 'if RAID isn't a backup, why are you backing up to RAID storage?' The answer is in the same paragraph as RAID projects against disk failure. Just in my experience, I typically need to hammer in the idea of what 'what good is backing up to a hard drive if a hard drive dies?' as the case for RAID 1/5/6 on a NAS. This idea can be obtained from the context of the article but I've found this needs more emphasis in my experience.

    The issue of RAID disk failure leads into one topics that I've found missing: media reliability. The article mentions hard drives, USB sticks, optical and the cloud as targets for backup storage. (For consumer usage, I would say it is safe to omit tape but it still exists.) How long the media is stored and its ability to be retained over time does matter. This is more of a long term problem with USB and optical media as after several years, corruption can creep in. Hard drives of course can fail but typically they're in an active environment so that you'd know exactly when it failed. With RAID, it is possible to recover from failed media but once an optical disk rots or a USB flash stick is dead, the data on it is gone. This article does cover the media reliability of the cloud which is unique: you continually have to pay. Stop paying and you lose your backup data. There is one open question though with cloud backups as none have been around for a long time. Issues like outages are also possible with the cloud but so far many of the backup providers have been good in this regard.
  • ltcommanderdata - Wednesday, May 21, 2014 - link

    For the built-in backup options for Windows 7/8 and OS X is there a way to limit the size of the backup without having to partition such that multiple computers can backup to one drive without directly competing with each other for space?

    For Time Machine you mention that it'll automatically delay the scheduled backup if the backup location is unavailable. Does Windows 7/8 do the same? I'm thinking of laptops that are always out and about so hopefully they won't throw up distracting error prompts when the network store location is not available.
  • Brett Howse - Wednesday, May 21, 2014 - link

    I don't have any Windows 7 machines to test, so I can't answer that. Windows 8 has an offline cache though to which allows backups/restores when the device is disconnected:

    Advanced Settings in File History

    File History allows you to fine tune how it works including:
    Which target storage device is used
    How frequently files are checked and backed up
    How much space is used locally to cache backup versions of your files when the target backup device is disconnected
    How long backup files are retained
    Which folders in your libraries are excluded from backup
  • sepffuzzball - Wednesday, May 21, 2014 - link

    Have to say...I've been running Windows Server Essentials 2012 since I was sad about WHS going by the wayside, and I love it.

    I'm running it in a VM on my ESXi server, it backs up all my clients with no issues. Then the WSE backs that up to a different storage pool (Solaris/ZFS), and then that gets kicked off-site.

    Now I just need to find out a cheap solution to backup off-site the ~40TB worth of stuff on the file server (and then the upload speeds to actually back it up!).
  • coburn_c - Wednesday, May 21, 2014 - link

    I just lost 6TB to a failed RAID 5 array. Thank you Seagate/China. The RMA drives are malaysian, so maybe that gives hope. Anyway, you can talk backups all you want but backing up 6TB is neither time nor cost economical.

Log in

Don't have an account? Sign up now