Data Loss and the Importance of Backups

gdon_2003 · May 21, 2020, 4:37am

When I logged in today I saw the system message that there was a server failure and then a restore failure so 8 days of data were lost. I used to go to peoples houses and fix Apple computers. You dont know how many people I made cry when I had to replace their hard drive. I would restore the OS but customer data, i.e. backups, were the customer responsibility. Hardly ever did they have a good backup.

Computers can be lost, destroyed or stolen. Hard drives can crash never to be resurrected. The moral is to have a good backup. That means opening up your wallet and spending some money. You have a lot of money invested as well as time. So be smart and backup your data and make sure you can do a bare metal restore of your OS. Bare metal restore means a way to restore your operating system.

So are you backed up? If not do something about it before you are sorry and crying over your Shapeoko over your lost projects.

Personally I use Norton 360 for backup to two USB drives (two backups). There are cloud backup over the internet for a very cheap rate. But with Norton 360 and the online backup services you still need to restore an OS before your data can be restored. Where are your CD/DVD/Flashdrive that either came with your PC or a utility to make a restoration CD/DVD/Flashdrive. These restore OS disks must be bootable so take a few minutes to figure out if you will be annoyed or crying. Being annoyed at the wasted time to restore is better than crying over everything lost.

NickT · May 21, 2020, 10:39am

This concept can never be over stated at all. Seen it myself time and time again in corporate environments. Test your backups as well, how do you know it works?

CrookedWoodTex · May 21, 2020, 11:29am

Well, that’s something I’ve heard well into the '80s, and it never ceases to bring out the incredulity.

There’s no way to test a whole backup in the real world, and no way to know what key file is not going to restore (for whatever reason.) We can talk/debate that forever, but I won’t. The statement is never supported by a process, but there is the theory.

Yeah, backup. Backup in different ways, because one way may not work.

I use “Backblaze” on one computer, and “Carbonize” on two others. The difference being that the one computer has large external harddrives. Each computer also has “Time Machine” (or equivalent) running to an external harddrive.

I don’t know which one will be the solution to a problem, but hopefully I won’t suffer a “server” problem that has caught C3D with their pants down.

wb9tpg · May 21, 2020, 11:33am

If you’re looking for a more robust solution I’d put the server in the cloud with Amazon or Azure. It does not make much sense to run one yourselves anymore. I was a IT Architect before I retired and picked up a CNC addiction (terminal condition I think).

stutaylo · May 21, 2020, 11:41am

after being burned having everything on one hard drive that failed, I now have everything in 4 places:
Google Drive (cloud)
Dropbox (cloud)
External Hard Drive (physical)
Laptop internal SSD (physical solid)

Sounds over the top… But I have photos + Video of close friends and family that have passed, files that I have spent Hours upon Hours creating, Priceless family Memoirs… well worth the cost of backing up in multiple places

Moded1952 · May 21, 2020, 5:03pm

There’s no way to test a whole backup in the real world

I think that’s hyperbolic to the point of being flat out false.

You test a whole backup by spinning up a new server (e.g. on a cloud provider like Amazon AWS, Microsoft Azure or Google Cloud Platform) and restoring the backup to it. Even if you need a stupidly large server or a set of large servers, you can rent them for a few hours, with high-speed SSDs for a few dollars.

Do this every couple of months and you have reasonable certainty that your process is functional.

and no way to know what key file is not going to restore (for whatever reason.)

This sounds like a backup strategy based on outdated technology. If your backups are properly hashed and stored somewhere that guarantees data integrity (e.g. a file server using ZFS, BTRFS or ReFS, or any cloud provider), you can absolutely know that your files will restore properly.

Regarding general backup strategies, the common recommendation I’ve heard is the 3-2-1 strategy:

3 copies
2 different types of media (e.g. HDD, SSD/flash, optical, tape)
1 copy offsite (e.g. cloud, HDD at a friend’s house)

Personally I store all my data on a NAS based on the ZFS filesystem and have a second NAS running ZFS for backup (which I’m aware ignores the strategy I just recommended, I don’t have enough money for a second form of media given the scale of the data I store).

CrookedWoodTex · May 21, 2020, 6:49pm

Ok, quite a bit hypocritical, and now you’ve come back to the real world.

Moded1952 · May 21, 2020, 7:18pm

I’m bored so in the mood to rant a bit.

Ok, quite a bit hypocritical

Not at all. I fully stand by my recommendation as a generally good rule to follow for the audience I’m recommending it to (normal people who don’t have crazy amounts of data).

For most people this shouldn’t be a problem. Simply dump your data to another computer, a flash drive and Google Drive/Dropbox every month. Bam. Done.

I however am not part of the audience I’m recommending this to. I have 70TB of personal data and storing it remotely would cost ~$350/month so it’s simply infeasible.

There’s also only a single alternative medium that makes the slightest degree of sense and that is tape. Unfortunately tape is also extremely costly. The drives alone cost ~$3k, which is enough to buy a third NAS.

and now you’ve come back to the real world

FWIW, despite being in the “real world”, I’m able to reliably test and restore my backups.

Thanks to using ZFS, I can replicate my entire datastore to another computer over the network with a single command. There are strong integrity guarantees that render it essentially impossible for the backup to be received and written in an incorrect state.

One the backup is written by the receiving computer, it takes a single command for the computer to read and verify every byte of data that it’s storing.

The restore process is basically the inverse of the backup process: a single command that streams everything over the network. I have a 10GbE NIC so in theory, I can restore the entire 70TB backup, guaranteed to be byte-for-byte correct, in ~15h (in practice it takes longer, as different parts of the backup have different I/O characteristics).

Plus:

The backup is mounted, so I can access all of the files within instantly. No need for extracting or selectively restoring particular files with some clunky tool.
I have point-in-time snapshots going back ~6 months:
- I can restore the entire filesystem to the state it was at any of those points
- I can browse the entire filesystem as it was in any of those points and retrieve any subset of files at any of those points.

As for software, most of the important stuff is in Docker containers. I can restore any application from a backup by simply pointing Docker at the backup directory.

Also FWIW, Discourse (the software Carbide 3D uses for this forum) has built-in backup functionality, which should be fully functional.

CrookedWoodTex · May 21, 2020, 8:21pm

However, at the same time, I think you’ve just proven my point.

Moded1952 · May 21, 2020, 8:26pm

If the point was “Yeah, backup. Backup in different ways, because one way may not work.”, I absolutely agree.

If the point was that a single backup can’t be reliably tested and maintained, I disagree.

robgrz · May 21, 2020, 8:43pm

I agree completely. I keep multiple computers fully up and running with everything I need. (Mostly a byproduct of needing multiple computers for software testing)

Big USB hard drives are cheap enough that I image my primary laptop to a USB drive every month and throw it in a pile at home.

CrookedWoodTex · May 21, 2020, 8:53pm

“There ya go again.” [Ronald Reagan] You forgot to quote the part about “real world” where most of us other folks live that barely have 15M service and few would have 3 or 4 computers laying about for anything to do with a CNC.

Same old speech; new faces/voices. I’ve been hearing it for 40 years or so.

The real point is that we depend on these backup services to do what they say they’ll do. We pay them to ensure that the backups are reliable. That’s about all we can do in our real world. YMMV.

Moded1952 · May 21, 2020, 9:42pm

barely have 15M service

Then drop a flash drive or external HDD at a friend’s place, a relative’s place or a drawer at your workplace. A remote backup doesn’t have to be done over the internet, it just has to be somewhere other than your house (or wherever the machine you’re backing up is).

and few would have 3 or 4 computers laying about

Then buy an external HDD instead. It costs $60 to buy a 2TB HDD from Amazon. That’s likely big enough to back up every computer in your house.

Hell, if the amount of important data you have is small, buy a spindle of Blu-Rays. You can get a spindle of 25 25GB BD-R’s for about $10.

The real point is that we depend on these backup services to do what they say they’ll do. We pay them to ensure that the backups are reliable.

If you’re relying on these services alone, I think you misunderstand what it is you’re paying for. You’re not paying them to ensure that backups are reliable, you’re paying them to store your data and return it to you on a best-effort basis. They offer no guarantees and accept no liability. If they fail you, your recourse is at best a refund. If all your data is lost, you’re entitled to maybe a couple hundred in compensation. Is that enough to make up for the loss of your data?

Don’t believe me? Look at the terms. What happens when they lose all your data? Spoiler: nothing. Even if you get lawyers involved, the terms remove all liability. “But what about their reputation!?” you say? Have a Google around. These companies have failed people in the past. They’re still afloat.

Plus, if your internet is as slow as you say, it’s going to take days to weeks to restore your backup if something goes wrong. You can’t even properly test it.

An untested strategy isn’t a backup strategy at all. You have no idea if the restore will work. You’re essentially just praying that everything goes as you expect it to.

A service like Backblaze can absolutely be part of a backup strategy but it’s definitely not a backup strategy on its own, especially if you’re not in a position to restore the backup.

That’s about all we can do in our real world.

For the price of a year of Backblaze, you can buy a 2TB external HDD from Amazon, likely more than enough to back up all your computers.

Buy two of them. Keep one at home and back up to it with backup software (which your computer likely has built-in) every few weeks. Leave another one with a friend or one of the other places I listed. Swap the two HDDs every few weeks/months.

You can even test the backup. Find a technical friend with a spare hard drive (or buy one, again, less than $100, you can use it as a third backup drive), swap it for the one in the computer you’re backing up and try restoring to it.

Also a big plus: these backups are fast and close. If your house burns down, all you have to do is go to your friend/relative/workplace and pick up the HDD. You have all your data mere minutes/hours after your main computer was destroyed.

There’s a lot more you can do than just pray that Backblaze will work.

gdon_2003 · May 22, 2020, 4:14am

I did not mean to stir up a hornets nest. My point is you should be backing up your data. I have seen far too many people that did not back up any thing. I worked for Sun Microsystem and then Oracle after they bought us. I worked in some of the biggest data centers in the world. Even people that spend millions on servers and backup services have failures and failure to restore.

So just think about what you are doing and prepare for the worst. It is ok to hope for the best but that often leads to data loss. As some have stated the USB C 2 TB drives are dirt cheat and even an unproven backup strategy is better than no strategy.

To test backups done with applications make a new user on your system and restore your data to that user. You can see your data and prove your restore works. Most computers made in the last 5 years have over 1 TB drives. Now for normal folks that is a lot of stprage and you may not have that much data so you have room to test a restore to your own hard drive. After successfully restoring you can delete the data and the user and you make sure your backup and restore works.

If you have a lot of data to backup then buy yourself more storage and protect your investment in that data.

Working for Sun/Oracle I used the ZFS file systems and they are robust but not inflatable. So having a mirror or a raid set helps protect you against single disk failure. For most people this is not practical but if you think your data is valuable then prove it by investing in your data.

Just think about what you are doing and decide if your data is valuable or is expendable and go from there.

PhilG · May 22, 2020, 5:52am

I think it has all been said. I’ll tack on my philosophy for what its worth.

(1) Have at least 3 backups

(2) Have (at least) one of these an automated backup system (because humans are bad at doing things regularly)

(2) Have (at least) one of these a backup that is not in the same building

(3) Have one backup that is just plain filesytem - no compression, no de-dup, and (if you can get away with it) no encryption. Because its the simplest way of getting some/all things back should there be a backup problem

PhilG · May 22, 2020, 5:53am

Almost forgot one more. Backup your cloud files (looking at you Fusion 360) because you never know when they may be going away.

NickT · May 22, 2020, 7:50pm

I wasn’t going to continue this thread, but what the heck… For the “common” user this certainly does not apply, however… There ARE supported processes…

I work for one of the largest pharma companies on the planet. And we certainly do test our backups on a very regular basis. Now this does mean we hire a DR facility for a few weeks and build an entire mock up of our environment, AD, DNS, WINS, backup system, what ever it takes… 220, 221. Thankfully I don’t take part in these anymore, but we would be at these facilities for more hours that I care to say to ensure everything works the way we said it would…

These days with virtual machines it’s much easier and we can fail over between our own data centers.
The cost of doing these tests by far out weighs the cost of not doing them and being surprised when a system crashes beyond repair and then you find out your DR strategy doesn’t work.

This topic is amusing in the sense that this is one of those topics that fires off fierce debate. My bag of popcorn is full

system · June 20, 2020, 4:37am

This topic was automatically closed after 30 days. New replies are no longer allowed.