Curtis' Random Thoughts
Written by W. Curtis Preston
Tuesday, 22 January 2013 23:19
Stephen Manley published a blog post today called "Tape is Alive? Inconceivable!" To which I have to reply with a quote from Inigo Montoya, "You keep using that word. I do not think it means what you think it means." I say that because, for me, it's very conceivable that tape continues to play the role that it does in today's IT departments. Yes, its role is shrinking in the backup space, but it's far from "dead," which is what Stephen's blog post suggests should happen.
He makes several good points as to why tape should be dead by now. I like and respect Stephen very much, and I'd love to have this discussion over drinks at EMC World or VMworld sometime. I hope that he and his employer see this post as helping him to understand what people who don't live in the echo chamber of disk think about tape.
Stephen makes a few good points about disk in his post. The first point is that the fastest way to recover a disk system is to have a replicated copy standing by ready to go. Change where you're mounting your primary data and you're up and running. He's right. He's also right about snapshots or CDP being the fastest way to recover from logical corruption, and the fastest way to do granular recovery of files or emails.
In my initial post on the LinkedIn discussion that started this whole thing, I make additional "pro-disk" points. First, I say that tape is very bad at what most of us use it for: receiving backups across a network -- especially incremental backups. I also mention that tape cannot be RAID-protected, where disk can be. I also mention that disk enables deduplication, CDP, near-CDP and replication -- all superior ways to get your data offsite than handing tape to a dude in a truck. I summarize with the statement that I believe that disk is the best place for day-to-day backups.
Disk has all of the above going for it. But it doesn't have everything going for it, and that's why tape isn't dead yet -- nor will it be any time soon.
I do have an issue or two with the paragraph in Stephen's post called "Archival Recovery." First, there is no such thing. It may seem like semantics, but one does not recover from archives; one retrieves from archives. If one is using archive software to do their archives, there is no "recover" or "restore" button in the GUI. There is only "retrieve." Stephen seems to be hinting at the fact that most people use their backups as archives -- a fact on which he and I agree is bad. Where we disagree is whether or not moving many-years-old backup data to disk solves anything. My opinion is that the problem is not that the customer has really old backups on tape. The problem is that they have really old backups. Doing a retrieval from backups is always going to be a really bad thing (regardless of the media you use) and could potentially cost your company millions of dollars in fines and billions of dollars in lost lawsuits if you're unable to do it quickly enough. (I'll be making this point again later.)
Disk is the best thing for backups, but not everyone can afford the best. Even companies that fill their data centers with deduplicated disk and the like still tend to use tape somewhere -- mainly for cost reasons. They put the first 30-90 days on deduped disk, then they put the next six months on tape. Why? Because it's cheaper. If it wasn't cheaper, there would be no reason that they do this. (This is also the reason why EMC still sells tape libraries -- because people still want to buy them.)
Just to compare cost, at $35 per 1.5 TB tape, storing 20 PB on LTO-5 tapes costs $22K with no compression, or $11K with 2:1 compression. In contrast, the cheapest disk system I could find (Promise VTrak 32TB unit) would cost me over $12M to store that same amount of data. Even if got a 20:1 dedupe ratio in software (which very few people get), it would still cost over $600K (plus the cost of the capacity-based dedupe license from my backup software company).
It's also the cheapest way to get data offsite and keep it there. Making another copy on tape at $.013/GB (current LTO-5 pricing) and paying ~$1/tape/month to Iron Mountain is much cheaper than buying another disk array (deduped or not) and replicating data to it. The disk array is much more expensive than a tape, and then you need to pay for bandwidth -- and you have to power the equipment providing that bandwidth and power the disks themselves. The power alone for that equipment will cost more than the Iron Mountain bill for the same amount of data -- and then you have the bill for the bandwidth itself.
Now let's talk about long-term archives. This is data stored for a long time that doesn't need to be in a library. It can go on a shelf and that'll be just fine. Therefore, the only cost for this data is the cost of the media and the cost of cooling/dehumidifying something that doesn't generate heat. I can put it on a tape and never touch it for 30 years, and it'll be fine (Yes, I'm serious; read the rest of the post). If I put it on disk, I'm going to need to buy a new disk every five years and copy it. So, even if the media were the same price (which it most certainly is not), the cost to store it on disk would be six times the cost of storing it on tape.
Never underestimate the bandwidth of a truck. 'Nuf said. Lousy latency, yes. But definitely unlimited bandwidth.
Integrity of Initial Write
LTO is two orders of magnitude better at writing bits than enterprise-grade SATA disks, which is what most data protection data is stored on. The undetectable bit error rate of enterprise SATA is 1:10^15, and LTO is 1:10^17. That's one undetectable error every 100 TB with SATA disk and one undetectable error every 10 PB with LTO. (If you want more than that, you can have one error every Exabyte with the Oracle and IBM drives.) I would also argue that if one error every 10 PB is too much, then you can make two copies -- at a cost an order of magnitude less than doing it on disk. There's that cost argument again.
As I have previously written, tape is also much better than disk at holding onto data for periods longer than five years. This is due to the physics of how disks and tapes are made and operated. There is a formula (KuV/kt) that I explain in a previous blog post that explains how the bigger your magnetic grains are, the better, and the cooler your device is, the better The resulting value of this formula gives you an understanding of how well the device will keep its bits in place over long periods of time, and not suffer what is commonly called "bit rot." This is because disks use significantly smaller magnetic grains than tape, and disks run at very high operating temperatures, where tape is stored in ambient temperatures. The result is that disk cannot be trusted to hold onto data for more than five years without suffering bit rot. If you're going to store data longer than five years on disk, you must move it around. And remember that every time you move it around, you're subject to the lower write integrity of disk.
I know that those who are proponents of disk-based systems will say that because it's on disk you can scan it regularly. People who say that obviously don't know that you can do the same thing on tape. Any modern tape drive supports the SCSI verify command that will compare the checksums of the data stored on tape with the actual data. And modern tape libraries have now worked this into their system, automatically verifying tapes as they have time.
Only optical (i.e. non-magnetic) formats (e.g. BluRay, UDO) do a better job of holding onto data for decades. Unfortunately they're really expensive. Last I checked, UDO media was 75 times more expensive than tape.
Air Gap [Update: I added this a day after writing the inital post because I forgot to add it]
One thing tape can do that replicated disk systems cannot do is create a gap of air between the protected data and the final copy of its backup. Give the final tape copy to Iron Mountain and you create a barrier to someone destroying that backup maliciously. One bad thing about replicated backups is that a malicious sysadmin can delete the primary system, backup system, and replicated backup system with a well-written script. That's not possible with an air gap.
People that don't like tape also like to bring up device obsolescence. They say things like "you can't even get a device to read the tape you wrote 10 years ago." They're wrong. Even if you completely failed to plan, there is a huge market for older tape drives and you can find any tape drive used in the last 20-30 years on eBay if you have no other choice. (I know because I just did it.)
Second, if you're keeping tapes from twenty-year-old tape drives, you should be keeping the drives. Duh. And if those drives aren't working, there are companies that will repair them for you. No problem, easy peasy. Device obsolescence is a myth.
Suppose you have a misbehaving disk from many years ago. There are no disk repair companies. There are only data recovery companies that charge astronomical amounts of money to recover data from that drive.
Now consider what you do if you had a malfunctioning tape, which is odd, because there's not much to malfunction. I have been able to "repair" all of the physically malfunctioning tapes I have ever experienced (which is only a few out of the hundreds of thousands of tapes I've handled). The physical structure of a modern tape spool is not that difficult to understand, take apart, and reassemble.
Now consider what happens when your old tape drive malfunctions, which is much more likely. You know what you do? Use a different drive! If you don't have another drive, you can just send the one that's malfunctioning to a repair shop that will cost you far less than what a data recovery company will cost you. If you're in a hurry, buy another one off eBay and have them rush it to you. Better yet, always have a spare drive.
This isn't really a disk-vs-tape issue, but I just had to comment on the customer that Stephen quoted in his blog post as saying, "I'm legally required to store data for 30 years, but I’m not required by law or business to ever recover it. That data is perfect for tape." That may be a statement that amuses someone who works for a disk company, but I find the statement to be both idiotic and irresponsible. If one is required by law to store data for 30 years, then one is required by law to be able to retrieve that data when asked for it. This could be a request from a government agency, or an electronic discovery request in a lawsuit. If you are unable to retrieve that data when you were required to store it, you run afoul of that agency and will be fined or worse. If you are unable to retrieve the data for an electronic discovery request in a lawsuit, you risk receiving an adverse inference instruction by the judge that will result in you losing the lawsuit. So whoever said that has no idea what he/she is talking about.
Think I'm exaggerating? Just ask Morgan Stanley, who up until the mid 00's used their backups as archives. The SEC asked them for a bunch of emails, and their inability to retrieve those emails resulted in a $15M fine. They also had a little over 1400 backup tapes that they needed months of time to be able to pull emails off of to satisfy an electronic discovery request from a major lawsuit from Coleman Holdings in 2005. (They needed this time because they stored the data via backup software, not archive software.) The judge said "archive searches are quick and inexpensive. They do not cost 'hundred of thousands of dollars' or 'take several months.'" (He obviously had never tried to retrieve emails off of backup tapes.) He issued an adverse inference instruction to the jury that said that this was a ploy by Morgan Stanley to hide emails, and that they should take that into consideration in the verdict. They did, and Morgan Stanley lost the case and Coleman Holdings was given a $1.57B judgment.
Doing a retrieval for a lawsuit or a government agency request is a piece of cake -- regardless of the medium you use -- if you use archive software. If you're use backup software to store data for many years, it won't matter what medium you use either -- retrieval will take forever. (I do feel it important to mention that there is one product I know that will truly help you in this case, and that's Index Engines. It's a brute-force approach, but it's manageable. They support disk and tape.)
Why isn't tape dead? Because there are plenty of things that it is better at than disk. Yes, there are plenty of things that disk is better at than tape. But move all of today's production, backup, and archive data to disk? Inconceivable!
Written by W. Curtis Preston
Tuesday, 04 December 2012 20:11
Keep it up, Apple, and I'm going back to Windows.
I was a Windows customer for many years. Despite running virus/malware protection and being pretty good at doing the right things security-wise, I had to completely rebuild Windows at least once a year -- and it usually happened when I really didn't have the time for it. It happened one too many times and said, "that's it," and I bought my first MacBook Pro. (The last Windows OS I ran on bare metal was Windows XP.)
I made the conversion to MacOS about 4+ years ago. During all this time, I have never -- never -- had to rebuild MacOS. When I get a new Mac, I just use Time Machine to move the OS, apps, and data to the new machine. When a new version of the OS comes out, I just push a button and it upgrades itself. I cannot say enough nice things about how much easier it is to have a Mac than a Windows box. (I just got an email today of a Windows user complaining about what he was told about transferring his apps and user data to his new Windows8 machine. He was told that it wasn't possible.)
My first Mac was a used MacBook Pro for roughly $600, for which I promptly got more RAM and a bigger disk drive. I liked it. I soon bought a brand new MacBook Pro with a 500 GB SSD drive, making it cost much more than it would have otherwise. (In hindsight, I should've bought the cheapest one I could buy and then upgrade the things I didn't like.) It wasn't that long before I realized that I hadn't put enough RAM in it, so I did. (I didn't account for the amount of RAM that Parallels would take.)
My company's second Mac was an iMac. After we started doing video editing on that, we decided to max out its RAM. Another MacBook Pro had more RAM installed in it because Lion wanted more than Snow Leopard, and on another MacBook Pro we replaced the built-in hard drive with an SSD unit and upgraded its RAM. We are still using that original MacBook Pro and it works fine -- because we upgraded to more RAM and a better disk -- because we could. It's what people that know how to use computers do -- they upgrade or repair the little parts in them to make them better.
The first expensive application we bought (besides Microsoft Office) was Final Cut Pro 7, and I bought it at Fry's Electronics -- an authorized reseller of Apple products. I somehow managed to pay $1000 for a piece of software that Apple was going to replace in just a few days with a completely different product. Not an upgrade, mind you, a complete ground-up rework of that product. Again, anyone who followed that world knows what's coming next. I wish I had known at the time.
First, Apple ruins Final Cut Pro
For those who don't follow the professional video editing space, Final Cut Pro was the industry standard for a long time. Other products eventually passed it up in functionality and speed, but a lot of people hung onto Final Cut Pro 7 anyway because (A) they knew it already and (B) it worked with all their existing and past project files. They waited for years for a 64-bit upgrade to Final Cut Pro 7.
Apple responded by coming out with Final Cut Pro X, a product that was closer in functionality to iMovie than Final Cut Pro -- and couldn't open Final Cut Pro 7 projects. (In case you missed that, the two reasons that people were holding onto Final Cut Pro 7 were gone. They didn't know how to use the new product because it was night and day a different product, and it couldn't open the old product's projects.) FCP X was missing literally dozens of features that were important to the pro editing community. (They have since replaced a lot of those missing features, but not all of them.) And the day they started selling FCP X, they stopped selling FCP 7. Without going into the details, suffice it to say that there was a mass exodus and Adobe and Avid both had a very good year. (Both products offered, and may still be offering big discounts to FCP customers that wanted to jump ship.)
But what really killed me is what happened to me personally. I thought that while Apple was addressing the concerns that many had with FCP X, I'd continue using FCP 7. So I called them to pay for commercial support for FCP 7 so I could call and ask stupid questions -- of which I had many -- as I was learning to use the product. Their response was to say that support for FCP 7 was unavailable. I couldn't pay them to take my calls on FCP 7. What?
So here I am with a piece of software that I just paid $1000 for and I can't get any help from the company that just sold it to me. I can't return it to Fry's because it's open software. I can't return it to Apple because I bought it at Fry's. I asked Apple to give me a free copy of FCP X to ease the pain and they told me they'd look into it and then slowly stopped returning my emails. Thanks a bunch, Apple. (Hey Apple: If you're reading this, it's never too late to make an apology & give me that free copy of FCP X.)
Apple ruins the MacBook Pro
Have you seen the new MBP? Cool, huh? Did you know that if you want the one with the Retina display, you'd be getting the least upgradeable, least repairable laptop in history? That's what iFixit had to say after they tore down then 15" and 13" MBPs. You won't be able to upgrade the RAM because it's soldered to the motherboard. You'll have to replace the entire top just to replace the screen -- because Apple fused the two together.
When I mention this to Apple fans and employees, what I get is, "well it's just like the iPad!" You're right. The 15-inch MacBook Pro is a $2200 iPad. This means that they can do things like they do in the iPad where they charge you hundreds of dollars to go from a 16 GB SSD chip to a 64 GB SSD chip, although the actual difference in cost is a fraction of that. Except now we're not talking hundreds of dollars -- we're talking thousands. This means that you'll be forced to buy the most expensive one you can afford because if you do like I did and underestimate how much RAM you'll need, you'll be screwed. (It costs $200 more to go from an 8GB version to a 16GB version, despite the fact that buying that same RAM directly from Crucial will cost you $30 more -- not $200.)
Apple's response is also that they'll let the market decide. You can have the MBP with the Retina Display and no possibility of upgrade or the MBP without the Retina Display and the ability to upgrade.
First, I want to say that that's not a fair fight. Second, can you please show me on the Apple website where they show any difference between the two MBPs other than CPU speed and the display? Everyone is going to buy the cheaper laptop with the cooler display, validating Apple's theory that you'll buy whatever they tell you to buy. (Update: If you do order one of the Retina laptops, it does say in the memory and hard drive sections, "Please note that the memory is built into the computer, so if you think you may need more memory in the future, it is important to upgrade at the time of purchase." But I don't think the average schmo is going to know what that means.)
Apple Ruins the iMac
I just found out today that they did the same thing they did above, but with the iMac. And they did this to make the iMac thinner. My first question is why the heck did the iMac need to be thinner? There's already a giant empty chunk of air behind my current iMac because it's so stinking thin already. What exactly are they accomplishing by making it thinner?
One of the coolest things about the old iMac was how easy it was to upgrade the RAM. There was a special door on the bottom to add more RAM. Two screws and you're in like Flynn. Now it's almost as bad as the MacBook Pros, according to the folks over at iFix it. First, they removed the optical drive. Great, just like FCP. They made it better by removing features! Their tear down analysis includes sentences like the following:
"To our dismay, we're forced to break out our heat gun and guitar picks to get past the adhesive holding the display down."
"Repair faux pas alert! To save space and eliminate the gap between the glass and the pixels, Apple opted to fuse the front glass and the LCD. This means that if you want to replace one, you'll have to replace both."
"Putting things back together will require peeling off and replacing all of the original adhesive, which will be a major pain for repairers."
"The speakers may look simple, but removing them is nerve-wracking. For seemingly no reason other than to push our buttons, Apple has added a barb to the bottom of the speaker assemblies that makes them harder-than-necessary to remove."
"Good news: The iMac's RAM is "user-replaceable." Bad news: You have to unglue your screen and remove the logic board in order to do so. This is just barely less-terrible than having soldered RAM that's completely non-removable."
It is obvious to me that Apple doesn't care at all about upgradeability and repairabiity. Because otherwise they wouldn't design a system that requires ungluing a display just to upgrade the RAM! How ridiculous is that? And they did all this to make something thinner that totally didn't need to be thinner. This isn't a laptop. There is absolutely no benefit to making it thinner. You should have left well enough alone.
Will they screw up the Mac Pro, too?
I have it on good authority that they are also doing a major redesign of the Mac Pro (the tower config). This is why we have waited to replace our iMac w/a Mac Pro, even though the video editing process could totally use the juice. But now I'm scared that they'll come out with another non-repairable product.
Keep it up, Apple, and I'm gone
Mac OS may be better than Windows in some ways, but it also comes with a lot of downsides. I continually get sick of not being able to integrate my Office suite with many of today's cool cloud applications, for example. I still have to run a copy of Windows in Parallels so I can use Visio and Dragon Naturally Speaking.
You are proving to me that you do not want intelligent people as your customers. You don't want people that try to extend the life of their devices by adding a little more RAM or a faster disk drive. You want people that will go "ooh" and "ahh" when you release a thinner iMac, and never ask how you did that, or that don't care that they now have to pay extra for a DVD drive that still isn't Blu-Ray.
Like I said when I started this blog post. I like my Mac. I love my new iPad Mine, but I am really starting to hate Apple.
Written by W. Curtis Preston
Monday, 03 December 2012 22:31
"The Cloud" has changed the way I do business, but I'm not always sure how I should back up the data I have "up there." So I thought I'd write a blog post about my research to address this hole in our plan.
Truth in IT, Inc. is run almost entirely in the cloud. We have a few MacBooks and one iMac & a little bit of storage where we do our video editing of our educational & editorial content, as well as our ridiculous music video parodies. But that's it. Everything else is "out there" somewhere. We use all of the following:
Liquidweb.com: Managed web hosting
Virtualpbx.com: Phone System
Sherweb: Hosted Exchange Services
Quickbooks Online: Online bookkeeping & payroll
Q-Commission: Online commission management (talks to Salesforce & Quickbooks)
Act-On: Marketing automation system
iCloud: Syncs & stores data from mobile devices
File synchronization system with history*
A cloud backup service for our laptops*
We have data in salesforce that is nowhere else, and the same is true of our web servers, email servers, & laptops. Did you know that using salesforce's backups to recover data that you deleted is not included in your contract, and that if you need them to recover your data (due to your error) it will cost at least $10,000?!?!?!
I want my own backups of my data in the cloud. I don't think we're alone in this regard. I therefore took a look at what our options are. The process was interesting. The following is a copy of an actual chat session I had with one of our providers:
curtis preston says:
one question i've wondered about is how people back up the email that is hosted with you
<support person> says:
You mean when they lose it?
curtis preston says:
Let me put it plainly/bluntly: Scenario is you do something wrong and the exchange server i'm hosted on dies and your backups are bad. What can I do in advance to prepare for that?
<support person> says:
Well, when using Outlook there is always a copy on the computer that is made that could be used
<support person> says:
And to be extra-sure you can create backups from time to time
<support person> says:
but we have a 7 days of backup on a server so the chance both the main server and the backup cannot be backup is pretty low
<support person> says:
Everything is really well backup here you don't have to worry
And that pretty much sums up the attitude of most of the vendors, "We've got it. Dont worry. That's the whole reason you went to the cloud!" Here's my problem with that. Maybe they do have it; maybe they don't. If it turns out they don't know how do IT, there's a good chance they also don't know how to configure a backup system. I'd like to have my own copy in someone else's system and I don't mind paying for the privilege. It turned out that all but hosted Exchange had what I would consider a decent answer. (As far as I can tell, it's not the fault of our provider; Multi-tenant Exchange has some things ripped out of it that create this problem.)
Backups for cloud apps
There are actually a lot of solutions out there to back up cloud applications. Here's what I found:
Salesforce can be automatically and regularly backed up via backupify.com, asigra.com, or ownbackup.com.
Gmail & Google Apps can be backed up via backupify.com.
Quickbooks Online can be backed up OE Companion.
Hosted servers or virtual servers can be backed up via any cloud backup service that supports the operating system that you're using.
Laptops and desktops can also easily be backed up by most cloud backup services.
If you're using a file synchronization service, those files will also be backed up via whatever you choose for your backup solution for your laptops & desktops.
Offline copes of Outlook data can be used to restore lost Exchange data, but it seems clunky, and you need to make the offline copy manually.
Does the lack of backups for the cloud serve as a barrier to the cloud for you or your company? Or are you in the cloud and you have the same worries as me? Is there a particular app that worries you? Tell me about it in the comment section.
*I don't give the name of either of these for various reasons.
Written by W. Curtis Preston
Friday, 30 November 2012 22:45
Disaster recovery experts do not agree whether you should have one-and-only-one recovery time objective (RTO) and recovery point objective (RPO) for each applicaition, or two of them. What am I talking about? Let me explain.
What are RTO and RPO, you ask? RTO is the amount of time it should take to restore your data and return the application to a ready state (e.g. "This server must be up w/in four hours"). RPO is the amount of data you can afford to lose (e.g. "You must restore this app to within one hour of when the outage occurred").
Please note that no one is suggesting you have one RTO/RPO for your entire site. What we're talking about is whether or not each application should have one RTO/RPO or two. We're also not talking about whether or not to have different values for RTO and RPO (e.g. 12-hour RPO and 4-hour RTO). Most people do that.
In defense of two RTOs/RPOs (for each app)
If you lose a building (e.g via a bomb blast or major fire) or a campus (e.g. via an earthquake or tsunami) it's going to take a lot longer to get up and running than if you just have a triple-disk failure in a RAID6 array. In addition, you might have an onsite solution that gets you a nice RPO or RTO as long as the building is still intact. But when the building ceases to exist, most people are just left to their latest backup tape they sent to Iron Mountain. This is why most people feel it's acceptable to have two RTOs/RPOs: one for onsite "disasters" and another for true, site-wide disasters.
In defense of one RTO/RPO (for each app)
It is an absolute fact that RTOs and RPOs should be based on the needs of the business unit that is using any given application. Those who feel that there can only be one RTO/RPO say that the business can either be down for a day or it can't (24-hour RTO). It can either lose a day of data or it can't (24-hour RPO). If they can only afford to be down for one hour (1-hour RTO), it shouldn't matter what the cause of the outage is -- they can't afford one longer than an hour.
I'm with the first team
While I agree with the second team that the business can either afford (or not) a certain amount of downtime and/or data loss, I also understand that backup and disaster recovery solutions come with a cost. The shorter the RTO & RPO, the greater the cost. In addition, solutions that are built to survive the loss of a datacenter or campus are more expensive than those that are built to survive a simple disk or server outage. They cost more in terms of the software and hardware to make it possible -- and especially in terms of the bandwidth required to satisfy an aggressive RTO or RPO. You can't do an RPO of less than 24-36 hours with trucks; you have to do it with replication.
This is how it plays out in my head. Let's say a given business unit says that one hour of downtime costs $1M. This is after considering all of the factors, including loss of revenue and damage to the brand, etc. So they say they decide that they can't afford more than one hour of downtime. No problem. Now we go and design a solution to meet a 1-hour RTO. Now suppose that the solution to satisfy that one-hour RTO costs $10M. After hearing this, the IT department looks at alternatives, and it finds out that we can do a 12-hour RTO for $100K and a 6-hour RTO for $2M.
So for $10M, we are assured that we will lose only $1M in an outage. For $2M we can have a 6-hour RTO, and for $100K we can have a 12-hour RTO. That means that a severe outage would cost me $10M-11M ($10M + 1 hour of downtime at $1M), or $6M-$12M ($6M + $6M in downtime), or $100K-$12M ($100K + 12 hours of downtime). A gambler would say that you're looking at definitely losing (spending) $10M, $6M, or $100K and possibly losing $1M, $6M or $12M. I would probably take option two or three -- probably three. I'd then put $9.9M I saved and make it work for me, and hopefully I'll make more for the company with that $9.9M than the amount we will lose ($12M) if we have a major outage.
Now what if I told you that I could also give you an onsite 1-hour RTO for another $10K. Wouldn't you want to spend another $10K to prevent a loss greater than $1M, knowing full well that this solution will only work if the datacenter remains intact? Of course you would.
So we'll have a 12-hour RTO for a true disaster that takes out my datacenter, but we'll have a 1-hour RTO as long as the outage is local and doesn't take out the entire datacenter.
Guess what. You just agreed to have two RTOs. (All the same logic applies to RPOs, by the way.)
If everything cost the same, then I'd agree that each application should have one -- and only one -- RTO and RPO. However, things do not cost the same. That's why I'm a firm believer in having two complete different sets of RTOs and RPOs. You have one that you will live up to in most situations (e.g. dead disk array) and another that you hope you never have to live up to (loss of an entire building or campus).
What do you think? Weigh in on this in the comments section.
Written by W. Curtis Preston
Friday, 24 August 2012 04:45
In case you missed it, Amazon just announced a new storage cloud service called Glacier. It's designed as a target for archive and backup data at a cost of $.01/GB/mth. That's right, one penny per month per GB. I think my first tweet on this sums up my feelings on this matter: "Amazon glacier announcement today. 1c/GB per month for backup archive type data. Wow. Seriously."
I think Amazon designed and priced this service very well. The price includes unlimited transfers of data into the service. The price also includes retrieving/restoring up to 5% of your total storage per month, and it includes unlimited retrievals/restores from Glacier into EC2. If you want to retrieve/restore more than 5% of your data in a given month, additional retrievals/restores are priced at $.05/GB-$.12/GB depending on the amount you're restoring. Since most backup and archive systems store, store, store and backup, backup, backup and never retrieve or restore, I'd say that it's safe to say that most people's cost will be only $.01/GB/month. (There are some other things you can do to drive up costs, so make sure you're aware of them, but I think as long as you take them into consideration in the design of your system, they shouldn't hit you.)
This low price comes at a cost, starting with the fact that retrievals take a while. Each retrieval request initiates a retrieval job, and each job takes 3-5 hours to complete. That's 3-5 hours before you can begin downloading the first byte to your datacenter. Then it's available for download for another 24 hours.
This is obviously not for mission critical data that needs to be retrieved in minutes. If that doesn't meet your needs, don't use the service. But my thinking is that it is perfectly matched to the way people use archive systems, and to a lesser degree how they use backup systems.
It's better suited for archive, which is why Amazon uses that term first to describe this system. It also properly uses the term retrieve instead of restore. (A retrieve is what an archive system does; a restore is what a backup system does.) Good on ya, Amazon! Glacier could be used for backup, as long as you're going to do small restores, and RTOs of many, many hours are OK. But it's perfect for archives.
We need software! (But not from Amazon!)
Right now Glacier is just an API; there is no backup or archive software that writes to that API. A lot of people on twitter and on Glacier's forum seem to think this is lame and that Amazon should come out with some backup software.
First, let me say that this is how Amazon has always done things. Here's where you can put some storage (S-3), but it's just an API. Here's where you can put some servers (EC2), but what you put in those virtual servers is up to you. This is no different.
Second, let me say that I don't want Amazon to come out with backup software. I want all commercial backup software apps and appliances to write to Glacier as a backup target. I'm sure Jungledisk, which currently writes to S-3, will add Glacier support posthaste. So will all the other backup software products that currently know how to write to S-3. They'll never do that, though, if they have to compete with Amazon's own backup app. These apps and appliances writing to Glacier will add deduplication and compression, significantly dropping the effective price of Glacier -- and making archives and backups use far less bandwidth.
We all have questions that the Amazon announcement did not answer. I have asked these questions of Amazon and am awaiting an answer. I'll let you know what they say.
Is this on disk, tape, or both? (I've heard unofficially that the official answer is no answer, but I'll wait to see what they say to me directly.)
The briefing says that it distributes my data across mutliple locations. Are they saying that every archive will be in at least two locations, or are they saying they're doing some type of multiple location redundacy. (Think RAID across locations.)
It says that downloads are avaialble for 24 hours. What if it takes me longer than 24 hours to download something.
What about tape-based seeding for large archives, or tape-based retrieval of large archives?'
ZDNet's Cost Article
Jack Clark of ZDNet wrote an article that said that Glacier's 1c/GB/mth pricing was ten times that of tape. Suffice it to say that I believe his numbers are way off. I'm writing a blog post to respond to his article, but it will be a long one and a difficult read with lots of numbers and math. I know you can't wait.
Written by W. Curtis Preston
Friday, 24 August 2012 04:45
Jack Clark of ZDNet wrote an article entitled AWS Glacier's dazzling price benefits melt next to the cost of tape, where he compares what he believes is the cost of storing 10 PB on tape for five years, versus the cost of doing the same with Amazon's Glacier service. His conclusion is that Amazon's 1c/GB price is ten times the cost of tape.
I mean no disrespect, but I don't believe Jack Clark has ever had anything to do with a total cost of ownership (TCO) study of anything in IT. Because if he had, he'd know that the acquisition cost of the hardware is only a fraction of the TCO of any given IT system. If only IT systems only cost what they cost when you buy them.... If only.
So what does it really cost to store 10 PB on tape? Let's take a look at two published TCO studies to find out. Before looking at these studies, let me say that since both studies were sponsored by tape companies, the point of them was to prove that tape systems are cheaper than disk systems. If these studies are biased in any way, it would be that they might underestimate the price of tape, since the purpose of these two, uh, independent studies is to prove that tape is cheaper. (In fact, I wrote about one of the reports being significantly biased in favor of tape.)
Clipper Group Report
The first report we'll look at is the Clipper Group report that said that tape was 15 times cheaper than disk. It's a very different report, but I'm going to use the graph on page 3, as it gives what it believes to be the TCO of storing a TB of data on tape for a year, based on four different three-year "cycles" of a 12-year period.
As you can see, the cost per TB is much higher in the first three years, because it includes the cost of buying a tape library that is much larger than it needs to be for that period -- because you must plan for growth. (This, of course, is one of the major advantages of the Glacier model -- you only pay for what you use.) But to get close to Mr. Clark's five-year period, I need to use two three-year periods.
The other problem with the report is that they use graphs and don't show the actual numbers, and they use scales that make the tape numbers look really small. You can see how difficult it is to figure out the actual numbers for tape. It is, easy, however, to figure out the cost numbers for disk and then divide them by the multiplier shown in the graph.
The disk number for the first three-year period looks to be about $2600, which is said to be 9x the price of tape. I divide that $2600 by 9 and I get $288/TB for that 3 year period, which matches up with the line for tape on the graph. Divide it by 3 and we get $96/TB per year. The disk cost of the second period is $1250/TB. Divide it by15x and you get $83/TB for that 3 year period; divide that by 3 to get $27/TB per year. If I average those two together, I get $61/TB per year. Since Amazon Glacier stores your data in multiple locations, we'll need two copies, so the cost is $122/TB per year for two copies. Since Jack Clark used 10 PB for five years, we'll multiply this by 10,000 to get to 10 PB, then by five to get to five years. This gives us a cost of $6,100,000 for to store 10 PB on tape for five years, based on the numbers from the Clipper Group study.
Let's look at a more recent report that compares a relatively new idea of using a disk front end to LTFS-based tape. The first fully-baked system of this type is from Crossroads, and they just happen to have created a TCO study that compares the cost of storing 2PB on their system (a combination of disk and tape) vs storing it on disk for ten years. Awesome! Their 10-year cost for this is $1.64M. Divide 2PB by 2000 gives us 1TB, then dividing the 10 year cost by 10 gives us the cost of $80/TB for one year. Double it like we did the last number, and we have $160/TB/yr for two copies. Mutiply it by 10,000 (10 PB) and then again by five (five years) gives us a cost of $8M for 10 PB for five years based on the Crossroads Report.
On a side note, the Crossroads Strongbox system has the ability to replicate backups between two locations using their disk front end. This makes this system a lot more like what Amazon is offering with their Glacier service. (As opposed to traditional use of tape like the Clipper Group report was based on, where you'd also have to pay for someone like Iron Mountain to move tapes around as well.)
According to two TCO studies, storing two copies of 10 PB of data on tape for five years costs the same or more than it costs to store that same data on Amazon's Glacier.
And you don't have to buy everything up front and you only pay for what you use. You don't have to plan for anything but bandwidth. Yes, this will only work for data whose usage pattern matches what they offer, but they sure have made it cheap -- and you don't have to manage it!
Written by W. Curtis Preston
Thursday, 05 July 2012 04:37
Dell continues its global domination plans via more acquisitions. Last year, they acquired Ocarina & Compellant. This year they've acquired Wyse (they're still in business?) and Sonicwall.
But now they've entered into my world. They have announced that they are acquiring Appassure and Quest Software. For those of you who haven't been following these companies, Appassure is a near-CDP backup software product with quite a bit of success. Quest has a bunch of products, but among them is NetVault (a traditional network backup software product) and vRanger (a purpose-built backup product for virtualization).
The Appassure acquisition is a very solid one. I have personally watched that company increase their market share daily in a very aggressive way. The near-CDP story plays very well both when backing up physical servers and even better when backing up virtual servers. It's not a small deal that they can recover a server in minutes. It's interesting that Appassure has been doing so well, given the difficulty that other CDP and CDP-like products have had.
For those unfamiliar with Quest's history, the two backup products they have are the result of relatively recent acquisitions. NetVault is a general-purpose network backup product that has been around a while, but has never garnered much market share. I know they've been working to bring it up to speed with other products in the space, and have definitely increased marketing activities for it compared to its previous owner, Bakbone. vRanger was the king of the mountain in virtualization backup at one point, but they seem to have been out-marketed by Veeam lately -- but they're not taking that lying down. Just look at product manager John Maxwell's comments on my last blog post to see that! Perhaps having Dell's marketing budget behind these products will finally get them the attention they are looking for.
Dell's challenge will be similar to Quest's: integrating all of these products into a single coherent product line. This is a challenge they already know well.
Written by W. Curtis Preston
Tuesday, 03 July 2012 17:17
Finally the world can see what we've been showing in our Backup Central Live! shows. This video is the first music parody song we produced, and the first music video that we did. I hope you enjoy it.
Yesterday (Backup Version) from TrueBit.tv on Vimeo.
The musician, singer and actor for the video is none other than Cameron Romney, a talented young man I met in my daughters' show choir. He's 16 yo and did all the instruments, vocals, and sound mixing for this song. In case you're curious, yes, he's related to Mitt Romney. 1st cousin, twice removed, or something like that.
This is the first in a series of these videos that I'll be publishing on TrueBit.TV. Make sure to check out our educational videos at TrueBit.TV.
Written by W. Curtis Preston
Wednesday, 27 June 2012 13:37
When I'm talking about backing up of virtual servers & desktops, I always get asked "what do you think about Veeam/vRanger/PHD Virtual/VMPro? Should I buy that instead of NBU/TSM/CV/BE/AS/DP/etc's VADP agent?"
The first thing I can say is that it has definitely become a trend to install a purpose-built backup app for VMware/Hyper-V. On one hand, it's hard to argue with success. People who have moved to such products have often found their backups much easier than they were before. On the other hand, since most of them are moving from the agent-in-the-guest approach, anything would be better than that. Some of them are also moving from their attempt to use Very Crappy Backup (VCB). That's another one that is not hard to compete with.
These purpose-built products do have some really awesome features. For one, they often work around the limitation that VMware creates by using the VSS_BT_COPY backup type. They do something to make sure that the transaction logs in guests get truncated. Some of them also have some really interesting features of being able to run a guest directly from a backup, which leads to all sorts of interesting recovery possibilities.
My concern is that many of these products are missing what I would consider to be core functionality for a centralized backup product. When I look closely at these products, they tend to be missing one or all of the features listed below. What's worse, when these shortcomings are pointed out, some of their representatives look at you like "why would you want that?" I've been doing nothing but backups for almost 20 years, and I think any decent backup product should have all of these features:
Support for more than one platform
Very few shops are 100% virtualized. If you can't also backup the few physical machines their environment, you force them to also run some other product to back those up. More backup apps means more confusion and more confusion with each failed backup and restore. It's simply math.
It's very common to require more than one backup server to handle a given environment. Do you have some sort of centralized management that allows you to see all of your backup servers and manage them all from one place?
Backup functionality #1 is to support the copying of the data from the backup source to the destination, and then to another destination. Many of these purpose-built backup tools are really good at the first step, but have absolutely nothing for the second step. They tell you that you can run another backup to another destination, which is not good for a number of reasons. Or they tell you that you can buy a 3rd-party dedupe product that can replicate your backups for you. I'm sorry, this should be in your product. You should not require me to buy other people's products to do what your product should do on its own. (The lack of this feature is why so many people buy inexpensive products like Backup Exec to back up the datastores created by these products.)
If you have a database that stores configuration and/or history information about your product, it should be built into your product to back up that database. Period. Telling someone they can run a cron job just doesn't cut it. And if how/when to run that cron job isn't even in your documentation, shame on you.
This is at the end of the list for a reason. I'll admit that this is the least important on the list, but I do believe that a backup product should have the ability to copy to tape for long term storage purposes. Not everyone can store their backup on disk as long as they need to. Again, many people use Backup Exec to handle this, but I believe it should be built into any backup product.
My position above has made me very unpopular with some of the purpose-built folks -- some have been very upset with me. That doesn't change the fact that the above features should be considered table stakes for any backup application. You could argue that tape support isn't table stakes anymore, but I disagree. That is, it is still table stakes if you want to be considered a full-fledged backup app.
Let's see how things go.
Written by W. Curtis Preston
Thursday, 10 May 2012 17:10
Sometimes I walk to work in the mornings (I live just over 2 miles from the office), and if I leave early enough I walk down this particular sidewalk that has just been sprinkled with water. On each side of this sidewalk is grass, and watered grass here in Southern California tends to mean that snails will be there. The snails take advantage of the wet, cool sidewalk (and the fact that the sun isn't overhead), and they decide to cross from one bunch of grass to another bunch of grass.
On any given morning there will be several hundred snails crossing from one side to another. 200 or so will be crossing from the left to the right, and 200 or so will be crossing from right to left. You know, 'cause the grass on the other side is, well... you know. The funniest one I saw was a snail that had made it 95% of the way from one side to the other, and then changed his mind and turned around.
I got to thinking about backup software (as one does), and all of the people I know that are moving from product A to product B. Then a bunch of other people that are moving from product C to product A, while others are moving from product B to product C -- and they all think that this will make their lives soooooo much better.
I've done hundreds of backup assessments over the years. I can only think of one or two where the gist of the recommendation was, "Your backup software sucks. You should change it."
I can, however, think of many, many times where the problem was "you're not streaming your tape drives," or "you're manually specifying an include list and you should use the auto-selection feature," or "you're making too many full backups," or "you're using the scheduler in a way that it wasn't designed to work," and on and on and on.
Changing backup products is one of the riskiest things you can do to your backup environment. The learning curve of the new backup product is almost definitely going to reduce your recoverability for a significant period of time.
What would be much better is to bring in an expert in that product for a few weeks and have him/her tell you how best to use the product you already have. The learning curve is much easier, the cost is much lower, and the period of instability will be much shorter.
Don't be a snail. Learn what the grass on your side of the sidewalk really tastes like before you start crossing the sidewalk. Remember that some snails die along the way.
Written by W. Curtis Preston
Saturday, 07 April 2012 01:18
I was helping a guy on a plane understand what "the cloud" is. Once I did that, we begun a discussion on trust. I shared with him my opinion that we have been trusting other vendors since we started IT. We trust every hardware and software we have not to put backdoor stuff in our hardware or software that is designed to do things we don't know about. We trust technicians to know enough not to use bad passwords. (Of course, sometimes we're wrong.) I don't see trusting a cloud vendor as being so terribly different.
I'm sure a bunch of you will focus on that first paragraph, and not on what this blog post is actually about. But here goes anyway.
Eventually we got to the part of the discussion where he mentioned that "our IT department would never allow that." He explained how he has to carry three laptops (personal, corporate 1 and corporate 2) whenever he travels and how he has to dial four digits on his phone before he makes any calls. I'm guessing that we just hit the tip of the iceberg of how his IT department is soooo security concsious that they have forgotten their primary purpose -- to enable people to do work. (BTW, this guy wasn't working on missile launch codes or anything. I forgot what he does for a living but I remember wondering was security was that important for this particular company.)
I ranted a little bit about that to him, to which he replied, "well, they are in charge." I asked who he meant, and he said, "IT."
I just about lost it.
If you are in IT and you think you are in charge, you are wrong. The only thing you are in charge of is helping people get their job done. We buy decent laptops & desktops, so they'll stay up and people can get their job done. We make backups so when things go wrong, we can get people their work back, and let them get their job done. The only reason we do security things is to keep our company from losing the efforts of the people that work there.
Sometimes IT people forget that we are there to serve the business. If you enact a security policy that's so rigid that it slows down people's work, you forgot your job. If you turn on a backup system that slows down the servers, and by association the work of the people, you forgot your job.
You are not in charge. The business is. I feel better now.
Written by W. Curtis Preston
Wednesday, 21 March 2012 00:15
The FCC gives discounts to schools and libraries if they want to buy a tape-based backup system, but not if they want to use disk or any type of cloud-based architecture.
No, this is not me saying this is an example of how tape is better. It's me, an American citizen expressing frustration at how inefficient my government is -- at least in this case.
For those (like me) who don't live in this world, here's what I'm talking about. According to their website, "The Schools and Libraries Program of the Universal Service Fund, commonly known as "E-Rate," is administered by the Universal Service Administrative Company (USAC) under the direction of the Federal Communications Commission (FCC), and provides discounts to assist most schools and libraries in the United States to obtain affordable telecommunications and Internet access."
If you download the list of things that are eligible for the E-rate program, you will find that "tape backup" is eligible, but Online Backup Solutions are specifically not eligible. There is no mention of disk-based backup devices. Here's the best part. Tape backup is defined as "QIC, DAT, 8mm, DLT, AIT, and ADR." ADR was end-of-lifed 9 years ago, QIC & AIT were EOLd 3 years ago. Note that their is no mention of LTO, a device that was released 12 years ago and currently owns 90% of the market. So to say that the FCC is a bit behind the times is an overstatement.
By the way, they also list floppy disks and CD-Rs as the only examples of removable storage. No mention of DVDs or BluRays -- and when was the last time you saw a floppy drive?
Thanks to Christina Weil (@c_weil) for pointing this out via Twitter.
Amazing, just amazing.
Update (3/22): This program is aimed at getting schools and libraries connected. So I've been told that the network parts of this document are mostly up to date. The only reason backup is in the document in the first place is to help ensure that the connectivity systems remain available and connected. What I think happened is that the network vendors knew about this program and made sure their parts got updated, and the tape/storage folks have ignored it (or not know about it).
Written by W. Curtis Preston
Saturday, 10 March 2012 07:25
I don't care if you use disk, tape, or the cloud to back up your systems. (In case you think I'm swayed by advertising, I have advertisers from all of those categories.)
Having said that, it bothers me when I see misinformation being used to sway you one way or the other. This is why I wrote this article that disproved the Gartner 71% tape failure "quote," and this article disproving the Yankee Group 42% failure "quote." And since he used my comment system to link to his article, I also thought I'd write this blog article dispelling the misinformation in the article he linked to.
He said it's been a long time since people have seen tape used for backups.
The live survey of the hundreds of attendees to last year's Backup Central Live shows showed that 82% of them still use tape as their final destination for backups. So much for not seeing tape in a while.
He said IT pros are still skeptical that removable drives have a legitimate place in backup
Yes, we are. I think that 3.5" removable disk drives are a very bad place for backups. 2.5" drives yes, 3.5" drives, not so much. They're simply not designed for excessive portability. Adding to that is this fact: every portable hard drive I have ever used for backing up my laptop has died long before the drive it was backing up. Every single one.
He said cloud backup is shiny and new and that's why people are choosing it.
No, it's because it's a complete and total outsourcing of backup functionality. Backups can be onsite and offsite without ever touching a disk drive or tape drive. AND you will be constantly notified if your backups are working or not working. You often even get notified even if you shut off all your backups! That's not the case with any backup software product that I've ever used. There are a lot of reasons to use cloud backup over removable disk drives or tape. In fact, there are so many that I strongly recommend cloud backup for small to medium sized companies.
He said disk is cheaper for small companies
Yes, it is. It is cheaper to acquire the drives as long as you never need to add capacity. If you do, however, need to add capacity, disk costs will double. Tape costs will not. It'll cost you about $.02/GB to add more capacity to a tape-based system. (Having said that, I do not recommend backing up directly to tape; I haven't in a while.) Having said that, I priced a slightly different tape-based system than the one he quoted in his article, and it was approximately the same cost.
As to comparing their 10-bay disk systems against an autoloader, I don't see how you can do that. Backup software products simply don't know what to do with 10 removable disk drives, but they do not what to do with autoloaders. (I'm sure he knows a backup software product that will work his configuration, but I don't know of one.)
He said disk is more reliable than tape
Baloney. I've written about this before. Tape has a much higher reliability rate than SATA disk -- one hundred times more reliable. I've already shown above that the statistics he quotes in his article are bunk. Almost every failure I've ever seen with restores was the fault of anything but the media that was being used. (I still don't think tape should be used as the initial target for backups, but it is a very reliable place to put the second copy.)
He said LTO-5 speed is 140 MB/s, but only with compression
Sorry, Charlie. That's the native speed. It's up to 280 MB/s with compression. With the 1.5:1 I see all the time, it's 210 MB/s easy. Having said that, it's very hard to feed data to a drive that fast. This is why I don't recommend tape as the initial target for backups. But if you've already made a copy on disk, you should have no problem streaming that tape drive.
He said single file restores are faster from disk
Yes, they are. That extra minute it takes the tape to load and get to the single file would probably put most companies out of business. Seriously, you're going to make a case out of a minute of tape loading time?
He said upgrading/replacing is cheaper with disk
Are you kidding me? I addressed this already. If you need more capacity than the initial purchase, it's much cheaper to expand a tape system. Most customers go years without upgrading their drives.
He said doing synthetic fulls is easier on disk
Yes, it is. And CDP and CDP-like tech is also only possible on disk. Disk has a lot of things going for it.
He said tape has to be replaced more often
Again, baloney. A tape that is used once a week will last four years with the chart quoted in the article. That is longer than most disks I've used.
I don't recommend using tape as the initial target for backups, but I still think it's a great place to put the next copy. And if you are a smaller company, I think the best thing you can do is to use a cloud backup service that totally automates everything, and alllows for a local copy of your data. But I think that any backup system that requires small companies to manually swap removable media just to make backups happen is a bad idea.
Written by W. Curtis Preston
Saturday, 10 March 2012 01:37
I just wrote a blog post about how Gartner never said that 71% of tape restores fail. They never said anything like it. Another statistic that is often quoted is "The Yankee Group said that 54% of tape restores fail." Guess what? They never said that, either.
What they did say in a 2004 paper is that 40.7% of 362 IT executives believed that they had suffered at least one restore failure in the previous year due to tape unreliability. That's not even close to saying htat 42% of all tape restores fail, but who need truth, right?
Also, I'd like to throw out that these were IT executives. What this stat really means is that 40.7% of them were told that they had restores that failed in the previous years due to bad tapes. That's not quite the same thing as it actually happening. How many backup people even know that the reason for their failure is their own misconfiguration? And if they did, how many of them would admit that to their boss, rather than saying "the dang tape failed again."
Now the only statistic left is the Strategic Research one, but I can't find anything on that one.
It appears that at least 66% of all tape statistics are made up. ;)
Written by W. Curtis Preston
Friday, 09 March 2012 23:41
How many times have you read that Gartner said 71% of tape restores fail? Google it. You'll find dozens of references to this Gartner "statistic." It was cited again recently in an article by Highly Reliable Systems, along with a bunch of other stats about how tape sucks. I saw Dave Russell of Gartner last week and asked him about this statistic. He said he had never heard it, but that he would look into it. It turns out that the only way he could find it was to Google it. He searched Gartner's entire archive and could find no paper that ever suggested at 71% failure rate for tape restores.
He said, "I am somewhere between annoyed and pretty darn angry about what I believe are continued misquotes re. Gartner and tape failure rates. I’ve been the lead analyst for backup and recovery technologies since 2005, and none of what’s out there have been published during my watch." The only report that referenced tape and the number 71% was a report David did in March of 2006. Here is what it said:
New, and less-expensive, disk options make the use of disk for faster recovery a more viable option than backup to tape. In a poll of 252 attendees at the 2005 Gartner PlanetStorage conference, 26 percent reported that half or more of their recoveries were currently done from disk. That number jumped to 62 percent when the time frame was extended to 2007. As they look five years into the future to 2010, 71 percent expect that tape will be used mostly for archiving and disaster recovery.
I did a bunch of web searches for "Gartner 71% tape restores fail," and found that if I search for those words prior to March of 2006, I don't find much. I do find an article from Jon Toigo in 2005 that says he hears IT people quoting a 10% failure rate from Gartner, but he believes that number is fictitous (which it probably was.) I also find a whitepaper from Exabyte that refers to a 2002 article from Adam Couture of Gartner Group. I just asked David Russell to see if he can find that article. I also found another whitepaper from Tandberg citing similar numbers and the same paper. Maybe that one has some basis in reality. Most interestingly, I did find this page which claims to be the text of a Feb 2003 article from Computer Technology Review that says that "A recent study [it doesn't cite the study] found that while tape backups are used extensively, restoring data from a tape backup system fails an astounding 70 percent of the time. The reasons for such an alarming rate of failure range significantly--and may vary from bad tapes or tape drives to the inability to find the backup tapes or careless processing by IT staff." (My experience has been it's been far more careless processing by IT staff than bad tapes.)
The important thing is that prior to March of 2006, a Google search shows no references to Gartner thinking that 71% of tape restores fail. Then David Russell wrote his report in March of 2006 that said that "71 percent expect that tape will be used mostly for archiving and disaster recovery." If you change your Google search to the year after his paper came out, you find a bunch of quotes to the 71%, the first of which comes from this DPM Datasheet from Microsoft -- promoting DPM. Then all of the sudden, the floodgates are open and everyone is quoting this number -- no one (including Microsoft) actually giving their source, other than simply saying "Gartner said it." Most of them also seem to quote the Yankee Group (saying 42%) and Strategic Research (54%). I wonder if they ever said what these articles say they said.
Another quote I've seen is this: "according to Ben Matheson, group product manager for Microsoft’s “Data Protection Manager” Division, 42% of attempted recoveries from tape backups in the past year have failed.” (BTW, please note that this is the same number as the Yankee Group number above, so maybe he was just quoting that number) I saw this in an article updated last week. According to LinkedIn, Ben Matheson hasn't worked for Microsoft since February of 2006, so that quote can't be correct either. But once you've got a great quote, why let it go? Wait, I may have found our Gartner quote culprit. Let's see, Ben Matheson leaves Microsoft as its DPM product manager in February of 2006. The new person took over shortly thereafter. The next month a Gartner paper is written, and within two months we have the Microsoft DPM product group citing it incorrectly. Could it have been a new gung-ho product manager misquoting Gartner? Then everyone else starts quoting Gartner by quoting this Microsoft paper. Next thing you know it, it's real! (This is just conjecture, of course. Don't sue me, person who took over from Ben Matheson.)
We all know tape backups and restores, fail, right? Who cares if no one at Gartner said it? The first reason is truth. This statistic is cited so often that it has been accepted as truth, and it isn't.
The second reason is that you can't debate the truth of a fake report. If it was a real report, we could check the stats behind the stat, and see how many of these "tape restore failures" were caused by human error and had nothing to do with the fact that they were using tape. But since there never was any report, we can't do such a thing.
Please, people. Don't quote third parties like that if you can't cite the source. It's too easy to misquote.
Written by W. Curtis Preston
Thursday, 19 January 2012 08:19
We've got new and exciting content for 2012, and we're starting our seminars this year with San Jose and San Diego next week. These free seminars are first-come first serve (end-users only), and we're almost at capacity in the first two cities, so you'd better act now if you want to go. I've also listed the rest of our backup seminars for Q1. (Other cities will be announced soon.)
See you there!
Written by W. Curtis Preston
Wednesday, 18 January 2012 11:10
Before I say anything about SOPA, let me say that I am not battling SOPA because I’m into illegally downloading books/music/movies. As a provider of content (author of three books), I am strongly FOR paying for the media I use. And don’t give me that crap about it isn’t stealing cause you were never going to buy it anyway. You didn’t have something, now you have it, and you neither paid for it nor obtained the permission of the person who provided it. You call it what you want; I call it stealing.
Now that THAT is out of the way….
What I’m also very much against is the government wasting time and MY money trying to stop something they will never stop. I’m against SOPA for many of the reasons I’m against the TSA. The TSA is security theater; SOPA is anti-piracy theater. The only thing it will accomplish besides wasted money is some government folks getting to say that they did what they could — all while wasting millions of taxpayer dollars and the time of other companies’ IT departments.
Chime in! Especially over at my new domain stopsopacentral.com. ;)
Written by W. Curtis Preston
Monday, 31 October 2011 15:23
I wrote a few months ago about what a difference the cloud has made for how I conduct business. I rarely buy software for my new company anymore; I often am paying for some type of cloud-delivered service.
One of those services that I use (and love) is Dropbox. It is an incredibly easy replacement for a file server when you need to share 10s to 100s of GB of files between mutliple users. However, I definitely have some security concerns about it, and not just since the big snafu a few months ago.
One of my issues with dropbox is that they can access my data. Data is encrypted in transit, but they can access my data because they have my password. The same appears to be true of Syncplicity & Sugarsync. Why do I think that? Because they have a "reset my password" link. How does encryption work if they can change my password without a problem? Compare this, for example, to wuala's answer and boxcryptor's answer to the question about a lost password.
Even with Wuala, who says they don't know my password, how do they share encrypted data with users I specify? If all data is encrypted/decrypted locally, how does the person with whom I'm sharing files decrypt them? I'm curious.
The last two listed are open source alternatives. They're too limited in functionality for me, but I thought I'd throw them on there anyway.
What do you think about all this? Anyone I left out that I shouldn't have?
Written by W. Curtis Preston
Saturday, 01 October 2011 23:41
Veeam is one of the most innovative backup and recovery tools designed specifically for VMware and Hyper-V. They've also done a really good job of marketing this tool. In a matter of a couple of years, they've gone from "who's Veeam?" to the mindshare leader in this space. I'm not sure what they're actual market share is, and there are several other tools that are also making a name for themselves, but it's hard to think of a product that has more successfully captured the hearts and minds of their target market than Veeam.
They announced their vPower functionality at Tech Field Day in Seattle quite some time ago. To summarize, this is the ability to run a VM from their backup image of that VM. This opens up all sorts of different levels of functionality, such as instant VM recovery and automated, full testing of the viability of your backups of a given VM.
This is why I looked forward to their presentation at Tech Field Day 7. At first, I was not disappointed. They announced support for Hyper-V. Yay! They also announced further refinement of their vPower functionality. (They even gave me credit in one of the Powerpoint slides for some suggestion I made that they acted on.) They also hinted at a new version that is almost out, but wouldnt' really talk about it or show it. We definitely were not allowed to ask questions about it. Note to future Tech Field Day presenters: I can't think of a way to frustrate bloggers more than to tell them about a new version that you're not going to talk about, show us, or let us ask questions about. To make that matter worse, they kept hinting about the new version throughout the presentation, but then kept telling us we couldn't ask about it.
Where the wheels fell off the truck for me was when I brought up the fact that most Veeam customers use Backup Exec to back up Veeam. Another way to say that is that Veeam can't back itself up. This resulted in a 20 minute conversation during which I got quite riled up, while Doug Hazelmen kept looking at me like he had no idea why I had such an issue with this. You can watch the whole conversation here. It's from 1:24 to 1:45. He occasionally snickered, as if to say that the whole point of the discussion was ludicrious. At one point he actually said the statement that they can't back themselves up was "stupid." Yet he confirmed that the most common practice for Veeam customers was to use Backup Exec to back up Veeam.
Veeam data is stored in two places: the SQL database and the backup jobs directory. There is no way within the product to make a special backup of the SQL catalog so that it can be easily restored without creating a catch-22 situation. For example, one suggestion was to use one Veeam server to backup another Veeam server. That creates a catch-22 of having to restore one server before you can restore the other server. What if both servers are gone? Doug hinted that losing the SQL database just isn't that big of a deal because it's just job configuration information. You could just redo it if you lost it. Is this really a backup company talking to me?
The second part of their data is the backup jobs history. It has no catalog; everything that Veeam needs to know about the backups is stored with the backups. The question is: what happens if one or more of those files gets corrupted? What happens if some well-meaning admin looking for space deletes some jobs? What happens if a rogue administrator deletes all of them? As far as I could tell, Veeam has no way of recovering from this situation -- which is why most Veeam customers use Backup Exec to back up Veeam.
Doug seemed to think that I was pushing for tape support. In a way, I was. Tape is still the least expensive way to get data offsite. In many organizations, it's the only way to get data offsite. They just have too much data to be able to afford a pipe big enough to replicate their backups -- even if they have been deduplicated. That issue aside, I wasn't pushing so much for tape as I was a method for creating a backup of my backup. Files stored in filesystems get corrupted. It just happened to me today. For no apparent reason, a file whose modification time hadn't changed was telling me that it couldn't be copied. It was a movie file on an iMac. I can play the movie, but I can't copy the file. Weird. That's what files on filesystems do -- and that's why we back them up. But the guys at Veeam just don't seem to get this, and that's why they frustrate me.
On one hand, I think the idea of a backup that can test itself in a totally automated fashion is completely awesome, and a lot of other areas of functionality are very impressive as well. On the other hand, them not understanding the issue I do have (and therefore not addressing it) is really frustrating. I hope we can work this out eventually, but they'll first have to stop calling what I'm saying "stupid." ;)
Written by W. Curtis Preston
Saturday, 01 October 2011 21:58
Dell is going to build a unified storage system that has everything you could want ever want in a mid-tier or enterprise-tier storage system. Or so said the presenters at Tech Field Day 7. Only time will tell.
I was part of several bloggers visiting Dell's headquarters in Round Rock, TX (a short drive from Austin) last month just prior to VMWorld. (That's my excuse for this blog entry being so late, BTW.) Dell apparently paid for a double-sponsorship from Stephen Foskett of Gestalt IT so that they could talk to us for four hours (instead of the usual two). They had a lot to talk about.
They made sure we knew about all of the major acquisitions that Dell has made over the past few years:
Equallogic - A scalable iSCSI grid storage array
Exanet - A scalable NAS system
Perot Systems - Professional Services
Ocarina - Deduplication and Compression
Compellant - Midrange storage arrays
RNA Networks - Cloud memory
Scalent Technologies - Datacenter management software
I believe it was Carter George who explained all this, and explained how Dell was going to integrate these technologies faster and better than any other storage company has ever done. The way he described it, it was as if Dell would come out with a totally unified scalable storage system that supported iSCSI, NAS, dedupe and compression that could meet the needs of the mid-market and enterprise market, while being easy to manage in a datacenter -- and be cloud ready. And they were going to do all of this reeeeal soon. He didn't give dates, but the way it was talking, it sounded like 2012.
Dell, you see, "is starting from scratch." Those other vendors weren't. The problem is that I'm not sure how having several products from several different companies, all of which already have existing customers is "starting from scratch."
The way this usually goes is each company becomes a faction in a big project, each wanting to put their technology into the finished product. Each of them thinks that their technology is what's going to make things better. I have one product in mind from the past, where it was pieced together from acquired technologies from a bunch of different companies. The result was three levels of abstraction (one from each company) before the data ever got to disk. The result was also a piece of crap.
Maybe Dell will be different. I wish them the best of luck. Good luck at tearing down the fiefdoms without damaging egos. Good luck getting people to speak their mind when it's really important -- when the emperor appears to be getting undressed. My personal experience with trying to do that with Dell did not go very well (to put it mildly), so I hope things have changed.
I also have concerns about how Dell salespeople will evolve to sell products that require upfront sales engineering to get the order right. My personal experience with their sales teams so far suggests that they've got as much work to do here as they do with all their products I mentioned earlier.
I have been exposed to Equallogic, Compellant, and Ocarina before, and have heard nothing but good about them from the field. So I think Dell has chosen some really solid building blocks to build a real storage company with. I just don't think it's going to be as easy as the presenters at Tech Field Day were trying to say it will be. I'll be more than happy to be wrong, though.
Page 1 of 9