Print 63 comment(s) - last by bgm063.. on Mar 14 at 1:12 AM

The Windows Home Server continues to eat up files, but it appears that this product's very hungry bug isn't going to turn into a butterfly and fly away anytime soon.  (Source: Microsoft)
The very hungry Windows Home Server continues to whet its appetite on unfortunate users files as the scope of the problem grows and grows

Like the children's book, The Very Hungry Caterpillar, the scope of the Windows Home Server bug simply grew and grew as it ate its way through users' files.  Although Microsoft promised that a solution would be made available to resolve the current issues, a full fix will not be available until at least June 2008.

What was once an attractive home service solution with a wealth of hardware partners has, in essence, become an unsightly pest for many.  Until a solution is found, use of WHS brings with it serious risks of data corruption -- something many consider to be a cardinal sin of networking hardware or software.

One DailyTech reader, Tim Slocum from Roscoe, Illinois, contacted us a couple weeks back with a rather incredible personal story of data loss, which he hoped would serve as a warning to others. 
Slocum was an eager WHS user and states that he copied 16,000+ family pictures and videos to the system.  Around Christmas he discovered that many of these files had become corrupted.  He rebuilt and reformatted the system, only to experience unpleasantly surprising results.

Slocum states in an email to DailyTech, "I then reformatted and rebuilt the system with NO ADD-INS or extra software. Copied all photos to the server, setup PC backups, and let the system sit with no usage because of the lack of trust. This weekend I again noticed photos are now corrupted."

Slocum acknowledges that a family member who works for MS as a consultant has had no issues that he knows of, though he planned on emailing him to verify this.  Slocum adds  that while not a "real techie" he is fairly knowledgeable.  He states, "I have been a developer for over 20 years ... last 2-3 have been moving into VB.NET.  So I have some knowledge of testing and have built PC's in the past."

Having worked hard to stabilize his system,
Slocum plans to continue his efforts with a third build, turning off file duplication, which reportedly may affect the likelihood of occurrence.  Tim feels that WHS is a promising product, but Microsoft is failing to take its issues seriously enough. 

The really surprising part of
Slocum's story at the time DailyTech received it was that he did not edit the files.  While some users had alleged corruption on transfers in unverified reports floating around the internet, previously, Microsoft stated that corruption only occurred when editing files.

Now Microsoft says the problem is that the underpinnings of WHS are broken, and that a fix is required at a very low level.  This will take a good deal of time to develop and validate, according to the WHS Team at Microsoft.  The WHS Team hopes to release beta versions of a patch over following months, but states that June is the soonest a finished patch might appear.

The WHS Team also warns that some users are mistaking other problems for the issue.  Says the Team, "Some of the instances that were initially attributed to this issue ended up being something else, such as a faulty network card/driver, old routers with outdated firmware, or people incorrectly testing the limits of their home servers."

However the Team did not rule that the WHS may have other problems causing trouble on a low level, though they state that they feel very confident that they understand the underlying issue that’s currently causing the main known problem.

And it turns out that
Slocum was correct -- the knowledgebase article has just been updated to encompass file transfers.  The new knowledge base article also has additional information on the cause -- how the NFTS file system, the cache, and the memory manager can get out of whack and beginning eating up user data.  The article explains it thusly:

Windows Home Server uses a file system mini-filter driver in addition to the NTFS file system to implement Shared Folders storage technology. File system mini-filter drivers are an extensibility mechanism that is provided by Windows to enable storage scenarios. For distributing data across the different hard drives that are managed by Windows Home Server, the Windows Home Server mini-filter driver redirects I/O between files that are stored on the main hard drive and files that are stored on other hard drives. This redirection mechanism is enabled only when Windows Home Server is managing the Shared Folder storage of multiple physical hard drives. A bug has been discovered in the redirection mechanism which, in certain cases, depending on application use patterns, timing, and workload, may cause interactions between NTFS, the Memory Manager, and the Cache Manager to get out of sync.

A link to a full technical page on the situation can be found here.

While the Windows Home Server Team is working hard to have a fix ready by summer time, in the meantime WHS users are left with the unpleasant reality that editing or storing files on the server may lead to corruption.  And with the scope of the flaws in WHS's low level file handling growing weekly, like a certain hungry caterpillar; it leaves one to wonder whether there are more aspects of the problem yet to be discovered.

Comments     Threshold

This article is over a month old, voting and posting comments is disabled

By mindless1 on 3/12/2008 4:15:09 AM , Rating: 5
for brains, is what you have if you're still relying on WHS to store anything of value. Yes you paid for it and deserve to be able to use it but to those who claim "oh but my files are ok", well what exactly did you think those who lost files were thinking until they discovered corruption?

Here's how it works- Everything seems A-OK then all of a sudden you don't have the thing which was the whole point of the fileserver - the intact file. A not so similar problem but similar result was seen on home PCs back in the Via 686 southbridge days and nobody tried to defend Via on that but now if it's MS, well let's make excuses? Sorry but it doesn't wash, if more people buy products and they have more revenue, they should be doing more testing enough to catch these bugs becauses even a bug that only effects a small percentage of people is still a heck of a lot of people.

Then there's the infuriating part, that the WHS team is actually trying to diffuse the bug by making a claim like:

"Some of the instances that were initially attributed to this issue ended up being something else, such as a faulty network card/driver, old routers with outdated firmware, or people incorrectly testing the limits of their home servers."

No, dear WHS team, data doesn't get corrupted because of outdated router firmware, the checksums ensure it gets to the WHS box and back. People incorreclty testing limits? So WHS team are you saying there's also ANOTHER bug? Apparently so, there is no "limit testing" that should corrupt data without yet another bug. Either it works or doesn't support what someone is trying to do and won't do it, never ever ever corrupting data as a result.

Surely someone will come along and think I'm a MS basher. Yes in this case it's true because of the product. MS is in the best position to deliver a home server product and that is something some of you obviously feel you can't do without MS, so MS should be the one to develop and sell one, they are certianly entitled to make a profit doing so, if/when it works properly. It is really amazing that after Server 2K3 had been tested so much that they didn't put due diligence into the testing of the added features.

They couldn't have done much testing as the described flaw is not that unlikely to occur. They really didn't test that using the virtualized volume strategy worked ok when this ever growing list of mainstream applications edited or transferred a file? If not, what in the world DID they test?

RE: Rocks,
By kkwst2 on 3/12/2008 12:04:37 PM , Rating: 2
I agree with most of what you're saying. The comments about "pushing the system" and out of date firmware are ridiculous. Outdated firmware shouldn't make you lose data or corrupt files.

A couple points:

The errors do not occur with a single drive system. The error does not occur all the time, even with multi-drives and the affected programs. These types of bugs can be tough. However, apparently the bug was reported before final release. It was minimized at first, blamed on hardware, and could not be reproduced in-house. So they went ahead and released it.

Certainly in retrospect that was a big mistake, but these decisions are usually not made by the people in the trenches that are working to solve problems. As to how much testing was actually done is only speculation. They're finally recognizing the problem and fixing it. I still think it will eventually be a good product.

It seems like something should be done to compensate the early adopters who were screwed by this bug. I would at least refund them for the software.

"Let's face it, we're not changing the world. We're building a product that helps people buy more crap - and watch porn." -- Seagate CEO Bill Watkins

Copyright 2016 DailyTech LLC. - RSS Feed | Advertise | About Us | Ethics | FAQ | Terms, Conditions & Privacy Information | Kristopher Kubicki