reiserfsck --rebuild-tree and when backup saves your life
After a powerfailure on my main file server, reiserfsck said some errors was found on a reiserfs partition, and reiserfsck --rebuild-tree was required. Off course, RAID 1 doesn't help in this case. The reiserfsck man page says you'd better backup your data before proceeding with --rebuild-tree. __Believe them__. Following this advice, i hopefully made a last backup before proceeding. The reiserfs partition was perfectly readable, except some files. Crawling the web, i learned on multiple sites that until reiserfsck --rebuild-tree hasn't finished it's job, the partition was unusable at all. This operation really takes a long time. Don't estimate the time remaining on the number on remaining blocks and the blocks/seconds. Mine was stuck for many hours on pass #2, eating as much as CPU as possible. Waiting for more than 12 hours for repairing a 200GB partition was fruitless: it ended up saying there were not enough space left on the drive. Not only the drive had a powerfailure while writing data, but some people here were uploading lots of data and there were very little space left (around 5MB). That's something like Murphy's law. The backup was put in place, i strongly believed the original reiserfs partition were lost forever. I better understand now the amount of blocks reserved for root on ext2/ext3 filesystems. I think i'll have a look if this kind of option is available on reiser, too. I relaunched the check, mostly for educational purpose. Noone were complaining about lack of data nor data corruption on the backup. It runned all the night, and on the next morning the drive was totally fixed. OK, reiserfsck --rebuild-tree did it's job. Thanx to all reiserfs crew. It didn't ate data, but it took almost 24 hours to do that, an amount of time i couldn't make my users wait for. I discovered i'm now too old to live with the fsck adrenalin :) I hereby promise to make stronger backups now. Better spend some minutes a day backuping and making sure it's backuped than loose many hours of your life and hours/weeks/month/years of others people work on a power/hard drive failure.
Permalink No comments / Post a Comment