Chad Leigh -- Shire.Net LLC
2006-08-30 18:04:42 UTC
Hi
Sol 10 U2.
I have a test box that has a single raid 6 device that is used as the
source device for a ZFS pool called "local". As of now there is no
important data in it but what happened worries me.
First, I get this message on boot and when I do a "zpool status -x"
command. I seem to be hosed.
# zpool status -x
pool: local
state: FAULTED
status: The pool metadata is corrupted and the pool cannot be opened.
action: Destroy and re-create the pool from a backup source.
see: http://www.sun.com/msg/ZFS-8000-CS
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
local FAULTED 0 0 6 corrupted data
c1t1d0 ONLINE 0 0 6
#
How I got here:
The machine was idle. I did a
# shutdown -i 5 -y -g 1
This completely shuts down and powers off (is there a better way to
do this?)
The ZFS pool was not being used for anything and had not been used
for quite some time when I did this. According to the Areca tech
support (the underlying RAID device used as the pool source device is
an Areca Raid), Solaris has a bug in that it does not call the driver
flush routines to get the device drivers to flush when you do this
sort of shutdown. (reboot and some others do). The RAID device has
a battery backup so it should be ok on reboot. But I had the case
open to check on a BMC/remote console device (Tyan) that is not
working right and I accidently removed the battery backup cable. So
any pending data on thh raid controller was lost.
That was all that happened. What I am concerned about is that the
ZFS meta data is so fragile that such a simple process could
permanently destroy the whole pool. With UFS we could do an fsck
which would probably fix whatever the underlying problems here were.
With ZFS we seem to have fragile meta data that is easily corrupted.
This makes ZFS unusable for production use if it is so easily
corrupted. ZFS does not keep copies of its meta data or have any
other way to fight such a simple corruption? The machine had been
idle for a few days and the ZFS pool should have had 0 activity on
it. This is a machine still being tested before it goes into
production.
Thanks
Chad
---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net
Please check the Links page before posting:
http://groups.yahoo.com/group/solarisx86/links
Post message: ***@yahoogroups.com
UNSUBSCRIBE: solarisx86-***@yahoogroups.com
Sol 10 U2.
I have a test box that has a single raid 6 device that is used as the
source device for a ZFS pool called "local". As of now there is no
important data in it but what happened worries me.
First, I get this message on boot and when I do a "zpool status -x"
command. I seem to be hosed.
# zpool status -x
pool: local
state: FAULTED
status: The pool metadata is corrupted and the pool cannot be opened.
action: Destroy and re-create the pool from a backup source.
see: http://www.sun.com/msg/ZFS-8000-CS
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
local FAULTED 0 0 6 corrupted data
c1t1d0 ONLINE 0 0 6
#
How I got here:
The machine was idle. I did a
# shutdown -i 5 -y -g 1
This completely shuts down and powers off (is there a better way to
do this?)
The ZFS pool was not being used for anything and had not been used
for quite some time when I did this. According to the Areca tech
support (the underlying RAID device used as the pool source device is
an Areca Raid), Solaris has a bug in that it does not call the driver
flush routines to get the device drivers to flush when you do this
sort of shutdown. (reboot and some others do). The RAID device has
a battery backup so it should be ok on reboot. But I had the case
open to check on a BMC/remote console device (Tyan) that is not
working right and I accidently removed the battery backup cable. So
any pending data on thh raid controller was lost.
That was all that happened. What I am concerned about is that the
ZFS meta data is so fragile that such a simple process could
permanently destroy the whole pool. With UFS we could do an fsck
which would probably fix whatever the underlying problems here were.
With ZFS we seem to have fragile meta data that is easily corrupted.
This makes ZFS unusable for production use if it is so easily
corrupted. ZFS does not keep copies of its meta data or have any
other way to fight such a simple corruption? The machine had been
idle for a few days and the ZFS pool should have had 0 activity on
it. This is a machine still being tested before it goes into
production.
Thanks
Chad
---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net
Please check the Links page before posting:
http://groups.yahoo.com/group/solarisx86/links
Post message: ***@yahoogroups.com
UNSUBSCRIBE: solarisx86-***@yahoogroups.com