Discussion:
downgrade from kernel 3.17 to 3.10
Cristian Falcas
2014-10-21 08:13:48 UTC
Permalink
Can I downgrade the kernel from 3.17.1 to latest 3.10 if I have a
btrfs partition formatted and used on 3.17.1?

I mean, is there something that could go wrong with the fs if suddenly
I use an older kernel?

I want to downgrade because last night we had some 1200 oops's in 1
hour on the 3.17 kernel related to "CPU#n stuck" and what seems to be
btrfs work:

NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [ceph-osd:3542]
Modules linked in: iptable_nat nf_nat_ipv4 nf_nat iptable_mangle
nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables
nf_conntrack_ipv6 nf_defrag_ipv6 xt_mac xt_physdev veth
ip6table_filter ip6_tables ebtable_nat ebtables ipt_REJECT xt_CHECKSUM
autofs4 openvswitch vxlan udp_tunnel gre libcrc32c xt_state
nf_conntrack xt_comment xt_multiport vfat fat bridge ipv6 stp llc
vhost_net macvtap macvlan vhost tun kvm_intel kvm iTCO_wdt
iTCO_vendor_support serio_raw rtc_efi pcspkr ipmi_si ipmi_msghandler
i2c_i801 lpc_ich cdc_ether usbnet mii shpchp sg ses enclosure ioatdma
dca i7core_edac edac_core bnx2 ext4 jbd2 mbcache btrfs raid6_pq xor
sr_mod cdrom sd_mod crc_t10dif crct10dif_common pata_acpi ata_generic
ata_piix megaraid_sas mgag200 ttm drm_kms_helper sysimgblt sysfillrect
syscopyarea dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nf_defrag_ipv4]
CPU: 3 PID: 3542 Comm: ceph-osd Tainted: G L
3.17.1-1.el6.elrepo.x86_64 #1
Hardware name: IBM System x3650 M3 -[7945AC1]-/69Y4438 , BIOS
-[D6E149AUS-1.09]- 09/21/2010
task: ffff88121cc2b010 ti: ffff8810212a4000 task.ti: ffff8810212a4000
RIP: 0010:[<ffffffff810b7c96>] [<ffffffff810b7c96>]
queue_read_lock_slowpath+0x76/
0x90
RSP: 0018:ffff8810212a7b58 EFLAGS: 00000206
RAX: 0000000000000c11 RBX: ffff8810212a7ba8 RCX: ffff880773ee1aec
RDX: 0000000000000c16 RSI: 000000000000000a RDI: ffff880773ee1ae8
RBP: ffff8810212a7b58 R08: 000000000000000b R09: ffff880773ee1aac
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000065
R13: 00000019518bb800 R14: ffff88034a151780 R15: ffff8810212a7bb4
FS: 00007f36da014700(0000) GS:ffff88127f260000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000030b145010 CR3: 000000121e0e1000 CR4: 00000000000027e0
Stack:
ffff8810212a7b68 ffffffff8165e78d ffff8810212a7bf8 ffffffffa0174eca
00000001012a7b98 ffff88121cc2b010 ffff8810212a7bb0 ffff88121cc2b010
ffff8806b3b67ab8 ffff880b5005a4a0 ffff8810212a7bc8 ffffffffa0158ad1
Call Trace:
[<ffffffff8165e78d>] _raw_read_lock+0x1d/0x30
[<ffffffffa0174eca>] btrfs_tree_read_lock+0x5a/0x130 [btrfs]
[<ffffffffa0158ad1>] ? free_extent_buffer+0x61/0xc0 [btrfs]
[<ffffffffa010e43b>] btrfs_read_lock_root_node+0x3b/0x50 [btrfs]
[<ffffffffa01182da>] btrfs_search_forward+0x3a/0x340 [btrfs]
[<ffffffffa017d215>] btrfs_log_inode+0x375/0x7b0 [btrfs]
[<ffffffffa015e9d3>] ? extent_write_cache_pages.clone.6+0xf3/0x3f0 [btrfs]
[<ffffffff810afb50>] ? bit_waitqueue+0xb0/0xb0
[<ffffffff8165cb96>] ? mutex_lock+0x16/0x40
[<ffffffffa0175cc1>] ? start_log_trans+0xe1/0x230 [btrfs]
[<ffffffffa017d771>] btrfs_log_inode_parent+0x121/0x340 [btrfs]
[<ffffffffa017da8c>] btrfs_log_dentry_safe+0x6c/0x90 [btrfs]
[<ffffffffa014cfc1>] btrfs_sync_file+0x181/0x340 [btrfs]
[<ffffffff81205861>] vfs_fsync_range+0x21/0x30
[<ffffffff8120588c>] vfs_fsync+0x1c/0x20
[<ffffffff81205a7d>] do_fsync+0x3d/0x70
[<ffffffff81205ac3>] SyS_fdatasync+0x13/0x20
[<ffffffff8165ede9>] system_call_fastpath+0x16/0x1b
Code: 00 00 f0 0f c1 07 3c ff 75 0b 0f 1f 00 f3 90 8b 07 3c ff 74 f8
66 83 47 04 01 c9 c3 f3 90 8b 07 3c ff 74 f8 c9 c3 f3 90 0f b7 01 <66>
39 c2 75 f6 0f 1f 44 00 00 eb b5 90 90 90 90 90 90 90 90 90


Thank you,
Cristian Falcas
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Duncan
2014-10-21 11:26:53 UTC
Permalink
Can I downgrade the kernel from 3.17.1 to latest 3.10 if I have a btrfs
partition formatted and used on 3.17.1?
I mean, is there something that could go wrong with the fs if suddenly I
use an older kernel?
I want to downgrade because last night we had some 1200 oops's in 1 hour
on the 3.17 kernel related to "CPU#n stuck" and what seems to be btrfs
You definitely don't want to downgrade that far -- there's way too many
btrfs fixes since then and you'd be needlessly risking your data.

Much more viable would be to downgrade to the latest 3.16.x stable kernel
(definitely not 3.16.0 or 3.16.1 as they had an open issue much like
3.17.0 does), and then upgrade to the latest 3.17.x in a couple weeks, as
there's some critical stable fixes in the pipeline for it.

Or if you must, 3.14.x is the latest long-term-stable series, and is
continuing to get btrfs-stable patches along with the other stable
patches it gets.

But I'd definitely not recommend reverting to older than 3.14.x stable
series, because even if it's a stable series and they catch and apply to
stable all the patches that ideally need to be applied back that far, if
you have problems, what you'd be running is simply too far back in
history to get much support on this list for.

Also, keep in mind that the btrfs-is-experimental warnings didn't come
off until 3.12 or so. Any btrfs older than that was officially
experimental when it came out, and even if it's a long-term-stable
kernel, no stable series patches are going to remove the still
experimental nature of btrfs in a kernel that old.

So 3.10, no way if it were /my/ data! Latest 3.14.x stable, I'd
consider. But preferably step back to the latest 3.16.x (past 3.16.2 for
sure) temporarily, and try latest 3.17.x again in a couple weeks (or 3.18-
live-git now) as there's some critical fixes for 3.17-stable now in 3.18
and still making their way to the stable releases.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Cristian Falcas
2014-10-21 13:18:44 UTC
Permalink
Thank you for your answer.

I will reformat the disk with a 3.10 kernel in the meantime, because I
don't have any rpms for 3.16 now.
Post by Duncan
Can I downgrade the kernel from 3.17.1 to latest 3.10 if I have a btrfs
partition formatted and used on 3.17.1?
I mean, is there something that could go wrong with the fs if suddenly I
use an older kernel?
I want to downgrade because last night we had some 1200 oops's in 1 hour
on the 3.17 kernel related to "CPU#n stuck" and what seems to be btrfs
You definitely don't want to downgrade that far -- there's way too many
btrfs fixes since then and you'd be needlessly risking your data.
Much more viable would be to downgrade to the latest 3.16.x stable kernel
(definitely not 3.16.0 or 3.16.1 as they had an open issue much like
3.17.0 does), and then upgrade to the latest 3.17.x in a couple weeks, as
there's some critical stable fixes in the pipeline for it.
Or if you must, 3.14.x is the latest long-term-stable series, and is
continuing to get btrfs-stable patches along with the other stable
patches it gets.
But I'd definitely not recommend reverting to older than 3.14.x stable
series, because even if it's a stable series and they catch and apply to
stable all the patches that ideally need to be applied back that far, if
you have problems, what you'd be running is simply too far back in
history to get much support on this list for.
Also, keep in mind that the btrfs-is-experimental warnings didn't come
off until 3.12 or so. Any btrfs older than that was officially
experimental when it came out, and even if it's a long-term-stable
kernel, no stable series patches are going to remove the still
experimental nature of btrfs in a kernel that old.
So 3.10, no way if it were /my/ data! Latest 3.14.x stable, I'd
consider. But preferably step back to the latest 3.16.x (past 3.16.2 for
sure) temporarily, and try latest 3.17.x again in a couple weeks (or 3.18-
live-git now) as there's some critical fixes for 3.17-stable now in 3.18
and still making their way to the stable releases.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Robert White
2014-10-21 15:13:27 UTC
Permalink
Post by Cristian Falcas
Thank you for your answer.
I will reformat the disk with a 3.10 kernel in the meantime, because I
don't have any rpms for 3.16 now.
Don't bother reformatting (yet). The on-disk layout is stable between
the releases. It should run fine and all the known-to-date issues with
3.17 only affected read-only snapshots, so your live data is still good.

I'd wait till things shake out before doing anything drastic to the
working set.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Robert White
2014-10-21 15:20:30 UTC
Permalink
Post by Cristian Falcas
Thank you for your answer.
I will reformat the disk with a 3.10 kernel in the meantime, because I
don't have any rpms for 3.16 now.
More concisely: Don't use 3.10 BTRFS for data you value. There is a
non-trivial chance that the problems you observed are/were due to "bad
things" on the disk written there by 3.10.

There is no value to recreating your file systems under 3.10 as the same
thing is likely to go bad again when you get out of the dungeon.

What are your RPM options? What about just getting the sources from
kernel.org and compiling your won 3.16.5?

Seriously, 3.10.... just... no...

8-)

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Cristian Falcas
2014-10-21 15:34:34 UTC
Permalink
I will start investigating how can we build our own rpms from the 3.16
sources. Until then we are stuck with the ones from the official repos
or elrepo. Which means 3.10 is the latest for el6. We used this until
now and seems we where lucky enough to not hit anything bad.

We upgraded to 3.17 because we use ceph on the machine with openstack
and on the ceph site they recommended >3.14. And because we need
writable snapshots, we are forced to use btrfs under ceph.

Thank you all for your advice.
Post by Robert White
Post by Cristian Falcas
Thank you for your answer.
I will reformat the disk with a 3.10 kernel in the meantime, because I
don't have any rpms for 3.16 now.
More concisely: Don't use 3.10 BTRFS for data you value. There is a
non-trivial chance that the problems you observed are/were due to "bad
things" on the disk written there by 3.10.
There is no value to recreating your file systems under 3.10 as the same
thing is likely to go bad again when you get out of the dungeon.
What are your RPM options? What about just getting the sources from
kernel.org and compiling your won 3.16.5?
Seriously, 3.10.... just... no...
8-)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Austin S Hemmelgarn
2014-10-21 15:58:14 UTC
Permalink
Post by Cristian Falcas
I will start investigating how can we build our own rpms from the 3.16
sources. Until then we are stuck with the ones from the official repos
or elrepo. Which means 3.10 is the latest for el6. We used this until
now and seems we where lucky enough to not hit anything bad.
IIRC there is a make target in the kernel sources that generates the
appropriate RPM's for you, although doing so from mainline won't get you
any of the patches from Oracle that they use in el.
Post by Cristian Falcas
We upgraded to 3.17 because we use ceph on the machine with openstack
and on the ceph site they recommended >3.14. And because we need
writable snapshots, we are forced to use btrfs under ceph.
Thank you all for your advice.
Post by Robert White
Post by Cristian Falcas
Thank you for your answer.
I will reformat the disk with a 3.10 kernel in the meantime, because I
don't have any rpms for 3.16 now.
More concisely: Don't use 3.10 BTRFS for data you value. There is a
non-trivial chance that the problems you observed are/were due to "bad
things" on the disk written there by 3.10.
There is no value to recreating your file systems under 3.10 as the same
thing is likely to go bad again when you get out of the dungeon.
What are your RPM options? What about just getting the sources from
kernel.org and compiling your won 3.16.5?
Seriously, 3.10.... just... no...
8-)
Chris Murphy
2014-10-21 16:19:25 UTC
Permalink
Post by Cristian Falcas
I will start investigating how can we build our own rpms from the 3.16
sources. Until then we are stuck with the ones from the official repos
or elrepo. Which means 3.10 is the latest for el6. We used this until
now and seems we where lucky enough to not hit anything bad.
We upgraded to 3.17 because we use ceph on the machine with openstack
and on the ceph site they recommended >3.14. And because we need
writable snapshots, we are forced to use btrfs under ceph.
Hmm, well you could use a Fedora kernel, at least it's approximately the same family. The 3.14.22 kernel RPMs are in koji. Scroll down for x86_64. Chances are you only need kernel-3.14.22-100.fc19.x86_64.rpm.

http://koji.fedoraproject.org/koji/buildinfo?buildID=585577


Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Murphy
2014-10-21 16:26:27 UTC
Permalink
Post by Chris Murphy
Post by Cristian Falcas
I will start investigating how can we build our own rpms from the 3.16
sources. Until then we are stuck with the ones from the official repos
or elrepo. Which means 3.10 is the latest for el6. We used this until
now and seems we where lucky enough to not hit anything bad.
We upgraded to 3.17 because we use ceph on the machine with openstack
and on the ceph site they recommended >3.14. And because we need
writable snapshots, we are forced to use btrfs under ceph.
Hmm, well you could use a Fedora kernel, at least it's approximately the same family. The 3.14.22 kernel RPMs are in koji. Scroll down for x86_64. Chances are you only need kernel-3.14.22-100.fc19.x86_64.rpm.
http://koji.fedoraproject.org/koji/buildinfo?buildID=585577
And 3.16.6 is here:
http://koji.fedoraproject.org/koji/buildinfo?buildID=585583

I guess I'm wondering why elrepo offers 3.10 and 3.17 yet nothing in between? Huge gap.

Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Julio E. Gonzalez P.
2014-10-21 16:36:48 UTC
Permalink
This post might be inappropriate. Click to display it.
Cristian Falcas
2014-10-21 17:55:19 UTC
Permalink
I'm rebuilding now the 3.16.6 version from fedora for el6 (I had to
make some small modification: remove perl-carp dependency and some
compiler flag). And it's for el6, so we have only elrepo with a newer
kernel.

Is it safe to install the kernel without recompiling it first for the
new platform?



On Tue, Oct 21, 2014 at 7:36 PM, Julio E. Gonzalez P.
Post by Julio E. Gonzalez P.
When you say "el6" you mean "el7" right? The last kernel for el7 is
3.10.xxxxx
But Redhat lie a little with kernel version numbers. They say you have a
3.10 kernel, but I think they backport a lot from newers kernels.
Probably the btrfs of redhat el7 is not really a btrfs from 3.10, maybe is
btrfs from 3.12 or 3.14....how knows... (I want to know better about this
too...also using btfrs in rhel7)
Post by Cristian Falcas
I will start investigating how can we build our own rpms from the 3.16
sources. Until then we are stuck with the ones from the official repos
or elrepo. Which means 3.10 is the latest for el6. We used this until
now and seems we where lucky enough to not hit anything bad.
We upgraded to 3.17 because we use ceph on the machine with openstack
and on the ceph site they recommended >3.14. And because we need
writable snapshots, we are forced to use btrfs under ceph.
Thank you all for your advice.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Murphy
2014-10-21 16:07:27 UTC
Permalink
Post by Cristian Falcas
Thank you for your answer.
I will reformat the disk with a 3.10 kernel in the meantime, because I
don't have any rpms for 3.16 now.
If you've formatted with features in common between 3.10 and 3.17, I don't think you need to do this. The concern is only that your existing file system enters an even more non-deterministic state than usual, and therefore not significantly tested.

The suggestion you stick to the most recent kernel you can, is sound advice though. Why all the way back to 3.10 instead of 3.14.22? Every distro should have binaries for it available. They should even have 3.16.6.

One thing I wonder, if going back to kernel 3.14 (or even 3.10), which btrfs-progs to use? Is it OK to use 3.17?


Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Duncan
2014-10-22 02:24:54 UTC
Permalink
Post by Chris Murphy
One thing I wonder, if going back to kernel 3.14 (or even 3.10), which
btrfs-progs to use? Is it OK to use 3.17?
The goal is to have userspace entirely backward compatible (well, to the
last incompatible device format change, anyway, which was well before
3.0). So barring bugs, btrfs-progs-3.17 should work just fine with
kernel 3.14 or even 3.10.

Tho do keep in mind that -progs-3.12 was the first release using the new
kernel-synced versioning. Before that the newest full -progs release was
ancient, 0.19, from years earlier, tho there was a 0.20-rc1 somewhere
along the line. So you really do want at least -progs-3.12 because older
than that is ancient, and definitely the best-tested -progs-3.12 kernel
combinations will be the 3.12 and 3.13 kernel series which ran
concurrently (there being no -progs-3.13, so kernel 3.13 was concurrent
to -progs-3.12).

IOW your point about staying within well-tested norms applies here as
well. By far the most tested code-paths will be the ones where kernel
and -progs versions were concurrent to each other, so that's what I'd
recommend sticking with if you want to play it safe and well tested.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Robert White
2014-10-21 15:04:23 UTC
Permalink
Post by Cristian Falcas
Can I downgrade the kernel from 3.17.1 to latest 3.10 if I have a
btrfs partition formatted and used on 3.17.1?
I went back from 3.17.0 to 3.16.3 when 3.17 acted flaky, and since then
gone up to 3.16.5 with nice results. 3.17.2 is, I think, expected to
contain the actual fixes

Did you upgrade straight from 3.10 to 3.17?

Going all the way back to 3.10 is probably a bad thing. There's been a
lot of work since then. I'd build a new 3.16.5 and switch to that.

DONT go back further, and DONT run btrfsck until the mainline and tools
that fix the "read-only snapshot bug" are in your system. I think that's
going to be 3.17.2 and the 3.17 btrfs tools. If you use older tools you
may end up having to recreate the partition completely.



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loading...