Discussion:
segmentation fault in qemu-kvm-0.14.0
Peter Lieven
2011-03-08 22:53:59 UTC
Permalink
Hi,

during testing of qemu-kvm-0.14.0 i can reproduce the following segfault. i have seen similar crash already in 0.13.0, but had no time to debug.
my guess is that this segfault is related to the threaded vnc server which was introduced in qemu 0.13.0. the bug is only triggerable if a vnc
client is attached. it might also be connected to a resolution change in the guest. i have a backtrace attached. the debugger is still running if someone
needs more output

Reading symbols from /usr/local/bin/qemu-system-x86_64...done.
(gdb) r -net tap,vlan=141,script=no,downscript=no,ifname=tap0 -net nic,vlan=141,model=rtl8139,macaddr=52:54:00:ff:00:93 -drive format=host_device,file=/dev/mapper/iqn.2001-05.co
m.equallogic:0-8a0906-e6b70e107-e87000e7acf4d4e5-lieven-winxp-r17453,if=ide,boot=on,cache=none,aio=native -m 1024 -monitor tcp:0:4001,server,nowait -vnc :1 -name 'lieven-winxp-te
st' -boot order=c,menu=on -k de -pidfile /var/run/qemu/vm-265.pid -mem-path /hugepages -mem-prealloc -cpu qemu64,model_id='Intel(R) Xeon(R) CPU E5640 @ 2.67GHz',-n
x -rtc base=localtime,clock=vm -vga cirrus -usb -usbdevice tablet
Starting program: /usr/local/bin/qemu-system-x86_64 -net tap,vlan=141,script=no,downscript=no,ifname=tap0 -net nic,vlan=141,model=rtl8139,macaddr=52:54:00:ff:00:93 -drive format
=host_device,file=/dev/mapper/iqn.2001-05.com.equallogic:0-8a0906-e6b70e107-e87000e7acf4d4e5-lieven-winxp-r17453,if=ide,boot=on,cache=none,aio=native -m 1024 -monitor tcp:0:4001,
server,nowait -vnc :1 -name 'lieven-winxp-test' -boot order=c,menu=on -k de -pidfile /var/run/qemu/vm-265.pid -mem-path /hugepages -mem-prealloc -cpu qemu64,model_id='Intel(R
) Xeon(R) CPU E5640 @ 2.67GHz',-nx -rtc base=localtime,clock=vm -vga cirrus -usb -usbdevice tablet
[Thread debugging using libthread_db enabled]

[New Thread 0x7ffff694e700 (LWP 29042)]
[New Thread 0x7ffff6020700 (LWP 29043)]
[New Thread 0x7ffff581f700 (LWP 29074)]
[Thread 0x7ffff581f700 (LWP 29074) exited]
[New Thread 0x7ffff581f700 (LWP 29124)]
[Thread 0x7ffff581f700 (LWP 29124) exited]
[New Thread 0x7ffff581f700 (LWP 29170)]
[Thread 0x7ffff581f700 (LWP 29170) exited]
[New Thread 0x7ffff581f700 (LWP 29246)]
[Thread 0x7ffff581f700 (LWP 29246) exited]
[New Thread 0x7ffff581f700 (LWP 29303)]
[Thread 0x7ffff581f700 (LWP 29303) exited]
[New Thread 0x7ffff581f700 (LWP 29349)]
[Thread 0x7ffff581f700 (LWP 29349) exited]
[New Thread 0x7ffff581f700 (LWP 29399)]
[Thread 0x7ffff581f700 (LWP 29399) exited]
[New Thread 0x7ffff581f700 (LWP 29471)]
[Thread 0x7ffff581f700 (LWP 29471) exited]
[New Thread 0x7ffff581f700 (LWP 29521)]
[Thread 0x7ffff581f700 (LWP 29521) exited]
[New Thread 0x7ffff581f700 (LWP 29593)]
[Thread 0x7ffff581f700 (LWP 29593) exited]
[New Thread 0x7ffff581f700 (LWP 29703)]
[Thread 0x7ffff581f700 (LWP 29703) exited]

Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb)

(gdb) thread apply all bt full

Thread 3 (Thread 0x7ffff6020700 (LWP 29043)):
#0 0x00007ffff79c385c in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib/libpthread.so.0
No symbol table info available.
#1 0x00000000004d3ae1 in qemu_cond_wait (cond=0x1612d50, mutex=0x1612d80)
at qemu-thread.c:133
err = 0
__func__ = "qemu_cond_wait"
#2 0x00000000004d2b39 in vnc_worker_thread_loop (queue=0x1612d50)
at ui/vnc-jobs-async.c:198
job = 0x7ffff058cd20
entry = 0x0
tmp = 0x0
vs = {csock = -1, ds = 0x15cb380, dirty = {{0, 0, 0, 0,
0} <repeats 2048 times>}, vd = 0x1607ff0, need_update = 0,
force_update = 0, features = 243, absolute = 0, last_x = 0,
last_y = 0, client_width = 0, client_height = 0, vnc_encoding = 7,
major = 0, minor = 0, challenge = '\000' <repeats 15 times>,
info = 0x0, output = {capacity = 3194, offset = 2723,
buffer = 0x1fbbfd0 ""}, input = {capacity = 0, offset = 0,
buffer = 0x0}, write_pixels = 0x4c4bc9 <vnc_write_pixels_generic>,
clientds = {flags = 0 '\000', width = 720, height = 400,
linesize = 2880,
data = 0x7ffff6021010 <Address 0x7ffff6021010 out of bounds>,
pf = {bits_per_pixel = 32 ' ', bytes_per_pixel = 4 '\004',
depth = 24 '\030', rmask = 0, gmask = 0, bmask = 0, amask = 0,
rshift = 16 '\020', gshift = 8 '\b', bshift = 0 '\000',
ashift = 24 '\030', rmax = 255 '\377', gmax = 255 '\377',
bmax = 255 '\377', amax = 255 '\377', rbits = 8 '\b',
gbits = 8 '\b', bbits = 8 '\b', abits = 8 '\b'}},
audio_cap = 0x0, as = {freq = 0, nchannels = 0, fmt = AUD_FMT_U8,
endianness = 0}, read_handler = 0, read_handler_expect = 0,
modifiers_state = '\000' <repeats 255 times>, led = 0x0,
abort = false, output_mutex = {lock = {__data = {__lock = 0,
__count = 0, __owner = 0, __nusers = 0, __kind = 0,
__spins = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = '\000' <repeats 39 times>, __align = 0}}, tight = {
type = 7, quality = 255 '\377', compression = 9 '\t',
pixel24 = 1 '\001', tight = {capacity = 3146708, offset = 1376,
buffer = 0x7ffff4684010 ""}, tmp = {capacity = 3194,
offset = 2672, buffer = 0x1fbbfd0 ""}, zlib = {
capacity = 141451, offset = 0,
buffer = 0x1805da0 "B--G\317\253\031=\257f\364\274\232\321\363jF\317\253\241\326y5"}, gradient = {capacity = 0, offset = 0, buffer = 0x0},
levels = {9, 9, 9, 0}, stream = {{
next_in = 0x7ffff4684460 "\002\003\002\003\002\002\002\002\002\002\002\002\002", avail_in = 0, total_in = 9048240,
next_out = 0x1805df6 "d\276\345c\363\216\237\212\377\314\003\236\246\\$\361\025~\032\311\232\067q&_$\231\251y\262*!\231\067\236\067\363\214\347\315\240\361\221,\274T\257\221\314\333\341\251\362R\373\232_\311a\272\002\061sgt\317\030\332\262~\300\063\267]\307\267\343\033", avail_out = 141365,
total_out = 1034664, msg = 0x0, state = 0x16ea550,
zalloc = 0x4ca2d8 <vnc_zlib_zalloc>,
zfree = 0x4ca315 <vnc_zlib_zfree>, opaque = 0x7ffff6015920,
data_type = 0, adler = 1285581469, reserved = 0}, {
next_in = 0x7ffff4684022 "\002\002", avail_in = 0,
total_in = 46491,
next_out = 0x1805da8 "\257f\364\274\232\321\363jF\317\253\241\326y5", avail_out = 141443, total_out = 9905, msg = 0x0, state = 0x17a3f00,
zalloc = 0x4ca2d8 <vnc_zlib_zalloc>,
zfree = 0x4ca315 <vnc_zlib_zfree>, opaque = 0x7ffff6015920,
data_type = 0, adler = 2415253306, reserved = 0}, {
next_in = 0x7ffff4684570 "\017(`", avail_in = 0,
total_in = 2188024,
next_out = 0x1805dbd "z\222\024#\233O\214g\341\352\211/U\310s\017\361\245\020\262\343\211TO\222\371\304\207\fryK|\251E\222\343\311+\237\351`>\355\312sR\265\320\272~\001", avail_out = 141422, total_out = 100512, msg = 0x0,
state = 0x1682950, zalloc = 0x4ca2d8 <vnc_zlib_zalloc>,
zfree = 0x4ca315 <vnc_zlib_zfree>, opaque = 0x7ffff6015920,
data_type = 0, adler = 3999694267, reserved = 0}, {
next_in = 0x0, avail_in = 0, total_in = 0, next_out = 0x0,
avail_out = 0, total_out = 0, msg = 0x0, state = 0x0,
zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0, adler = 0,
reserved = 0}}}, zlib = {zlib = {capacity = 0, offset = 0,
buffer = 0x0}, tmp = {capacity = 0, offset = 0, buffer = 0x0},
stream = {next_in = 0x0, avail_in = 0, total_in = 0,
next_out = 0x0, avail_out = 0, total_out = 0, msg = 0x0,
state = 0x0, zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0,
adler = 0, reserved = 0}, level = 0}, hextile = {
send_tile = 0x4cd21d <send_hextile_tile_generic_32>},
mouse_mode_notifier = {notify = 0x80, node = {tqe_next = 0x1612da8,
tqe_prev = 0x7ffff6020700}}, next = {tqe_next = 0x7ffff6020700,
tqe_prev = 0x7ffff7ffe128}}
n_rectangles = 40
saved_offset = 2
flush = true
#3 0x00000000004d302c in vnc_worker_thread (arg=0x1612d50)
at ui/vnc-jobs-async.c:302
queue = 0x1612d50
#4 0x00007ffff79be9ca in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#5 0x00007ffff6c3970d in clone () from /lib/libc.so.6
No symbol table info available.
#6 0x0000000000000000 in ?? ()
No symbol table info available.

Thread 2 (Thread 0x7ffff694e700 (LWP 29042)):
#0 0x00007ffff6c31197 in ioctl () from /lib/libc.so.6
No symbol table info available.
#1 0x0000000000434c12 in kvm_run (env=0x118c2e0)
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:582
r = 0
kvm = 0x1172538
run = 0x7ffff7fc7000
fd = 15
#2 0x0000000000435ed4 in kvm_cpu_exec (env=0x118c2e0)
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1233
r = 0
#3 0x00000000004364f7 in kvm_main_loop_cpu (env=0x118c2e0)
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1419
run_cpu = 1
#4 0x0000000000436640 in ap_main_loop (_env=0x118c2e0)
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1466
env = 0x118c2e0
signals = {__val = {18446744067267100671,
18446744073709551615 <repeats 15 times>}}
data = 0x0
#5 0x00007ffff79be9ca in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#6 0x00007ffff6c3970d in clone () from /lib/libc.so.6
No symbol table info available.
#7 0x0000000000000000 in ?? ()
No symbol table info available.

Thread 1 (Thread 0x7ffff7ff0700 (LWP 29038)):
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#1 0x000000000041d669 in main_loop_wait (nonblocking=0)
at /usr/src/qemu-kvm-0.14.0/vl.c:1388
pioh = 0x1652870
ioh = 0x1613330
rfds = {fds_bits = {0 <repeats 16 times>}}
wfds = {fds_bits = {1048576, 0 <repeats 15 times>}}
xfds = {fds_bits = {0 <repeats 16 times>}}
ret = 1
nfds = 21
tv = {tv_sec = 0, tv_usec = 999998}
timeout = 1000
#2 0x0000000000436a4a in kvm_main_loop ()
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1589
mask = {__val = {268443712, 0 <repeats 15 times>}}
sigfd = 19
#3 0x000000000041d785 in main_loop () at /usr/src/qemu-kvm-0.14.0/vl.c:1429
r = 0
#4 0x000000000042166a in main (argc=33, argv=0x7fffffffe658,
envp=0x7fffffffe768) at /usr/src/qemu-kvm-0.14.0/vl.c:3201
gdbstub_dev = 0x0
i = 64
snapshot = 0
linux_boot = 0
icount_option = 0x0
initrd_filename = 0x0
kernel_filename = 0x0
kernel_cmdline = 0x5e57a3 ""
boot_devices = "c\000d", '\000' <repeats 29 times>
ds = 0x15cb380
dcl = 0x0
cyls = 0
heads = 0
secs = 0
translation = 0
hda_opts = 0x0
opts = 0x1171a80
olist = 0x7fffffffe3f0
optind = 33
optarg = 0x7fffffffebce "tablet"
loadvm = 0x0
machine = 0x962cc0
cpu_model = 0x7fffffffeb51 "qemu64,model_id=Intel(R) Xeon(R) CPU", ' ' <repeats 11 times>, "E5640 @ 2.67GHz,-nx"
tb_size = 0
pid_file = 0x7fffffffeb10 "/var/run/qemu/vm-265.pid"
incoming = 0x0
show_vnc_port = 0
defconfig = 1
(gdb) info threads
3 Thread 0x7ffff6020700 (LWP 29043) 0x00007ffff79c385c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
2 Thread 0x7ffff694e700 (LWP 29042) 0x00007ffff6c31197 in ioctl ()
from /lib/libc.so.6
* 1 Thread 0x7ffff7ff0700 (LWP 29038) 0x0000000000000000 in ?? ()
(gdb)


Help appreciated!

Thanks,
Peter--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Corentin Chary
2011-03-09 07:13:43 UTC
Permalink
Post by Peter Lieven
Hi,
during testing of qemu-kvm-0.14.0 i can reproduce the following segfault. i have seen similar crash already in 0.13.0, but had no time to debug.
my guess is that this segfault is related to the threaded vnc server which was introduced in qemu 0.13.0. the bug is only triggerable if a vnc
client is attached. it might also be connected to a resolution change in the guest. i have a backtrace attached. the debugger is still running if someone
needs more output
Hi, it looks like something that was already fixed sometime ago.
What version exactly are you using ? Could you try with the one from
stable-0.14 git branch ?
Thanks,
--
Corentin Chary
http://xf.iksaif.net
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Stefan Weil
2011-03-09 07:26:52 UTC
Permalink
Post by Peter Lieven
Hi,
during testing of qemu-kvm-0.14.0 i can reproduce the following
segfault. i have seen similar crash already in 0.13.0, but had no time
to debug.
my guess is that this segfault is related to the threaded vnc server
which was introduced in qemu 0.13.0. the bug is only triggerable if a vnc
client is attached. it might also be connected to a resolution change
in the guest. i have a backtrace attached. the debugger is still
running if someone
needs more output
Reading symbols from /usr/local/bin/qemu-system-x86_64...done.
(gdb) r -net tap,vlan=141,script=no,downscript=no,ifname=tap0 -net
nic,vlan=141,model=rtl8139,macaddr=52:54:00:ff:00:93 -drive
format=host_device,file=/dev/mapper/iqn.2001-05.co
m.equallogic:0-8a0906-e6b70e107-e87000e7acf4d4e5-lieven-winxp-r17453,if=ide,boot=on,cache=none,aio=native
-m 1024 -monitor tcp:0:4001,server,nowait -vnc :1 -name 'lieven-winxp-te
st' -boot order=c,menu=on -k de -pidfile /var/run/qemu/vm-265.pid
-mem-path /hugepages -mem-prealloc -cpu qemu64,model_id='Intel(R)
x -rtc base=localtime,clock=vm -vga cirrus -usb -usbdevice tablet
Starting program: /usr/local/bin/qemu-system-x86_64 -net
tap,vlan=141,script=no,downscript=no,ifname=tap0 -net
nic,vlan=141,model=rtl8139,macaddr=52:54:00:ff:00:93 -drive format
=host_device,file=/dev/mapper/iqn.2001-05.com.equallogic:0-8a0906-e6b70e107-e87000e7acf4d4e5-lieven-winxp-r17453,if=ide,boot=on,cache=none,aio=native
-m 1024 -monitor tcp:0:4001,
server,nowait -vnc :1 -name 'lieven-winxp-test' -boot order=c,menu=on
-k de -pidfile /var/run/qemu/vm-265.pid -mem-path /hugepages
-mem-prealloc -cpu qemu64,model_id='Intel(R
cirrus -usb -usbdevice tablet
[Thread debugging using libthread_db enabled]
[New Thread 0x7ffff694e700 (LWP 29042)]
[New Thread 0x7ffff6020700 (LWP 29043)]
[New Thread 0x7ffff581f700 (LWP 29074)]
[Thread 0x7ffff581f700 (LWP 29074) exited]
[New Thread 0x7ffff581f700 (LWP 29124)]
[Thread 0x7ffff581f700 (LWP 29124) exited]
[New Thread 0x7ffff581f700 (LWP 29170)]
[Thread 0x7ffff581f700 (LWP 29170) exited]
[New Thread 0x7ffff581f700 (LWP 29246)]
[Thread 0x7ffff581f700 (LWP 29246) exited]
[New Thread 0x7ffff581f700 (LWP 29303)]
[Thread 0x7ffff581f700 (LWP 29303) exited]
[New Thread 0x7ffff581f700 (LWP 29349)]
[Thread 0x7ffff581f700 (LWP 29349) exited]
[New Thread 0x7ffff581f700 (LWP 29399)]
[Thread 0x7ffff581f700 (LWP 29399) exited]
[New Thread 0x7ffff581f700 (LWP 29471)]
[Thread 0x7ffff581f700 (LWP 29471) exited]
[New Thread 0x7ffff581f700 (LWP 29521)]
[Thread 0x7ffff581f700 (LWP 29521) exited]
[New Thread 0x7ffff581f700 (LWP 29593)]
[Thread 0x7ffff581f700 (LWP 29593) exited]
[New Thread 0x7ffff581f700 (LWP 29703)]
[Thread 0x7ffff581f700 (LWP 29703) exited]
Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb)
(gdb) thread apply all bt full
from /lib/libpthread.so.0
No symbol table info available.
#1 0x00000000004d3ae1 in qemu_cond_wait (cond=0x1612d50, mutex=0x1612d80)
at qemu-thread.c:133
err = 0
__func__ = "qemu_cond_wait"
#2 0x00000000004d2b39 in vnc_worker_thread_loop (queue=0x1612d50)
at ui/vnc-jobs-async.c:198
job = 0x7ffff058cd20
entry = 0x0
tmp = 0x0
vs = {csock = -1, ds = 0x15cb380, dirty = {{0, 0, 0, 0,
0} <repeats 2048 times>}, vd = 0x1607ff0, need_update = 0,
force_update = 0, features = 243, absolute = 0, last_x = 0,
last_y = 0, client_width = 0, client_height = 0, vnc_encoding = 7,
major = 0, minor = 0, challenge = '\000' <repeats 15 times>,
info = 0x0, output = {capacity = 3194, offset = 2723,
buffer = 0x1fbbfd0 ""}, input = {capacity = 0, offset = 0,
buffer = 0x0}, write_pixels = 0x4c4bc9 <vnc_write_pixels_generic>,
clientds = {flags = 0 '\000', width = 720, height = 400,
linesize = 2880,
data = 0x7ffff6021010 <Address 0x7ffff6021010 out of bounds>,
pf = {bits_per_pixel = 32 ' ', bytes_per_pixel = 4 '\004',
depth = 24 '\030', rmask = 0, gmask = 0, bmask = 0, amask = 0,
rshift = 16 '\020', gshift = 8 '\b', bshift = 0 '\000',
ashift = 24 '\030', rmax = 255 '\377', gmax = 255 '\377',
bmax = 255 '\377', amax = 255 '\377', rbits = 8 '\b',
gbits = 8 '\b', bbits = 8 '\b', abits = 8 '\b'}},
audio_cap = 0x0, as = {freq = 0, nchannels = 0, fmt = AUD_FMT_U8,
endianness = 0}, read_handler = 0, read_handler_expect = 0,
modifiers_state = '\000' <repeats 255 times>, led = 0x0,
abort = false, output_mutex = {lock = {__data = {__lock = 0,
__count = 0, __owner = 0, __nusers = 0, __kind = 0,
__spins = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = '\000' <repeats 39 times>, __align = 0}}, tight = {
type = 7, quality = 255 '\377', compression = 9 '\t',
pixel24 = 1 '\001', tight = {capacity = 3146708, offset = 1376,
buffer = 0x7ffff4684010 ""}, tmp = {capacity = 3194,
offset = 2672, buffer = 0x1fbbfd0 ""}, zlib = {
capacity = 141451, offset = 0,
buffer = 0x1805da0
"B--G\317\253\031=\257f\364\274\232\321\363jF\317\253\241\326y5"},
gradient = {capacity = 0, offset = 0, buffer = 0x0},
levels = {9, 9, 9, 0}, stream = {{
next_in = 0x7ffff4684460
"\002\003\002\003\002\002\002\002\002\002\002\002\002", avail_in = 0,
total_in = 9048240,
next_out = 0x1805df6
"d\276\345c\363\216\237\212\377\314\003\236\246\\$\361\025~\032\311\232\067q&_$\231\251y\262*!\231\067\236\067\363\214\347\315\240\361\221,\274T\257\221\314\333\341\251\362R\373\232_\311a\272\002\061sgt\317\030\332\262~\300\063\267]\307\267\343\033",
avail_out = 141365,
total_out = 1034664, msg = 0x0, state = 0x16ea550,
zalloc = 0x4ca2d8 <vnc_zlib_zalloc>,
zfree = 0x4ca315 <vnc_zlib_zfree>, opaque = 0x7ffff6015920,
data_type = 0, adler = 1285581469, reserved = 0}, {
next_in = 0x7ffff4684022 "\002\002", avail_in = 0,
total_in = 46491,
next_out = 0x1805da8 "\257f\364\274\232\321\363jF\317\253\241\326y5",
avail_out = 141443, total_out = 9905, msg = 0x0, state = 0x17a3f00,
zalloc = 0x4ca2d8 <vnc_zlib_zalloc>,
zfree = 0x4ca315 <vnc_zlib_zfree>, opaque = 0x7ffff6015920,
data_type = 0, adler = 2415253306, reserved = 0}, {
next_in = 0x7ffff4684570 "\017(`", avail_in = 0,
total_in = 2188024,
next_out = 0x1805dbd
"z\222\024#\233O\214g\341\352\211/U\310s\017\361\245\020\262\343\211TO\222\371\304\207\fryK|\251E\222\343\311+\237\351`>\355\312sR\265\320\272~\001",
avail_out = 141422, total_out = 100512, msg = 0x0,
state = 0x1682950, zalloc = 0x4ca2d8 <vnc_zlib_zalloc>,
zfree = 0x4ca315 <vnc_zlib_zfree>, opaque = 0x7ffff6015920,
data_type = 0, adler = 3999694267, reserved = 0}, {
next_in = 0x0, avail_in = 0, total_in = 0, next_out = 0x0,
avail_out = 0, total_out = 0, msg = 0x0, state = 0x0,
zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0, adler = 0,
reserved = 0}}}, zlib = {zlib = {capacity = 0, offset = 0,
buffer = 0x0}, tmp = {capacity = 0, offset = 0, buffer = 0x0},
stream = {next_in = 0x0, avail_in = 0, total_in = 0,
next_out = 0x0, avail_out = 0, total_out = 0, msg = 0x0,
state = 0x0, zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0,
adler = 0, reserved = 0}, level = 0}, hextile = {
send_tile = 0x4cd21d <send_hextile_tile_generic_32>},
mouse_mode_notifier = {notify = 0x80, node = {tqe_next = 0x1612da8,
tqe_prev = 0x7ffff6020700}}, next = {tqe_next = 0x7ffff6020700,
tqe_prev = 0x7ffff7ffe128}}
n_rectangles = 40
saved_offset = 2
flush = true
#3 0x00000000004d302c in vnc_worker_thread (arg=0x1612d50)
at ui/vnc-jobs-async.c:302
queue = 0x1612d50
#4 0x00007ffff79be9ca in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#5 0x00007ffff6c3970d in clone () from /lib/libc.so.6
No symbol table info available.
#6 0x0000000000000000 in ?? ()
No symbol table info available.
#0 0x00007ffff6c31197 in ioctl () from /lib/libc.so.6
No symbol table info available.
#1 0x0000000000434c12 in kvm_run (env=0x118c2e0)
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:582
r = 0
kvm = 0x1172538
run = 0x7ffff7fc7000
fd = 15
#2 0x0000000000435ed4 in kvm_cpu_exec (env=0x118c2e0)
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1233
r = 0
#3 0x00000000004364f7 in kvm_main_loop_cpu (env=0x118c2e0)
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1419
run_cpu = 1
#4 0x0000000000436640 in ap_main_loop (_env=0x118c2e0)
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1466
env = 0x118c2e0
signals = {__val = {18446744067267100671,
18446744073709551615 <repeats 15 times>}}
data = 0x0
#5 0x00007ffff79be9ca in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#6 0x00007ffff6c3970d in clone () from /lib/libc.so.6
No symbol table info available.
#7 0x0000000000000000 in ?? ()
No symbol table info available.
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#1 0x000000000041d669 in main_loop_wait (nonblocking=0)
at /usr/src/qemu-kvm-0.14.0/vl.c:1388
pioh = 0x1652870
ioh = 0x1613330
rfds = {fds_bits = {0 <repeats 16 times>}}
wfds = {fds_bits = {1048576, 0 <repeats 15 times>}}
xfds = {fds_bits = {0 <repeats 16 times>}}
ret = 1
nfds = 21
tv = {tv_sec = 0, tv_usec = 999998}
timeout = 1000
#2 0x0000000000436a4a in kvm_main_loop ()
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1589
mask = {__val = {268443712, 0 <repeats 15 times>}}
sigfd = 19
#3 0x000000000041d785 in main_loop () at
/usr/src/qemu-kvm-0.14.0/vl.c:1429
r = 0
#4 0x000000000042166a in main (argc=33, argv=0x7fffffffe658,
envp=0x7fffffffe768) at /usr/src/qemu-kvm-0.14.0/vl.c:3201
gdbstub_dev = 0x0
i = 64
snapshot = 0
linux_boot = 0
icount_option = 0x0
initrd_filename = 0x0
kernel_filename = 0x0
kernel_cmdline = 0x5e57a3 ""
boot_devices = "c\000d", '\000' <repeats 29 times>
ds = 0x15cb380
dcl = 0x0
cyls = 0
heads = 0
secs = 0
translation = 0
hda_opts = 0x0
opts = 0x1171a80
olist = 0x7fffffffe3f0
optind = 33
optarg = 0x7fffffffebce "tablet"
loadvm = 0x0
machine = 0x962cc0
cpu_model = 0x7fffffffeb51 "qemu64,model_id=Intel(R) Xeon(R) CPU", ' '
tb_size = 0
pid_file = 0x7fffffffeb10 "/var/run/qemu/vm-265.pid"
incoming = 0x0
show_vnc_port = 0
defconfig = 1
(gdb) info threads
3 Thread 0x7ffff6020700 (LWP 29043) 0x00007ffff79c385c in
2 Thread 0x7ffff694e700 (LWP 29042) 0x00007ffff6c31197 in ioctl ()
from /lib/libc.so.6
* 1 Thread 0x7ffff7ff0700 (LWP 29038) 0x0000000000000000 in ?? ()
(gdb)
Help appreciated!
Thanks,
Peter
Hi Peter,

did you apply this patch which fixes one of the known vnc problems
(but is still missing in qemu git master):

http://lists.nongnu.org/archive/html/qemu-devel/2011-03/msg00256.html

Then you can read this thread:

http://lists.nongnu.org/archive/html/qemu-devel/2011-03/msg00313.html

And finally the following modifications of ui/vnc.c might help to see
whether you experience the same kind of crash as I get here in
my environment. They add assertions for bad memory access
which occurs sometimes when a vnc client-server connection exists and
the screen is refreshed after a resolution change.
The code line with the //~ comment also includes a fix which
works for me.

Regards,
Stefan W.

@@ -2382,6 +2384,10 @@ static int vnc_refresh_server_surface(VncDisplay *vd)
int y;
uint8_t *guest_row;
uint8_t *server_row;
+
+ size_t guest_size = vd->guest.ds->linesize * vd->guest.ds->height;
+ size_t server_size = vd->server->linesize * vd->server->height;
+
int cmp_bytes;
VncState *vs;
int has_dirty = 0;
@@ -2399,11 +2405,15 @@ static int vnc_refresh_server_surface(VncDisplay
*vd)
* Update server dirty map.
*/
cmp_bytes = 16 * ds_get_bytes_per_pixel(vd->ds);
+ if (cmp_bytes > vd->ds->surface->linesize) {
+ //~ fix crash: cmp_bytes = vd->ds->surface->linesize;
+ }
guest_row = vd->guest.ds->data;
server_row = vd->server->data;
for (y = 0; y < vd->guest.ds->height; y++) {
if (!bitmap_empty(vd->guest.dirty[y], VNC_DIRTY_BITS)) {
int x;
+ size_t size_offset = 0;
uint8_t *guest_ptr;
uint8_t *server_ptr;

@@ -2412,6 +2422,9 @@ static int vnc_refresh_server_surface(VncDisplay *vd)

for (x = 0; x < vd->guest.ds->width;
x += 16, guest_ptr += cmp_bytes, server_ptr +=
cmp_bytes) {
+ assert(size_offset + cmp_bytes <= guest_size);
+ assert(size_offset + cmp_bytes <= server_size);
+ size_offset += cmp_bytes;
if (!test_and_clear_bit((x / 16), vd->guest.dirty[y]))
continue;
if (memcmp(server_ptr, guest_ptr, cmp_bytes) == 0)
@@ -2427,6 +2440,10 @@ static int vnc_refresh_server_surface(VncDisplay *vd)
}
guest_row += ds_get_linesize(vd->ds);
server_row += ds_get_linesize(vd->ds);
+ assert(guest_size >= ds_get_linesize(vd->ds));
+ guest_size -= ds_get_linesize(vd->ds);
+ assert(server_size >= ds_get_linesize(vd->ds));
+ server_size -= ds_get_linesize(vd->ds);
}
return has_dirty;
}

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Michael Tokarev
2011-03-09 07:39:17 UTC
Permalink
Post by Stefan Weil
Post by Peter Lieven
Hi,
during testing of qemu-kvm-0.14.0 i can reproduce the following
segfault. i have seen similar crash already in 0.13.0, but had no time
to debug.
my guess is that this segfault is related to the threaded vnc server
which was introduced in qemu 0.13.0. the bug is only triggerable if a vnc
client is attached. it might also be connected to a resolution change
in the guest. i have a backtrace attached. the debugger is still
running if someone
needs more output
[]
Post by Stefan Weil
Hi Peter,
did you apply this patch which fixes one of the known vnc problems
http://lists.nongnu.org/archive/html/qemu-devel/2011-03/msg00256.html
This patch is not suitable for 0.14 since in current qemu/master quite
alot of stuff were changed in this area (bitmaps added), there's no
similar infrastructure in 0.14.
Post by Stefan Weil
http://lists.nongnu.org/archive/html/qemu-devel/2011-03/msg00313.html
And finally the following modifications of ui/vnc.c might help to see
whether you experience the same kind of crash as I get here in
my environment. They add assertions for bad memory access
which occurs sometimes when a vnc client-server connection exists and
the screen is refreshed after a resolution change.
The code line with the //~ comment also includes a fix which
works for me.
The same is true for this patch, but of a less extent: it can be applied
manually (the bitmap_empty context line).

I wonder if something similar actually exists in 0.13/0.14 too and needs
to be backported to -stable.
Post by Stefan Weil
Regards,
Stefan W.
Thanks!

/mjt
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Stefan Weil
2011-03-09 09:22:31 UTC
Permalink
Post by Michael Tokarev
Post by Stefan Weil
Post by Peter Lieven
Hi,
during testing of qemu-kvm-0.14.0 i can reproduce the following
segfault. i have seen similar crash already in 0.13.0, but had no time
to debug.
my guess is that this segfault is related to the threaded vnc server
which was introduced in qemu 0.13.0. the bug is only triggerable if a vnc
client is attached. it might also be connected to a resolution change
in the guest. i have a backtrace attached. the debugger is still
running if someone
needs more output
[]
Post by Stefan Weil
Hi Peter,
did you apply this patch which fixes one of the known vnc problems
http://lists.nongnu.org/archive/html/qemu-devel/2011-03/msg00256.html
This patch is not suitable for 0.14 since in current qemu/master quite
alot of stuff were changed in this area (bitmaps added), there's no
similar infrastructure in 0.14.
Post by Stefan Weil
http://lists.nongnu.org/archive/html/qemu-devel/2011-03/msg00313.html
And finally the following modifications of ui/vnc.c might help to see
whether you experience the same kind of crash as I get here in
my environment. They add assertions for bad memory access
which occurs sometimes when a vnc client-server connection exists and
the screen is refreshed after a resolution change.
The code line with the //~ comment also includes a fix which
works for me.
The same is true for this patch, but of a less extent: it can be applied
manually (the bitmap_empty context line).
I wonder if something similar actually exists in 0.13/0.14 too and needs
to be backported to -stable.
Post by Stefan Weil
Regards,
Stefan W.
Thanks!
/mjt
I just tested stable-0.14. It shows the same kind of bug.
Output of qemu run with valgrind:

==18143== Conditional jump or move depends on uninitialised value(s)
==18143== at 0x4027022: bcmp (mc_replace_strmem.c:541)
==18143== by 0x80EEF96: vnc_refresh_server_surface (vnc.c:2292)
==18143== by 0x80EF0F1: vnc_refresh (vnc.c:2322)
==18143== by 0x80FA026: qemu_run_timers (qemu-timer.c:503)
==18143== by 0x80FA34E: qemu_run_all_timers (qemu-timer.c:634)
==18143== by 0x816BBB6: main_loop_wait (vl.c:1383)
==18143== by 0x816BC36: main_loop (vl.c:1424)
==18143== by 0x816FEAF: main (vl.c:3136)

Stefan

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Peter Lieven
2011-03-09 10:00:48 UTC
Permalink
Post by Stefan Weil
Post by Peter Lieven
Hi,
during testing of qemu-kvm-0.14.0 i can reproduce the following segfault. i have seen similar crash already in 0.13.0, but had no time to debug.
my guess is that this segfault is related to the threaded vnc server which was introduced in qemu 0.13.0. the bug is only triggerable if a vnc
client is attached. it might also be connected to a resolution change in the guest. i have a backtrace attached. the debugger is still running if someone
needs more output
Reading symbols from /usr/local/bin/qemu-system-x86_64...done.
(gdb) r -net tap,vlan=141,script=no,downscript=no,ifname=tap0 -net nic,vlan=141,model=rtl8139,macaddr=52:54:00:ff:00:93 -drive format=host_device,file=/dev/mapper/iqn.2001-05.co
m.equallogic:0-8a0906-e6b70e107-e87000e7acf4d4e5-lieven-winxp-r17453,if=ide,boot=on,cache=none,aio=native -m 1024 -monitor tcp:0:4001,server,nowait -vnc :1 -name 'lieven-winxp-te
x -rtc base=localtime,clock=vm -vga cirrus -usb -usbdevice tablet
Starting program: /usr/local/bin/qemu-system-x86_64 -net tap,vlan=141,script=no,downscript=no,ifname=tap0 -net nic,vlan=141,model=rtl8139,macaddr=52:54:00:ff:00:93 -drive format
=host_device,file=/dev/mapper/iqn.2001-05.com.equallogic:0-8a0906-e6b70e107-e87000e7acf4d4e5-lieven-winxp-r17453,if=ide,boot=on,cache=none,aio=native -m 1024 -monitor tcp:0:4001,
server,nowait -vnc :1 -name 'lieven-winxp-test' -boot order=c,menu=on -k de -pidfile /var/run/qemu/vm-265.pid -mem-path /hugepages -mem-prealloc -cpu qemu64,model_id='Intel(R
[Thread debugging using libthread_db enabled]
[New Thread 0x7ffff694e700 (LWP 29042)]
[New Thread 0x7ffff6020700 (LWP 29043)]
[New Thread 0x7ffff581f700 (LWP 29074)]
[Thread 0x7ffff581f700 (LWP 29074) exited]
[New Thread 0x7ffff581f700 (LWP 29124)]
[Thread 0x7ffff581f700 (LWP 29124) exited]
[New Thread 0x7ffff581f700 (LWP 29170)]
[Thread 0x7ffff581f700 (LWP 29170) exited]
[New Thread 0x7ffff581f700 (LWP 29246)]
[Thread 0x7ffff581f700 (LWP 29246) exited]
[New Thread 0x7ffff581f700 (LWP 29303)]
[Thread 0x7ffff581f700 (LWP 29303) exited]
[New Thread 0x7ffff581f700 (LWP 29349)]
[Thread 0x7ffff581f700 (LWP 29349) exited]
[New Thread 0x7ffff581f700 (LWP 29399)]
[Thread 0x7ffff581f700 (LWP 29399) exited]
[New Thread 0x7ffff581f700 (LWP 29471)]
[Thread 0x7ffff581f700 (LWP 29471) exited]
[New Thread 0x7ffff581f700 (LWP 29521)]
[Thread 0x7ffff581f700 (LWP 29521) exited]
[New Thread 0x7ffff581f700 (LWP 29593)]
[Thread 0x7ffff581f700 (LWP 29593) exited]
[New Thread 0x7ffff581f700 (LWP 29703)]
[Thread 0x7ffff581f700 (LWP 29703) exited]
Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb)
(gdb) thread apply all bt full
from /lib/libpthread.so.0
No symbol table info available.
#1 0x00000000004d3ae1 in qemu_cond_wait (cond=0x1612d50, mutex=0x1612d80)
at qemu-thread.c:133
err = 0
__func__ = "qemu_cond_wait"
#2 0x00000000004d2b39 in vnc_worker_thread_loop (queue=0x1612d50)
at ui/vnc-jobs-async.c:198
job = 0x7ffff058cd20
entry = 0x0
tmp = 0x0
vs = {csock = -1, ds = 0x15cb380, dirty = {{0, 0, 0, 0,
0} <repeats 2048 times>}, vd = 0x1607ff0, need_update = 0,
force_update = 0, features = 243, absolute = 0, last_x = 0,
last_y = 0, client_width = 0, client_height = 0, vnc_encoding = 7,
major = 0, minor = 0, challenge = '\000' <repeats 15 times>,
info = 0x0, output = {capacity = 3194, offset = 2723,
buffer = 0x1fbbfd0 ""}, input = {capacity = 0, offset = 0,
buffer = 0x0}, write_pixels = 0x4c4bc9 <vnc_write_pixels_generic>,
clientds = {flags = 0 '\000', width = 720, height = 400,
linesize = 2880,
data = 0x7ffff6021010 <Address 0x7ffff6021010 out of bounds>,
pf = {bits_per_pixel = 32 ' ', bytes_per_pixel = 4 '\004',
depth = 24 '\030', rmask = 0, gmask = 0, bmask = 0, amask = 0,
rshift = 16 '\020', gshift = 8 '\b', bshift = 0 '\000',
ashift = 24 '\030', rmax = 255 '\377', gmax = 255 '\377',
bmax = 255 '\377', amax = 255 '\377', rbits = 8 '\b',
gbits = 8 '\b', bbits = 8 '\b', abits = 8 '\b'}},
audio_cap = 0x0, as = {freq = 0, nchannels = 0, fmt = AUD_FMT_U8,
endianness = 0}, read_handler = 0, read_handler_expect = 0,
modifiers_state = '\000' <repeats 255 times>, led = 0x0,
abort = false, output_mutex = {lock = {__data = {__lock = 0,
__count = 0, __owner = 0, __nusers = 0, __kind = 0,
__spins = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = '\000' <repeats 39 times>, __align = 0}}, tight = {
type = 7, quality = 255 '\377', compression = 9 '\t',
pixel24 = 1 '\001', tight = {capacity = 3146708, offset = 1376,
buffer = 0x7ffff4684010 ""}, tmp = {capacity = 3194,
offset = 2672, buffer = 0x1fbbfd0 ""}, zlib = {
capacity = 141451, offset = 0,
buffer = 0x1805da0 "B--G\317\253\031=\257f\364\274\232\321\363jF\317\253\241\326y5"}, gradient = {capacity = 0, offset = 0, buffer = 0x0},
levels = {9, 9, 9, 0}, stream = {{
next_in = 0x7ffff4684460 "\002\003\002\003\002\002\002\002\002\002\002\002\002", avail_in = 0, total_in = 9048240,
next_out = 0x1805df6 "d\276\345c\363\216\237\212\377\314\003\236\246\\$\361\025~\032\311\232\067q&_$\231\251y\262*!\231\067\236\067\363\214\347\315\240\361\221,\274T\257\221\314\333\341\251\362R\373\232_\311a\272\002\061sgt\317\030\332\262~\300\063\267]\307\267\343\033", avail_out = 141365,
total_out = 1034664, msg = 0x0, state = 0x16ea550,
zalloc = 0x4ca2d8 <vnc_zlib_zalloc>,
zfree = 0x4ca315 <vnc_zlib_zfree>, opaque = 0x7ffff6015920,
data_type = 0, adler = 1285581469, reserved = 0}, {
next_in = 0x7ffff4684022 "\002\002", avail_in = 0,
total_in = 46491,
next_out = 0x1805da8 "\257f\364\274\232\321\363jF\317\253\241\326y5", avail_out = 141443, total_out = 9905, msg = 0x0, state = 0x17a3f00,
zalloc = 0x4ca2d8 <vnc_zlib_zalloc>,
zfree = 0x4ca315 <vnc_zlib_zfree>, opaque = 0x7ffff6015920,
data_type = 0, adler = 2415253306, reserved = 0}, {
next_in = 0x7ffff4684570 "\017(`", avail_in = 0,
total_in = 2188024,
next_out = 0x1805dbd "z\222\024#\233O\214g\341\352\211/U\310s\017\361\245\020\262\343\211TO\222\371\304\207\fryK|\251E\222\343\311+\237\351`>\355\312sR\265\320\272~\001", avail_out = 141422, total_out = 100512, msg = 0x0,
state = 0x1682950, zalloc = 0x4ca2d8 <vnc_zlib_zalloc>,
zfree = 0x4ca315 <vnc_zlib_zfree>, opaque = 0x7ffff6015920,
data_type = 0, adler = 3999694267, reserved = 0}, {
next_in = 0x0, avail_in = 0, total_in = 0, next_out = 0x0,
avail_out = 0, total_out = 0, msg = 0x0, state = 0x0,
zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0, adler = 0,
reserved = 0}}}, zlib = {zlib = {capacity = 0, offset = 0,
buffer = 0x0}, tmp = {capacity = 0, offset = 0, buffer = 0x0},
stream = {next_in = 0x0, avail_in = 0, total_in = 0,
next_out = 0x0, avail_out = 0, total_out = 0, msg = 0x0,
state = 0x0, zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0,
adler = 0, reserved = 0}, level = 0}, hextile = {
send_tile = 0x4cd21d <send_hextile_tile_generic_32>},
mouse_mode_notifier = {notify = 0x80, node = {tqe_next = 0x1612da8,
tqe_prev = 0x7ffff6020700}}, next = {tqe_next = 0x7ffff6020700,
tqe_prev = 0x7ffff7ffe128}}
n_rectangles = 40
saved_offset = 2
flush = true
#3 0x00000000004d302c in vnc_worker_thread (arg=0x1612d50)
at ui/vnc-jobs-async.c:302
queue = 0x1612d50
#4 0x00007ffff79be9ca in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#5 0x00007ffff6c3970d in clone () from /lib/libc.so.6
No symbol table info available.
#6 0x0000000000000000 in ?? ()
No symbol table info available.
#0 0x00007ffff6c31197 in ioctl () from /lib/libc.so.6
No symbol table info available.
#1 0x0000000000434c12 in kvm_run (env=0x118c2e0)
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:582
r = 0
kvm = 0x1172538
run = 0x7ffff7fc7000
fd = 15
#2 0x0000000000435ed4 in kvm_cpu_exec (env=0x118c2e0)
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1233
r = 0
#3 0x00000000004364f7 in kvm_main_loop_cpu (env=0x118c2e0)
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1419
run_cpu = 1
#4 0x0000000000436640 in ap_main_loop (_env=0x118c2e0)
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1466
env = 0x118c2e0
signals = {__val = {18446744067267100671,
18446744073709551615 <repeats 15 times>}}
data = 0x0
#5 0x00007ffff79be9ca in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#6 0x00007ffff6c3970d in clone () from /lib/libc.so.6
No symbol table info available.
#7 0x0000000000000000 in ?? ()
No symbol table info available.
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#1 0x000000000041d669 in main_loop_wait (nonblocking=0)
at /usr/src/qemu-kvm-0.14.0/vl.c:1388
pioh = 0x1652870
ioh = 0x1613330
rfds = {fds_bits = {0 <repeats 16 times>}}
wfds = {fds_bits = {1048576, 0 <repeats 15 times>}}
xfds = {fds_bits = {0 <repeats 16 times>}}
ret = 1
nfds = 21
tv = {tv_sec = 0, tv_usec = 999998}
timeout = 1000
#2 0x0000000000436a4a in kvm_main_loop ()
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1589
mask = {__val = {268443712, 0 <repeats 15 times>}}
sigfd = 19
#3 0x000000000041d785 in main_loop () at /usr/src/qemu-kvm-0.14.0/vl.c:1429
r = 0
#4 0x000000000042166a in main (argc=33, argv=0x7fffffffe658,
envp=0x7fffffffe768) at /usr/src/qemu-kvm-0.14.0/vl.c:3201
gdbstub_dev = 0x0
i = 64
snapshot = 0
linux_boot = 0
icount_option = 0x0
initrd_filename = 0x0
kernel_filename = 0x0
kernel_cmdline = 0x5e57a3 ""
boot_devices = "c\000d", '\000' <repeats 29 times>
ds = 0x15cb380
dcl = 0x0
cyls = 0
heads = 0
secs = 0
translation = 0
hda_opts = 0x0
opts = 0x1171a80
olist = 0x7fffffffe3f0
optind = 33
optarg = 0x7fffffffebce "tablet"
loadvm = 0x0
machine = 0x962cc0
tb_size = 0
pid_file = 0x7fffffffeb10 "/var/run/qemu/vm-265.pid"
incoming = 0x0
show_vnc_port = 0
defconfig = 1
(gdb) info threads
2 Thread 0x7ffff694e700 (LWP 29042) 0x00007ffff6c31197 in ioctl ()
from /lib/libc.so.6
* 1 Thread 0x7ffff7ff0700 (LWP 29038) 0x0000000000000000 in ?? ()
(gdb)
Help appreciated!
Thanks,
Peter
hi stefan,
Post by Stefan Weil
Hi Peter,
did you apply this patch which fixes one of the known vnc problems
http://lists.nongnu.org/archive/html/qemu-devel/2011-03/msg00256.html
it seems that this patch is not compatible with 0.14.0. 0.14.0 uses
vnc_set_bits and vnc_and_bits and no bitmap functions.
does the problem you observed affect these functions as well?
then we might need to backport here. my intention is to get 0.14.0 stable
for daily operation.
Post by Stefan Weil
http://lists.nongnu.org/archive/html/qemu-devel/2011-03/msg00313.html
you mention a patch that was on qemu-devel recently. which patch do you refer
to?
Post by Stefan Weil
And finally the following modifications of ui/vnc.c might help to see
whether you experience the same kind of crash as I get here in
my environment. They add assertions for bad memory access
which occurs sometimes when a vnc client-server connection exists and
the screen is refreshed after a resolution change.
The code line with the //~ comment also includes a fix which
works for me.
i added these checks and also turned the //~ comment into an assert. I will
now run some tests again.

Thanks,
Peter
Post by Stefan Weil
Regards,
Stefan W.
@@ -2382,6 +2384,10 @@ static int vnc_refresh_server_surface(VncDisplay *vd)
int y;
uint8_t *guest_row;
uint8_t *server_row;
+
+ size_t guest_size = vd->guest.ds->linesize * vd->guest.ds->height;
+ size_t server_size = vd->server->linesize * vd->server->height;
+
int cmp_bytes;
VncState *vs;
int has_dirty = 0;
@@ -2399,11 +2405,15 @@ static int vnc_refresh_server_surface(VncDisplay *vd)
* Update server dirty map.
*/
cmp_bytes = 16 * ds_get_bytes_per_pixel(vd->ds);
+ if (cmp_bytes > vd->ds->surface->linesize) {
+ //~ fix crash: cmp_bytes = vd->ds->surface->linesize;
+ }
guest_row = vd->guest.ds->data;
server_row = vd->server->data;
for (y = 0; y < vd->guest.ds->height; y++) {
if (!bitmap_empty(vd->guest.dirty[y], VNC_DIRTY_BITS)) {
int x;
+ size_t size_offset = 0;
uint8_t *guest_ptr;
uint8_t *server_ptr;
@@ -2412,6 +2422,9 @@ static int vnc_refresh_server_surface(VncDisplay *vd)
for (x = 0; x < vd->guest.ds->width;
x += 16, guest_ptr += cmp_bytes, server_ptr += cmp_bytes) {
+ assert(size_offset + cmp_bytes <= guest_size);
+ assert(size_offset + cmp_bytes <= server_size);
+ size_offset += cmp_bytes;
if (!test_and_clear_bit((x / 16), vd->guest.dirty[y]))
continue;
if (memcmp(server_ptr, guest_ptr, cmp_bytes) == 0)
@@ -2427,6 +2440,10 @@ static int vnc_refresh_server_surface(VncDisplay *vd)
}
guest_row += ds_get_linesize(vd->ds);
server_row += ds_get_linesize(vd->ds);
+ assert(guest_size >= ds_get_linesize(vd->ds));
+ guest_size -= ds_get_linesize(vd->ds);
+ assert(server_size >= ds_get_linesize(vd->ds));
+ server_size -= ds_get_linesize(vd->ds);
}
return has_dirty;
}
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Peter Lieven
2011-03-15 12:53:20 UTC
Permalink
Post by Stefan Weil
Post by Peter Lieven
Hi,
during testing of qemu-kvm-0.14.0 i can reproduce the following
segfault. i have seen similar crash already in 0.13.0, but had no
time to debug.
my guess is that this segfault is related to the threaded vnc server
which was introduced in qemu 0.13.0. the bug is only triggerable if a vnc
client is attached. it might also be connected to a resolution change
in the guest. i have a backtrace attached. the debugger is still
running if someone
needs more output
Reading symbols from /usr/local/bin/qemu-system-x86_64...done.
(gdb) r -net tap,vlan=141,script=no,downscript=no,ifname=tap0 -net
nic,vlan=141,model=rtl8139,macaddr=52:54:00:ff:00:93 -drive
format=host_device,file=/dev/mapper/iqn.2001-05.co
m.equallogic:0-8a0906-e6b70e107-e87000e7acf4d4e5-lieven-winxp-r17453,if=ide,boot=on,cache=none,aio=native
-m 1024 -monitor tcp:0:4001,server,nowait -vnc :1 -name 'lieven-winxp-te
st' -boot order=c,menu=on -k de -pidfile /var/run/qemu/vm-265.pid
-mem-path /hugepages -mem-prealloc -cpu qemu64,model_id='Intel(R)
x -rtc base=localtime,clock=vm -vga cirrus -usb -usbdevice tablet
Starting program: /usr/local/bin/qemu-system-x86_64 -net
tap,vlan=141,script=no,downscript=no,ifname=tap0 -net
nic,vlan=141,model=rtl8139,macaddr=52:54:00:ff:00:93 -drive format
=host_device,file=/dev/mapper/iqn.2001-05.com.equallogic:0-8a0906-e6b70e107-e87000e7acf4d4e5-lieven-winxp-r17453,if=ide,boot=on,cache=none,aio=native
-m 1024 -monitor tcp:0:4001,
server,nowait -vnc :1 -name 'lieven-winxp-test' -boot order=c,menu=on
-k de -pidfile /var/run/qemu/vm-265.pid -mem-path /hugepages
-mem-prealloc -cpu qemu64,model_id='Intel(R
cirrus -usb -usbdevice tablet
[Thread debugging using libthread_db enabled]
[New Thread 0x7ffff694e700 (LWP 29042)]
[New Thread 0x7ffff6020700 (LWP 29043)]
[New Thread 0x7ffff581f700 (LWP 29074)]
[Thread 0x7ffff581f700 (LWP 29074) exited]
[New Thread 0x7ffff581f700 (LWP 29124)]
[Thread 0x7ffff581f700 (LWP 29124) exited]
[New Thread 0x7ffff581f700 (LWP 29170)]
[Thread 0x7ffff581f700 (LWP 29170) exited]
[New Thread 0x7ffff581f700 (LWP 29246)]
[Thread 0x7ffff581f700 (LWP 29246) exited]
[New Thread 0x7ffff581f700 (LWP 29303)]
[Thread 0x7ffff581f700 (LWP 29303) exited]
[New Thread 0x7ffff581f700 (LWP 29349)]
[Thread 0x7ffff581f700 (LWP 29349) exited]
[New Thread 0x7ffff581f700 (LWP 29399)]
[Thread 0x7ffff581f700 (LWP 29399) exited]
[New Thread 0x7ffff581f700 (LWP 29471)]
[Thread 0x7ffff581f700 (LWP 29471) exited]
[New Thread 0x7ffff581f700 (LWP 29521)]
[Thread 0x7ffff581f700 (LWP 29521) exited]
[New Thread 0x7ffff581f700 (LWP 29593)]
[Thread 0x7ffff581f700 (LWP 29593) exited]
[New Thread 0x7ffff581f700 (LWP 29703)]
[Thread 0x7ffff581f700 (LWP 29703) exited]
Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb)
(gdb) thread apply all bt full
from /lib/libpthread.so.0
No symbol table info available.
#1 0x00000000004d3ae1 in qemu_cond_wait (cond=0x1612d50,
mutex=0x1612d80)
at qemu-thread.c:133
err = 0
__func__ = "qemu_cond_wait"
#2 0x00000000004d2b39 in vnc_worker_thread_loop (queue=0x1612d50)
at ui/vnc-jobs-async.c:198
job = 0x7ffff058cd20
entry = 0x0
tmp = 0x0
vs = {csock = -1, ds = 0x15cb380, dirty = {{0, 0, 0, 0,
0} <repeats 2048 times>}, vd = 0x1607ff0, need_update = 0,
force_update = 0, features = 243, absolute = 0, last_x = 0,
last_y = 0, client_width = 0, client_height = 0, vnc_encoding = 7,
major = 0, minor = 0, challenge = '\000' <repeats 15 times>,
info = 0x0, output = {capacity = 3194, offset = 2723,
buffer = 0x1fbbfd0 ""}, input = {capacity = 0, offset = 0,
buffer = 0x0}, write_pixels = 0x4c4bc9 <vnc_write_pixels_generic>,
clientds = {flags = 0 '\000', width = 720, height = 400,
linesize = 2880,
data = 0x7ffff6021010 <Address 0x7ffff6021010 out of bounds>,
pf = {bits_per_pixel = 32 ' ', bytes_per_pixel = 4 '\004',
depth = 24 '\030', rmask = 0, gmask = 0, bmask = 0, amask = 0,
rshift = 16 '\020', gshift = 8 '\b', bshift = 0 '\000',
ashift = 24 '\030', rmax = 255 '\377', gmax = 255 '\377',
bmax = 255 '\377', amax = 255 '\377', rbits = 8 '\b',
gbits = 8 '\b', bbits = 8 '\b', abits = 8 '\b'}},
audio_cap = 0x0, as = {freq = 0, nchannels = 0, fmt = AUD_FMT_U8,
endianness = 0}, read_handler = 0, read_handler_expect = 0,
modifiers_state = '\000' <repeats 255 times>, led = 0x0,
abort = false, output_mutex = {lock = {__data = {__lock = 0,
__count = 0, __owner = 0, __nusers = 0, __kind = 0,
__spins = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = '\000' <repeats 39 times>, __align = 0}}, tight = {
type = 7, quality = 255 '\377', compression = 9 '\t',
pixel24 = 1 '\001', tight = {capacity = 3146708, offset = 1376,
buffer = 0x7ffff4684010 ""}, tmp = {capacity = 3194,
offset = 2672, buffer = 0x1fbbfd0 ""}, zlib = {
capacity = 141451, offset = 0,
buffer = 0x1805da0
"B--G\317\253\031=\257f\364\274\232\321\363jF\317\253\241\326y5"},
gradient = {capacity = 0, offset = 0, buffer = 0x0},
levels = {9, 9, 9, 0}, stream = {{
next_in = 0x7ffff4684460
"\002\003\002\003\002\002\002\002\002\002\002\002\002", avail_in = 0,
total_in = 9048240,
next_out = 0x1805df6
"d\276\345c\363\216\237\212\377\314\003\236\246\\$\361\025~\032\311\232\067q&_$\231\251y\262*!\231\067\236\067\363\214\347\315\240\361\221,\274T\257\221\314\333\341\251\362R\373\232_\311a\272\002\061sgt\317\030\332\262~\300\063\267]\307\267\343\033",
avail_out = 141365,
total_out = 1034664, msg = 0x0, state = 0x16ea550,
zalloc = 0x4ca2d8 <vnc_zlib_zalloc>,
zfree = 0x4ca315 <vnc_zlib_zfree>, opaque = 0x7ffff6015920,
data_type = 0, adler = 1285581469, reserved = 0}, {
next_in = 0x7ffff4684022 "\002\002", avail_in = 0,
total_in = 46491,
next_out = 0x1805da8 "\257f\364\274\232\321\363jF\317\253\241\326y5",
avail_out = 141443, total_out = 9905, msg = 0x0, state = 0x17a3f00,
zalloc = 0x4ca2d8 <vnc_zlib_zalloc>,
zfree = 0x4ca315 <vnc_zlib_zfree>, opaque = 0x7ffff6015920,
data_type = 0, adler = 2415253306, reserved = 0}, {
next_in = 0x7ffff4684570 "\017(`", avail_in = 0,
total_in = 2188024,
next_out = 0x1805dbd
"z\222\024#\233O\214g\341\352\211/U\310s\017\361\245\020\262\343\211TO\222\371\304\207\fryK|\251E\222\343\311+\237\351`>\355\312sR\265\320\272~\001",
avail_out = 141422, total_out = 100512, msg = 0x0,
state = 0x1682950, zalloc = 0x4ca2d8 <vnc_zlib_zalloc>,
zfree = 0x4ca315 <vnc_zlib_zfree>, opaque = 0x7ffff6015920,
data_type = 0, adler = 3999694267, reserved = 0}, {
next_in = 0x0, avail_in = 0, total_in = 0, next_out = 0x0,
avail_out = 0, total_out = 0, msg = 0x0, state = 0x0,
zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0, adler = 0,
reserved = 0}}}, zlib = {zlib = {capacity = 0, offset = 0,
buffer = 0x0}, tmp = {capacity = 0, offset = 0, buffer = 0x0},
stream = {next_in = 0x0, avail_in = 0, total_in = 0,
next_out = 0x0, avail_out = 0, total_out = 0, msg = 0x0,
state = 0x0, zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0,
adler = 0, reserved = 0}, level = 0}, hextile = {
send_tile = 0x4cd21d <send_hextile_tile_generic_32>},
mouse_mode_notifier = {notify = 0x80, node = {tqe_next = 0x1612da8,
tqe_prev = 0x7ffff6020700}}, next = {tqe_next = 0x7ffff6020700,
tqe_prev = 0x7ffff7ffe128}}
n_rectangles = 40
saved_offset = 2
flush = true
#3 0x00000000004d302c in vnc_worker_thread (arg=0x1612d50)
at ui/vnc-jobs-async.c:302
queue = 0x1612d50
#4 0x00007ffff79be9ca in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#5 0x00007ffff6c3970d in clone () from /lib/libc.so.6
No symbol table info available.
#6 0x0000000000000000 in ?? ()
No symbol table info available.
#0 0x00007ffff6c31197 in ioctl () from /lib/libc.so.6
No symbol table info available.
#1 0x0000000000434c12 in kvm_run (env=0x118c2e0)
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:582
r = 0
kvm = 0x1172538
run = 0x7ffff7fc7000
fd = 15
#2 0x0000000000435ed4 in kvm_cpu_exec (env=0x118c2e0)
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1233
r = 0
#3 0x00000000004364f7 in kvm_main_loop_cpu (env=0x118c2e0)
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1419
run_cpu = 1
#4 0x0000000000436640 in ap_main_loop (_env=0x118c2e0)
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1466
env = 0x118c2e0
signals = {__val = {18446744067267100671,
18446744073709551615 <repeats 15 times>}}
data = 0x0
#5 0x00007ffff79be9ca in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#6 0x00007ffff6c3970d in clone () from /lib/libc.so.6
No symbol table info available.
#7 0x0000000000000000 in ?? ()
No symbol table info available.
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#1 0x000000000041d669 in main_loop_wait (nonblocking=0)
at /usr/src/qemu-kvm-0.14.0/vl.c:1388
pioh = 0x1652870
ioh = 0x1613330
rfds = {fds_bits = {0 <repeats 16 times>}}
wfds = {fds_bits = {1048576, 0 <repeats 15 times>}}
xfds = {fds_bits = {0 <repeats 16 times>}}
ret = 1
nfds = 21
tv = {tv_sec = 0, tv_usec = 999998}
timeout = 1000
#2 0x0000000000436a4a in kvm_main_loop ()
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1589
mask = {__val = {268443712, 0 <repeats 15 times>}}
sigfd = 19
#3 0x000000000041d785 in main_loop () at
/usr/src/qemu-kvm-0.14.0/vl.c:1429
r = 0
#4 0x000000000042166a in main (argc=33, argv=0x7fffffffe658,
envp=0x7fffffffe768) at /usr/src/qemu-kvm-0.14.0/vl.c:3201
gdbstub_dev = 0x0
i = 64
snapshot = 0
linux_boot = 0
icount_option = 0x0
initrd_filename = 0x0
kernel_filename = 0x0
kernel_cmdline = 0x5e57a3 ""
boot_devices = "c\000d", '\000' <repeats 29 times>
ds = 0x15cb380
dcl = 0x0
cyls = 0
heads = 0
secs = 0
translation = 0
hda_opts = 0x0
opts = 0x1171a80
olist = 0x7fffffffe3f0
optind = 33
optarg = 0x7fffffffebce "tablet"
loadvm = 0x0
machine = 0x962cc0
cpu_model = 0x7fffffffeb51 "qemu64,model_id=Intel(R) Xeon(R) CPU", '
tb_size = 0
pid_file = 0x7fffffffeb10 "/var/run/qemu/vm-265.pid"
incoming = 0x0
show_vnc_port = 0
defconfig = 1
(gdb) info threads
3 Thread 0x7ffff6020700 (LWP 29043) 0x00007ffff79c385c in
2 Thread 0x7ffff694e700 (LWP 29042) 0x00007ffff6c31197 in ioctl ()
from /lib/libc.so.6
* 1 Thread 0x7ffff7ff0700 (LWP 29038) 0x0000000000000000 in ?? ()
(gdb)
Help appreciated!
Thanks,
Peter
Hi Peter,
did you apply this patch which fixes one of the known vnc problems
http://lists.nongnu.org/archive/html/qemu-devel/2011-03/msg00256.html
http://lists.nongnu.org/archive/html/qemu-devel/2011-03/msg00313.html
And finally the following modifications of ui/vnc.c might help to see
whether you experience the same kind of crash as I get here in
my environment. They add assertions for bad memory access
which occurs sometimes when a vnc client-server connection exists and
the screen is refreshed after a resolution change.
The code line with the //~ comment also includes a fix which
works for me.
Regards,
Stefan W.
@@ -2382,6 +2384,10 @@ static int
vnc_refresh_server_surface(VncDisplay *vd)
int y;
uint8_t *guest_row;
uint8_t *server_row;
+
+ size_t guest_size = vd->guest.ds->linesize * vd->guest.ds->height;
+ size_t server_size = vd->server->linesize * vd->server->height;
+
int cmp_bytes;
VncState *vs;
int has_dirty = 0;
@@ -2399,11 +2405,15 @@ static int
vnc_refresh_server_surface(VncDisplay *vd)
* Update server dirty map.
*/
cmp_bytes = 16 * ds_get_bytes_per_pixel(vd->ds);
+ if (cmp_bytes > vd->ds->surface->linesize) {
+ //~ fix crash: cmp_bytes = vd->ds->surface->linesize;
+ }
Stefan, I can confirm that I ran into issues if this is check is missing:

qemu-kvm-0.14.0: ui/vnc.c:2292: vnc_refresh_server_surface: Assertion
`cmp_bytes <= vd->ds->surface->linesize' failed.

Somebody should definetly look into the root cause or at least your
proposed check+fix should be added.

Peter
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Stefan Weil
2011-03-15 18:52:27 UTC
Permalink
Post by Peter Lieven
Post by Stefan Weil
Post by Peter Lieven
Hi,
during testing of qemu-kvm-0.14.0 i can reproduce the following
segfault. i have seen similar crash already in 0.13.0, but had no
time to debug.
my guess is that this segfault is related to the threaded vnc server
which was introduced in qemu 0.13.0. the bug is only triggerable if a vnc
client is attached. it might also be connected to a resolution
change in the guest. i have a backtrace attached. the debugger is
still running if someone
needs more output
Reading symbols from /usr/local/bin/qemu-system-x86_64...done.
(gdb) r -net tap,vlan=141,script=no,downscript=no,ifname=tap0 -net
nic,vlan=141,model=rtl8139,macaddr=52:54:00:ff:00:93 -drive
format=host_device,file=/dev/mapper/iqn.2001-05.co
m.equallogic:0-8a0906-e6b70e107-e87000e7acf4d4e5-lieven-winxp-r17453,if=ide,boot=on,cache=none,aio=native
-m 1024 -monitor tcp:0:4001,server,nowait -vnc :1 -name
'lieven-winxp-te
st' -boot order=c,menu=on -k de -pidfile /var/run/qemu/vm-265.pid
-mem-path /hugepages -mem-prealloc -cpu qemu64,model_id='Intel(R)
x -rtc base=localtime,clock=vm -vga cirrus -usb -usbdevice tablet
Starting program: /usr/local/bin/qemu-system-x86_64 -net
tap,vlan=141,script=no,downscript=no,ifname=tap0 -net
nic,vlan=141,model=rtl8139,macaddr=52:54:00:ff:00:93 -drive format
=host_device,file=/dev/mapper/iqn.2001-05.com.equallogic:0-8a0906-e6b70e107-e87000e7acf4d4e5-lieven-winxp-r17453,if=ide,boot=on,cache=none,aio=native
-m 1024 -monitor tcp:0:4001,
server,nowait -vnc :1 -name 'lieven-winxp-test' -boot
order=c,menu=on -k de -pidfile /var/run/qemu/vm-265.pid -mem-path
/hugepages -mem-prealloc -cpu qemu64,model_id='Intel(R
cirrus -usb -usbdevice tablet
[Thread debugging using libthread_db enabled]
[New Thread 0x7ffff694e700 (LWP 29042)]
[New Thread 0x7ffff6020700 (LWP 29043)]
[New Thread 0x7ffff581f700 (LWP 29074)]
[Thread 0x7ffff581f700 (LWP 29074) exited]
[New Thread 0x7ffff581f700 (LWP 29124)]
[Thread 0x7ffff581f700 (LWP 29124) exited]
[New Thread 0x7ffff581f700 (LWP 29170)]
[Thread 0x7ffff581f700 (LWP 29170) exited]
[New Thread 0x7ffff581f700 (LWP 29246)]
[Thread 0x7ffff581f700 (LWP 29246) exited]
[New Thread 0x7ffff581f700 (LWP 29303)]
[Thread 0x7ffff581f700 (LWP 29303) exited]
[New Thread 0x7ffff581f700 (LWP 29349)]
[Thread 0x7ffff581f700 (LWP 29349) exited]
[New Thread 0x7ffff581f700 (LWP 29399)]
[Thread 0x7ffff581f700 (LWP 29399) exited]
[New Thread 0x7ffff581f700 (LWP 29471)]
[Thread 0x7ffff581f700 (LWP 29471) exited]
[New Thread 0x7ffff581f700 (LWP 29521)]
[Thread 0x7ffff581f700 (LWP 29521) exited]
[New Thread 0x7ffff581f700 (LWP 29593)]
[Thread 0x7ffff581f700 (LWP 29593) exited]
[New Thread 0x7ffff581f700 (LWP 29703)]
[Thread 0x7ffff581f700 (LWP 29703) exited]
Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb)
(gdb) thread apply all bt full
from /lib/libpthread.so.0
No symbol table info available.
#1 0x00000000004d3ae1 in qemu_cond_wait (cond=0x1612d50,
mutex=0x1612d80)
at qemu-thread.c:133
err = 0
__func__ = "qemu_cond_wait"
#2 0x00000000004d2b39 in vnc_worker_thread_loop (queue=0x1612d50)
at ui/vnc-jobs-async.c:198
job = 0x7ffff058cd20
entry = 0x0
tmp = 0x0
vs = {csock = -1, ds = 0x15cb380, dirty = {{0, 0, 0, 0,
0} <repeats 2048 times>}, vd = 0x1607ff0, need_update = 0,
force_update = 0, features = 243, absolute = 0, last_x = 0,
last_y = 0, client_width = 0, client_height = 0, vnc_encoding = 7,
major = 0, minor = 0, challenge = '\000' <repeats 15 times>,
info = 0x0, output = {capacity = 3194, offset = 2723,
buffer = 0x1fbbfd0 ""}, input = {capacity = 0, offset = 0,
buffer = 0x0}, write_pixels = 0x4c4bc9 <vnc_write_pixels_generic>,
clientds = {flags = 0 '\000', width = 720, height = 400,
linesize = 2880,
data = 0x7ffff6021010 <Address 0x7ffff6021010 out of bounds>,
pf = {bits_per_pixel = 32 ' ', bytes_per_pixel = 4 '\004',
depth = 24 '\030', rmask = 0, gmask = 0, bmask = 0, amask = 0,
rshift = 16 '\020', gshift = 8 '\b', bshift = 0 '\000',
ashift = 24 '\030', rmax = 255 '\377', gmax = 255 '\377',
bmax = 255 '\377', amax = 255 '\377', rbits = 8 '\b',
gbits = 8 '\b', bbits = 8 '\b', abits = 8 '\b'}},
audio_cap = 0x0, as = {freq = 0, nchannels = 0, fmt = AUD_FMT_U8,
endianness = 0}, read_handler = 0, read_handler_expect = 0,
modifiers_state = '\000' <repeats 255 times>, led = 0x0,
abort = false, output_mutex = {lock = {__data = {__lock = 0,
__count = 0, __owner = 0, __nusers = 0, __kind = 0,
__spins = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = '\000' <repeats 39 times>, __align = 0}}, tight = {
type = 7, quality = 255 '\377', compression = 9 '\t',
pixel24 = 1 '\001', tight = {capacity = 3146708, offset = 1376,
buffer = 0x7ffff4684010 ""}, tmp = {capacity = 3194,
offset = 2672, buffer = 0x1fbbfd0 ""}, zlib = {
capacity = 141451, offset = 0,
buffer = 0x1805da0
"B--G\317\253\031=\257f\364\274\232\321\363jF\317\253\241\326y5"},
gradient = {capacity = 0, offset = 0, buffer = 0x0},
levels = {9, 9, 9, 0}, stream = {{
next_in = 0x7ffff4684460
"\002\003\002\003\002\002\002\002\002\002\002\002\002", avail_in =
0, total_in = 9048240,
next_out = 0x1805df6
"d\276\345c\363\216\237\212\377\314\003\236\246\\$\361\025~\032\311\232\067q&_$\231\251y\262*!\231\067\236\067\363\214\347\315\240\361\221,\274T\257\221\314\333\341\251\362R\373\232_\311a\272\002\061sgt\317\030\332\262~\300\063\267]\307\267\343\033",
avail_out = 141365,
total_out = 1034664, msg = 0x0, state = 0x16ea550,
zalloc = 0x4ca2d8 <vnc_zlib_zalloc>,
zfree = 0x4ca315 <vnc_zlib_zfree>, opaque = 0x7ffff6015920,
data_type = 0, adler = 1285581469, reserved = 0}, {
next_in = 0x7ffff4684022 "\002\002", avail_in = 0,
total_in = 46491,
next_out = 0x1805da8
"\257f\364\274\232\321\363jF\317\253\241\326y5", avail_out = 141443,
total_out = 9905, msg = 0x0, state = 0x17a3f00,
zalloc = 0x4ca2d8 <vnc_zlib_zalloc>,
zfree = 0x4ca315 <vnc_zlib_zfree>, opaque = 0x7ffff6015920,
data_type = 0, adler = 2415253306, reserved = 0}, {
next_in = 0x7ffff4684570 "\017(`", avail_in = 0,
total_in = 2188024,
next_out = 0x1805dbd
"z\222\024#\233O\214g\341\352\211/U\310s\017\361\245\020\262\343\211TO\222\371\304\207\fryK|\251E\222\343\311+\237\351`>\355\312sR\265\320\272~\001",
avail_out = 141422, total_out = 100512, msg = 0x0,
state = 0x1682950, zalloc = 0x4ca2d8 <vnc_zlib_zalloc>,
zfree = 0x4ca315 <vnc_zlib_zfree>, opaque = 0x7ffff6015920,
data_type = 0, adler = 3999694267, reserved = 0}, {
next_in = 0x0, avail_in = 0, total_in = 0, next_out = 0x0,
avail_out = 0, total_out = 0, msg = 0x0, state = 0x0,
zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0, adler = 0,
reserved = 0}}}, zlib = {zlib = {capacity = 0, offset = 0,
buffer = 0x0}, tmp = {capacity = 0, offset = 0, buffer = 0x0},
stream = {next_in = 0x0, avail_in = 0, total_in = 0,
next_out = 0x0, avail_out = 0, total_out = 0, msg = 0x0,
state = 0x0, zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0,
adler = 0, reserved = 0}, level = 0}, hextile = {
send_tile = 0x4cd21d <send_hextile_tile_generic_32>},
mouse_mode_notifier = {notify = 0x80, node = {tqe_next = 0x1612da8,
tqe_prev = 0x7ffff6020700}}, next = {tqe_next = 0x7ffff6020700,
tqe_prev = 0x7ffff7ffe128}}
n_rectangles = 40
saved_offset = 2
flush = true
#3 0x00000000004d302c in vnc_worker_thread (arg=0x1612d50)
at ui/vnc-jobs-async.c:302
queue = 0x1612d50
#4 0x00007ffff79be9ca in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#5 0x00007ffff6c3970d in clone () from /lib/libc.so.6
No symbol table info available.
#6 0x0000000000000000 in ?? ()
No symbol table info available.
#0 0x00007ffff6c31197 in ioctl () from /lib/libc.so.6
No symbol table info available.
#1 0x0000000000434c12 in kvm_run (env=0x118c2e0)
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:582
r = 0
kvm = 0x1172538
run = 0x7ffff7fc7000
fd = 15
#2 0x0000000000435ed4 in kvm_cpu_exec (env=0x118c2e0)
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1233
r = 0
#3 0x00000000004364f7 in kvm_main_loop_cpu (env=0x118c2e0)
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1419
run_cpu = 1
#4 0x0000000000436640 in ap_main_loop (_env=0x118c2e0)
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1466
env = 0x118c2e0
signals = {__val = {18446744067267100671,
18446744073709551615 <repeats 15 times>}}
data = 0x0
#5 0x00007ffff79be9ca in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#6 0x00007ffff6c3970d in clone () from /lib/libc.so.6
No symbol table info available.
#7 0x0000000000000000 in ?? ()
No symbol table info available.
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#1 0x000000000041d669 in main_loop_wait (nonblocking=0)
at /usr/src/qemu-kvm-0.14.0/vl.c:1388
pioh = 0x1652870
ioh = 0x1613330
rfds = {fds_bits = {0 <repeats 16 times>}}
wfds = {fds_bits = {1048576, 0 <repeats 15 times>}}
xfds = {fds_bits = {0 <repeats 16 times>}}
ret = 1
nfds = 21
tv = {tv_sec = 0, tv_usec = 999998}
timeout = 1000
#2 0x0000000000436a4a in kvm_main_loop ()
at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1589
mask = {__val = {268443712, 0 <repeats 15 times>}}
sigfd = 19
#3 0x000000000041d785 in main_loop () at
/usr/src/qemu-kvm-0.14.0/vl.c:1429
r = 0
#4 0x000000000042166a in main (argc=33, argv=0x7fffffffe658,
envp=0x7fffffffe768) at /usr/src/qemu-kvm-0.14.0/vl.c:3201
gdbstub_dev = 0x0
i = 64
snapshot = 0
linux_boot = 0
icount_option = 0x0
initrd_filename = 0x0
kernel_filename = 0x0
kernel_cmdline = 0x5e57a3 ""
boot_devices = "c\000d", '\000' <repeats 29 times>
ds = 0x15cb380
dcl = 0x0
cyls = 0
heads = 0
secs = 0
translation = 0
hda_opts = 0x0
opts = 0x1171a80
olist = 0x7fffffffe3f0
optind = 33
optarg = 0x7fffffffebce "tablet"
loadvm = 0x0
machine = 0x962cc0
cpu_model = 0x7fffffffeb51 "qemu64,model_id=Intel(R) Xeon(R) CPU", '
tb_size = 0
pid_file = 0x7fffffffeb10 "/var/run/qemu/vm-265.pid"
incoming = 0x0
show_vnc_port = 0
defconfig = 1
(gdb) info threads
3 Thread 0x7ffff6020700 (LWP 29043) 0x00007ffff79c385c in
2 Thread 0x7ffff694e700 (LWP 29042) 0x00007ffff6c31197 in ioctl ()
from /lib/libc.so.6
* 1 Thread 0x7ffff7ff0700 (LWP 29038) 0x0000000000000000 in ?? ()
(gdb)
Help appreciated!
Thanks,
Peter
Hi Peter,
did you apply this patch which fixes one of the known vnc problems
http://lists.nongnu.org/archive/html/qemu-devel/2011-03/msg00256.html
http://lists.nongnu.org/archive/html/qemu-devel/2011-03/msg00313.html
And finally the following modifications of ui/vnc.c might help to see
whether you experience the same kind of crash as I get here in
my environment. They add assertions for bad memory access
which occurs sometimes when a vnc client-server connection exists and
the screen is refreshed after a resolution change.
The code line with the //~ comment also includes a fix which
works for me.
Regards,
Stefan W.
@@ -2382,6 +2384,10 @@ static int
vnc_refresh_server_surface(VncDisplay *vd)
int y;
uint8_t *guest_row;
uint8_t *server_row;
+
+ size_t guest_size = vd->guest.ds->linesize * vd->guest.ds->height;
+ size_t server_size = vd->server->linesize * vd->server->height;
+
int cmp_bytes;
VncState *vs;
int has_dirty = 0;
@@ -2399,11 +2405,15 @@ static int
vnc_refresh_server_surface(VncDisplay *vd)
* Update server dirty map.
*/
cmp_bytes = 16 * ds_get_bytes_per_pixel(vd->ds);
+ if (cmp_bytes > vd->ds->surface->linesize) {
+ //~ fix crash: cmp_bytes = vd->ds->surface->linesize;
+ }
qemu-kvm-0.14.0: ui/vnc.c:2292: vnc_refresh_server_surface: Assertion
`cmp_bytes <= vd->ds->surface->linesize' failed.
Somebody should definetly look into the root cause or at least your
proposed check+fix should be added.
Peter
Hi Peter,

I just sent the patch which limits cmp_bytes to qemu-devel
(two mails: for stable-0.14 and for master).

The bug is triggered by certain strange screen resolutions,
for example while booting MIPS Malta with a Debian Linux kernel.

Regards,
Stefan

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Kiszka
2011-03-09 07:37:48 UTC
Permalink
Post by Peter Lieven
Hi,
during testing of qemu-kvm-0.14.0 i can reproduce the following segfault. i have seen similar crash already in 0.13.0, but had no time to debug.
my guess is that this segfault is related to the threaded vnc server which was introduced in qemu 0.13.0. the bug is only triggerable if a vnc
client is attached. it might also be connected to a resolution change in the guest. i have a backtrace attached. the debugger is still running if someone
needs more output
...
Post by Peter Lieven
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#1 0x000000000041d669 in main_loop_wait (nonblocking=0)
at /usr/src/qemu-kvm-0.14.0/vl.c:1388
So we are calling a IOHandlerRecord::fd_write handler that is NULL.
Looking at qemu_set_fd_handler2, this may happen if that function is
called for an existing io-handler entry with non-NULL write handler,
passing a NULL write and a non-NULL read handler. And all this without
the global mutex held.

And there are actually calls in vnc_client_write_plain and
vnc_client_write_locked (in contrast to vnc_write) that may generate
this pattern. It's probably worth validating that the iothread lock is
always held when qemu_set_fd_handler2 is invoked to confirm this race
theory, adding something like

assert(pthread_mutex_trylock(&qemu_mutex) != 0);
(that's for qemu-kvm only)

BTW, qemu with just --enable-vnc-thread, ie. without io-thread support,
should always run into this race as it then definitely lacks a global mutex.

Jan
Corentin Chary
2011-03-09 08:50:25 UTC
Permalink
Post by Jan Kiszka
Hi,
during testing of qemu-kvm-0.14.0 i can reproduce the following segf=
ault. i have seen similar crash already in 0.13.0, but had no time to d=
ebug.
Post by Jan Kiszka
my guess is that this segfault is related to the threaded vnc server=
which was introduced in qemu 0.13.0. the bug is only triggerable if a =
vnc
Post by Jan Kiszka
client is attached. it might also be connected to a resolution chang=
e in the guest. i have a backtrace attached. the debugger is still runn=
ing if someone
Post by Jan Kiszka
needs more output
...
#0 =C2=A00x0000000000000000 in ?? ()
No symbol table info available.
#1 =C2=A00x000000000041d669 in main_loop_wait (nonblocking=3D0)
=C2=A0 =C2=A0 at /usr/src/qemu-kvm-0.14.0/vl.c:1388
So we are calling a IOHandlerRecord::fd_write handler that is NULL.
Looking at qemu_set_fd_handler2, this may happen if that function is
called for an existing io-handler entry with non-NULL write handler,
passing a NULL write and a non-NULL read handler. And all this withou=
t
Post by Jan Kiszka
the global mutex held.
And there are actually calls in vnc_client_write_plain and
vnc_client_write_locked (in contrast to vnc_write) that may generate
this pattern. It's probably worth validating that the iothread lock i=
s
Post by Jan Kiszka
always held when qemu_set_fd_handler2 is invoked to confirm this race
theory, adding something like
assert(pthread_mutex_trylock(&qemu_mutex) !=3D 0);
(that's for qemu-kvm only)
BTW, qemu with just --enable-vnc-thread, ie. without io-thread suppor=
t,
Post by Jan Kiszka
should always run into this race as it then definitely lacks a global=
mutex.

I'm not sure what mutex should be locked here (qemu_global_mutex,
qemu_fair_mutex, lock_iothread).
But here is where is should be locked (other vnc_write calls in this
thread should never trigger qemu_set_fd_handler):

diff --git a/ui/vnc-jobs-async.c b/ui/vnc-jobs-async.c
index 1d4c5e7..e02d891 100644
--- a/ui/vnc-jobs-async.c
+++ b/ui/vnc-jobs-async.c
@@ -258,7 +258,9 @@ static int vnc_worker_thread_loop(VncJobQueue *queu=
e)
goto disconnected;
}

+ /* lock */
vnc_write(job->vs, vs.output.buffer, vs.output.offset);
+ /* unlock */

disconnected:
/* Copy persistent encoding data */
@@ -267,7 +269,9 @@ disconnected:
vnc_unlock_output(job->vs);

if (flush) {
+ /* lock */
vnc_flush(job->vs);
+ /* unlock */
}

vnc_lock_queue(queue)

--=20
Corentin Chary
http://xf.iksaif.net
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Kiszka
2011-03-09 09:04:02 UTC
Permalink
Post by Corentin Chary
Post by Jan Kiszka
Post by Peter Lieven
Hi,
during testing of qemu-kvm-0.14.0 i can reproduce the following segfault. i have seen similar crash already in 0.13.0, but had no time to debug.
my guess is that this segfault is related to the threaded vnc server which was introduced in qemu 0.13.0. the bug is only triggerable if a vnc
client is attached. it might also be connected to a resolution change in the guest. i have a backtrace attached. the debugger is still running if someone
needs more output
...
Post by Peter Lieven
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#1 0x000000000041d669 in main_loop_wait (nonblocking=0)
at /usr/src/qemu-kvm-0.14.0/vl.c:1388
So we are calling a IOHandlerRecord::fd_write handler that is NULL.
Looking at qemu_set_fd_handler2, this may happen if that function is
called for an existing io-handler entry with non-NULL write handler,
passing a NULL write and a non-NULL read handler. And all this without
the global mutex held.
And there are actually calls in vnc_client_write_plain and
vnc_client_write_locked (in contrast to vnc_write) that may generate
this pattern. It's probably worth validating that the iothread lock is
always held when qemu_set_fd_handler2 is invoked to confirm this race
theory, adding something like
assert(pthread_mutex_trylock(&qemu_mutex) != 0);
(that's for qemu-kvm only)
BTW, qemu with just --enable-vnc-thread, ie. without io-thread support,
should always run into this race as it then definitely lacks a global mutex.
I'm not sure what mutex should be locked here (qemu_global_mutex,
qemu_fair_mutex, lock_iothread).
In upstream qemu, the latter - if it exists (which is not the case in
non-io-thread mode).

In qemu-kvm, those locks are yet unused. Rather the code in qemu-kvm.c
implements the global lock.
Post by Corentin Chary
But here is where is should be locked (other vnc_write calls in this
diff --git a/ui/vnc-jobs-async.c b/ui/vnc-jobs-async.c
index 1d4c5e7..e02d891 100644
--- a/ui/vnc-jobs-async.c
+++ b/ui/vnc-jobs-async.c
@@ -258,7 +258,9 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
goto disconnected;
}
+ /* lock */
vnc_write(job->vs, vs.output.buffer, vs.output.offset);
+ /* unlock */
/* Copy persistent encoding data */
vnc_unlock_output(job->vs);
if (flush) {
+ /* lock */
vnc_flush(job->vs);
+ /* unlock */
}
Particularly this call is critical as it will trigger
vnc_client_write_locked which may NULL'ify the write handler.

But I'm also not sure about vnc_send_framebuffer_update. Someone has to
go through the threaded vnc code again, very thoroughly. It looks fragile.

Jan
Corentin Chary
2011-03-09 09:54:51 UTC
Permalink
Post by Jan Kiszka
Post by Jan Kiszka
So we are calling a IOHandlerRecord::fd_write handler that is NULL.
Looking at qemu_set_fd_handler2, this may happen if that function is
called for an existing io-handler entry with non-NULL write handler,
passing a NULL write and a non-NULL read handler. And all this without
the global mutex held.
When using the vnc thread, nothing in the vnc thread will never be
called directly by an IO-handler. So I don't really how in would
trigger this.
And since there is a lock for accessing the output buffer (and the
network actually), there should not be any race condition either.

So the real issue, is that qemu_set_fd_handler2() is called outside of
the main thread by those two vnc_write() and vnc_flush() calls,
without any kind of locking.
Post by Jan Kiszka
In upstream qemu, the latter - if it exists (which is not the case in
non-io-thread mode).
In qemu-kvm, those locks are yet unused. Rather the code in qemu-kvm.c
implements the global lock.
So there is currently no lock for that when io-thread is disabled :/.
Spice also seems to project this kind of thing with
qemu_mutex_lock_iothread().

Maybe qemu_mutex_lock_iothread() should also be defined when
CONFIG_VNC_THREAD=y ?
Post by Jan Kiszka
But I'm also not sure about vnc_send_framebuffer_update. Someone has to
go through the threaded vnc code again, very thoroughly. It looks fragile.
while vnc-thread is enabled vnc_send_framebuffer_update() will always
call vnc_write() with csock = -1 in a temporary buffer. Check
vnc_async_encoding_start() and vnc_async_encoding_end(), they provide
a kind of "sandbox" that prevent the thread to write anything the
main-thread will use. You can also see that as a "transaction": the
thread compute the update in a temporary buffer, and only send it to
the network (real vnc_write calls with csock correctly set) once it's
successfully finished.

The is only two functions calls that break this isolation are the two
that I pointed out earlier.
--
Corentin Chary
http://xf.iksaif.net
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Kiszka
2011-03-09 09:58:44 UTC
Permalink
Post by Corentin Chary
Post by Jan Kiszka
Post by Jan Kiszka
So we are calling a IOHandlerRecord::fd_write handler that is NULL.
Looking at qemu_set_fd_handler2, this may happen if that function is
called for an existing io-handler entry with non-NULL write handler,
passing a NULL write and a non-NULL read handler. And all this without
the global mutex held.
When using the vnc thread, nothing in the vnc thread will never be
called directly by an IO-handler. So I don't really how in would
trigger this.
And since there is a lock for accessing the output buffer (and the
network actually), there should not be any race condition either.
So the real issue, is that qemu_set_fd_handler2() is called outside of
the main thread by those two vnc_write() and vnc_flush() calls,
without any kind of locking.
Yes, that's what I was referring to.
Post by Corentin Chary
Post by Jan Kiszka
In upstream qemu, the latter - if it exists (which is not the case in
non-io-thread mode).
In qemu-kvm, those locks are yet unused. Rather the code in qemu-kvm.c
implements the global lock.
So there is currently no lock for that when io-thread is disabled :/.
Spice also seems to project this kind of thing with
qemu_mutex_lock_iothread().
Maybe qemu_mutex_lock_iothread() should also be defined when
CONFIG_VNC_THREAD=y ?
Post by Jan Kiszka
But I'm also not sure about vnc_send_framebuffer_update. Someone has to
go through the threaded vnc code again, very thoroughly. It looks fragile.
while vnc-thread is enabled vnc_send_framebuffer_update() will always
call vnc_write() with csock = -1 in a temporary buffer. Check
vnc_async_encoding_start() and vnc_async_encoding_end(), they provide
a kind of "sandbox" that prevent the thread to write anything the
main-thread will use. You can also see that as a "transaction": the
thread compute the update in a temporary buffer, and only send it to
the network (real vnc_write calls with csock correctly set) once it's
successfully finished.
The is only two functions calls that break this isolation are the two
that I pointed out earlier.
Probably the best way is to make vnc stop fiddling with
qemu_set_fd_handler2, specifically in threaded mode. Why does it need to
set/reset the write handler all the time?

Jan
Jan Kiszka
2011-03-09 10:02:37 UTC
Permalink
Post by Jan Kiszka
Post by Corentin Chary
Post by Jan Kiszka
Post by Jan Kiszka
So we are calling a IOHandlerRecord::fd_write handler that is NULL.
Looking at qemu_set_fd_handler2, this may happen if that function is
called for an existing io-handler entry with non-NULL write handler,
passing a NULL write and a non-NULL read handler. And all this without
the global mutex held.
When using the vnc thread, nothing in the vnc thread will never be
called directly by an IO-handler. So I don't really how in would
trigger this.
And since there is a lock for accessing the output buffer (and the
network actually), there should not be any race condition either.
So the real issue, is that qemu_set_fd_handler2() is called outside of
the main thread by those two vnc_write() and vnc_flush() calls,
without any kind of locking.
Yes, that's what I was referring to.
Post by Corentin Chary
Post by Jan Kiszka
In upstream qemu, the latter - if it exists (which is not the case in
non-io-thread mode).
In qemu-kvm, those locks are yet unused. Rather the code in qemu-kvm.c
implements the global lock.
So there is currently no lock for that when io-thread is disabled :/.
Spice also seems to project this kind of thing with
qemu_mutex_lock_iothread().
Maybe qemu_mutex_lock_iothread() should also be defined when
CONFIG_VNC_THREAD=y ?
Post by Jan Kiszka
But I'm also not sure about vnc_send_framebuffer_update. Someone has to
go through the threaded vnc code again, very thoroughly. It looks fragile.
while vnc-thread is enabled vnc_send_framebuffer_update() will always
call vnc_write() with csock = -1 in a temporary buffer. Check
vnc_async_encoding_start() and vnc_async_encoding_end(), they provide
a kind of "sandbox" that prevent the thread to write anything the
main-thread will use. You can also see that as a "transaction": the
thread compute the update in a temporary buffer, and only send it to
the network (real vnc_write calls with csock correctly set) once it's
successfully finished.
The is only two functions calls that break this isolation are the two
that I pointed out earlier.
Probably the best way is to make vnc stop fiddling with
qemu_set_fd_handler2, specifically in threaded mode. Why does it need to
set/reset the write handler all the time?
The other question is: Who's responsible for writing to the client
socket in threaded mode? Only the vnc thread(s), or also other qemu
threads? In the former case, just avoid setting a write handler at all.

Jan
Corentin Chary
2011-03-09 10:06:58 UTC
Permalink
Post by Jan Kiszka
Probably the best way is to make vnc stop fiddling with
qemu_set_fd_handler2, specifically in threaded mode.
Why does it need to set/reset the write handler all the time?
I didn't write the original code, but it's probably to avoid calling a
write handler when there is no data to write. That would make select()
busy loop when there are actually no data to write.
There is also the vnc_disconnect_start() case when the socket seems to
be broken and we remove all handlers.
Post by Jan Kiszka
The other question is: Who's responsible for writing to the client
socket in threaded mode? Only the vnc thread(s), or also other qemu
threads? In the former case, just avoid setting a write handler at all.
Cheap stuff is done by the main thread (cursor, etc...). The thread
only do framebuffer updates.
--
Corentin Chary
http://xf.iksaif.net
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Kiszka
2011-03-09 10:12:53 UTC
Permalink
Post by Corentin Chary
Post by Jan Kiszka
Probably the best way is to make vnc stop fiddling with
qemu_set_fd_handler2, specifically in threaded mode.
Why does it need to set/reset the write handler all the time?
I didn't write the original code, but it's probably to avoid calling a
write handler when there is no data to write. That would make select()
busy loop when there are actually no data to write.
There is also the vnc_disconnect_start() case when the socket seems to
be broken and we remove all handlers.
Post by Jan Kiszka
The other question is: Who's responsible for writing to the client
socket in threaded mode? Only the vnc thread(s), or also other qemu
threads? In the former case, just avoid setting a write handler at all.
Cheap stuff is done by the main thread (cursor, etc...). The thread
only do framebuffer updates.
And both are synchronized with a vnc-private lock only?

The problem with this model is the non-threaded qemu execution model.
Even if we acquire the global mutex to protect handler updates against
the main select loop, this won't help here. So vnc-threading likely
needs to be made dependent on --enable-io-thread.

Jan
Corentin Chary
2011-03-09 10:14:55 UTC
Permalink
Post by Jan Kiszka
Post by Corentin Chary
Cheap stuff is done by the main thread (cursor, etc...). The thread
only do framebuffer updates.
And both are synchronized with a vnc-private lock only?
Yes
Post by Jan Kiszka
The problem with this model is the non-threaded qemu execution model.
Even if we acquire the global mutex to protect handler updates against
the main select loop, this won't help here. So vnc-threading likely
needs to be made dependent on --enable-io-thread.
Plus proper lock_iothread() calls right ?

If yes I'll send a patch for that (or you can do it, as you want).

Thanks,
--
Corentin Chary
http://xf.iksaif.net
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Kiszka
2011-03-09 10:17:02 UTC
Permalink
Post by Corentin Chary
Post by Jan Kiszka
Post by Corentin Chary
Cheap stuff is done by the main thread (cursor, etc...). The thread
only do framebuffer updates.
And both are synchronized with a vnc-private lock only?
Yes
Post by Jan Kiszka
The problem with this model is the non-threaded qemu execution model.
Even if we acquire the global mutex to protect handler updates against
the main select loop, this won't help here. So vnc-threading likely
needs to be made dependent on --enable-io-thread.
Plus proper lock_iothread() calls right ?
Yep.
Post by Corentin Chary
If yes I'll send a patch for that (or you can do it, as you want).
Don't wait for me. :)

Jan
Corentin Chary
2011-03-09 10:41:24 UTC
Permalink
The threaded VNC servers messed up with QEMU fd handlers without
any kind of locking, and that can cause some nasty race conditions.

The IO-Thread provides appropriate locking primitives to avoid that.
This patch makes CONFIG_VNC_THREAD depends on CONFIG_IO_THREAD,
and add lock and unlock calls around the two faulty calls.

Thanks to Jan Kiszka for helping me solve this issue.

Cc: Jan Kiszka <***@web.de>
Signed-off-by: Corentin Chary <***@gmail.com>
---
configure | 9 +++++++++
ui/vnc-jobs-async.c | 4 ++++
2 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/configure b/configure
index 5513d3e..c8c1ac1 100755
--- a/configure
+++ b/configure
@@ -2455,6 +2455,15 @@ if test \( "$cpu" = "i386" -o "$cpu" = "x86_64" \) -a \
roms="optionrom"
fi

+# VNC Thread depends on IO Thread
+if test "$vnc_thread" = "yes" -a "$io_thread" != "yes"; then
+ echo
+ echo "ERROR: VNC thread depends on IO thread which isn't enabled."
+ echo "Please use --enable-io-thread if you want to enable it."
+ echo
+ exit 1
+fi
+

echo "Install prefix $prefix"
echo "BIOS directory `eval echo $datadir`"
diff --git a/ui/vnc-jobs-async.c b/ui/vnc-jobs-async.c
index f596247..093c0d4 100644
--- a/ui/vnc-jobs-async.c
+++ b/ui/vnc-jobs-async.c
@@ -260,7 +260,9 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
goto disconnected;
}

+ qemu_mutex_lock_iothread();
vnc_write(job->vs, vs.output.buffer, vs.output.offset);
+ qemu_mutex_unlock_iothread();

disconnected:
/* Copy persistent encoding data */
@@ -269,7 +271,9 @@ disconnected:
vnc_unlock_output(job->vs);

if (flush) {
+ qemu_mutex_lock_iothread();
vnc_flush(job->vs);
+ qemu_mutex_unlock_iothread();
}

vnc_lock_queue(queue);
--
1.7.3.4
Peter Lieven
2011-03-09 10:50:55 UTC
Permalink
Post by Corentin Chary
The threaded VNC servers messed up with QEMU fd handlers without
any kind of locking, and that can cause some nasty race conditions.
The IO-Thread provides appropriate locking primitives to avoid that.
This patch makes CONFIG_VNC_THREAD depends on CONFIG_IO_THREAD,
and add lock and unlock calls around the two faulty calls.
qemu-kvm currently doesn't compile with --enable-io-thread. is there an easy fix
for this?

thanks,
peter
Post by Corentin Chary
Thanks to Jan Kiszka for helping me solve this issue.
---
configure | 9 +++++++++
ui/vnc-jobs-async.c | 4 ++++
2 files changed, 13 insertions(+), 0 deletions(-)
diff --git a/configure b/configure
index 5513d3e..c8c1ac1 100755
--- a/configure
+++ b/configure
@@ -2455,6 +2455,15 @@ if test \( "$cpu" = "i386" -o "$cpu" = "x86_64" \) -a \
roms="optionrom"
fi
+# VNC Thread depends on IO Thread
+if test "$vnc_thread" = "yes" -a "$io_thread" != "yes"; then
+ echo
+ echo "ERROR: VNC thread depends on IO thread which isn't enabled."
+ echo "Please use --enable-io-thread if you want to enable it."
+ echo
+ exit 1
+fi
+
echo "Install prefix $prefix"
echo "BIOS directory `eval echo $datadir`"
diff --git a/ui/vnc-jobs-async.c b/ui/vnc-jobs-async.c
index f596247..093c0d4 100644
--- a/ui/vnc-jobs-async.c
+++ b/ui/vnc-jobs-async.c
@@ -260,7 +260,9 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
goto disconnected;
}
+ qemu_mutex_lock_iothread();
vnc_write(job->vs, vs.output.buffer, vs.output.offset);
+ qemu_mutex_unlock_iothread();
/* Copy persistent encoding data */
vnc_unlock_output(job->vs);
if (flush) {
+ qemu_mutex_lock_iothread();
vnc_flush(job->vs);
+ qemu_mutex_unlock_iothread();
}
vnc_lock_queue(queue);
--
1.7.3.4
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Corentin Chary
2011-03-09 10:57:36 UTC
Permalink
Post by Peter Lieven
Post by Corentin Chary
The threaded VNC servers messed up with QEMU fd handlers without
any kind of locking, and that can cause some nasty race conditions.
The IO-Thread provides appropriate locking primitives to avoid that.
This patch makes CONFIG_VNC_THREAD depends on CONFIG_IO_THREAD,
and add lock and unlock calls around the two faulty calls.
qemu-kvm currently doesn't compile with --enable-io-thread. is there an easy fix
for this?
If IO Thread is not available, I'm afraid that --disable-vnc-thread is
the only fix.
Or, you can try to define some global mutex acting like iothread
locks, but that doesn't sounds like an easy fix.
--
Corentin Chary
http://xf.iksaif.net
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Stefan Hajnoczi
2011-03-09 11:05:36 UTC
Permalink
On Wed, Mar 9, 2011 at 10:57 AM, Corentin Chary
Post by Corentin Chary
Post by Peter Lieven
Post by Corentin Chary
The threaded VNC servers messed up with QEMU fd handlers without
any kind of locking, and that can cause some nasty race conditions.
The IO-Thread provides appropriate locking primitives to avoid that.
This patch makes CONFIG_VNC_THREAD depends on CONFIG_IO_THREAD,
and add lock and unlock calls around the two faulty calls.
qemu-kvm currently doesn't compile with --enable-io-thread. is there an easy fix
for this?
If IO Thread is not available, I'm afraid that --disable-vnc-thread is
the only fix.
Or, you can try to define some global mutex acting like iothread
locks, but that doesn't sounds like an easy fix.
Jan or Marcelo can help here but qemu-kvm has an iothread equivalent
built in by default. It should be possible to use that.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Kiszka
2011-03-09 11:25:50 UTC
Permalink
Post by Stefan Hajnoczi
On Wed, Mar 9, 2011 at 10:57 AM, Corentin Chary
Post by Corentin Chary
Post by Peter Lieven
Post by Corentin Chary
The threaded VNC servers messed up with QEMU fd handlers without
any kind of locking, and that can cause some nasty race conditions.
The IO-Thread provides appropriate locking primitives to avoid that.
This patch makes CONFIG_VNC_THREAD depends on CONFIG_IO_THREAD,
and add lock and unlock calls around the two faulty calls.
qemu-kvm currently doesn't compile with --enable-io-thread. is there an easy fix
for this?
If IO Thread is not available, I'm afraid that --disable-vnc-thread is
the only fix.
Or, you can try to define some global mutex acting like iothread
locks, but that doesn't sounds like an easy fix.
Jan or Marcelo can help here but qemu-kvm has an iothread equivalent
built in by default. It should be possible to use that.
qemu_mutex_lock/unlock_iothread is properly provided in qemu-kvm even
without --enable-io-thread. So that tree could temporarily disable the
new configure check until we got rid of the special qemu-kvm bits.
Corentin's patch is against upstream, that adjustment need to be made
once the commit is merged into qemu-kvm.

Jan
Peter Lieven
2011-03-09 11:32:27 UTC
Permalink
Post by Jan Kiszka
Post by Stefan Hajnoczi
On Wed, Mar 9, 2011 at 10:57 AM, Corentin Chary
Post by Corentin Chary
Post by Peter Lieven
Post by Corentin Chary
The threaded VNC servers messed up with QEMU fd handlers without
any kind of locking, and that can cause some nasty race conditions.
The IO-Thread provides appropriate locking primitives to avoid that.
This patch makes CONFIG_VNC_THREAD depends on CONFIG_IO_THREAD,
and add lock and unlock calls around the two faulty calls.
qemu-kvm currently doesn't compile with --enable-io-thread. is there an easy fix
for this?
If IO Thread is not available, I'm afraid that --disable-vnc-thread is
the only fix.
Or, you can try to define some global mutex acting like iothread
locks, but that doesn't sounds like an easy fix.
Jan or Marcelo can help here but qemu-kvm has an iothread equivalent
built in by default. It should be possible to use that.
qemu_mutex_lock/unlock_iothread is properly provided in qemu-kvm even
without --enable-io-thread. So that tree could temporarily disable the
new configure check until we got rid of the special qemu-kvm bits.
Corentin's patch is against upstream, that adjustment need to be made
once the commit is merged into qemu-kvm.
do i understand you right, that i should be able to use vnc-thread together with qemu-kvm
just now if I add Corentin's patch without the io-thread dependency?

if yes, i will do and try if I can force a crash again.

br,
peter
Post by Jan Kiszka
Jan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Kiszka
2011-03-09 11:33:31 UTC
Permalink
Post by Peter Lieven
Post by Jan Kiszka
Post by Stefan Hajnoczi
On Wed, Mar 9, 2011 at 10:57 AM, Corentin Chary
Post by Corentin Chary
Post by Peter Lieven
Post by Corentin Chary
The threaded VNC servers messed up with QEMU fd handlers without
any kind of locking, and that can cause some nasty race conditions.
The IO-Thread provides appropriate locking primitives to avoid that.
This patch makes CONFIG_VNC_THREAD depends on CONFIG_IO_THREAD,
and add lock and unlock calls around the two faulty calls.
qemu-kvm currently doesn't compile with --enable-io-thread. is there an easy fix
for this?
If IO Thread is not available, I'm afraid that --disable-vnc-thread is
the only fix.
Or, you can try to define some global mutex acting like iothread
locks, but that doesn't sounds like an easy fix.
Jan or Marcelo can help here but qemu-kvm has an iothread equivalent
built in by default. It should be possible to use that.
qemu_mutex_lock/unlock_iothread is properly provided in qemu-kvm even
without --enable-io-thread. So that tree could temporarily disable the
new configure check until we got rid of the special qemu-kvm bits.
Corentin's patch is against upstream, that adjustment need to be made
once the commit is merged into qemu-kvm.
do i understand you right, that i should be able to use vnc-thread together with qemu-kvm
just now if I add Corentin's patch without the io-thread dependency?
Yep.
Post by Peter Lieven
if yes, i will do and try if I can force a crash again.
TIA,
Jan
Jan Kiszka
2011-03-09 11:42:56 UTC
Permalink
Post by Corentin Chary
The threaded VNC servers messed up with QEMU fd handlers without
any kind of locking, and that can cause some nasty race conditions.
The IO-Thread provides appropriate locking primitives to avoid that.
This patch makes CONFIG_VNC_THREAD depends on CONFIG_IO_THREAD,
and add lock and unlock calls around the two faulty calls.
Thanks to Jan Kiszka for helping me solve this issue.
---
configure | 9 +++++++++
ui/vnc-jobs-async.c | 4 ++++
2 files changed, 13 insertions(+), 0 deletions(-)
diff --git a/configure b/configure
index 5513d3e..c8c1ac1 100755
--- a/configure
+++ b/configure
@@ -2455,6 +2455,15 @@ if test \( "$cpu" = "i386" -o "$cpu" = "x86_64" \) -a \
roms="optionrom"
fi
+# VNC Thread depends on IO Thread
+if test "$vnc_thread" = "yes" -a "$io_thread" != "yes"; then
+ echo
+ echo "ERROR: VNC thread depends on IO thread which isn't enabled."
+ echo "Please use --enable-io-thread if you want to enable it."
+ echo
+ exit 1
+fi
+
echo "Install prefix $prefix"
echo "BIOS directory `eval echo $datadir`"
diff --git a/ui/vnc-jobs-async.c b/ui/vnc-jobs-async.c
index f596247..093c0d4 100644
--- a/ui/vnc-jobs-async.c
+++ b/ui/vnc-jobs-async.c
@@ -260,7 +260,9 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
goto disconnected;
}
+ qemu_mutex_lock_iothread();
Doesn't this comes with a risk of an ABBA deadlock between output_mutex
and the global lock? Here you take the global lock while holding
output_mutex, but I bet there is also the other way around when vnc
services are called from the main thread or a vcpu.
Post by Corentin Chary
vnc_write(job->vs, vs.output.buffer, vs.output.offset);
+ qemu_mutex_unlock_iothread();
/* Copy persistent encoding data */
vnc_unlock_output(job->vs);
if (flush) {
+ qemu_mutex_lock_iothread();
vnc_flush(job->vs);
+ qemu_mutex_unlock_iothread();
}
vnc_lock_queue(queue);
The second hunk looks safe.

Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Peter Lieven
2011-03-09 12:50:45 UTC
Permalink
Post by Jan Kiszka
Post by Corentin Chary
The threaded VNC servers messed up with QEMU fd handlers without
any kind of locking, and that can cause some nasty race conditions.
The IO-Thread provides appropriate locking primitives to avoid that.
This patch makes CONFIG_VNC_THREAD depends on CONFIG_IO_THREAD,
and add lock and unlock calls around the two faulty calls.
Thanks to Jan Kiszka for helping me solve this issue.
---
configure | 9 +++++++++
ui/vnc-jobs-async.c | 4 ++++
2 files changed, 13 insertions(+), 0 deletions(-)
diff --git a/configure b/configure
index 5513d3e..c8c1ac1 100755
--- a/configure
+++ b/configure
@@ -2455,6 +2455,15 @@ if test \( "$cpu" = "i386" -o "$cpu" = "x86_64" \) -a \
roms="optionrom"
fi
+# VNC Thread depends on IO Thread
+if test "$vnc_thread" = "yes" -a "$io_thread" != "yes"; then
+ echo
+ echo "ERROR: VNC thread depends on IO thread which isn't enabled."
+ echo "Please use --enable-io-thread if you want to enable it."
+ echo
+ exit 1
+fi
+
echo "Install prefix $prefix"
echo "BIOS directory `eval echo $datadir`"
diff --git a/ui/vnc-jobs-async.c b/ui/vnc-jobs-async.c
index f596247..093c0d4 100644
--- a/ui/vnc-jobs-async.c
+++ b/ui/vnc-jobs-async.c
@@ -260,7 +260,9 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
goto disconnected;
}
+ qemu_mutex_lock_iothread();
Doesn't this comes with a risk of an ABBA deadlock between output_mutex
and the global lock? Here you take the global lock while holding
output_mutex, but I bet there is also the other way around when vnc
services are called from the main thread or a vcpu.
so after all i should stay with disabled vnc-thread in qemu-kvm until there is a solution?

br,
peter
Post by Jan Kiszka
Post by Corentin Chary
vnc_write(job->vs, vs.output.buffer, vs.output.offset);
+ qemu_mutex_unlock_iothread();
/* Copy persistent encoding data */
vnc_unlock_output(job->vs);
if (flush) {
+ qemu_mutex_lock_iothread();
vnc_flush(job->vs);
+ qemu_mutex_unlock_iothread();
}
vnc_lock_queue(queue);
The second hunk looks safe.
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Corentin Chary
2011-03-09 13:21:32 UTC
Permalink
The threaded VNC servers messed up with QEMU fd handlers without
any kind of locking, and that can cause some nasty race conditions.

The IO-Thread provides appropriate locking primitives to avoid that.
This patch makes CONFIG_VNC_THREAD depends on CONFIG_IO_THREAD,
and add lock and unlock calls around the two faulty calls.

Thanks to Jan Kiszka for helping me solve this issue.

Cc: Jan Kiszka <***@web.de>
Signed-off-by: Corentin Chary <***@gmail.com>
---
The previous patch was total crap, introduced race conditions,
and probably crashs on client disconnections.

configure | 9 +++++++++
ui/vnc-jobs-async.c | 24 +++++++++++++++++++-----
2 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/configure b/configure
index 5513d3e..c8c1ac1 100755
--- a/configure
+++ b/configure
@@ -2455,6 +2455,15 @@ if test \( "$cpu" = "i386" -o "$cpu" = "x86_64" \) -a \
roms="optionrom"
fi

+# VNC Thread depends on IO Thread
+if test "$vnc_thread" = "yes" -a "$io_thread" != "yes"; then
+ echo
+ echo "ERROR: VNC thread depends on IO thread which isn't enabled."
+ echo "Please use --enable-io-thread if you want to enable it."
+ echo
+ exit 1
+fi
+

echo "Install prefix $prefix"
echo "BIOS directory `eval echo $datadir`"
diff --git a/ui/vnc-jobs-async.c b/ui/vnc-jobs-async.c
index f596247..d0c6f61 100644
--- a/ui/vnc-jobs-async.c
+++ b/ui/vnc-jobs-async.c
@@ -190,6 +190,18 @@ static void vnc_async_encoding_end(VncState *orig, VncState *local)
queue->buffer = local->output;
}

+static void vnc_worker_lock_output(VncState *vs)
+{
+ qemu_mutex_lock_iothread();
+ vnc_lock_output(vs);
+}
+
+static void vnc_worker_unlock_output(VncState *vs)
+{
+ vnc_unlock_output(vs);
+ qemu_mutex_unlock_iothread();
+}
+
static int vnc_worker_thread_loop(VncJobQueue *queue)
{
VncJob *job;
@@ -211,11 +223,11 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
return -1;
}

- vnc_lock_output(job->vs);
+ vnc_worker_lock_output(job->vs);
if (job->vs->csock == -1 || job->vs->abort == true) {
goto disconnected;
}
- vnc_unlock_output(job->vs);
+ vnc_worker_unlock_output(job->vs);

/* Make a local copy of vs and switch output buffers */
vnc_async_encoding_start(job->vs, &vs);
@@ -236,7 +248,7 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
/* output mutex must be locked before going to
* disconnected:
*/
- vnc_lock_output(job->vs);
+ vnc_worker_lock_output(job->vs);
goto disconnected;
}

@@ -255,7 +267,7 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
vs.output.buffer[saved_offset + 1] = n_rectangles & 0xFF;

/* Switch back buffers */
- vnc_lock_output(job->vs);
+ vnc_worker_lock_output(job->vs);
if (job->vs->csock == -1) {
goto disconnected;
}
@@ -266,10 +278,12 @@ disconnected:
/* Copy persistent encoding data */
vnc_async_encoding_end(job->vs, &vs);
flush = (job->vs->csock != -1 && job->vs->abort != true);
- vnc_unlock_output(job->vs);
+ vnc_worker_unlock_output(job->vs);

if (flush) {
+ qemu_mutex_lock_iothread();
vnc_flush(job->vs);
+ qemu_mutex_unlock_iothread();
}

vnc_lock_queue(queue);
--
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Corentin Chary
2011-03-09 13:42:29 UTC
Permalink
Post by Corentin Chary
The threaded VNC servers messed up with QEMU fd handlers without
any kind of locking, and that can cause some nasty race conditions.
The IO-Thread provides appropriate locking primitives to avoid that.
This patch makes CONFIG_VNC_THREAD depends on CONFIG_IO_THREAD,
and add lock and unlock calls around the two faulty calls.
Thanks to Jan Kiszka for helping me solve this issue.
---
The previous patch was total crap, introduced race conditions,
and probably crashs on client disconnections.
Forget that one too, also deadlock on vnc_jobs_join(). I'll need some
more time to fix that.
--
Corentin Chary
http://xf.iksaif.net
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Paolo Bonzini
2011-03-09 13:51:10 UTC
Permalink
Post by Corentin Chary
The threaded VNC servers messed up with QEMU fd handlers without
any kind of locking, and that can cause some nasty race conditions.
The IO-Thread provides appropriate locking primitives to avoid that.
This patch makes CONFIG_VNC_THREAD depends on CONFIG_IO_THREAD,
and add lock and unlock calls around the two faulty calls.
Thanks to Jan Kiszka for helping me solve this issue.
---
The previous patch was total crap, introduced race conditions,
and probably crashs on client disconnections.
configure | 9 +++++++++
ui/vnc-jobs-async.c | 24 +++++++++++++++++++-----
2 files changed, 28 insertions(+), 5 deletions(-)
diff --git a/configure b/configure
index 5513d3e..c8c1ac1 100755
--- a/configure
+++ b/configure
@@ -2455,6 +2455,15 @@ if test \( "$cpu" = "i386" -o "$cpu" = "x86_64" \) -a \
roms="optionrom"
fi
+# VNC Thread depends on IO Thread
+if test "$vnc_thread" = "yes" -a "$io_thread" != "yes"; then
+ echo
+ echo "ERROR: VNC thread depends on IO thread which isn't enabled."
+ echo "Please use --enable-io-thread if you want to enable it."
+ echo
+ exit 1
+fi
+
echo "Install prefix $prefix"
echo "BIOS directory `eval echo $datadir`"
diff --git a/ui/vnc-jobs-async.c b/ui/vnc-jobs-async.c
index f596247..d0c6f61 100644
--- a/ui/vnc-jobs-async.c
+++ b/ui/vnc-jobs-async.c
@@ -190,6 +190,18 @@ static void vnc_async_encoding_end(VncState *orig, VncState *local)
queue->buffer = local->output;
}
+static void vnc_worker_lock_output(VncState *vs)
+{
+ qemu_mutex_lock_iothread();
+ vnc_lock_output(vs);
+}
+
+static void vnc_worker_unlock_output(VncState *vs)
+{
+ vnc_unlock_output(vs);
+ qemu_mutex_unlock_iothread();
+}
+
static int vnc_worker_thread_loop(VncJobQueue *queue)
{
VncJob *job;
@@ -211,11 +223,11 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
return -1;
}
- vnc_lock_output(job->vs);
+ vnc_worker_lock_output(job->vs);
if (job->vs->csock == -1 || job->vs->abort == true) {
goto disconnected;
}
- vnc_unlock_output(job->vs);
+ vnc_worker_unlock_output(job->vs);
/* Make a local copy of vs and switch output buffers */
vnc_async_encoding_start(job->vs,&vs);
@@ -236,7 +248,7 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
/* output mutex must be locked before going to
*/
- vnc_lock_output(job->vs);
+ vnc_worker_lock_output(job->vs);
goto disconnected;
}
@@ -255,7 +267,7 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
vs.output.buffer[saved_offset + 1] = n_rectangles& 0xFF;
/* Switch back buffers */
- vnc_lock_output(job->vs);
+ vnc_worker_lock_output(job->vs);
if (job->vs->csock == -1) {
goto disconnected;
}
/* Copy persistent encoding data */
vnc_async_encoding_end(job->vs,&vs);
flush = (job->vs->csock != -1&& job->vs->abort != true);
- vnc_unlock_output(job->vs);
+ vnc_worker_unlock_output(job->vs);
if (flush) {
+ qemu_mutex_lock_iothread();
vnc_flush(job->vs);
+ qemu_mutex_unlock_iothread();
}
vnc_lock_queue(queue);
Acked-by: Paolo Bonzini <***@redhat.com> for stable.

For 0.15, I believe an iohandler-list lock is a better solution.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Corentin Chary
2011-03-09 13:59:09 UTC
Permalink
Post by Paolo Bonzini
Post by Corentin Chary
The threaded VNC servers messed up with QEMU fd handlers without
any kind of locking, and that can cause some nasty race conditions.
The IO-Thread provides appropriate locking primitives to avoid that.
This patch makes CONFIG_VNC_THREAD depends on CONFIG_IO_THREAD,
and add lock and unlock calls around the two faulty calls.
Thanks to Jan Kiszka for helping me solve this issue.
---
The previous patch was total crap, introduced race conditions,
and probably crashs on client disconnections.
=C2=A0configure =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | =C2=A0 =C2=A09 =
+++++++++
Post by Paolo Bonzini
Post by Corentin Chary
=C2=A0ui/vnc-jobs-async.c | =C2=A0 24 +++++++++++++++++++-----
=C2=A02 files changed, 28 insertions(+), 5 deletions(-)
diff --git a/configure b/configure
index 5513d3e..c8c1ac1 100755
--- a/configure
+++ b/configure
@@ -2455,6 +2455,15 @@ if test \( "$cpu" =3D "i386" -o "$cpu" =3D "x=
86_64" \)
Post by Paolo Bonzini
Post by Corentin Chary
-a \
=C2=A0 =C2=A0roms=3D"optionrom"
=C2=A0fi
+# VNC Thread depends on IO Thread
+if test "$vnc_thread" =3D "yes" -a "$io_thread" !=3D "yes"; then
+ =C2=A0echo
+ =C2=A0echo "ERROR: VNC thread depends on IO thread which isn't ena=
bled."
Post by Paolo Bonzini
Post by Corentin Chary
+ =C2=A0echo "Please use --enable-io-thread if you want to enable it=
=2E"
Post by Paolo Bonzini
Post by Corentin Chary
+ =C2=A0echo
+ =C2=A0exit 1
+fi
+
=C2=A0echo "Install prefix =C2=A0 =C2=A0$prefix"
=C2=A0echo "BIOS directory =C2=A0 =C2=A0`eval echo $datadir`"
diff --git a/ui/vnc-jobs-async.c b/ui/vnc-jobs-async.c
index f596247..d0c6f61 100644
--- a/ui/vnc-jobs-async.c
+++ b/ui/vnc-jobs-async.c
@@ -190,6 +190,18 @@ static void vnc_async_encoding_end(VncState *or=
ig,
Post by Paolo Bonzini
Post by Corentin Chary
VncState *local)
=C2=A0 =C2=A0 =C2=A0queue->buffer =3D local->output;
=C2=A0}
+static void vnc_worker_lock_output(VncState *vs)
+{
+ =C2=A0 =C2=A0qemu_mutex_lock_iothread();
+ =C2=A0 =C2=A0vnc_lock_output(vs);
+}
+
+static void vnc_worker_unlock_output(VncState *vs)
+{
+ =C2=A0 =C2=A0vnc_unlock_output(vs);
+ =C2=A0 =C2=A0qemu_mutex_unlock_iothread();
+}
+
=C2=A0static int vnc_worker_thread_loop(VncJobQueue *queue)
=C2=A0{
=C2=A0 =C2=A0 =C2=A0VncJob *job;
@@ -211,11 +223,11 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return -1;
=C2=A0 =C2=A0 =C2=A0}
- =C2=A0 =C2=A0vnc_lock_output(job->vs);
+ =C2=A0 =C2=A0vnc_worker_lock_output(job->vs);
=C2=A0 =C2=A0 =C2=A0if (job->vs->csock =3D=3D -1 || job->vs->abort =3D=
=3D true) {
Post by Paolo Bonzini
Post by Corentin Chary
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0goto disconnected;
=C2=A0 =C2=A0 =C2=A0}
- =C2=A0 =C2=A0vnc_unlock_output(job->vs);
+ =C2=A0 =C2=A0vnc_worker_unlock_output(job->vs);
=C2=A0 =C2=A0 =C2=A0/* Make a local copy of vs and switch output buf=
fers */
Post by Paolo Bonzini
Post by Corentin Chary
=C2=A0 =C2=A0 =C2=A0vnc_async_encoding_start(job->vs,&vs);
@@ -236,7 +248,7 @@ static int vnc_worker_thread_loop(VncJobQueue *q=
ueue)
Post by Paolo Bonzini
Post by Corentin Chary
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* output mutex must=
be locked before going to
Post by Paolo Bonzini
Post by Corentin Chary
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 */
- =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0vnc_lock_output(job->vs);
+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0vnc_worker_lock_output(jo=
b->vs);
Post by Paolo Bonzini
Post by Corentin Chary
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0goto disconnected;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
@@ -255,7 +267,7 @@ static int vnc_worker_thread_loop(VncJobQueue *q=
ueue)
Post by Paolo Bonzini
Post by Corentin Chary
=C2=A0 =C2=A0 =C2=A0vs.output.buffer[saved_offset + 1] =3D n_rectang=
les& =C2=A00xFF;
Post by Paolo Bonzini
Post by Corentin Chary
=C2=A0 =C2=A0 =C2=A0/* Switch back buffers */
- =C2=A0 =C2=A0vnc_lock_output(job->vs);
+ =C2=A0 =C2=A0vnc_worker_lock_output(job->vs);
=C2=A0 =C2=A0 =C2=A0if (job->vs->csock =3D=3D -1) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0goto disconnected;
=C2=A0 =C2=A0 =C2=A0}
=C2=A0 =C2=A0 =C2=A0/* Copy persistent encoding data */
=C2=A0 =C2=A0 =C2=A0vnc_async_encoding_end(job->vs,&vs);
=C2=A0 =C2=A0 =C2=A0flush =3D (job->vs->csock !=3D -1&& =C2=A0job->v=
s->abort !=3D true);
Post by Paolo Bonzini
Post by Corentin Chary
- =C2=A0 =C2=A0vnc_unlock_output(job->vs);
+ =C2=A0 =C2=A0vnc_worker_unlock_output(job->vs);
=C2=A0 =C2=A0 =C2=A0if (flush) {
+ =C2=A0 =C2=A0 =C2=A0 =C2=A0qemu_mutex_lock_iothread();
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0vnc_flush(job->vs);
+ =C2=A0 =C2=A0 =C2=A0 =C2=A0qemu_mutex_unlock_iothread();
=C2=A0 =C2=A0 =C2=A0}
=C2=A0 =C2=A0 =C2=A0vnc_lock_queue(queue);
For 0.15, I believe an iohandler-list lock is a better solution.
I believe that's the only solution. Looks at that deadlock:

(gdb) bt
#0 0x00007f70c067ad74 in __lll_lock_wait () from /lib/libpthread.so.0
#1 0x00007f70c0676219 in _L_lock_508 () from /lib/libpthread.so.0
#2 0x00007f70c067603b in pthread_mutex_lock () from /lib/libpthread.so=
=2E0
#3 0x00000000004c7559 in qemu_mutex_lock (mutex=3D0x10f4340) at qemu-t=
hread.c:50
#4 0x00000000005644ef in main_loop_wait (nonblocking=3D<value optimize=
d
out>) at /home/iksaif/dev/qemu/vl.c:1374
#5 0x00000000005653a8 in main_loop (argc=3D<value optimized out>,
argv=3D<value optimized out>, envp=3D<value optimized out>) at
/home/iksaif/dev/qemu/vl.c:1437
#6 main (argc=3D<value optimized out>, argv=3D<value optimized out>,
envp=3D<value optimized out>) at /home/iksaif/dev/qemu/vl.c:3148
(gdb) t 2
[Switching to thread 2 (Thread 0x7f709262b710 (LWP 19546))]#0
0x00007f70c067ad74 in __lll_lock_wait () from /lib/libpthread.so.0
(gdb) bt
#0 0x00007f70c067ad74 in __lll_lock_wait () from /lib/libpthread.so.0
#1 0x00007f70c0676219 in _L_lock_508 () from /lib/libpthread.so.0
#2 0x00007f70c067603b in pthread_mutex_lock () from /lib/libpthread.so=
=2E0
#3 0x00000000004c7559 in qemu_mutex_lock (mutex=3D0x10f4340) at qemu-t=
hread.c:50
#4 0x00000000004c65de in vnc_worker_thread_loop (queue=3D0x33561d0) at
ui/vnc-jobs-async.c:254
#5 0x00000000004c6730 in vnc_worker_thread (arg=3D0x33561d0) at
ui/vnc-jobs-async.c:306
#6 0x00007f70c0673904 in start_thread () from /lib/libpthread.so.0
#7 0x00007f70be5cf1dd in clone () from /lib/libc.so.6
(gdb) t 3
[Switching to thread 3 (Thread 0x7f70b38ff710 (LWP 19545))]#0
0x00007f70c067829c in pthread_cond_wait () from /lib/libpthread.so.0
(gdb) bt
#0 0x00007f70c067829c in pthread_cond_wait () from /lib/libpthread.so.=
0
#1 0x00000000004c7239 in qemu_cond_wait (cond=3D0x33561d4, mutex=3D0x8=
0)
at qemu-thread.c:133
#2 0x00000000004c617b in vnc_jobs_join (vs=3D0x35d9c10) at
ui/vnc-jobs-async.c:155
#3 0x00000000004afefa in vnc_update_client_sync (ds=3D<value optimized
out>, src_x=3D<value optimized out>, src_y=3D<value optimized out>,
dst_x=3D<value optimized out>, dst_y=3D<value optimized out>, w=3D<valu=
e
optimized out>, h=3D1) at ui/vnc.c:819
#4 vnc_dpy_copy (ds=3D<value optimized out>, src_x=3D<value optimized
out>, src_y=3D<value optimized out>, dst_x=3D<value optimized out>,
dst_y=3D<value optimized out>, w=3D<value optimized out>, h=3D1) at
ui/vnc.c:692
#5 0x000000000046b5e1 in dpy_copy (ds=3D0x3329d40, src_x=3D<value
optimized out>, src_y=3D<value optimized out>, dst_x=3D129, dst_y=3D377=
,
w=3D2, h=3D1) at console.h:273
#6 qemu_console_copy (ds=3D0x3329d40, src_x=3D<value optimized out>,
src_y=3D<value optimized out>, dst_x=3D129, dst_y=3D377, w=3D2, h=3D1) =
at
console.c:1579
#7 0x00000000005d6159 in cirrus_do_copy (s=3D0x3316ea8) at
/home/iksaif/dev/qemu/hw/cirrus_vga.c:729
#8 cirrus_bitblt_videotovideo_copy (s=3D0x3316ea8) at
/home/iksaif/dev/qemu/hw/cirrus_vga.c:747
#9 cirrus_bitblt_videotovideo (s=3D0x3316ea8) at
/home/iksaif/dev/qemu/hw/cirrus_vga.c:869
#10 cirrus_bitblt_start (s=3D0x3316ea8) at
/home/iksaif/dev/qemu/hw/cirrus_vga.c:1010
#11 0x00000000005d8b67 in cirrus_mmio_writel (opaque=3D0x33561d4,
addr=3D128, val=3D4294967042) at
/home/iksaif/dev/qemu/hw/cirrus_vga.c:2819
#12 0x00000000004e04d8 in cpu_physical_memory_rw (addr=3D4060086592,
buf=3D0x7f70b30fc028 "\002\377\377\377", len=3D4, is_write=3D1) at
/home/iksaif/dev/qemu/exec.c:3670
#13 0x000000000042c845 in kvm_cpu_exec (env=3D0x30f8dd0) at
/home/iksaif/dev/qemu/kvm-all.c:954
#14 0x000000000040c75e in qemu_kvm_cpu_thread_fn (arg=3D0x30f8dd0) at
/home/iksaif/dev/qemu/cpus.c:832
#15 0x00007f70c0673904 in start_thread () from /lib/libpthread.so.0
#16 0x00007f70be5cf1dd in clone () from /lib/libc.so.6
(gdb) t 4
[Switching to thread 4 (Thread 0x7f70b4103710 (LWP 19544))]#0
0x00007f70c067ad74 in __lll_lock_wait () from /lib/libpthread.so.0
(gdb) bt
#0 0x00007f70c067ad74 in __lll_lock_wait () from /lib/libpthread.so.0
#1 0x00007f70c0676219 in _L_lock_508 () from /lib/libpthread.so.0
#2 0x00007f70c067603b in pthread_mutex_lock () from /lib/libpthread.so=
=2E0
#3 0x00000000004c7559 in qemu_mutex_lock (mutex=3D0x10f4340) at qemu-t=
hread.c:50
#4 0x000000000042c703 in kvm_cpu_exec (env=3D0x30e15f0) at
/home/iksaif/dev/qemu/kvm-all.c:924
#5 0x000000000040c75e in qemu_kvm_cpu_thread_fn (arg=3D0x30e15f0) at
/home/iksaif/dev/qemu/cpus.c:832
#6 0x00007f70c0673904 in start_thread () from /lib/libpthread.so.0
#7 0x00007f70be5cf1dd in clone () from /lib/libc.so.6

I currently don't see any solution for that one using iothread lock:
- vnc_dpy_copy can be called with iothread locked
- vnc_dpy_copy needs to wait for vnc-thread to finish is work before
being able to do anything
- vnc-thread need to lock iothread to finish its work

--=20
Corentin Chary
http://xf.iksaif.net
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Corentin Chary
2011-03-10 12:59:39 UTC
Permalink
The threaded VNC servers messed up with QEMU fd handlers without
any kind of locking, and that can cause some nasty race conditions.

Using qemu_mutex_lock_iothread() won't work because vnc_dpy_cpy(),
which will wait for the current job queue to finish, can be called with
the iothread lock held.

Instead, we now store the data in a temporary buffer, and use a socket
pair to notify the main thread that new data is available. The data
is not directly send through this socket because it would make the code
more complex without any performance benefit.

vnc_[un]lock_ouput() is still needed to access VncState members like
abort, csock or jobs_buffer.

Thanks to Jan Kiszka for helping me solve this issue.

Signed-off-by: Corentin Chary <***@gmail.com>
---
ui/vnc-jobs-async.c | 50 ++++++++++++++++++++++++++++++++------------------
ui/vnc-jobs-sync.c | 4 ++++
ui/vnc-jobs.h | 1 +
ui/vnc.c | 16 ++++++++++++++++
ui/vnc.h | 2 ++
5 files changed, 55 insertions(+), 18 deletions(-)

diff --git a/ui/vnc-jobs-async.c b/ui/vnc-jobs-async.c
index f596247..543b5a9 100644
--- a/ui/vnc-jobs-async.c
+++ b/ui/vnc-jobs-async.c
@@ -28,6 +28,7 @@

#include "vnc.h"
#include "vnc-jobs.h"
+#include "qemu_socket.h"

/*
* Locking:
@@ -155,6 +156,24 @@ void vnc_jobs_join(VncState *vs)
qemu_cond_wait(&queue->cond, &queue->mutex);
}
vnc_unlock_queue(queue);
+ vnc_jobs_consume_buffer(vs);
+}
+
+void vnc_jobs_consume_buffer(VncState *vs)
+{
+ bool flush;
+
+ vnc_lock_output(vs);
+ if (vs->jobs_buffer.offset) {
+ vnc_write(vs, vs->jobs_buffer.buffer, vs->jobs_buffer.offset);
+ buffer_reset(&vs->jobs_buffer);
+ }
+ flush = vs->csock != -1 && vs->abort != true;
+ vnc_unlock_output(vs);
+
+ if (flush) {
+ vnc_flush(vs);
+ }
}

/*
@@ -197,7 +216,6 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
VncState vs;
int n_rectangles;
int saved_offset;
- bool flush;

vnc_lock_queue(queue);
while (QTAILQ_EMPTY(&queue->jobs) && !queue->exit) {
@@ -213,6 +231,7 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)

vnc_lock_output(job->vs);
if (job->vs->csock == -1 || job->vs->abort == true) {
+ vnc_unlock_output(job->vs);
goto disconnected;
}
vnc_unlock_output(job->vs);
@@ -233,10 +252,6 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)

if (job->vs->csock == -1) {
vnc_unlock_display(job->vs->vd);
- /* output mutex must be locked before going to
- * disconnected:
- */
- vnc_lock_output(job->vs);
goto disconnected;
}

@@ -254,24 +269,23 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
vs.output.buffer[saved_offset] = (n_rectangles >> 8) & 0xFF;
vs.output.buffer[saved_offset + 1] = n_rectangles & 0xFF;

- /* Switch back buffers */
vnc_lock_output(job->vs);
- if (job->vs->csock == -1) {
- goto disconnected;
- }
-
- vnc_write(job->vs, vs.output.buffer, vs.output.offset);
+ if (job->vs->csock != -1) {
+ int notify = !job->vs->jobs_buffer.offset;

-disconnected:
- /* Copy persistent encoding data */
- vnc_async_encoding_end(job->vs, &vs);
- flush = (job->vs->csock != -1 && job->vs->abort != true);
- vnc_unlock_output(job->vs);
+ buffer_reserve(&job->vs->jobs_buffer, vs.output.offset);
+ buffer_append(&job->vs->jobs_buffer, vs.output.buffer,
+ vs.output.offset);
+ /* Copy persistent encoding data */
+ vnc_async_encoding_end(job->vs, &vs);

- if (flush) {
- vnc_flush(job->vs);
+ if (notify) {
+ send(job->vs->jobs_socks[1], (char []){1}, 1, 0);
+ }
}
+ vnc_unlock_output(job->vs);

+disconnected:
vnc_lock_queue(queue);
QTAILQ_REMOVE(&queue->jobs, job, next);
vnc_unlock_queue(queue);
diff --git a/ui/vnc-jobs-sync.c b/ui/vnc-jobs-sync.c
index 49b77af..3a4d7df 100644
--- a/ui/vnc-jobs-sync.c
+++ b/ui/vnc-jobs-sync.c
@@ -36,6 +36,10 @@ void vnc_jobs_join(VncState *vs)
{
}

+void vnc_jobs_consume_buffer(VncState *vs)
+{
+}
+
VncJob *vnc_job_new(VncState *vs)
{
vs->job.vs = vs;
diff --git a/ui/vnc-jobs.h b/ui/vnc-jobs.h
index b8dab81..7c529b7 100644
--- a/ui/vnc-jobs.h
+++ b/ui/vnc-jobs.h
@@ -37,6 +37,7 @@ void vnc_job_push(VncJob *job);
bool vnc_has_job(VncState *vs);
void vnc_jobs_clear(VncState *vs);
void vnc_jobs_join(VncState *vs);
+void vnc_jobs_consume_buffer(VncState *vs);

#ifdef CONFIG_VNC_THREAD

diff --git a/ui/vnc.c b/ui/vnc.c
index 610f884..48a81a2 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -1011,6 +1011,10 @@ static void vnc_disconnect_finish(VncState *vs)

#ifdef CONFIG_VNC_THREAD
qemu_mutex_destroy(&vs->output_mutex);
+ qemu_set_fd_handler2(vs->jobs_socks[0], NULL, NULL, NULL, vs);
+ close(vs->jobs_socks[0]);
+ close(vs->jobs_socks[1]);
+ buffer_free(&vs->jobs_buffer);
#endif
for (i = 0; i < VNC_STAT_ROWS; ++i) {
qemu_free(vs->lossy_rect[i]);
@@ -1226,6 +1230,16 @@ static long vnc_client_read_plain(VncState *vs)
return ret;
}

+#ifdef CONFIG_VNC_THREAD
+static void vnc_jobs_read(void *opaque)
+{
+ VncState *vs = opaque;
+ char dummy;
+
+ recv(vs->jobs_socks[0], (void *)&dummy, 1, 0);
+ vnc_jobs_consume_buffer(vs);
+}
+#endif

/*
* First function called whenever there is more data to be read from
@@ -2525,6 +2539,8 @@ static void vnc_connect(VncDisplay *vd, int csock)

#ifdef CONFIG_VNC_THREAD
qemu_mutex_init(&vs->output_mutex);
+ qemu_socketpair(AF_UNIX, SOCK_STREAM, 0, vs->jobs_socks);
+ qemu_set_fd_handler2(vs->jobs_socks[0], NULL, vnc_jobs_read, NULL, vs);
#endif

QTAILQ_INSERT_HEAD(&vd->clients, vs, next);
diff --git a/ui/vnc.h b/ui/vnc.h
index 8a1e7b9..0f98f2e 100644
--- a/ui/vnc.h
+++ b/ui/vnc.h
@@ -283,6 +283,8 @@ struct VncState
VncJob job;
#else
QemuMutex output_mutex;
+ Buffer jobs_buffer;
+ int jobs_socks[2];
#endif

/* Encoding specific, if you add something here, don't forget to
--
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Paolo Bonzini
2011-03-10 13:06:39 UTC
Permalink
Post by Corentin Chary
Instead, we now store the data in a temporary buffer, and use a socket
pair to notify the main thread that new data is available.
You can use a bottom half for this instead of a special socket.
Signaling a bottom half is async-signal- and thread-safe.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Anthony Liguori
2011-03-10 13:45:39 UTC
Permalink
Post by Paolo Bonzini
Post by Corentin Chary
Instead, we now store the data in a temporary buffer, and use a socket
pair to notify the main thread that new data is available.
You can use a bottom half for this instead of a special socket.
Signaling a bottom half is async-signal- and thread-safe.
Bottom halves are thread safe?

I don't think so.

They probably should be but they aren't today.

Regards,

Anthony Liguori
Post by Paolo Bonzini
Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Corentin Chary
2011-03-10 13:54:12 UTC
Permalink
Post by Anthony Liguori
Post by Corentin Chary
Instead, we now store the data in a temporary buffer, and use a socket
pair to notify the main thread that new data is available.
You can use a bottom half for this instead of a special socket. Signaling
a bottom half is async-signal- and thread-safe.
Bottom halves are thread safe?
I don't think so.
The bottom halves API is not thread safe, but calling
qemu_bh_schedule_idle() in another thread *seems* to be safe (here, it
would be protected from qemu_bh_delete() by vnc_lock_output()).
--
Corentin Chary
http://xf.iksaif.net
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Paolo Bonzini
2011-03-10 13:58:21 UTC
Permalink
Post by Corentin Chary
Post by Anthony Liguori
You can use a bottom half for this instead of a special socket. Signaling
a bottom half is async-signal- and thread-safe.
Bottom halves are thread safe?
I don't think so.
The bottom halves API is not thread safe, but calling
qemu_bh_schedule_idle()
Not _idle please.
Post by Corentin Chary
in another thread *seems* to be safe (here, it
would be protected from qemu_bh_delete() by vnc_lock_output()).
If it weren't protected against qemu_bh_delete, you would have already
the same race between writing to the socket and closing it in another
thread.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Paolo Bonzini
2011-03-10 13:56:05 UTC
Permalink
Post by Anthony Liguori
Post by Paolo Bonzini
Post by Corentin Chary
Instead, we now store the data in a temporary buffer, and use a socket
pair to notify the main thread that new data is available.
You can use a bottom half for this instead of a special socket.
Signaling a bottom half is async-signal- and thread-safe.
Bottom halves are thread safe?
I don't think so.
They probably should be but they aren't today.
Creating a new bottom half is not thread-safe, but scheduling one is.
Assuming that you never use qemu_bh_schedule_idle, qemu_bh_schedule
boils down to:

if (bh->scheduled)
return;
bh->scheduled = 1;
/* stop the currently executing CPU to execute the BH ASAP */
qemu_notify_event();

You may have a spurious wakeup if two threads race on the same bottom
half (including the signaling thread racing with the IO thread), but
overall you can safely treat them as thread-safe.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Peter Lieven
2011-03-10 13:47:16 UTC
Permalink
Post by Corentin Chary
The threaded VNC servers messed up with QEMU fd handlers without
any kind of locking, and that can cause some nasty race conditions.
Using qemu_mutex_lock_iothread() won't work because vnc_dpy_cpy(),
which will wait for the current job queue to finish, can be called with
the iothread lock held.
Instead, we now store the data in a temporary buffer, and use a socket
pair to notify the main thread that new data is available. The data
is not directly send through this socket because it would make the code
more complex without any performance benefit.
vnc_[un]lock_ouput() is still needed to access VncState members like
abort, csock or jobs_buffer.
is this compatible with qemu-kvm?

thanks,
peter
Post by Corentin Chary
Thanks to Jan Kiszka for helping me solve this issue.
---
ui/vnc-jobs-async.c | 50 ++++++++++++++++++++++++++++++++------------------
ui/vnc-jobs-sync.c | 4 ++++
ui/vnc-jobs.h | 1 +
ui/vnc.c | 16 ++++++++++++++++
ui/vnc.h | 2 ++
5 files changed, 55 insertions(+), 18 deletions(-)
diff --git a/ui/vnc-jobs-async.c b/ui/vnc-jobs-async.c
index f596247..543b5a9 100644
--- a/ui/vnc-jobs-async.c
+++ b/ui/vnc-jobs-async.c
@@ -28,6 +28,7 @@
#include "vnc.h"
#include "vnc-jobs.h"
+#include "qemu_socket.h"
/*
@@ -155,6 +156,24 @@ void vnc_jobs_join(VncState *vs)
qemu_cond_wait(&queue->cond,&queue->mutex);
}
vnc_unlock_queue(queue);
+ vnc_jobs_consume_buffer(vs);
+}
+
+void vnc_jobs_consume_buffer(VncState *vs)
+{
+ bool flush;
+
+ vnc_lock_output(vs);
+ if (vs->jobs_buffer.offset) {
+ vnc_write(vs, vs->jobs_buffer.buffer, vs->jobs_buffer.offset);
+ buffer_reset(&vs->jobs_buffer);
+ }
+ flush = vs->csock != -1&& vs->abort != true;
+ vnc_unlock_output(vs);
+
+ if (flush) {
+ vnc_flush(vs);
+ }
}
/*
@@ -197,7 +216,6 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
VncState vs;
int n_rectangles;
int saved_offset;
- bool flush;
vnc_lock_queue(queue);
while (QTAILQ_EMPTY(&queue->jobs)&& !queue->exit) {
@@ -213,6 +231,7 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
vnc_lock_output(job->vs);
if (job->vs->csock == -1 || job->vs->abort == true) {
+ vnc_unlock_output(job->vs);
goto disconnected;
}
vnc_unlock_output(job->vs);
@@ -233,10 +252,6 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
if (job->vs->csock == -1) {
vnc_unlock_display(job->vs->vd);
- /* output mutex must be locked before going to
- */
- vnc_lock_output(job->vs);
goto disconnected;
}
@@ -254,24 +269,23 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
vs.output.buffer[saved_offset] = (n_rectangles>> 8)& 0xFF;
vs.output.buffer[saved_offset + 1] = n_rectangles& 0xFF;
- /* Switch back buffers */
vnc_lock_output(job->vs);
- if (job->vs->csock == -1) {
- goto disconnected;
- }
-
- vnc_write(job->vs, vs.output.buffer, vs.output.offset);
+ if (job->vs->csock != -1) {
+ int notify = !job->vs->jobs_buffer.offset;
- /* Copy persistent encoding data */
- vnc_async_encoding_end(job->vs,&vs);
- flush = (job->vs->csock != -1&& job->vs->abort != true);
- vnc_unlock_output(job->vs);
+ buffer_reserve(&job->vs->jobs_buffer, vs.output.offset);
+ buffer_append(&job->vs->jobs_buffer, vs.output.buffer,
+ vs.output.offset);
+ /* Copy persistent encoding data */
+ vnc_async_encoding_end(job->vs,&vs);
- if (flush) {
- vnc_flush(job->vs);
+ if (notify) {
+ send(job->vs->jobs_socks[1], (char []){1}, 1, 0);
+ }
}
+ vnc_unlock_output(job->vs);
vnc_lock_queue(queue);
QTAILQ_REMOVE(&queue->jobs, job, next);
vnc_unlock_queue(queue);
diff --git a/ui/vnc-jobs-sync.c b/ui/vnc-jobs-sync.c
index 49b77af..3a4d7df 100644
--- a/ui/vnc-jobs-sync.c
+++ b/ui/vnc-jobs-sync.c
@@ -36,6 +36,10 @@ void vnc_jobs_join(VncState *vs)
{
}
+void vnc_jobs_consume_buffer(VncState *vs)
+{
+}
+
VncJob *vnc_job_new(VncState *vs)
{
vs->job.vs = vs;
diff --git a/ui/vnc-jobs.h b/ui/vnc-jobs.h
index b8dab81..7c529b7 100644
--- a/ui/vnc-jobs.h
+++ b/ui/vnc-jobs.h
@@ -37,6 +37,7 @@ void vnc_job_push(VncJob *job);
bool vnc_has_job(VncState *vs);
void vnc_jobs_clear(VncState *vs);
void vnc_jobs_join(VncState *vs);
+void vnc_jobs_consume_buffer(VncState *vs);
#ifdef CONFIG_VNC_THREAD
diff --git a/ui/vnc.c b/ui/vnc.c
index 610f884..48a81a2 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -1011,6 +1011,10 @@ static void vnc_disconnect_finish(VncState *vs)
#ifdef CONFIG_VNC_THREAD
qemu_mutex_destroy(&vs->output_mutex);
+ qemu_set_fd_handler2(vs->jobs_socks[0], NULL, NULL, NULL, vs);
+ close(vs->jobs_socks[0]);
+ close(vs->jobs_socks[1]);
+ buffer_free(&vs->jobs_buffer);
#endif
for (i = 0; i< VNC_STAT_ROWS; ++i) {
qemu_free(vs->lossy_rect[i]);
@@ -1226,6 +1230,16 @@ static long vnc_client_read_plain(VncState *vs)
return ret;
}
+#ifdef CONFIG_VNC_THREAD
+static void vnc_jobs_read(void *opaque)
+{
+ VncState *vs = opaque;
+ char dummy;
+
+ recv(vs->jobs_socks[0], (void *)&dummy, 1, 0);
+ vnc_jobs_consume_buffer(vs);
+}
+#endif
/*
* First function called whenever there is more data to be read from
@@ -2525,6 +2539,8 @@ static void vnc_connect(VncDisplay *vd, int csock)
#ifdef CONFIG_VNC_THREAD
qemu_mutex_init(&vs->output_mutex);
+ qemu_socketpair(AF_UNIX, SOCK_STREAM, 0, vs->jobs_socks);
+ qemu_set_fd_handler2(vs->jobs_socks[0], NULL, vnc_jobs_read, NULL, vs);
#endif
QTAILQ_INSERT_HEAD(&vd->clients, vs, next);
diff --git a/ui/vnc.h b/ui/vnc.h
index 8a1e7b9..0f98f2e 100644
--- a/ui/vnc.h
+++ b/ui/vnc.h
@@ -283,6 +283,8 @@ struct VncState
VncJob job;
#else
QemuMutex output_mutex;
+ Buffer jobs_buffer;
+ int jobs_socks[2];
#endif
/* Encoding specific, if you add something here, don't forget to
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Corentin Chary
2011-03-10 12:59:38 UTC
Permalink
Signed-off-by: Corentin Chary <***@gmail.com>
---
osdep.c | 83 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
qemu_socket.h | 1 +
2 files changed, 84 insertions(+), 0 deletions(-)

diff --git a/osdep.c b/osdep.c
index 327583b..93bfbe0 100644
--- a/osdep.c
+++ b/osdep.c
@@ -147,6 +147,89 @@ int qemu_socket(int domain, int type, int protocol)
return ret;
}

+#ifdef _WIN32
+int qemu_socketpair(int domain, int type, int protocol, int socks[2])
+{
+ union {
+ struct sockaddr_in inaddr;
+ struct sockaddr addr;
+ } a;
+ int listener;
+ socklen_t addrlen = sizeof(a.inaddr);
+ int reuse = 1;
+
+ if (domain == AF_UNIX) {
+ domain = AF_INET;
+ }
+
+ if (socks == 0) {
+ return EINVAL;
+ }
+
+ listener = qemu_socket(domain, type, protocol);
+ if (listener < 0) {
+ return listener;
+ }
+
+ memset(&a, 0, sizeof(a));
+ a.inaddr.sin_family = AF_INET;
+ a.inaddr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
+ a.inaddr.sin_port = 0;
+
+ socks[0] = socks[1] = -1;
+
+ if (setsockopt(listener, SOL_SOCKET, SO_REUSEADDR,
+ (char*) &reuse, (socklen_t) sizeof(reuse)) == -1) {
+ goto error;
+ }
+ if (bind(listener, &a.addr, sizeof(a.inaddr)) < 0) {
+ goto error;
+ }
+
+ memset(&a, 0, sizeof(a));
+ if (getsockname(listener, &a.addr, &addrlen) < 0) {
+ goto error;
+ }
+
+ a.inaddr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
+ a.inaddr.sin_family = AF_INET;
+
+ if (listen(listener, 1) < 0) {
+ goto error;
+ }
+
+ socks[0] = qemu_socket(AF_INET, SOCK_STREAM, 0);
+ if (socks[0] < 0) {
+ goto error;
+ }
+ if (connect(socks[0], &a.addr, sizeof(a.inaddr)) < 0) {
+ goto error;
+ }
+
+ socks[1] = qemu_accept(listener, NULL, NULL);
+ if (socks[1] < 0) {
+ goto error;
+ }
+
+ closesocket(listener);
+ return 0;
+
+error:
+ if (listener != -1)
+ closesocket(listener);
+ if (socks[0] != -1)
+ closesocket(socks[0]);
+ if (socks[1] != -1)
+ closesocket(socks[1]);
+ return -1;
+}
+#else
+int qemu_socketpair(int domain, int type, int protocol, int socks[2])
+{
+ return socketpair(domain, type, protocol, socks);
+}
+#endif
+
/*
* Accept a connection and set FD_CLOEXEC
*/
diff --git a/qemu_socket.h b/qemu_socket.h
index 180e4db..d7eb9a5 100644
--- a/qemu_socket.h
+++ b/qemu_socket.h
@@ -34,6 +34,7 @@ int inet_aton(const char *cp, struct in_addr *ia);

/* misc helpers */
int qemu_socket(int domain, int type, int protocol);
+int qemu_socketpair(int domain, int type, int protocol, int socks[2]);
int qemu_accept(int s, struct sockaddr *addr, socklen_t *addrlen);
void socket_set_nonblock(int fd);
int send_all(int fd, const void *buf, int len1);
--
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Corentin Chary
2011-03-10 15:13:02 UTC
Permalink
The threaded VNC servers messed up with QEMU fd handlers without
any kind of locking, and that can cause some nasty race conditions.

Using qemu_mutex_lock_iothread() won't work because vnc_dpy_cpy(),
which will wait for the current job queue to finish, can be called with
the iothread lock held.

Instead, we now store the data in a temporary buffer, and use a bottom
half to notify the main thread that new data is available.

vnc_[un]lock_ouput() is still needed to access VncState members like
abort, csock or jobs_buffer.

Signed-off-by: Corentin Chary <***@gmail.com>
---
ui/vnc-jobs-async.c | 48 +++++++++++++++++++++++++++++-------------------
ui/vnc-jobs.h | 1 +
ui/vnc.c | 12 ++++++++++++
ui/vnc.h | 2 ++
4 files changed, 44 insertions(+), 19 deletions(-)

diff --git a/ui/vnc-jobs-async.c b/ui/vnc-jobs-async.c
index f596247..4438776 100644
--- a/ui/vnc-jobs-async.c
+++ b/ui/vnc-jobs-async.c
@@ -28,6 +28,7 @@

#include "vnc.h"
#include "vnc-jobs.h"
+#include "qemu_socket.h"

/*
* Locking:
@@ -155,6 +156,24 @@ void vnc_jobs_join(VncState *vs)
qemu_cond_wait(&queue->cond, &queue->mutex);
}
vnc_unlock_queue(queue);
+ vnc_jobs_consume_buffer(vs);
+}
+
+void vnc_jobs_consume_buffer(VncState *vs)
+{
+ bool flush;
+
+ vnc_lock_output(vs);
+ if (vs->jobs_buffer.offset) {
+ vnc_write(vs, vs->jobs_buffer.buffer, vs->jobs_buffer.offset);
+ buffer_reset(&vs->jobs_buffer);
+ }
+ flush = vs->csock != -1 && vs->abort != true;
+ vnc_unlock_output(vs);
+
+ if (flush) {
+ vnc_flush(vs);
+ }
}

/*
@@ -197,7 +216,6 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
VncState vs;
int n_rectangles;
int saved_offset;
- bool flush;

vnc_lock_queue(queue);
while (QTAILQ_EMPTY(&queue->jobs) && !queue->exit) {
@@ -213,6 +231,7 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)

vnc_lock_output(job->vs);
if (job->vs->csock == -1 || job->vs->abort == true) {
+ vnc_unlock_output(job->vs);
goto disconnected;
}
vnc_unlock_output(job->vs);
@@ -233,10 +252,6 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)

if (job->vs->csock == -1) {
vnc_unlock_display(job->vs->vd);
- /* output mutex must be locked before going to
- * disconnected:
- */
- vnc_lock_output(job->vs);
goto disconnected;
}

@@ -254,24 +269,19 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
vs.output.buffer[saved_offset] = (n_rectangles >> 8) & 0xFF;
vs.output.buffer[saved_offset + 1] = n_rectangles & 0xFF;

- /* Switch back buffers */
vnc_lock_output(job->vs);
- if (job->vs->csock == -1) {
- goto disconnected;
+ if (job->vs->csock != -1) {
+ buffer_reserve(&job->vs->jobs_buffer, vs.output.offset);
+ buffer_append(&job->vs->jobs_buffer, vs.output.buffer,
+ vs.output.offset);
+ /* Copy persistent encoding data */
+ vnc_async_encoding_end(job->vs, &vs);
+
+ qemu_bh_schedule(job->vs->bh);
}
-
- vnc_write(job->vs, vs.output.buffer, vs.output.offset);
-
-disconnected:
- /* Copy persistent encoding data */
- vnc_async_encoding_end(job->vs, &vs);
- flush = (job->vs->csock != -1 && job->vs->abort != true);
vnc_unlock_output(job->vs);

- if (flush) {
- vnc_flush(job->vs);
- }
-
+disconnected:
vnc_lock_queue(queue);
QTAILQ_REMOVE(&queue->jobs, job, next);
vnc_unlock_queue(queue);
diff --git a/ui/vnc-jobs.h b/ui/vnc-jobs.h
index b8dab81..4c661f9 100644
--- a/ui/vnc-jobs.h
+++ b/ui/vnc-jobs.h
@@ -40,6 +40,7 @@ void vnc_jobs_join(VncState *vs);

#ifdef CONFIG_VNC_THREAD

+void vnc_jobs_consume_buffer(VncState *vs);
void vnc_start_worker_thread(void);
bool vnc_worker_thread_running(void);
void vnc_stop_worker_thread(void);
diff --git a/ui/vnc.c b/ui/vnc.c
index 610f884..f28873b 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -1011,7 +1011,10 @@ static void vnc_disconnect_finish(VncState *vs)

#ifdef CONFIG_VNC_THREAD
qemu_mutex_destroy(&vs->output_mutex);
+ qemu_bh_delete(vs->bh);
+ buffer_free(&vs->jobs_buffer);
#endif
+
for (i = 0; i < VNC_STAT_ROWS; ++i) {
qemu_free(vs->lossy_rect[i]);
}
@@ -1226,6 +1229,14 @@ static long vnc_client_read_plain(VncState *vs)
return ret;
}

+#ifdef CONFIG_VNC_THREAD
+static void vnc_jobs_bh(void *opaque)
+{
+ VncState *vs = opaque;
+
+ vnc_jobs_consume_buffer(vs);
+}
+#endif

/*
* First function called whenever there is more data to be read from
@@ -2525,6 +2536,7 @@ static void vnc_connect(VncDisplay *vd, int csock)

#ifdef CONFIG_VNC_THREAD
qemu_mutex_init(&vs->output_mutex);
+ vs->bh = qemu_bh_new(vnc_jobs_bh, vs);
#endif

QTAILQ_INSERT_HEAD(&vd->clients, vs, next);
diff --git a/ui/vnc.h b/ui/vnc.h
index 8a1e7b9..bca5f87 100644
--- a/ui/vnc.h
+++ b/ui/vnc.h
@@ -283,6 +283,8 @@ struct VncState
VncJob job;
#else
QemuMutex output_mutex;
+ QEMUBH *bh;
+ Buffer jobs_buffer;
#endif

/* Encoding specific, if you add something here, don't forget to
--
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Corentin Chary
2011-03-14 09:19:56 UTC
Permalink
On Thu, Mar 10, 2011 at 3:13 PM, Corentin Chary
Post by Corentin Chary
The threaded VNC servers messed up with QEMU fd handlers without
any kind of locking, and that can cause some nasty race conditions.
Using qemu_mutex_lock_iothread() won't work because vnc_dpy_cpy(),
which will wait for the current job queue to finish, can be called wi=
th
Post by Corentin Chary
the iothread lock held.
Instead, we now store the data in a temporary buffer, and use a botto=
m
Post by Corentin Chary
half to notify the main thread that new data is available.
vnc_[un]lock_ouput() is still needed to access VncState members like
abort, csock or jobs_buffer.
---
=C2=A0ui/vnc-jobs-async.c | =C2=A0 48 +++++++++++++++++++++++++++++--=
-----------------
Post by Corentin Chary
=C2=A0ui/vnc-jobs.h =C2=A0 =C2=A0 =C2=A0 | =C2=A0 =C2=A01 +
=C2=A0ui/vnc.c =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| =C2=A0 12 +=
+++++++++++
Post by Corentin Chary
=C2=A0ui/vnc.h =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| =C2=A0 =C2=A0=
2 ++
Post by Corentin Chary
=C2=A04 files changed, 44 insertions(+), 19 deletions(-)
diff --git a/ui/vnc-jobs-async.c b/ui/vnc-jobs-async.c
index f596247..4438776 100644
--- a/ui/vnc-jobs-async.c
+++ b/ui/vnc-jobs-async.c
@@ -28,6 +28,7 @@
=C2=A0#include "vnc.h"
=C2=A0#include "vnc-jobs.h"
+#include "qemu_socket.h"
=C2=A0/*
@@ -155,6 +156,24 @@ void vnc_jobs_join(VncState *vs)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 qemu_cond_wait(&queue->cond, &queue->mute=
x);
Post by Corentin Chary
=C2=A0 =C2=A0 }
=C2=A0 =C2=A0 vnc_unlock_queue(queue);
+ =C2=A0 =C2=A0vnc_jobs_consume_buffer(vs);
+}
+
+void vnc_jobs_consume_buffer(VncState *vs)
+{
+ =C2=A0 =C2=A0bool flush;
+
+ =C2=A0 =C2=A0vnc_lock_output(vs);
+ =C2=A0 =C2=A0if (vs->jobs_buffer.offset) {
+ =C2=A0 =C2=A0 =C2=A0 =C2=A0vnc_write(vs, vs->jobs_buffer.buffer, vs=
->jobs_buffer.offset);
Post by Corentin Chary
+ =C2=A0 =C2=A0 =C2=A0 =C2=A0buffer_reset(&vs->jobs_buffer);
+ =C2=A0 =C2=A0}
+ =C2=A0 =C2=A0flush =3D vs->csock !=3D -1 && vs->abort !=3D true;
+ =C2=A0 =C2=A0vnc_unlock_output(vs);
+
+ =C2=A0 =C2=A0if (flush) {
+ =C2=A0 =C2=A0 =C2=A0vnc_flush(vs);
+ =C2=A0 =C2=A0}
=C2=A0}
=C2=A0/*
@@ -197,7 +216,6 @@ static int vnc_worker_thread_loop(VncJobQueue *qu=
eue)
Post by Corentin Chary
=C2=A0 =C2=A0 VncState vs;
=C2=A0 =C2=A0 int n_rectangles;
=C2=A0 =C2=A0 int saved_offset;
- =C2=A0 =C2=A0bool flush;
=C2=A0 =C2=A0 vnc_lock_queue(queue);
=C2=A0 =C2=A0 while (QTAILQ_EMPTY(&queue->jobs) && !queue->exit) {
@@ -213,6 +231,7 @@ static int vnc_worker_thread_loop(VncJobQueue *qu=
eue)
Post by Corentin Chary
=C2=A0 =C2=A0 vnc_lock_output(job->vs);
=C2=A0 =C2=A0 if (job->vs->csock =3D=3D -1 || job->vs->abort =3D=3D t=
rue) {
Post by Corentin Chary
+ =C2=A0 =C2=A0 =C2=A0 =C2=A0vnc_unlock_output(job->vs);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 goto disconnected;
=C2=A0 =C2=A0 }
=C2=A0 =C2=A0 vnc_unlock_output(job->vs);
@@ -233,10 +252,6 @@ static int vnc_worker_thread_loop(VncJobQueue *q=
ueue)
Post by Corentin Chary
=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (job->vs->csock =3D=3D -1) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 vnc_unlock_display(job->vs-=
vd);
- =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* output mutex must be lo=
cked before going to
Post by Corentin Chary
- =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 */
- =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0vnc_lock_output(job->vs);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 goto disconnected;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 }
@@ -254,24 +269,19 @@ static int vnc_worker_thread_loop(VncJobQueue *=
queue)
Post by Corentin Chary
=C2=A0 =C2=A0 vs.output.buffer[saved_offset] =3D (n_rectangles >> 8) =
& 0xFF;
Post by Corentin Chary
=C2=A0 =C2=A0 vs.output.buffer[saved_offset + 1] =3D n_rectangles & 0=
xFF;
Post by Corentin Chary
- =C2=A0 =C2=A0/* Switch back buffers */
=C2=A0 =C2=A0 vnc_lock_output(job->vs);
- =C2=A0 =C2=A0if (job->vs->csock =3D=3D -1) {
- =C2=A0 =C2=A0 =C2=A0 =C2=A0goto disconnected;
+ =C2=A0 =C2=A0if (job->vs->csock !=3D -1) {
+ =C2=A0 =C2=A0 =C2=A0 =C2=A0buffer_reserve(&job->vs->jobs_buffer, vs=
=2Eoutput.offset);
Post by Corentin Chary
+ =C2=A0 =C2=A0 =C2=A0 =C2=A0buffer_append(&job->vs->jobs_buffer, vs.=
output.buffer,
Post by Corentin Chary
+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0vs.output.offset);
Post by Corentin Chary
+ =C2=A0 =C2=A0 =C2=A0 =C2=A0/* Copy persistent encoding data */
+ =C2=A0 =C2=A0 =C2=A0 =C2=A0vnc_async_encoding_end(job->vs, &vs);
+
+ =C2=A0 =C2=A0 =C2=A0 qemu_bh_schedule(job->vs->bh);
=C2=A0 =C2=A0 }
-
- =C2=A0 =C2=A0vnc_write(job->vs, vs.output.buffer, vs.output.offset)=
;
Post by Corentin Chary
-
- =C2=A0 =C2=A0/* Copy persistent encoding data */
- =C2=A0 =C2=A0vnc_async_encoding_end(job->vs, &vs);
- =C2=A0 =C2=A0flush =3D (job->vs->csock !=3D -1 && job->vs->abort !=3D=
true);
Post by Corentin Chary
=C2=A0 =C2=A0 vnc_unlock_output(job->vs);
- =C2=A0 =C2=A0if (flush) {
- =C2=A0 =C2=A0 =C2=A0 =C2=A0vnc_flush(job->vs);
- =C2=A0 =C2=A0}
-
=C2=A0 =C2=A0 vnc_lock_queue(queue);
=C2=A0 =C2=A0 QTAILQ_REMOVE(&queue->jobs, job, next);
=C2=A0 =C2=A0 vnc_unlock_queue(queue);
diff --git a/ui/vnc-jobs.h b/ui/vnc-jobs.h
index b8dab81..4c661f9 100644
--- a/ui/vnc-jobs.h
+++ b/ui/vnc-jobs.h
@@ -40,6 +40,7 @@ void vnc_jobs_join(VncState *vs);
=C2=A0#ifdef CONFIG_VNC_THREAD
+void vnc_jobs_consume_buffer(VncState *vs);
=C2=A0void vnc_start_worker_thread(void);
=C2=A0bool vnc_worker_thread_running(void);
=C2=A0void vnc_stop_worker_thread(void);
diff --git a/ui/vnc.c b/ui/vnc.c
index 610f884..f28873b 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -1011,7 +1011,10 @@ static void vnc_disconnect_finish(VncState *vs=
)
Post by Corentin Chary
=C2=A0#ifdef CONFIG_VNC_THREAD
=C2=A0 =C2=A0 qemu_mutex_destroy(&vs->output_mutex);
+ =C2=A0 =C2=A0qemu_bh_delete(vs->bh);
+ =C2=A0 =C2=A0buffer_free(&vs->jobs_buffer);
=C2=A0#endif
+
=C2=A0 =C2=A0 for (i =3D 0; i < VNC_STAT_ROWS; ++i) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 qemu_free(vs->lossy_rect[i]);
=C2=A0 =C2=A0 }
@@ -1226,6 +1229,14 @@ static long vnc_client_read_plain(VncState *vs=
)
Post by Corentin Chary
=C2=A0 =C2=A0 return ret;
=C2=A0}
+#ifdef CONFIG_VNC_THREAD
+static void vnc_jobs_bh(void *opaque)
+{
+ =C2=A0 =C2=A0VncState *vs =3D opaque;
+
+ =C2=A0 =C2=A0vnc_jobs_consume_buffer(vs);
+}
+#endif
=C2=A0/*
=C2=A0* First function called whenever there is more data to be read =
from
Post by Corentin Chary
@@ -2525,6 +2536,7 @@ static void vnc_connect(VncDisplay *vd, int cso=
ck)
Post by Corentin Chary
=C2=A0#ifdef CONFIG_VNC_THREAD
=C2=A0 =C2=A0 qemu_mutex_init(&vs->output_mutex);
+ =C2=A0 =C2=A0vs->bh =3D qemu_bh_new(vnc_jobs_bh, vs);
=C2=A0#endif
=C2=A0 =C2=A0 QTAILQ_INSERT_HEAD(&vd->clients, vs, next);
diff --git a/ui/vnc.h b/ui/vnc.h
index 8a1e7b9..bca5f87 100644
--- a/ui/vnc.h
+++ b/ui/vnc.h
@@ -283,6 +283,8 @@ struct VncState
=C2=A0 =C2=A0 VncJob job;
=C2=A0#else
=C2=A0 =C2=A0 QemuMutex output_mutex;
+ =C2=A0 =C2=A0QEMUBH *bh;
+ =C2=A0 =C2=A0Buffer jobs_buffer;
=C2=A0#endif
=C2=A0 =C2=A0 /* Encoding specific, if you add something here, don't =
forget to
Post by Corentin Chary
--
1.7.3.4
Paolo, Anthony, Jan, are you ok with that one ?
Peter Lieve, could you test that patch ?
Thanks

--=20
Corentin Chary
http://xf.iksaif.net
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Peter Lieven
2011-03-14 09:55:52 UTC
Permalink
Post by Corentin Chary
On Thu, Mar 10, 2011 at 3:13 PM, Corentin Chary
Post by Corentin Chary
The threaded VNC servers messed up with QEMU fd handlers without
any kind of locking, and that can cause some nasty race conditions.
Using qemu_mutex_lock_iothread() won't work because vnc_dpy_cpy(),
which will wait for the current job queue to finish, can be called with
the iothread lock held.
Instead, we now store the data in a temporary buffer, and use a bottom
half to notify the main thread that new data is available.
vnc_[un]lock_ouput() is still needed to access VncState members like
abort, csock or jobs_buffer.
---
ui/vnc-jobs-async.c | 48 +++++++++++++++++++++++++++++-------------------
ui/vnc-jobs.h | 1 +
ui/vnc.c | 12 ++++++++++++
ui/vnc.h | 2 ++
4 files changed, 44 insertions(+), 19 deletions(-)
diff --git a/ui/vnc-jobs-async.c b/ui/vnc-jobs-async.c
index f596247..4438776 100644
--- a/ui/vnc-jobs-async.c
+++ b/ui/vnc-jobs-async.c
@@ -28,6 +28,7 @@
#include "vnc.h"
#include "vnc-jobs.h"
+#include "qemu_socket.h"
/*
@@ -155,6 +156,24 @@ void vnc_jobs_join(VncState *vs)
qemu_cond_wait(&queue->cond, &queue->mutex);
}
vnc_unlock_queue(queue);
+ vnc_jobs_consume_buffer(vs);
+}
+
+void vnc_jobs_consume_buffer(VncState *vs)
+{
+ bool flush;
+
+ vnc_lock_output(vs);
+ if (vs->jobs_buffer.offset) {
+ vnc_write(vs, vs->jobs_buffer.buffer, vs->jobs_buffer.offset);
+ buffer_reset(&vs->jobs_buffer);
+ }
+ flush = vs->csock != -1 && vs->abort != true;
+ vnc_unlock_output(vs);
+
+ if (flush) {
+ vnc_flush(vs);
+ }
}
/*
@@ -197,7 +216,6 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
VncState vs;
int n_rectangles;
int saved_offset;
- bool flush;
vnc_lock_queue(queue);
while (QTAILQ_EMPTY(&queue->jobs) && !queue->exit) {
@@ -213,6 +231,7 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
vnc_lock_output(job->vs);
if (job->vs->csock == -1 || job->vs->abort == true) {
+ vnc_unlock_output(job->vs);
goto disconnected;
}
vnc_unlock_output(job->vs);
@@ -233,10 +252,6 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
if (job->vs->csock == -1) {
vnc_unlock_display(job->vs->vd);
- /* output mutex must be locked before going to
- */
- vnc_lock_output(job->vs);
goto disconnected;
}
@@ -254,24 +269,19 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
vs.output.buffer[saved_offset] = (n_rectangles >> 8) & 0xFF;
vs.output.buffer[saved_offset + 1] = n_rectangles & 0xFF;
- /* Switch back buffers */
vnc_lock_output(job->vs);
- if (job->vs->csock == -1) {
- goto disconnected;
+ if (job->vs->csock != -1) {
+ buffer_reserve(&job->vs->jobs_buffer, vs.output.offset);
+ buffer_append(&job->vs->jobs_buffer, vs.output.buffer,
+ vs.output.offset);
+ /* Copy persistent encoding data */
+ vnc_async_encoding_end(job->vs, &vs);
+
+ qemu_bh_schedule(job->vs->bh);
}
-
- vnc_write(job->vs, vs.output.buffer, vs.output.offset);
-
- /* Copy persistent encoding data */
- vnc_async_encoding_end(job->vs, &vs);
- flush = (job->vs->csock != -1 && job->vs->abort != true);
vnc_unlock_output(job->vs);
- if (flush) {
- vnc_flush(job->vs);
- }
-
vnc_lock_queue(queue);
QTAILQ_REMOVE(&queue->jobs, job, next);
vnc_unlock_queue(queue);
diff --git a/ui/vnc-jobs.h b/ui/vnc-jobs.h
index b8dab81..4c661f9 100644
--- a/ui/vnc-jobs.h
+++ b/ui/vnc-jobs.h
@@ -40,6 +40,7 @@ void vnc_jobs_join(VncState *vs);
#ifdef CONFIG_VNC_THREAD
+void vnc_jobs_consume_buffer(VncState *vs);
void vnc_start_worker_thread(void);
bool vnc_worker_thread_running(void);
void vnc_stop_worker_thread(void);
diff --git a/ui/vnc.c b/ui/vnc.c
index 610f884..f28873b 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -1011,7 +1011,10 @@ static void vnc_disconnect_finish(VncState *vs)
#ifdef CONFIG_VNC_THREAD
qemu_mutex_destroy(&vs->output_mutex);
+ qemu_bh_delete(vs->bh);
+ buffer_free(&vs->jobs_buffer);
#endif
+
for (i = 0; i < VNC_STAT_ROWS; ++i) {
qemu_free(vs->lossy_rect[i]);
}
@@ -1226,6 +1229,14 @@ static long vnc_client_read_plain(VncState *vs)
return ret;
}
+#ifdef CONFIG_VNC_THREAD
+static void vnc_jobs_bh(void *opaque)
+{
+ VncState *vs = opaque;
+
+ vnc_jobs_consume_buffer(vs);
+}
+#endif
/*
* First function called whenever there is more data to be read from
@@ -2525,6 +2536,7 @@ static void vnc_connect(VncDisplay *vd, int csock)
#ifdef CONFIG_VNC_THREAD
qemu_mutex_init(&vs->output_mutex);
+ vs->bh = qemu_bh_new(vnc_jobs_bh, vs);
#endif
QTAILQ_INSERT_HEAD(&vd->clients, vs, next);
diff --git a/ui/vnc.h b/ui/vnc.h
index 8a1e7b9..bca5f87 100644
--- a/ui/vnc.h
+++ b/ui/vnc.h
@@ -283,6 +283,8 @@ struct VncState
VncJob job;
#else
QemuMutex output_mutex;
+ QEMUBH *bh;
+ Buffer jobs_buffer;
#endif
/* Encoding specific, if you add something here, don't forget to
--
1.7.3.4
Paolo, Anthony, Jan, are you ok with that one ?
Peter Lieve, could you test that patch ?
will do. but will take until likely take until wednesday.

peter
Post by Corentin Chary
Thanks
--
Corentin Chary
http://xf.iksaif.net
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Peter Lieven
2011-03-15 16:55:11 UTC
Permalink
Post by Corentin Chary
On Thu, Mar 10, 2011 at 3:13 PM, Corentin Chary
Post by Corentin Chary
The threaded VNC servers messed up with QEMU fd handlers without
any kind of locking, and that can cause some nasty race conditions.
Using qemu_mutex_lock_iothread() won't work because vnc_dpy_cpy(),
which will wait for the current job queue to finish, can be called with
the iothread lock held.
Instead, we now store the data in a temporary buffer, and use a bottom
half to notify the main thread that new data is available.
vnc_[un]lock_ouput() is still needed to access VncState members like
abort, csock or jobs_buffer.
---
ui/vnc-jobs-async.c | 48 +++++++++++++++++++++++++++++-------------------
ui/vnc-jobs.h | 1 +
ui/vnc.c | 12 ++++++++++++
ui/vnc.h | 2 ++
4 files changed, 44 insertions(+), 19 deletions(-)
diff --git a/ui/vnc-jobs-async.c b/ui/vnc-jobs-async.c
index f596247..4438776 100644
--- a/ui/vnc-jobs-async.c
+++ b/ui/vnc-jobs-async.c
@@ -28,6 +28,7 @@
#include "vnc.h"
#include "vnc-jobs.h"
+#include "qemu_socket.h"
/*
@@ -155,6 +156,24 @@ void vnc_jobs_join(VncState *vs)
qemu_cond_wait(&queue->cond,&queue->mutex);
}
vnc_unlock_queue(queue);
+ vnc_jobs_consume_buffer(vs);
+}
+
+void vnc_jobs_consume_buffer(VncState *vs)
+{
+ bool flush;
+
+ vnc_lock_output(vs);
+ if (vs->jobs_buffer.offset) {
+ vnc_write(vs, vs->jobs_buffer.buffer, vs->jobs_buffer.offset);
+ buffer_reset(&vs->jobs_buffer);
+ }
+ flush = vs->csock != -1&& vs->abort != true;
+ vnc_unlock_output(vs);
+
+ if (flush) {
+ vnc_flush(vs);
+ }
}
/*
@@ -197,7 +216,6 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
VncState vs;
int n_rectangles;
int saved_offset;
- bool flush;
vnc_lock_queue(queue);
while (QTAILQ_EMPTY(&queue->jobs)&& !queue->exit) {
@@ -213,6 +231,7 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
vnc_lock_output(job->vs);
if (job->vs->csock == -1 || job->vs->abort == true) {
+ vnc_unlock_output(job->vs);
goto disconnected;
}
vnc_unlock_output(job->vs);
@@ -233,10 +252,6 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
if (job->vs->csock == -1) {
vnc_unlock_display(job->vs->vd);
- /* output mutex must be locked before going to
- */
- vnc_lock_output(job->vs);
goto disconnected;
}
@@ -254,24 +269,19 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
vs.output.buffer[saved_offset] = (n_rectangles>> 8)& 0xFF;
vs.output.buffer[saved_offset + 1] = n_rectangles& 0xFF;
- /* Switch back buffers */
vnc_lock_output(job->vs);
- if (job->vs->csock == -1) {
- goto disconnected;
+ if (job->vs->csock != -1) {
+ buffer_reserve(&job->vs->jobs_buffer, vs.output.offset);
+ buffer_append(&job->vs->jobs_buffer, vs.output.buffer,
+ vs.output.offset);
+ /* Copy persistent encoding data */
+ vnc_async_encoding_end(job->vs,&vs);
+
+ qemu_bh_schedule(job->vs->bh);
}
-
- vnc_write(job->vs, vs.output.buffer, vs.output.offset);
-
- /* Copy persistent encoding data */
- vnc_async_encoding_end(job->vs,&vs);
- flush = (job->vs->csock != -1&& job->vs->abort != true);
vnc_unlock_output(job->vs);
- if (flush) {
- vnc_flush(job->vs);
- }
-
vnc_lock_queue(queue);
QTAILQ_REMOVE(&queue->jobs, job, next);
vnc_unlock_queue(queue);
diff --git a/ui/vnc-jobs.h b/ui/vnc-jobs.h
index b8dab81..4c661f9 100644
--- a/ui/vnc-jobs.h
+++ b/ui/vnc-jobs.h
@@ -40,6 +40,7 @@ void vnc_jobs_join(VncState *vs);
#ifdef CONFIG_VNC_THREAD
+void vnc_jobs_consume_buffer(VncState *vs);
void vnc_start_worker_thread(void);
bool vnc_worker_thread_running(void);
void vnc_stop_worker_thread(void);
diff --git a/ui/vnc.c b/ui/vnc.c
index 610f884..f28873b 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -1011,7 +1011,10 @@ static void vnc_disconnect_finish(VncState *vs)
#ifdef CONFIG_VNC_THREAD
qemu_mutex_destroy(&vs->output_mutex);
+ qemu_bh_delete(vs->bh);
+ buffer_free(&vs->jobs_buffer);
#endif
+
for (i = 0; i< VNC_STAT_ROWS; ++i) {
qemu_free(vs->lossy_rect[i]);
}
@@ -1226,6 +1229,14 @@ static long vnc_client_read_plain(VncState *vs)
return ret;
}
+#ifdef CONFIG_VNC_THREAD
+static void vnc_jobs_bh(void *opaque)
+{
+ VncState *vs = opaque;
+
+ vnc_jobs_consume_buffer(vs);
+}
+#endif
/*
* First function called whenever there is more data to be read from
@@ -2525,6 +2536,7 @@ static void vnc_connect(VncDisplay *vd, int csock)
#ifdef CONFIG_VNC_THREAD
qemu_mutex_init(&vs->output_mutex);
+ vs->bh = qemu_bh_new(vnc_jobs_bh, vs);
#endif
QTAILQ_INSERT_HEAD(&vd->clients, vs, next);
diff --git a/ui/vnc.h b/ui/vnc.h
index 8a1e7b9..bca5f87 100644
--- a/ui/vnc.h
+++ b/ui/vnc.h
@@ -283,6 +283,8 @@ struct VncState
VncJob job;
#else
QemuMutex output_mutex;
+ QEMUBH *bh;
+ Buffer jobs_buffer;
#endif
/* Encoding specific, if you add something here, don't forget to
--
1.7.3.4
Paolo, Anthony, Jan, are you ok with that one ?
Peter Lieve, could you test that patch ?
i have seen one segfault. unfortunatly, i had no debugger attached.

but whats equally worse during stress test, i managed to trigger oom-killer.
it seems we have a memory leak somewhere....

peter
Post by Corentin Chary
Thanks
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Peter Lieven
2011-03-15 18:07:10 UTC
Permalink
Post by Peter Lieven
Post by Corentin Chary
On Thu, Mar 10, 2011 at 3:13 PM, Corentin Chary
Post by Corentin Chary
The threaded VNC servers messed up with QEMU fd handlers without
any kind of locking, and that can cause some nasty race conditions.
Using qemu_mutex_lock_iothread() won't work because vnc_dpy_cpy(),
which will wait for the current job queue to finish, can be called with
the iothread lock held.
Instead, we now store the data in a temporary buffer, and use a bottom
half to notify the main thread that new data is available.
vnc_[un]lock_ouput() is still needed to access VncState members like
abort, csock or jobs_buffer.
---
ui/vnc-jobs-async.c | 48
+++++++++++++++++++++++++++++-------------------
ui/vnc-jobs.h | 1 +
ui/vnc.c | 12 ++++++++++++
ui/vnc.h | 2 ++
4 files changed, 44 insertions(+), 19 deletions(-)
diff --git a/ui/vnc-jobs-async.c b/ui/vnc-jobs-async.c
index f596247..4438776 100644
--- a/ui/vnc-jobs-async.c
+++ b/ui/vnc-jobs-async.c
@@ -28,6 +28,7 @@
#include "vnc.h"
#include "vnc-jobs.h"
+#include "qemu_socket.h"
/*
@@ -155,6 +156,24 @@ void vnc_jobs_join(VncState *vs)
qemu_cond_wait(&queue->cond,&queue->mutex);
}
vnc_unlock_queue(queue);
+ vnc_jobs_consume_buffer(vs);
+}
+
+void vnc_jobs_consume_buffer(VncState *vs)
+{
+ bool flush;
+
+ vnc_lock_output(vs);
+ if (vs->jobs_buffer.offset) {
+ vnc_write(vs, vs->jobs_buffer.buffer, vs->jobs_buffer.offset);
+ buffer_reset(&vs->jobs_buffer);
+ }
+ flush = vs->csock != -1&& vs->abort != true;
+ vnc_unlock_output(vs);
+
+ if (flush) {
+ vnc_flush(vs);
+ }
}
/*
@@ -197,7 +216,6 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
VncState vs;
int n_rectangles;
int saved_offset;
- bool flush;
vnc_lock_queue(queue);
while (QTAILQ_EMPTY(&queue->jobs)&& !queue->exit) {
@@ -213,6 +231,7 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
vnc_lock_output(job->vs);
if (job->vs->csock == -1 || job->vs->abort == true) {
+ vnc_unlock_output(job->vs);
goto disconnected;
}
vnc_unlock_output(job->vs);
@@ -233,10 +252,6 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
if (job->vs->csock == -1) {
vnc_unlock_display(job->vs->vd);
- /* output mutex must be locked before going to
- */
- vnc_lock_output(job->vs);
goto disconnected;
}
@@ -254,24 +269,19 @@ static int vnc_worker_thread_loop(VncJobQueue *queue)
vs.output.buffer[saved_offset] = (n_rectangles>> 8)& 0xFF;
vs.output.buffer[saved_offset + 1] = n_rectangles& 0xFF;
- /* Switch back buffers */
vnc_lock_output(job->vs);
- if (job->vs->csock == -1) {
- goto disconnected;
+ if (job->vs->csock != -1) {
+ buffer_reserve(&job->vs->jobs_buffer, vs.output.offset);
+ buffer_append(&job->vs->jobs_buffer, vs.output.buffer,
+ vs.output.offset);
+ /* Copy persistent encoding data */
+ vnc_async_encoding_end(job->vs,&vs);
+
+ qemu_bh_schedule(job->vs->bh);
}
-
- vnc_write(job->vs, vs.output.buffer, vs.output.offset);
-
- /* Copy persistent encoding data */
- vnc_async_encoding_end(job->vs,&vs);
- flush = (job->vs->csock != -1&& job->vs->abort != true);
vnc_unlock_output(job->vs);
- if (flush) {
- vnc_flush(job->vs);
- }
-
vnc_lock_queue(queue);
QTAILQ_REMOVE(&queue->jobs, job, next);
vnc_unlock_queue(queue);
diff --git a/ui/vnc-jobs.h b/ui/vnc-jobs.h
index b8dab81..4c661f9 100644
--- a/ui/vnc-jobs.h
+++ b/ui/vnc-jobs.h
@@ -40,6 +40,7 @@ void vnc_jobs_join(VncState *vs);
#ifdef CONFIG_VNC_THREAD
+void vnc_jobs_consume_buffer(VncState *vs);
void vnc_start_worker_thread(void);
bool vnc_worker_thread_running(void);
void vnc_stop_worker_thread(void);
diff --git a/ui/vnc.c b/ui/vnc.c
index 610f884..f28873b 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -1011,7 +1011,10 @@ static void vnc_disconnect_finish(VncState *vs)
#ifdef CONFIG_VNC_THREAD
qemu_mutex_destroy(&vs->output_mutex);
+ qemu_bh_delete(vs->bh);
+ buffer_free(&vs->jobs_buffer);
#endif
+
for (i = 0; i< VNC_STAT_ROWS; ++i) {
qemu_free(vs->lossy_rect[i]);
}
@@ -1226,6 +1229,14 @@ static long vnc_client_read_plain(VncState *vs)
return ret;
}
+#ifdef CONFIG_VNC_THREAD
+static void vnc_jobs_bh(void *opaque)
+{
+ VncState *vs = opaque;
+
+ vnc_jobs_consume_buffer(vs);
+}
+#endif
/*
* First function called whenever there is more data to be read from
@@ -2525,6 +2536,7 @@ static void vnc_connect(VncDisplay *vd, int csock)
#ifdef CONFIG_VNC_THREAD
qemu_mutex_init(&vs->output_mutex);
+ vs->bh = qemu_bh_new(vnc_jobs_bh, vs);
#endif
QTAILQ_INSERT_HEAD(&vd->clients, vs, next);
diff --git a/ui/vnc.h b/ui/vnc.h
index 8a1e7b9..bca5f87 100644
--- a/ui/vnc.h
+++ b/ui/vnc.h
@@ -283,6 +283,8 @@ struct VncState
VncJob job;
#else
QemuMutex output_mutex;
+ QEMUBH *bh;
+ Buffer jobs_buffer;
#endif
/* Encoding specific, if you add something here, don't forget to
--
1.7.3.4
Paolo, Anthony, Jan, are you ok with that one ?
Peter Lieve, could you test that patch ?
i have seen one segfault. unfortunatly, i had no debugger attached.
but whats equally worse during stress test, i managed to trigger oom-killer.
it seems we have a memory leak somewhere....
commit c53af37f375ce9c4999ff451c51173bdc1167e67
seems to fix this. remember this is not in 0.14.0 which could cause
a lot of trouble to 0.14.0 users when they do a lot of vnc work...

peter
Post by Peter Lieven
peter
Post by Corentin Chary
Thanks
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Peter Lieven
2011-03-09 10:02:43 UTC
Permalink
Post by Jan Kiszka
Post by Peter Lieven
Hi,
during testing of qemu-kvm-0.14.0 i can reproduce the following segfault. i have seen similar crash already in 0.13.0, but had no time to debug.
my guess is that this segfault is related to the threaded vnc server which was introduced in qemu 0.13.0. the bug is only triggerable if a vnc
client is attached. it might also be connected to a resolution change in the guest. i have a backtrace attached. the debugger is still running if someone
needs more output
...
Post by Peter Lieven
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#1 0x000000000041d669 in main_loop_wait (nonblocking=0)
at /usr/src/qemu-kvm-0.14.0/vl.c:1388
So we are calling a IOHandlerRecord::fd_write handler that is NULL.
Looking at qemu_set_fd_handler2, this may happen if that function is
called for an existing io-handler entry with non-NULL write handler,
passing a NULL write and a non-NULL read handler. And all this without
the global mutex held.
And there are actually calls in vnc_client_write_plain and
vnc_client_write_locked (in contrast to vnc_write) that may generate
this pattern. It's probably worth validating that the iothread lock is
always held when qemu_set_fd_handler2 is invoked to confirm this race
theory, adding something like
assert(pthread_mutex_trylock(&qemu_mutex) != 0);
(that's for qemu-kvm only)
i added it and will run tests now.

thanks,
peter
Post by Jan Kiszka
BTW, qemu with just --enable-vnc-thread, ie. without io-thread support,
should always run into this race as it then definitely lacks a global mutex.
Jan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Peter Lieven
2011-03-09 10:16:55 UTC
Permalink
Post by Jan Kiszka
Post by Peter Lieven
Hi,
during testing of qemu-kvm-0.14.0 i can reproduce the following segfault. i have seen similar crash already in 0.13.0, but had no time to debug.
my guess is that this segfault is related to the threaded vnc server which was introduced in qemu 0.13.0. the bug is only triggerable if a vnc
client is attached. it might also be connected to a resolution change in the guest. i have a backtrace attached. the debugger is still running if someone
needs more output
...
Post by Peter Lieven
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#1 0x000000000041d669 in main_loop_wait (nonblocking=0)
at /usr/src/qemu-kvm-0.14.0/vl.c:1388
So we are calling a IOHandlerRecord::fd_write handler that is NULL.
Looking at qemu_set_fd_handler2, this may happen if that function is
called for an existing io-handler entry with non-NULL write handler,
passing a NULL write and a non-NULL read handler. And all this without
the global mutex held.
And there are actually calls in vnc_client_write_plain and
vnc_client_write_locked (in contrast to vnc_write) that may generate
this pattern. It's probably worth validating that the iothread lock is
always held when qemu_set_fd_handler2 is invoked to confirm this race
theory, adding something like
assert(pthread_mutex_trylock(&qemu_mutex) != 0);
(that's for qemu-kvm only)
qemu_mutex doesn't exists (anymore). i can only find qemu_global_mutex and qemu_fair_mutex.

i try with qemu_global_mutex now. if thats not correct please tell.

peter
Post by Jan Kiszka
BTW, qemu with just --enable-vnc-thread, ie. without io-thread support,
should always run into this race as it then definitely lacks a global mutex.
Jan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Kiszka
2011-03-09 10:20:05 UTC
Permalink
Post by Peter Lieven
Post by Jan Kiszka
Post by Peter Lieven
Hi,
during testing of qemu-kvm-0.14.0 i can reproduce the following segfault. i have seen similar crash already in 0.13.0, but had no time to debug.
my guess is that this segfault is related to the threaded vnc server which was introduced in qemu 0.13.0. the bug is only triggerable if a vnc
client is attached. it might also be connected to a resolution change in the guest. i have a backtrace attached. the debugger is still running if someone
needs more output
...
Post by Peter Lieven
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#1 0x000000000041d669 in main_loop_wait (nonblocking=0)
at /usr/src/qemu-kvm-0.14.0/vl.c:1388
So we are calling a IOHandlerRecord::fd_write handler that is NULL.
Looking at qemu_set_fd_handler2, this may happen if that function is
called for an existing io-handler entry with non-NULL write handler,
passing a NULL write and a non-NULL read handler. And all this without
the global mutex held.
And there are actually calls in vnc_client_write_plain and
vnc_client_write_locked (in contrast to vnc_write) that may generate
this pattern. It's probably worth validating that the iothread lock is
always held when qemu_set_fd_handler2 is invoked to confirm this race
theory, adding something like
assert(pthread_mutex_trylock(&qemu_mutex) != 0);
(that's for qemu-kvm only)
qemu_mutex doesn't exists (anymore). i can only find qemu_global_mutex and qemu_fair_mutex.
In qemu-kvm, qemu-kvm.c contains that lock. In upstream, it's
qemu_global_mutex, but you will have to enable --enable-io-thread and
export it.
Post by Peter Lieven
i try with qemu_global_mutex now. if thats not correct please tell.
I think we knokw know the call path, so you could also wait for Corentin
to send a patch and test that one.

Jan
Peter Lieven
2011-03-09 10:31:52 UTC
Permalink
Post by Jan Kiszka
Post by Peter Lieven
Post by Jan Kiszka
Post by Peter Lieven
Hi,
during testing of qemu-kvm-0.14.0 i can reproduce the following segfault. i have seen similar crash already in 0.13.0, but had no time to debug.
my guess is that this segfault is related to the threaded vnc server which was introduced in qemu 0.13.0. the bug is only triggerable if a vnc
client is attached. it might also be connected to a resolution change in the guest. i have a backtrace attached. the debugger is still running if someone
needs more output
...
Post by Peter Lieven
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#1 0x000000000041d669 in main_loop_wait (nonblocking=0)
at /usr/src/qemu-kvm-0.14.0/vl.c:1388
So we are calling a IOHandlerRecord::fd_write handler that is NULL.
Looking at qemu_set_fd_handler2, this may happen if that function is
called for an existing io-handler entry with non-NULL write handler,
passing a NULL write and a non-NULL read handler. And all this without
the global mutex held.
And there are actually calls in vnc_client_write_plain and
vnc_client_write_locked (in contrast to vnc_write) that may generate
this pattern. It's probably worth validating that the iothread lock is
always held when qemu_set_fd_handler2 is invoked to confirm this race
theory, adding something like
assert(pthread_mutex_trylock(&qemu_mutex) != 0);
(that's for qemu-kvm only)
qemu_mutex doesn't exists (anymore). i can only find qemu_global_mutex and qemu_fair_mutex.
In qemu-kvm, qemu-kvm.c contains that lock. In upstream, it's
qemu_global_mutex, but you will have to enable --enable-io-thread and
export it.
qemu-kvm doesn't compile with --enable-io-thread

LINK x86_64-softmmu/qemu-system-x86_64
kvm-all.o: In function `qemu_mutex_unlock_iothread':
/usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1735: multiple definition of `qemu_mutex_unlock_iothread'
cpus.o:/usr/src/qemu-kvm-0.14.0/cpus.c:752: first defined here
kvm-all.o: In function `qemu_mutex_lock_iothread':
/usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1742: multiple definition of `qemu_mutex_lock_iothread'
cpus.o:/usr/src/qemu-kvm-0.14.0/cpus.c:738: first defined here
../compatfd.o: In function `qemu_signalfd':
/usr/src/qemu-kvm-0.14.0/compatfd.c:105: multiple definition of `qemu_signalfd'
../compatfd.o:/usr/src/qemu-kvm-0.14.0/compatfd.c:105: first defined here
collect2: ld returned 1 exit status
make[1]: *** [qemu-system-x86_64] Error 1
make: *** [subdir-x86_64-softmmu] Error 2
Post by Jan Kiszka
Post by Peter Lieven
i try with qemu_global_mutex now. if thats not correct please tell.
I think we knokw know the call path, so you could also wait for Corentin
to send a patch and test that one.
will do as soon as i get it ;-)
Post by Jan Kiszka
Jan
Peter

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Paolo Bonzini
2011-03-09 11:20:58 UTC
Permalink
Post by Jan Kiszka
It's probably worth validating that the iothread lock is
always held when qemu_set_fd_handler2 is invoked to confirm this race
theory, adding something like
assert(pthread_mutex_trylock(&qemu_mutex) != 0);
(that's for qemu-kvm only)
Alternatively, iohandlers could be a(nother) good place to start
introducing fine-grained locks or rwlocks.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Kiszka
2011-03-09 11:44:12 UTC
Permalink
Post by Paolo Bonzini
Post by Jan Kiszka
It's probably worth validating that the iothread lock is
always held when qemu_set_fd_handler2 is invoked to confirm this race
theory, adding something like
assert(pthread_mutex_trylock(&qemu_mutex) != 0);
(that's for qemu-kvm only)
Alternatively, iohandlers could be a(nother) good place to start
introducing fine-grained locks or rwlocks.
Yeah, could be a good idea. It's a fairly confined area here that needs
protection.

Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loading...