Discussion:
[ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.4
Ken Brown
2015-07-06 02:15:33 UTC
Permalink
This test release needs some good testing!
I repeated the emacs experiment discussed in the "[ANNOUNCEMENT] TEST RELEASE:
Cygwin 2.1.0-0.1" thread. In the 32-bit case, the results were more-or-less the
same as before: I forced a stack overflow, emacs recovered, I tried to continue
working, there was a second SIGSEGV, and handle_sigsegv bailed out because
garbage collection was in progress. This time I was unable to prevent the
second SIGSEGV by resetting max-specpdl-size and max-lisp-eval-depth. I'm not
sure what caused the second SIGSEGV, but it might have nothing to do with Cygwin.

In the 64-bit case, however, the recovery from stack overflow never happened
(i.e., the program never reached the siglongjmp). Here's a gdb session:

$ gdb ./emacs.exe
[...]
(gdb) b handle_sigsegv
Breakpoint 3 at 0x1005657b3: file ../../master/src/sysdep.c, line 1643.
(gdb) r -Q
Starting program: /home/kbrown/src/emacs/64build/src/emacs.exe -Q
[At this point I force stack overflow.]

Program received signal SIGSEGV, Segmentation fault.
0x000000010053b08b in builtin_lisp_symbol (index=290)
at ../../master/src/lisp.h:1069
1069 return make_lisp_symbol (lispsym + index);
(gdb) c
Continuing.

Breakpoint 3, handle_sigsegv (sig=11,
siginfo=0x100a3e190 <sigsegv_stack+65232>, arg=0x82de50)
at ../../master/src/sysdep.c:1643
1643 if (!gc_in_progress)
(gdb) l
1638 static void
1639 handle_sigsegv (int sig, siginfo_t *siginfo, void *arg)
1640 {
1641 /* Hard GC error may lead to stack overflow caused by
1642 too nested calls to mark_object. No way to survive. */
1643 if (!gc_in_progress)
1644 {
1645 struct rlimit rlim;
1646
1647 if (!getrlimit (RLIMIT_STACK, &rlim))
(gdb)
1648 {
1649 #ifdef CYGWIN
1650 enum { STACK_DANGER_ZONE = 32 * 1024 };
1651 #else
1652 enum { STACK_DANGER_ZONE = 16 * 1024 };
1653 #endif
1654 char *beg, *end, *addr;
1655
1656 beg = stack_bottom;
1657 end = stack_bottom + stack_direction * rlim.rlim_cur;
(gdb)
1658 if (beg > end)
1659 addr = beg, beg = end, end = addr;
1660 addr = (char *) siginfo->si_addr;
1661 /* If we're somewhere on stack and too close to
1662 one of its boundaries, most likely this is it. */
1663 if (beg < addr && addr < end
1664 && (addr - beg < STACK_DANGER_ZONE
1665 || end - addr < STACK_DANGER_ZONE))
1666 siglongjmp (return_to_command_loop, 1);
1667 }
(gdb) n
1647 if (!getrlimit (RLIMIT_STACK, &rlim))
(gdb)
1656 beg = stack_bottom;
(gdb)
1657 end = stack_bottom + stack_direction * rlim.rlim_cur;
(gdb)
1658 if (beg > end)
(gdb)
1660 addr = (char *) siginfo->si_addr;
(gdb)
1663 if (beg < addr && addr < end
(gdb) p beg
$1 = 0x82ca27 ""
(gdb) p addr
$2 = 0x33ff8 ""

Note that addr < beg, so we never reach the siglongjmp.

Ken

--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Ken Brown
2015-07-06 13:15:32 UTC
Permalink
Hi Corinna,
Hi Ken,
thanks for further testing this.
Post by Ken Brown
This test release needs some good testing!
I repeated the emacs experiment discussed in the "[ANNOUNCEMENT] TEST
RELEASE: Cygwin 2.1.0-0.1" thread. In the 32-bit case, the results were
more-or-less the same as before: I forced a stack overflow, emacs recovered,
I tried to continue working, there was a second SIGSEGV, and handle_sigsegv
bailed out because garbage collection was in progress. This time I was
unable to prevent the second SIGSEGV by resetting max-specpdl-size and
max-lisp-eval-depth. I'm not sure what caused the second SIGSEGV, but it
might have nothing to do with Cygwin.
In the 64-bit case, however, the recovery from stack overflow never happened
[...]
1647 if (!getrlimit (RLIMIT_STACK, &rlim))
(gdb)
1656 beg = stack_bottom;
(gdb)
1657 end = stack_bottom + stack_direction * rlim.rlim_cur;
(gdb)
1658 if (beg > end)
(gdb)
1660 addr = (char *) siginfo->si_addr;
(gdb)
1663 if (beg < addr && addr < end
(gdb) p beg
$1 = 0x82ca27 ""
(gdb) p addr
$2 = 0x33ff8 ""
I can't reproduce this. It works fine for me. For reference I attached
my simplified testcase again. It's basically the emacs SIGSEGV setup,
main triggers the stack overflow, the handler tries to write a file for
testing if that works from the handler, then it siglongjmps. The main
function tests if it can still fork, and then it repeats the action to
test if we're back to normal in terms of signal handling.
$ ./sigalt
command loop 1 before crash
command loop 1 after crash
In child
In parent
command loop 2 before crash
command loop 2 after crash
In child
In parent
(gdb) p beg
$1 = 0x40ac3 <error: Cannot access memory at address 0x40ac3>
(gdb) p addr
$2 = 0x43848 <error: Cannot access memory at address 0x43848>
(gdb) p end
$3 = 0x23cac3 ""
(gdb) p/x rlim.rlim_cur
$5 = 0x1fc000
)$ peflags -x ./sigalt
./sigalt: stack reserve size : 2097152 (0x200000) bytes
0x200000 - dead zone 4K - default W8.1 64 bit guardpagesize 3 * 4K ==
0x1fc000, the value rlim.rlim_cur returns. Looks good to me.
(gdb) p beg
$1 = 0x8fc33 ""
(gdb) p addr
$2 = 0x92d5c <error: Cannot access memory at address 0x92d5c>
(gdb) p end
$3 = 0x28cc33 ""
(gdb) p/x rlim.rlim_cur
$4 = 0x1fd000
$ peflags -x ./sigalt
./sigalt: stack reserve size : 2097152 (0x200000) bytes
0x200000 - dead zone 4K - default W8.1 32 bit guardpagesize 2 * 4K ==
0x1fd000.
(gdb) p beg
$1 = 0x2ec43 "\376\356..."
(gdb) p addr
$2 = 0x32d6c ""
(gdb) p end
$3 = 0x22cc43 ""
(gdb) p rlim.rlim_cur
$4 = 2088960
(gdb) p/x rlim.rlim_cur
$5 = 0x1fe000
$ peflags -x ./sigalt
./sigalt: stack reserve size : 2097152 (0x200000) bytes
0x200000 - dead zone 4K - default W7 32 bit guardpagesize 1 * 4K ==
0x1fe000.
Post by Ken Brown
Note that addr < beg, so we never reach the siglongjmp.
I have no explanation for this. What OS? What does rlim_cur contain?
What does peflags -x print for this executable?
I'm on W7 64-bit. The problem seems to be that rlim_cur is too big.

$ peflags -x ./emacs
./emacs: stack reserve size : 8388608 (0x800000) bytes

(gdb) p beg
$3 = 0x82ca27 ""
(gdb) p/x rlim.rlim_cur
$2 = 0x850e80

So there's overflow when end is computed:

(gdb) p end
$4 = 0xfffffffffffdbba7 <error: Cannot access memory at address 0xfffffffffffdbba7>

This doesn't happen when I run your testcase with the same 8MB stack size:

$ peflags -x0x800000 ./sigalt.exe
./sigalt.exe: stack reserve size : 8388608 (0x800000) bytes

(gdb) p beg
$1 = 0x82cabb ""
(gdb) p/x rlim.rlim_cur
$2 = 0x7fd000
(gdb) p end
$3 = 0x2fabb
And last but not least, what is emacs doing there? The stack should be
pretty much in a good shape when it's back to the main loop. The stack
is fully commited and has the default number of guardpages at the bottom,
as it is just short of the stack overflow.
For debugging purposes I also added a global variable called "tib" and a
memory info struct called "m" to the testcase which are initialized
right at the start of main. tib points to the start of the TEB (Thread
Environment Block, a Windows per-thread bookkeeping structure) of the
main thread. If you expand it right after it's fetched, you get
(gdb) p *tib
$2 = {ExceptionList = 0x22cd78, StackBase = 0x230000, StackLimit = 0x20c000,
SubSystemTib = 0x0, {FiberData = 0x1e00, Version = 7680},
ArbitraryUserPointer = 0x0, Self = 0x7ffdf000}
Note the values of StackBase and StackLimit and compare with your beg and
end values. StackBase is the upper limit of the stack. It grows downward
from there. StackLimit is the lowest address as yet commited. It's not much
yet as you can see, 0x230000-0x20c000 == 0x24000 == 144K. Since Cygwin
executables have a default stack of 2 Megs, the allocation base of the stack
(gdb) p m
$1 = {BaseAddress = 0x22c000, AllocationBase = 0x30000, AllocationProtect = 4,
RegionSize = 16384, State = 4096, Protect = 4, Type = 131072}
See the value of AllocationBase.
When you hit the breakpoint in handle_sigsegv, the output of tib should
(gdb) p *tib
$2 = {ExceptionList = 0x22cd78, StackBase = 0x230000, StackLimit = 0x32000,
SubSystemTib = 0x0, {FiberData = 0x1e00, Version = 7680},
ArbitraryUserPointer = 0x0, Self = 0x7ffdf000}
Observe the value of StackLimit. For this output I ran the testcase on
in Cygwin restored the stack to its state rifght before the stack overflow
- At 0x30000 we have the 4K dead zone, which is always only reserved,
never commited.
- At 0x31000 the 4K guard page starts.
- Thus the StackLimit (the start of the commited region of the stack)
starts at 0x32000.
#include <windows.h>
NT_TIB *tib;
MEMORY_BASIC_INFORMATION m;
[...]
/* Record (approximately) where the stack begins. */
stack_bottom = &stack_bottom_variable;
tib = (NT_TIB *) __readfsdword(PcTeb);
VirtualQuery (stack_bottom, &m, sizeof m);
I'll try this next and report back.

Ken

--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Ken Brown
2015-07-06 13:32:47 UTC
Permalink
Post by Ken Brown
Hi Corinna,
Hi Ken,
thanks for further testing this.
Post by Ken Brown
This test release needs some good testing!
I repeated the emacs experiment discussed in the "[ANNOUNCEMENT] TEST
RELEASE: Cygwin 2.1.0-0.1" thread. In the 32-bit case, the results were
more-or-less the same as before: I forced a stack overflow, emacs recovered,
I tried to continue working, there was a second SIGSEGV, and handle_sigsegv
bailed out because garbage collection was in progress. This time I was
unable to prevent the second SIGSEGV by resetting max-specpdl-size and
max-lisp-eval-depth. I'm not sure what caused the second SIGSEGV, but it
might have nothing to do with Cygwin.
In the 64-bit case, however, the recovery from stack overflow never happened
[...]
1647 if (!getrlimit (RLIMIT_STACK, &rlim))
(gdb)
1656 beg = stack_bottom;
(gdb)
1657 end = stack_bottom + stack_direction * rlim.rlim_cur;
(gdb)
1658 if (beg > end)
(gdb)
1660 addr = (char *) siginfo->si_addr;
(gdb)
1663 if (beg < addr && addr < end
(gdb) p beg
$1 = 0x82ca27 ""
(gdb) p addr
$2 = 0x33ff8 ""
I can't reproduce this. It works fine for me. For reference I attached
my simplified testcase again. It's basically the emacs SIGSEGV setup,
main triggers the stack overflow, the handler tries to write a file for
testing if that works from the handler, then it siglongjmps. The main
function tests if it can still fork, and then it repeats the action to
test if we're back to normal in terms of signal handling.
$ ./sigalt
command loop 1 before crash
command loop 1 after crash
In child
In parent
command loop 2 before crash
command loop 2 after crash
In child
In parent
(gdb) p beg
$1 = 0x40ac3 <error: Cannot access memory at address 0x40ac3>
(gdb) p addr
$2 = 0x43848 <error: Cannot access memory at address 0x43848>
(gdb) p end
$3 = 0x23cac3 ""
(gdb) p/x rlim.rlim_cur
$5 = 0x1fc000
)$ peflags -x ./sigalt
./sigalt: stack reserve size : 2097152 (0x200000) bytes
0x200000 - dead zone 4K - default W8.1 64 bit guardpagesize 3 * 4K ==
0x1fc000, the value rlim.rlim_cur returns. Looks good to me.
(gdb) p beg
$1 = 0x8fc33 ""
(gdb) p addr
$2 = 0x92d5c <error: Cannot access memory at address 0x92d5c>
(gdb) p end
$3 = 0x28cc33 ""
(gdb) p/x rlim.rlim_cur
$4 = 0x1fd000
$ peflags -x ./sigalt
./sigalt: stack reserve size : 2097152 (0x200000) bytes
0x200000 - dead zone 4K - default W8.1 32 bit guardpagesize 2 * 4K ==
0x1fd000.
(gdb) p beg
$1 = 0x2ec43 "\376\356..."
(gdb) p addr
$2 = 0x32d6c ""
(gdb) p end
$3 = 0x22cc43 ""
(gdb) p rlim.rlim_cur
$4 = 2088960
(gdb) p/x rlim.rlim_cur
$5 = 0x1fe000
$ peflags -x ./sigalt
./sigalt: stack reserve size : 2097152 (0x200000) bytes
0x200000 - dead zone 4K - default W7 32 bit guardpagesize 1 * 4K ==
0x1fe000.
Post by Ken Brown
Note that addr < beg, so we never reach the siglongjmp.
I have no explanation for this. What OS? What does rlim_cur contain?
What does peflags -x print for this executable?
I'm on W7 64-bit. The problem seems to be that rlim_cur is too big.
$ peflags -x ./emacs
./emacs: stack reserve size : 8388608 (0x800000) bytes
(gdb) p beg
$3 = 0x82ca27 ""
(gdb) p/x rlim.rlim_cur
$2 = 0x850e80
(gdb) p end
$4 = 0xfffffffffffdbba7 <error: Cannot access memory at address 0xfffffffffffdbba7>
$ peflags -x0x800000 ./sigalt.exe
./sigalt.exe: stack reserve size : 8388608 (0x800000) bytes
(gdb) p beg
$1 = 0x82cabb ""
(gdb) p/x rlim.rlim_cur
$2 = 0x7fd000
(gdb) p end
$3 = 0x2fabb
And last but not least, what is emacs doing there? The stack should be
pretty much in a good shape when it's back to the main loop. The stack
is fully commited and has the default number of guardpages at the bottom,
as it is just short of the stack overflow.
For debugging purposes I also added a global variable called "tib" and a
memory info struct called "m" to the testcase which are initialized
right at the start of main. tib points to the start of the TEB (Thread
Environment Block, a Windows per-thread bookkeeping structure) of the
main thread. If you expand it right after it's fetched, you get
(gdb) p *tib
$2 = {ExceptionList = 0x22cd78, StackBase = 0x230000, StackLimit = 0x20c000,
SubSystemTib = 0x0, {FiberData = 0x1e00, Version = 7680},
ArbitraryUserPointer = 0x0, Self = 0x7ffdf000}
Note the values of StackBase and StackLimit and compare with your beg and
end values. StackBase is the upper limit of the stack. It grows downward
from there. StackLimit is the lowest address as yet commited. It's not much
yet as you can see, 0x230000-0x20c000 == 0x24000 == 144K. Since Cygwin
executables have a default stack of 2 Megs, the allocation base of the stack
(gdb) p m
$1 = {BaseAddress = 0x22c000, AllocationBase = 0x30000, AllocationProtect = 4,
RegionSize = 16384, State = 4096, Protect = 4, Type = 131072}
See the value of AllocationBase.
When you hit the breakpoint in handle_sigsegv, the output of tib should
(gdb) p *tib
$2 = {ExceptionList = 0x22cd78, StackBase = 0x230000, StackLimit = 0x32000,
SubSystemTib = 0x0, {FiberData = 0x1e00, Version = 7680},
ArbitraryUserPointer = 0x0, Self = 0x7ffdf000}
Observe the value of StackLimit. For this output I ran the testcase on
in Cygwin restored the stack to its state rifght before the stack overflow
- At 0x30000 we have the 4K dead zone, which is always only reserved,
never commited.
- At 0x31000 the 4K guard page starts.
- Thus the StackLimit (the start of the commited region of the stack)
starts at 0x32000.
#include <windows.h>
NT_TIB *tib;
MEMORY_BASIC_INFORMATION m;
[...]
/* Record (approximately) where the stack begins. */
stack_bottom = &stack_bottom_variable;
tib = (NT_TIB *) __readfsdword(PcTeb);
VirtualQuery (stack_bottom, &m, sizeof m);
I'll try this next and report back.
PcTeb seems to be defined only on x86. What should I do on x86_64?

Ken

--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Ken Brown
2015-07-06 15:54:48 UTC
Permalink
Does emacs call setrlimit by any chance?
Yes, that's the problem. The initialization code contains essentially the
following:

if (!getrlimit (RLIMIT_STACK, &rlim))
{
long newlim;
/* Approximate the amount regex.c needs per unit of re_max_failures. */
int ratio = 20 * sizeof (char *);
/* Then add 33% to cover the size of the smaller stacks that regex.c
successively allocates and discards, on its way to the maximum. */
ratio += ratio / 3;
/* Add in some extra to cover
what we're likely to use for other reasons. */
newlim = re_max_failures * ratio + 200000;
if (newlim > rlim.rlim_max)
{
newlim = rlim.rlim_max;
/* Don't let regex.c overflow the stack we have. */
re_max_failures = (newlim - 200000) / ratio;
}
if (rlim.rlim_cur < newlim)
rlim.rlim_cur = newlim;

setrlimit (RLIMIT_STACK, &rlim);
}

If I disable that code, the problem goes away: rlim_cur is set to the expected
0x7fd000 in handle_sigsegv, and emacs recovers from the stack overflow.

I think I probably should disable that code on Cygwin anyway, because there's
simply no need for it. Some time ago I discovered that the default 2MB stack
size was not big enough for emacs on Cygwin, and I made emacs use 8MB instead.
So there's no need to enlarge it further.
Btw., *if* emacs calls setrlimit and then expects getrlimit to return
the *actual* size of the stack, rather than expecting that rlim_cur is
just a default value when setting up stacks, it's really doing something
borderline.
There's simply *no* guarantee that a stack can be extended to this size.
Any mmap() call could disallow growing the stack beyond its initial
size. Worse, on Linux you can even mmap so that the stack doesn't
grow to the supposed initial maximum size at all. The reason is that
Linux doesn't know the concept of "reserved" virtual memory, but the
stack is initially not commited in full either.
If you want to know how big your current stack *actually* is, you can
#include <pthread.h>
static void
handle_sigsegv (int sig, siginfo_t *siginfo, void *arg)
{
pthread_attr_t attr;
size_t stacksize;
if (!pthread_getattr_np (pthread_self (), &attr)
&& !pthread_attr_getstacksize (&attr, &stacksize))
{
beg = stack_bottom;
end = stack_bottom + stack_direction * stacksize;
[...]
Unfortunately this is non-portable as well, as the trailing _np denotes,
but at least there *is* a reliable method on Linux and Cygwin...
Thanks. That fixes the problem too, even with the call to setrlimit left in.
I'll report this to the emacs developers.

Ken

--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Ken Brown
2015-07-07 18:05:06 UTC
Permalink
Post by Ken Brown
If you want to know how big your current stack *actually* is, you can
#include <pthread.h>
static void
handle_sigsegv (int sig, siginfo_t *siginfo, void *arg)
{
pthread_attr_t attr;
size_t stacksize;
if (!pthread_getattr_np (pthread_self (), &attr)
&& !pthread_attr_getstacksize (&attr, &stacksize))
{
beg = stack_bottom;
end = stack_bottom + stack_direction * stacksize;
[...]
Unfortunately this is non-portable as well, as the trailing _np denotes,
but at least there *is* a reliable method on Linux and Cygwin...
Thanks. That fixes the problem too, even with the call to setrlimit left
in. I'll report this to the emacs developers.
Excellent, thanks for testing this!
Uh oh. We have a problem there. This only worked accidentally, at least
on x86_64. What happens is that pthread_getattr_np checks the validity
of the "attr" parameter and while doing so it may (validly) raise a SEGV.
Yes, I discovered that too. I was just about to send off an emacs bug
report and patch, but then I decided to test it once more and got the SEGV.
Usually this SEGV is catched by a special SEH handler in Cygwin, which
is used to implement __try/__except blocks in Cygwin. The validity
check returns the matching information "object uninitialized" to the
caller.
Not so here. Since we're still in exception handling while running the
signal handler, another nested SEGV makes the OS kill the process without
calling any SEH exception handler on the way.
The problem is, there doesn't seem to be an elegant way around that on
x86_64. From the application perspective you can just initialize the
pthread_attr_t to 0, as in
pthread_attr_t attr = { 0 };
but that's ... unusual. It's so unusual that nobody will ever think of
it. The other way to "fix" this in the application itself is to call
pthread_getattr_np in the main() function, which works because we're not
running in the context of the exception handler.
Every myfault setup will have to capture the current thread context
and set up a vectored continuation handler. This handler will be
called if no other exception handler feels responsible for an
exception. Fortunately it's called even while another exception is
still handled. The vectored handler then restores the thread context,
just with tweaked instruction pointer.
What bugs me with this solution is not only that it looks rather
hackish, but also that it comes with a performance hit. The fact
that every __try/__except block has to call RtlCaptureContext is
not exactly free of charge...
As you might have noticed, this has nothing to do with the alternate
stack. It's just YA problem which cropped up during this testphase.
Yep. But the good news is that the alternate stack is working.

Ken

--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Ken Brown
2015-07-07 19:37:25 UTC
Permalink
I spoke too soon. This *is* a result of the alternate stack handling.
When the exception occurs while running on the alternate stack, the OS
exception handler checks if the stack pointer is valid, and since it's
not in the stackarea as stored in the TEB, it treats the stack as
corrupted. That's why it stops calling the SEHs.
In the meantime I found a workaround for this problem with only a very
marginal performance hit. I applied it to the repo and I'm just in the
process of creatsing new snapshots. If the snapshots work for you I
create another test release.
They work for me. I guess I can go ahead and file that emacs bug
report. Thanks.

Ken


--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Warren Young
2015-07-08 19:39:52 UTC
Permalink
...RLIMIT_STACK...RLIM_INFINITY.
This should fix the Emacs crash you reference later in this message without rebuilding Emacs, right?

I’m not an Emacs user, as you know, but is there an STC for the Emacsizens to try on their systems?
- New API sigaltstack, plus definitions for SA_ONSTACK, SS_ONSTACK, SS_DISABLE,
MINSIGSTKSZ, SIGSTKSZ.
Since these were entirely missing before, this can’t be tested without rebuilding software, right? When rebuilt, existing Cygwin packages may discover the new APIs via autoconf or similar.

A search for "sigaltstack” on code.openhub.net found only 95 projects with this string in their source code, almost entirely consisting of *receivers* of that call, such as NetBSD, glibc, and a bunch of Linux forks.

Of those projects that I didn’t recognize as a receiver of these API calls, I didn’t recognize any as existing Cygwin user-space packages.

So, is it simply the case that the only people who will care to test this are those who already know they’re trying to call this, and need it to work?
- New API: sethostname.
For what it’s worth, that gave a pretty similar result set: lots of OSes and low-level infrastructure, few user-space packages.
- Enable non-SA_RESTART behaviour on threads other than main thread.
Addresses: https://cygwin.com/ml/cygwin/2015-06/msg00260.html
Nice fix for an obscure problem.

And thanks for providing references, so that those like me who didn’t pay attention to the original thread can catch up and see why we care about these fixes. :)
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Ken Brown
2015-07-08 20:12:46 UTC
Permalink
Post by Warren Young
...RLIMIT_STACK...RLIM_INFINITY.
This should fix the Emacs crash you reference later in this message without rebuilding Emacs, right?
I’m not an Emacs user, as you know, but is there an STC for the Emacsizens to try on their systems?
There's no crash in the current Emacs release. The crashes we were
discussing were in my build of the current Emacs development trunk,
which can use an alternate stack in order to recover from stack
overflow. This has no impact on users of the current Emacs release.

Ken

--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Eric Blake
2015-07-14 22:03:19 UTC
Permalink
- New API sigaltstack, plus definitions for SA_ONSTACK, SS_ONSTACK, SS_DISABLE,
MINSIGSTKSZ, SIGSTKSZ.
Since these were entirely missing before, this can’t be tested without rebuilding software, right? When rebuilt, existing Cygwin packages may discover the new APIs via autoconf or similar.
A search for "sigaltstack” on code.openhub.net found only 95 projects with this string in their source code, almost entirely consisting of *receivers* of that call, such as NetBSD, glibc, and a bunch of Linux forks.
libsigsegv is a cygwin package (currently 32-bit only) that has
configure checks to use sigaltstack if present; I have not yet tested if
it can be configured to work with the new API, but hope to do so in the
near future. In fact, if sigaltstack works, it may finally be possible
to port libsigsegv to 64-bit cygwin (the reason the current package is
not ported to 64-bit is that libsigsegv is relying on raw assembly and
Windows native calls to emulate the lack of sigaltstack; but if
sigaltstack works, then we don't need to port the 64-bit counterpart for
the 32-bit specific hacks).

I'm not the cygwin packager for libsigsegv, but am one of the upstream
contributors, and so this thread has piqued my interest. Sadly, I'm a
bit late to the testing because I was on vacation last month, and am now
trying to catch up with several things that happened during my
(much-needed) downtime, such as a new upstream release of coreutils.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
Ken Brown
2015-07-15 02:07:47 UTC
Permalink
Post by Eric Blake
- New API sigaltstack, plus definitions for SA_ONSTACK, SS_ONSTACK, SS_DISABLE,
MINSIGSTKSZ, SIGSTKSZ.
Since these were entirely missing before, this can’t be tested without rebuilding software, right? When rebuilt, existing Cygwin packages may discover the new APIs via autoconf or similar.
A search for "sigaltstack” on code.openhub.net found only 95 projects with this string in their source code, almost entirely consisting of *receivers* of that call, such as NetBSD, glibc, and a bunch of Linux forks.
libsigsegv is a cygwin package (currently 32-bit only) that has
configure checks to use sigaltstack if present; I have not yet tested if
it can be configured to work with the new API, but hope to do so in the
near future. In fact, if sigaltstack works, it may finally be possible
to port libsigsegv to 64-bit cygwin (the reason the current package is
not ported to 64-bit is that libsigsegv is relying on raw assembly and
Windows native calls to emulate the lack of sigaltstack; but if
sigaltstack works, then we don't need to port the 64-bit counterpart for
the 32-bit specific hacks).
I just did a quick test, and it looks promising. I removed all Cygwin-specific
code from configure.ac and Makefile.am (see attached patch), and it then built
on 64-bit Cygwin. Here's the result of 'make check':

Entering directory
'/home/kbrown/src/cyglibsigsegv/libsigsegv-2.10-1.x86_64/build/tests'
Test passed.
PASS: sigsegv1.exe
Test passed.
PASS: sigsegv2.exe
Doing SIGSEGV pass 1.
Stack overflow 1 caught.
Doing SIGSEGV pass 2.
Stack overflow 2 caught.
Test passed.
PASS: sigsegv3.exe
SKIP: stackoverflow1.exe
SKIP: stackoverflow2.exe
======================
All 3 tests passed
(2 tests were not run)
======================
[...]
Please send the following summary line via email to the main author
Bruno Haible <***@clisp.org> for inclusion into the list of
successfully tested platforms (see PORTING file).

libsigsegv: x86_64-unknown-cygwin | yes | no | 2.10
Post by Eric Blake
I'm not the cygwin packager for libsigsegv,
No one is; it's orphaned.
Post by Eric Blake
but am one of the upstream
contributors, and so this thread has piqued my interest.
So it seems that you would be the obvious person to maintain it, if you have the
time. If you don't have the time, I'd be willing to ITA it just to get it into
the 64-bit distro. But in that case I'd appreciate it if you would review my
build after I send the ITA, since you actually know something about libsigsegv,
and I don't.
Post by Eric Blake
Sadly, I'm a
bit late to the testing because I was on vacation last month, and am now
trying to catch up with several things that happened during my
(much-needed) downtime, such as a new upstream release of coreutils.
Ken
Ken Brown
2015-07-15 14:24:42 UTC
Permalink
Hi guys,
Post by Ken Brown
Entering directory
'/home/kbrown/src/cyglibsigsegv/libsigsegv-2.10-1.x86_64/build/tests'
Test passed.
PASS: sigsegv1.exe
Test passed.
PASS: sigsegv2.exe
Doing SIGSEGV pass 1.
Stack overflow 1 caught.
Doing SIGSEGV pass 2.
Stack overflow 2 caught.
Test passed.
PASS: sigsegv3.exe
SKIP: stackoverflow1.exe
SKIP: stackoverflow2.exe
Any idea why these two tests have been skipped? That means the
HAVE_STACK_OVERFLOW_RECOVERY autoconf test failed. You removed cygwin
from the explicit
mingw* | cygwin*) sv_cv_have_stack_overflow_recovery=yes ;;
which is the right thing to do, but that means CFG_LEAVE has been
set to leave-none.c, apparently.
I haven't much time to look into that right now, but will later today if
you don't beat me to it.
Got it. What's needed is a Cygwin-specific fault-*.h file which exposes
how to fetch the stack pointer register from mcontext_t. As you can see
from the plethora of fault-*.h files in the src subdir, this is highly
system-specific anyway.
Here's the set of files you need to rebuild libsigsegv for Cygwin 2.1.0,
with all tests running and passing on i686 and x86_64. No other patch
is requied.
Great! Thanks.
So, who of you is going to maintain it?
I'll defer to Eric, but if he doesn't want it then I'll take it.

Ken

--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Eric Blake
2015-07-15 15:08:16 UTC
Permalink
Post by Ken Brown
Got it. What's needed is a Cygwin-specific fault-*.h file which exposes
how to fetch the stack pointer register from mcontext_t. As you can see
from the plethora of fault-*.h files in the src subdir, this is highly
system-specific anyway.
Here's the set of files you need to rebuild libsigsegv for Cygwin 2.1.0,
with all tests running and passing on i686 and x86_64. No other patch
is requied.
Awesome! Makes my work easier.
Post by Ken Brown
Great! Thanks.
So, who of you is going to maintain it?
I'll defer to Eric, but if he doesn't want it then I'll take it.
I'll take it. Expect a package up later this week, as well as a rebuilt
64-bit m4 that uses it.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
Loading...