[D-runtime] A mechanism to kill threads

Post by Alex RÃ¸nne Petersen
Hi,
In my virtual machine, I need to be able to kill daemon threads on
shutdown. The problem is that VM shutdown is *not* tightly coupled to
druntime shutdown; multiple VM instances can run in a process at any
given time, and can be started/stopped whenever. It doesn't appear
like there is any functionality in core.thread to achieve this.
Is there any reason we don't have a Thread.kill() function?

Because it's incredibly unsafe? You may have a valid use case for it, but it's
_not_ something that you should normally be doing.

- Jonathan M Davis

Alex Rønne Petersen

2012-05-16 18:57:15 UTC

Of course it's unsafe, but no less than arbitrarily suspending a
thread. Once you've suspended a thread, you may as well kill it.
You've effectively halted it anyway. We have lots of unsafe primitives
in core.thread already (and my critical regions and cooperative
suspension patches add even more!).

Is there anything speaking against adding this to core.thread with a
big fat "THIS IS UNSAFE" warning?

Regards,
Alex

Because it's incredibly unsafe? You may have a valid use case for it, but it's
_not_ something that you should normally be doing.
- Jonathan M Davis
_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Jonathan M Davis

2012-05-16 21:04:48 UTC

Post by Alex RÃ¸nne Petersen
Of course it's unsafe, but no less than arbitrarily suspending a
thread. Once you've suspended a thread, you may as well kill it.
You've effectively halted it anyway. We have lots of unsafe primitives
in core.thread already (and my critical regions and cooperative
suspension patches add even more!).
Is there anything speaking against adding this to core.thread with a
big fat "THIS IS UNSAFE" warning?

David Simcha

2012-05-16 21:19:02 UTC

For the sake of argument, what are the most non-obvious reasons why killing
threads is bad? The ones I can think of are because the thread may be in
the middle of doing something important and bad things will happen if it's
interrupted and because the thread might hold resources that will never get
freed if it's killed before it gets to free them.

I was thinking at one point that I wanted a kill() primitive when I was
designing std.parallelism, but I don't remember why I wanted it and
obviously I've managed to do without it. Is it ok to kill threads if
they're not doing useful work at the time and they're not holding onto any
resources that will never get freed if you kill them?

Well, I wasn't saying that it's necessarily the case that we shouldn't add it.
I was pointing out that it's a very unsafe thing to do and rarely needed such
that it's not exactly surprising that it hasn't been added previously. If
there's a real use case for it, it makes sense to add it given that it _is_
something that's platform dependent, and part of the whole point of Thread is
to provide a platform-independent API for handling threads. It probably
_should_ have a big fat warning on it though, given that not all developers
seem to get how insanely bad it is to kill a thread (at least, that seems to
be the case given some thread-related questions I've seen in the past).
- Jonathan M Davis
_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/d-runtime/attachments/20120516/001feabc/attachment.html>

Alex Rønne Petersen

2012-05-16 21:29:36 UTC

The only reason I can think of is exactly what you mentioned, but this
is a non-issue because you *really* *should* *not* kill non-daemon
threads, and daemon threads should be designed to handle 'abnormal'
termination gracefully.

That said, I think it's okay to kill a thread which is just waiting
for work, sleeping, or whatever. As long as it isn't holding onto a
lock (or similar), you should be safe. Holding onto a resource like a
lock could easily lead to a deadlock because the lock is never
released when you kill the thread.

I think the TL;DR of all this is that you should really only kill
threads if either a) you know they can tolerate it (daemon threads) or
b) you know specifically what the thread is doing and that it's safe
to interrupt that work and never resume it (speaking of which, killing
a thread isn't really any different from suspending it and just never
resuming it).

Regards,
Alex

Post by David Simcha
For the sake of argument, what are the most non-obvious reasons why killing
threads is bad?? The ones I can think of are because the thread may be in
the middle of doing something important and bad things will happen if it's
interrupted and because the thread might hold resources that will never get
freed if it's killed before it gets to free them.
I was thinking at one point that I wanted a kill() primitive when I was
designing std.parallelism, but I don't remember why I wanted it and
obviously I've managed to do without it.? Is it ok to kill threads if
they're not doing useful work at the time and they're not holding onto any
resources that will never get freed if you kill them?
On Wed, May 16, 2012 at 5:04 PM, Jonathan M Davis <jmdavisProg at gmx.com>

_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Sean Kelly

2012-05-16 21:43:18 UTC

Why does this need to be built into core.thread? Is there some reason you can't implement a kill routine in your app?

Post by Alex RÃ¸nne Petersen
The only reason I can think of is exactly what you mentioned, but this
is a non-issue because you *really* *should* *not* kill non-daemon
threads, and daemon threads should be designed to handle 'abnormal'
termination gracefully.
That said, I think it's okay to kill a thread which is just waiting
for work, sleeping, or whatever. As long as it isn't holding onto a
lock (or similar), you should be safe. Holding onto a resource like a
lock could easily lead to a deadlock because the lock is never
released when you kill the thread.
I think the TL;DR of all this is that you should really only kill
threads if either a) you know they can tolerate it (daemon threads) or
b) you know specifically what the thread is doing and that it's safe
to interrupt that work and never resume it (speaking of which, killing
a thread isn't really any different from suspending it and just never
resuming it).
Regards,
Alex

Post by David Simcha
For the sake of argument, what are the most non-obvious reasons why killing
threads is bad? The ones I can think of are because the thread may be in
the middle of doing something important and bad things will happen if it's
interrupted and because the thread might hold resources that will never get
freed if it's killed before it gets to free them.
I was thinking at one point that I wanted a kill() primitive when I was
designing std.parallelism, but I don't remember why I wanted it and
obviously I've managed to do without it. Is it ok to kill threads if
they're not doing useful work at the time and they're not holding onto any
resources that will never get freed if you kill them?
On Wed, May 16, 2012 at 5:04 PM, Jonathan M Davis <jmdavisProg at gmx.com>

_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Jonathan M Davis

2012-05-16 21:32:16 UTC

It's the same reason that you don't want to continue executing after an
assertion fails. Your program is by definition in an undefined state. In this
case, it would be a single thread which was then in an undefined state. If it
were sufficiently isolated (e.g. doesn't use shared _at all_), it _might_ mean
that the rest of the program is okay, but there's no question that anything
relating to that thread is then in an undefined state, and odds are that that
means that the rest of the program is also in an undefined state, though the
impact is likely to be much less if the thread was well isolated. If you were
using shared much though, there's not really any question that your whole
program is then in an undefined state, and who knows what could happen if you
tried to continue. And even without using shared much, the threading stuff is
still built on top of C primitives which _are_ shared (well, __gshared), so by
definition, there's at least _some_ portion of the rest of your program which
is in an undefined state.

- Jonathan M Davis

Alex Rønne Petersen

2012-05-16 21:36:42 UTC

In my particular case, the threads I'm going to kill are executing
isolated managed code, so going into an undefined state can't really
happen, but you're certainly right that it's entirely possible in
normal C and D code. In fact, even .NET has a zillion warnings about
using Thread.Abort():
http://msdn.microsoft.com/en-us/library/ty8d3wta.aspx

Regards,
Alex

Sean Kelly

2012-05-16 21:47:01 UTC

Right. This is why I don't want the functionality in core.thread. There are potentially legitimate reasons to kill a thread, but exposing it in the library effectively sanctions it as acceptable behavior, when this absolutely isn't the case. Unless there were some reason why it absolutely couldn't be done without support in core.thread, I don't think inherently unsafe operations should be exposed to the user.

Post by Alex RÃ¸nne Petersen
In my particular case, the threads I'm going to kill are executing
isolated managed code, so going into an undefined state can't really
happen, but you're certainly right that it's entirely possible in
normal C and D code. In fact, even .NET has a zillion warnings about
http://msdn.microsoft.com/en-us/library/ty8d3wta.aspx
Regards,
Alex

_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Alex Rønne Petersen

2012-05-16 21:51:35 UTC

I'm sure it could be done outside of core.thread, but then anyone who
legitimately knows what they're doing and needs it would have to
reinvent the wheel.

Regards,
Alex

Right. ?This is why I don't want the functionality in core.thread. ?There are potentially legitimate reasons to kill a thread, but exposing it in the library effectively sanctions it as acceptable behavior, when this absolutely isn't the case. ?Unless there were some reason why it absolutely couldn't be done without support in core.thread, I don't think inherently unsafe operations should be exposed to the user.

_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Sean Kelly

2012-05-16 22:36:32 UTC

Post by Alex RÃ¸nne Petersen
I'm sure it could be done outside of core.thread, but then anyone who
legitimately knows what they're doing and needs it would have to
reinvent the wheel.

But it's trivial in this case:

class KillableThread : Thread {
this(void function() fn) {
m_fn = fn;
super(run);
}

public kill() {
pthread_cancel(m_thisThread);
}

private void run() {
m_thisThread = pthread_self();
fn();
}

private pthread_t m_thisThread;
private void function() m_fn;
}

Which I suppose could be made even easier by adding a protected method to Thread that returns the thread handle.

Alex Rønne Petersen

2012-05-16 22:42:15 UTC

The trouble is that they have to have version blocks for every
platform and every threading library supported. The fact that
core.thread supports pthreads should not even be known to code that
extends the Thread class; it's an implementation detail (other
threading libraries exist, especially on Linux). I believe there is
currently active work to make GDC support other C libraries, and I
imagine that involves other threading libraries on rather exotic
platforms too.

The way I've always seen core.thread is as a fundamental building
block for threading; it should hide the native threading interface
completely so people shouldn't have to mess with those at all.

Regards,
Alex

Post by Alex RÃ¸nne Petersen
I'm sure it could be done outside of core.thread, but then anyone who
legitimately knows what they're doing and needs it would have to
reinvent the wheel.

class KillableThread : Thread {
? ?this(void function() fn) {
? ? ? ?m_fn = fn;
? ? ? ?super(run);
? ?}
? ?public kill() {
? ? ? ?pthread_cancel(m_thisThread);
? ?}
? ?private void run() {
? ? ? ?m_thisThread = pthread_self();
? ? ? ?fn();
? ?}
? ?private pthread_t m_thisThread;
? ?private void function() m_fn;
}
Which I suppose could be made even easier by adding a protected method to Thread that returns the thread handle.
_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Sean Kelly

2012-05-16 22:55:55 UTC

Post by Alex RÃ¸nne Petersen
The trouble is that they have to have version blocks for every
platform and every threading library supported. The fact that
core.thread supports pthreads should not even be known to code that
extends the Thread class; it's an implementation detail (other
threading libraries exist, especially on Linux). I believe there is
currently active work to make GDC support other C libraries, and I
imagine that involves other threading libraries on rather exotic
platforms too.
The way I've always seen core.thread is as a fundamental building
block for threading; it should hide the native threading interface
completely so people shouldn't have to mess with those at all.

I tend to agree, though for some of the really risky stuff I think it may be fair to expect the user to be familiar with the platform specifics of what he's trying to do. Kill threads, for example, works differently on Posix vs. Win32 and is (surprise surprise) actually a bit safer. Regarding killing threads in general though, I wonder if there isn't perhaps a safer application-specific approach that you could use. Could these threads periodically check some global "cancel" state and exit if set? Is it truly necessary to have an external thread violently kill them?

Alex Rønne Petersen

2012-05-16 23:06:12 UTC

Trust me, I've evaluated every option I could think of. Even
cooperative suspension/killing won't cut it because a while (true);
inside a VM thread could prevent the cooperative kill from actually
happening at all.

E.g.:

void myThread()
{
while (run)
{
runUserCode(); // if this never returns, the thread will never stop
}
}

Regards,
Alex

I tend to agree, though for some of the really risky stuff I think it may be fair to expect the user to be familiar with the platform specifics of what he's trying to do. ?Kill threads, for example, works differently on Posix vs. Win32 and is (surprise surprise) actually a bit safer. ?Regarding killing threads in general though, I wonder if there isn't perhaps a safer application-specific approach that you could use. ?Could these threads periodically check some global "cancel" state and exit if set? ?Is it truly necessary to have an external thread violently kill them?
_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Sean Kelly

2012-05-16 23:22:36 UTC

Post by Alex RÃ¸nne Petersen
Trust me, I've evaluated every option I could think of. Even
cooperative suspension/killing won't cut it because a while (true);
inside a VM thread could prevent the cooperative kill from actually
happening at all.
void myThread()
{
while (run)
{
runUserCode(); // if this never returns, the thread will never stop
}
}

Maybe a standalone function like:

extern (C) int thread_terminateThread(Thread t);

That would at least keep it out of the public interface for the object, and move it into the realm of really unsafe or weird stuff. Obvious caveat still being that the app may well have been left in an invalid state.

Alex Rønne Petersen

2012-05-16 23:26:49 UTC

I agree. I'll have a go at implementing it and we can see where that goes.

Regards,
Alex

Post by Alex RÃ¸nne Petersen
Trust me, I've evaluated every option I could think of. Even
cooperative suspension/killing won't cut it because a while (true);
inside a VM thread could prevent the cooperative kill from actually
happening at all.
void myThread()
{
? ?while (run)
? ?{
? ? ? ?runUserCode(); // if this never returns, the thread will never stop
? ?}
}

extern (C) int thread_terminateThread(Thread t);
That would at least keep it out of the public interface for the object, and move it into the realm of really unsafe or weird stuff. ?Obvious caveat still being that the app may well have been left in an invalid state.
_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Michel Fortin

2012-05-17 21:31:20 UTC

Post by Sean Kelly
extern (C) int thread_terminateThread(Thread t);
That would at least keep it out of the public interface for the object, and move it into the realm of really unsafe or weird stuff. Obvious caveat still being that the app may well have been left in an invalid state.

int thread_terminateThread(string disclaimer)(Thread t) if (disclaimer == "I know it will leak and could deadlock once in a while. I have read the documentation and accepts the consequences.");

--
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/

Brad Roberts

2012-05-16 23:51:09 UTC

If you have code like this, how can you make the claims you have about
knowing it to be safe to kill the threads in question. I'm with Sean on
this, killing threads is a really bad idea and having an api is going to
result in pain.

The only safe way to do it is akin to what you describe above. Rely on
some communication path into the other thread to ask it to kill itself
when it's actually safe, not just believed to be.

It's a lot like so called lockless programming. It's _really_ hard to get
right. Even the experts get it wrong fairly frequently. When you can
avoid it, avoid it. There are safe ways of asking a thread to exit
itself, so that should be the much preferred method.

You're obviously a bright guy and know all this, so what's pushing you so
hard to go in an obviously unsafe direction rather than the obviously safe
one?

_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Alex Rønne Petersen

2012-05-16 23:58:23 UTC

Post by Brad Roberts
If you have code like this, how can you make the claims you have about
knowing it to be safe to kill the threads in question. ?I'm with Sean on
this, killing threads is a really bad idea and having an api is going to
result in pain.

I can because I know what the threads are doing. I'm doing this in a
virtual machine; I know the threads are executing isolated, managed
code. Heck, if I wanted, I could map their native instruction pointer
back into my IR to see exactly what operation they were performing.

The important point, however, is that the threads in question can only
mutate state that I know everything about, and I'm doing this daemon
thread killing on shutdown. That means that the managed system is left
in a very fragile state, but it doesn't matter, because the world has
completely stopped at that point (all managed, non-daemon threads have
been joined before killing the daemon threads). This is what I mean
when I say you have to reason about *exactly* what the thread you're
killing is doing, or you shouldn't be killing it at all.

The cases where you can actually kinda-sorta safely kill a thread are
so extremely rare, but they do exist.

Post by Brad Roberts
The only safe way to do it is akin to what you describe above. ?Rely on
some communication path into the other thread to ask it to kill itself
when it's actually safe, not just believed to be.

The thing is, that's not so useful in practice: Every programmer
working with my VM would have to make their daemon threads
cooperatively check for a shutdown condition frequently. And while
this is arguably a much better programming model than what people do
today, I can't reasonably force it upon people (it would also render
my VM incompatible with the majority of existing programming languages
in the threading department). Besides, cooperative killing means that
the thread may not be killed at all if the managed code so desires; I
can't afford that. I need to be able to guarantee that the VM shuts
down (well, almost; most thread killing APIs don't actually guarantee
all that much, but in practice, they tend to work).

Post by Brad Roberts
It's a lot like so called lockless programming. ?It's _really_ hard to get
right. ?Even the experts get it wrong fairly frequently. ?When you can
avoid it, avoid it. ?There are safe ways of asking a thread to exit
itself, so that should be the much preferred method.

I really would if I could! But this is a virtual machine, so asking
the thread at some arbitrary point to just shut itself down is not
practical. I know that the thread is executing managed code, but I
don't know *what* that code means; that's entirely up to the
programmer of the application the VM is executing. But as far as VM
shutdown is concerned, what the code does is irrelevant, since I'm
abandoning the managed world completely on shutdown anyway (in
practice, this equals zeroing all of the managed application's
globals, unregistering all roots, doing a GC cycle, and waiting for
finalizers to complete (the last one is another can of worms
entirely...)).

Post by Brad Roberts
You're obviously a bright guy and know all this, so what's pushing you so
hard to go in an obviously unsafe direction rather than the obviously safe
one?

See above. I believe in my specific case, I have enough control to
make this 'safe enough'.

Post by Brad Roberts

Post by Alex RÃ¸nne Petersen
Trust me, I've evaluated every option I could think of. Even
cooperative suspension/killing won't cut it because a while (true);
inside a VM thread could prevent the cooperative kill from actually
happening at all.
void myThread()
{
? ? while (run)
? ? {
? ? ? ? runUserCode(); // if this never returns, the thread will never stop
? ? }
}
Regards,
Alex

_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Regards,
Alex

Sean Kelly

2012-05-16 21:44:12 UTC

If the killed thread even has a chance of allocating memory ir could be killed while inside the GC and lock all threads out of the GC forever.

Jonathan M Davis

2012-05-16 22:52:09 UTC

Post by Sean Kelly
If the killed thread even has a chance of allocating memory ir could be
killed while inside the GC and lock all threads out of the GC forever.

Ouch.

- Jonathan M Davis

Alex Rønne Petersen

2012-05-16 22:54:44 UTC

Again, this boils down to knowing what you're doing. You should NOT be
killing a thread if there is any chance whatsoever that the thread
could be doing GC allocation while you do it.

That said, I'm not convinced that this is a problem that we cannot
work around; I think that if kill() would acquire the global thread
lock, this problem could be prevented from happening.

Regards,
Alex

Post by Sean Kelly
If the killed thread even has a chance of allocating memory ir could be
killed while inside the GC and lock all threads out of the GC forever.

Ouch.
- Jonathan M Davis
_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Sean Kelly

2012-05-16 22:59:59 UTC

That forces the GC to expose functionality for acquiring any relevant locks for something that really has nothing to do with memory allocation., and still only takes care of one potential deadlock, albeit a common one. Critical sections would help as well (that's basically how pthread_cancel() works), but applying them to this case would be tricky.

Post by Alex RÃ¸nne Petersen
Again, this boils down to knowing what you're doing. You should NOT be
killing a thread if there is any chance whatsoever that the thread
could be doing GC allocation while you do it.
That said, I'm not convinced that this is a problem that we cannot
work around; I think that if kill() would acquire the global thread
lock, this problem could be prevented from happening.
Regards,
Alex

Post by Sean Kelly
If the killed thread even has a chance of allocating memory ir could be
killed while inside the GC and lock all threads out of the GC forever.

Ouch.
- Jonathan M Davis
_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Alex Rønne Petersen

2012-05-16 23:03:50 UTC

So that's the thing: You really need to know what the thread is doing
or you should not be killing it. Predicting GC allocations is hard,
but taking the global thread lock inside kill() eliminates that
problem (at least I *think* it does; it covers any case that I can
think of). In most cases where you kill threads, you either know what
they are doing, or assume that *they* know what they're doing (and
that "what" being something that won't make the planet implode once
they're killed).

Regards,
Alex

That forces the GC to expose functionality for acquiring any relevant locks for something that really has nothing to do with memory allocation., and still only takes care of one potential deadlock, albeit a common one. ?Critical sections would help as well (that's basically how pthread_cancel() works), but applying them to this case would be tricky.

Post by Sean Kelly
If the killed thread even has a chance of allocating memory ir could be
killed while inside the GC and lock all threads out of the GC forever.

Ouch.
- Jonathan M Davis
_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Sean Kelly

2012-05-16 21:42:02 UTC

If the thread holds a lock on any mutexes, they will never be released and any other thread trying to acquire those locks will deadlock.

For the sake of argument, what are the most non-obvious reasons why killing threads is bad? The ones I can think of are because the thread may be in the middle of doing something important and bad things will happen if it's interrupted and because the thread might hold resources that will never get freed if it's killed before it gets to free them.
I was thinking at one point that I wanted a kill() primitive when I was designing std.parallelism, but I don't remember why I wanted it and obviously I've managed to do without it. Is it ok to kill threads if they're not doing useful work at the time and they're not holding onto any resources that will never get freed if you kill them?

Alex Rønne Petersen

2012-05-16 21:19:26 UTC

Well, I'll see if I can conjure some sort of patch for this.

I do have a more technical concern, though: Would a Thread.kill()
function have to acquire the global thread lock to avoid racing
against a world stop? At least the way I see it, that's going to be
necessary.

Regards,
Alex

Fawzi Mohamed

2012-05-16 21:07:10 UTC

On Wed, 16 May 2012 20:57:15 +0200

yes, suspending makes the code that is executed during the suspension
"unsafe" (meaning that it can do only things that can be done in a
signal handler).
Killing makes all the rest of the program have such constraints...

Fawzi

Post by Alex RÃ¸nne Petersen
Regards,
Alex
On Wed, May 16, 2012 at 8:47 PM, Jonathan M Davis

Post by Alex RÃ¸nne Petersen
Hi,
In my virtual machine, I need to be able to kill daemon threads on
shutdown. The problem is that VM shutdown is *not* tightly coupled
to druntime shutdown; multiple VM instances can run in a process
at any given time, and can be started/stopped whenever. It doesn't
appear like there is any functionality in core.thread to achieve
this.
Is there any reason we don't have a Thread.kill() function?

Because it's incredibly unsafe? You may have a valid use case for
it, but it's _not_ something that you should normally be doing.
- Jonathan M Davis
_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Alex Rønne Petersen

2012-05-16 21:39:00 UTC

What do you mean? This might hold true if you're executing in a signal
handler, but if the code you execute after suspending/killing a bunch
of threads does not touch any state that the suspended/killed threads
owned/accessed, you'll be fine.

Regards,
Alex

Post by Fawzi Mohamed
On Wed, 16 May 2012 20:57:15 +0200

Post by Alex RÃ¸nne Petersen
Regards,
Alex
On Wed, May 16, 2012 at 8:47 PM, Jonathan M Davis

Post by Alex RÃ¸nne Petersen
Hi,
In my virtual machine, I need to be able to kill daemon threads on
shutdown. The problem is that VM shutdown is *not* tightly coupled
to druntime shutdown; multiple VM instances can run in a process
at any given time, and can be started/stopped whenever. It doesn't
appear like there is any functionality in core.thread to achieve
this.
Is there any reason we don't have a Thread.kill() function?

Because it's incredibly unsafe? You may have a valid use case for
it, but it's _not_ something that you should normally be doing.
- Jonathan M Davis
_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Sean Kelly

2012-05-16 21:48:07 UTC

Post by Alex RÃ¸nne Petersen
What do you mean? This might hold true if you're executing in a signal
handler, but if the code you execute after suspending/killing a bunch
of threads does not touch any state that the suspended/killed threads
owned/accessed, you'll be fine.

But how do you know what the threads own or are accessing? This is the standard library. The user could be doing absolutely anything.

Alex Rønne Petersen

2012-05-16 21:51:02 UTC

It's the user's responsibility, just as with any of the other 'unsafe'
functions. Also, it's the runtime, not the standard library. I think
there is a significant difference between those two terms. You
shouldn't touch the runtime in general unless you're prepared to go
low-level.

Regards,
Alex

But how do you know what the threads own or are accessing? ?This is the standard library. ?The user could be doing absolutely anything.
_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Sean Kelly

2012-05-16 22:28:21 UTC

Post by Alex RÃ¸nne Petersen
It's the user's responsibility, just as with any of the other 'unsafe'
functions. Also, it's the runtime, not the standard library. I think
there is a significant difference between those two terms. You
shouldn't touch the runtime in general unless you're prepared to go
low-level.

I think that there are three important aspects of library design. The first is to make it useful by exposing any functionality that is commonly needed by users of the library, or in some cases that is not commonly needed but which can't be done without library support. The second is that this should be done with the understanding that the interface of a library provides a framework for how the library is to be used. This is the really tricky part, because a library could expose the same functionality in two different ways and the applications built upon each would be fundamentally different. The third is that the library should be no more complex than absolutely necessary. Doing so tends to expose too many assumptions about how the library will be used, and tends to confuse things without really adding much of use.

What I've found is that when people thing of something useful for their program, if it's something regarding a library they tend to want to build that functionality into the library even if it doesn't need to be there. The problem is that they may be the only person in the world that needs that functionality, and adding it to the library may increase maintenance cost to no one's actual benefit but that one user. There have been a number of requests like this for Druntime, and some of the few that were accepted have just been deleted after having been deprecated for years, without a single complaint. Assuming that people are actually using Druntime, I'll admit that I prefer to see it shrinking rather than growing. It means that it's being distilled down to only containing the functionality that people actually care about.

Regarding Druntime's "expert only" status? Druntime still follows the core tenet of D which is that the design should encourage people to do things the "right way" by making that approach inherently easy. For example, the GC needs to suspend and scan threads so it can collect unused memory. This functionality is absolutely needed. But an important observation is that the GC only ever needs to suspend and scan all threads in bulk--never just one at a time. So rather than exposing a Thread.suspend() routine as in Java, Druntime only provides a way to suspend all threads at once. This makes it far more likely that the functionality will only be used as was intended--by the GC--rather than by users as a means of implementing synchronization. And if you don't think people use suspend() in this way you need look no farther than Doug Lea's containers library. He's clearly an expert and so Gets it Right, and he's admittedly doing it by necessity since Java doesn't have semaphores or whatever, but I think it's a fair point all the same. What should have happened was for semaphores to be added as a separate API so Doug didn't have to cowboy it with suspend().

Alex Rønne Petersen

2012-05-16 22:57:07 UTC

I think that there are three important aspects of library design. ?The first is to make it useful by exposing any functionality that is commonly needed by users of the library, or in some cases that is not commonly needed but which can't be done without library support. ?The second is that this should be done with the understanding that the interface of a library provides a framework for how the library is to be used. ?This is the really tricky part, because a library could expose the same functionality in two different ways and the applications built upon each would be fundamentally different. ?The third is that the library should be no more complex than absolutely necessary. ?Doing so tends to expose too many assumptions about how the library will be used, and tends to confuse things without really adding much of use.

The problem is that druntime's core.thread is supposed to hide the
native threading mechanism being used. With the example in your other
email, the abstraction is completely broken, and portability becomes a
*giant* pain (especially because compilers don't set any flags
indicating what threading library the runtime is built for). So, I do
believe kill() would be an important primitive to have, as unsafe as
it is.

What I've found is that when people thing of something useful for their program, if it's something regarding a library they tend to want to build that functionality into the library even if it doesn't need to be there. ?The problem is that they may be the only person in the world that needs that functionality, and adding it to the library may increase maintenance cost to no one's actual benefit but that one user. ?There have been a number of requests like this for Druntime, and some of the few that were accepted have just been deleted after having been deprecated for years, without a single complaint. ?Assuming that people are actually using Druntime, I'll admit that I prefer to see it shrinking rather than growing. ?It means that it's being distilled down to only containing the functionality that people actually care about.

What were some of those (out of curiosity)?

I agree that keeping druntime small is a good idea - but IMO that
shouldn't mean that we can't have the abstractions that actually make
druntime useful. Obviously, I'm biased here, since this is a feature
I, specifically, want. But I'm not convinced that it would never be
used by anyone else. D is a systems language intended to be a
competitor to C and C++. A virtual machine is just one example of a
system that would need the functionality to destructively kill
threads, regardless of whether all hell breaks loose as a result.

Regarding Druntime's "expert only" status? Druntime still follows the core tenet of D which is that the design should encourage people to do things the "right way" by making that approach inherently easy. ?For example, the GC needs to suspend and scan threads so it can collect unused memory. ?This functionality is absolutely needed. ?But an important observation is that the GC only ever needs to suspend and scan all threads in bulk--never just one at a time. ?So rather than exposing a Thread.suspend() routine as in Java, Druntime only provides a way to suspend all threads at once. ?This makes it far more likely that the functionality will only be used as was intended--by the GC--rather than by users as a means of implementing synchronization. ?And if you don't think people use suspend() in this way you need look no farther than Doug Lea's containers library. ?He's clearly an expert and so Gets it Right, and he's admittedly doing it by necessity since Java doesn't have semaphores or whatever, but I think it's a fair point all the same. ?What should have happened was for semaphores to be added as a separate API so Doug didn't have to cowboy it with suspend().

I should make it clear that I don't consider druntime "expert only". I
just consider it a place where you should *seriously RTFM* before you
use anything it exposes.

I agree that exposing a suspend primitive for individual threads is a
very bad idea; all sorts of Bad Things (TM) could happen if the user
starts interfering with individual suspend counts. Keep in mind that
what I'm asking for will: 1) acquire the global thread lock on every
call 2) kill the thread completely and remove it from the global
thread list. This makes it much less error-prone than a simple
suspend().

_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Regards,
Alex

Sean Kelly

2012-05-16 23:04:48 UTC

Post by Alex RÃ¸nne Petersen

Post by Alex RÃ¸nne Petersen

Fair enough. I'm still resistant to Thread.kill() however, both because of how unsafe it is and because this is actually the first time anyone has ever requested it.

Post by Sean Kelly
What I've found is that when people thing of something useful for their program, if it's something regarding a library they tend to want to build that functionality into the library even if it doesn't need to be there. The problem is that they may be the only person in the world that needs that functionality, and adding it to the library may increase maintenance cost to no one's actual benefit but that one user. There have been a number of requests like this for Druntime, and some of the few that were accepted have just been deleted after having been deprecated for years, without a single complaint. Assuming that people are actually using Druntime, I'll admit that I prefer to see it shrinking rather than growing. It means that it's being distilled down to only containing the functionality that people actually care about.

What were some of those (out of curiosity)?

Runtime.isHalting, for one. There were others I didn't implement for one reason or another, like the ability to join a thread with a timeout.

Post by Alex RÃ¸nne Petersen
I agree that keeping druntime small is a good idea - but IMO that
shouldn't mean that we can't have the abstractions that actually make
druntime useful. Obviously, I'm biased here, since this is a feature
I, specifically, want. But I'm not convinced that it would never be
used by anyone else. D is a systems language intended to be a
competitor to C and C++. A virtual machine is just one example of a
system that would need the functionality to destructively kill
threads, regardless of whether all hell breaks loose as a result.

Suspending threads is actually similar, since debuggers do have a valid reason to suspend threads. I suppose I'm just leery of exposing any functionality that could cause a deadlock.

Sean Kelly

2012-05-16 21:40:47 UTC

Ah, but that's why we have thread_suspendAll() but no way to suspend only one specific thread. Because Java exposes Thread.suspend() for a single thread, people seem inclined to call it as a poor man's form of synchronization, with disastrous results. In fact, this is also why thread_suspendAll() is an extern (C) call rather than exposed as a static method in Thread. I was trying to make it as obscure as possible while still allowing it to be called by the GC.

_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Sean Kelly

2012-05-16 21:38:41 UTC

Post by Alex RÃ¸nne Petersen
Is there any reason we don't have a Thread.kill() function?

Because Thread.kill() is inherently unsafe. If a user really wants to expose that functionality he can subclass Thread.

Alex Rønne Petersen

2012-05-16 21:40:24 UTC

So you are against adding this functionality to druntime even with a
big, fat, red warning explaining why it's unsafe and how *not* to use
it?

Regards,
Alex

Post by Alex RÃ¸nne Petersen
Is there any reason we don't have a Thread.kill() function?

Because Thread.kill() is inherently unsafe. ?If a user really wants to expose that functionality he can subclass Thread.
_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Sean Kelly

2012-05-16 21:49:26 UTC

You're assuming that people read documentation. In my experience, they don't. But more to the point, since this is something that's fairly easy for a user to add if he needs it, there's no reason to expose it in core.thread.

Post by Alex RÃ¸nne Petersen
So you are against adding this functionality to druntime even with a
big, fat, red warning explaining why it's unsafe and how *not* to use
it?
Regards,
Alex

Post by Alex RÃ¸nne Petersen
Is there any reason we don't have a Thread.kill() function?

Because Thread.kill() is inherently unsafe. If a user really wants to expose that functionality he can subclass Thread.
_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

_______________________________________________
D-runtime mailing list
D-runtime at puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Alex Rønne Petersen

2012-05-17 00:55:48 UTC

One interesting question in designing a kill function is: Do we want
to guarantee that it runs all the thread shutdown routines (see the
shutdown code in thread_entryPoint)? I'm thinking that guaranteeing
this is basically impossible, so I'm leaning towards no. The most I
can see would be realistic is to remove the thread from the global
thread list if enqueueing the termination request (pthread_cancel on
POSIX, TerminateThread on Windows) succeeded.

If we don't guarantee this, how much resource leakage are we looking at?

Regards,
Alex

On Wed, May 16, 2012 at 7:04 PM, Alex R?nne Petersen

Alex Rønne Petersen

2012-05-17 01:08:32 UTC

Actually removing the thread from the global thread list would be bad,
since the pthread cleanup callback will most likely access it. So, I
see two ways to solve this:

a) Once the thread has been scheduled for termination, join it
immediately and pray that everything goes well, and then remove it
from the global list. This won't work if the code inside the thread
has altered its cancelability state with
pthread_setcancel[state,type], but I don't think this is worth
worrying about. If someone uses those functions and tries to kill the
thread, they probably know what they are doing and are prepared for
the consequences of calling the kill function with a thread in a
non-cancelable state (it's actually not necessarily problematic as
long as they obey the cancellation request inside the thread -
obviously, it's threading library-specific, though).
b) Enqueue the termination request, and just return immediately. This
means that the thread won't be removed from the global thread list
until something (i.e. thread_suspendAll) notices that it's dead
(!t.isRunning). While this is the easiest solution, I don't think it's
very clever; normally, you'd expect the thread to have terminated by
the time the kill function returns.

Personally, I'm in favor of (a).

The problem with guarantees about thread cleanup still stands, though.

Regards,
Alex

On Thu, May 17, 2012 at 2:55 AM, Alex R?nne Petersen

Post by Alex RÃ¸nne Petersen
One interesting question in designing a kill function is: Do we want
to guarantee that it runs all the thread shutdown routines (see the
shutdown code in thread_entryPoint)? I'm thinking that guaranteeing
this is basically impossible, so I'm leaning towards no. The most I
can see would be realistic is to remove the thread from the global
thread list if enqueueing the termination request (pthread_cancel on
POSIX, TerminateThread on Windows) succeeded.
If we don't guarantee this, how much resource leakage are we looking at?
Regards,
Alex
On Wed, May 16, 2012 at 7:04 PM, Alex R?nne Petersen

Fawzi Mohamed

2012-05-18 12:46:27 UTC

Hi Alex,

I still think that you don't really understand the constraint that your
code has to follow to be able to call kill safely.

I had said that you basically can use only signal safe functions after
calling kill.

Sean and Brad pointed out already some possible pitfalls: holding
resources (locks,...) allocating gc memory,... but to avoid
constraining the subsequent program to be able to use only signal
safe functions, the thread must *not* be executing any signal unsafe
function.

A signal unsafe function might be using resources within the kernel,
and those resource will also never be released.
Even if you are sure that your thread is not using any resource that
you care about later, unless you restrict yourself to signal safe
functions you cannot know what resources might be used by the kernel.

As a real life example aquiring or even releaseing! a lock is non
signal safe, and indeed on OSX to optimize the usage of "fat" locks,
the os tries to spinlock and uses a fat lock from a common pool only
when required.
When releasing the lock the fat lock might go back to the pool.
Now the pool obviously needs to be guarded, and it uses a spinlock for
it.
Now imagine you suspend or kill a thread while the spinlock to the
common pool of fat locks is being held in the kernel...
Any subsequent lock/unlock on *any* lock that requires access to the
common pool will block.
OSX is perfectly standard compliant in doing that optimization, because
indeed those functions are not signal safe.
This is not theoretical, I actually had this probelm, and is now fixed
in druntime (at least my fixes to the tango runtime were included in
druntime).

A consequence of this is that in general it is *not* ok to kill a
thread that is idle and holds a lock (likely what you thought of your
thread in a know state).

As said the only thread that could be safely killed is a thread that has
no resources you care abount and might remain uncollected, and executes
no signal unsafe functions.
Are there suche cases? yes indeed a purely computational thread that
allocates nothing might well fall into that category.
I am not sure your "managed code" thread does.
Normally probably you had better to fork and have a separate *process*
to kill, or accept that you have cooperative killing wich might take
some time (depending how often the thread checks if it should stop.

ciao
Fawzi

Artur Skawina

2012-05-18 13:49:30 UTC

Post by Fawzi Mohamed
Now imagine you suspend or kill a thread while the spinlock to the
common pool of fat locks is being held in the kernel...

If a thread can be killed while in kernel mode and all kernel-acquired
resources aren't freed, then it's a kernel bug. Period.
If some /userspace/ system library does that kind of "optimizations",
that's a bit different, but still at least means a proper safe API is
missing.

He could setup a timer to periodically check for a 'kill-me' flag, but
that probably means relying on the user-code to not actively fight against
it (i don't know if it's an issue; as "VM" was mentioned maybe it isn't)
and if signals are allowed to interrupt the critical regions then that
alone won't be enough.

artur

Fawzi Mohamed

2012-05-18 14:43:41 UTC

On Fri, 18 May 2012 15:49:30 +0200

Post by Artur Skawina

Post by Fawzi Mohamed
Now imagine you suspend or kill a thread while the spinlock to the
common pool of fat locks is being held in the kernel...

I tend to agree in the case of the example I did (and I had the problem
with suspending, not sure what will happen if one kills it).
But I am not sure there is any standard agreed guarantee when killing
threads in non signal safe functions.
Collection of some resources is guanteed only for process killing.

Steven Schveighoffer

2012-08-23 12:28:46 UTC