Subramanya Sastry
2009-07-25 18:26:20 UTC
I have been trying to lay down clearly all the dynamic features of Ruby so
that I understand this better. Please add/correct if I have understood this
incorrectly
1. Open classes: This is the well known case where you can modify classes,
add methods, redefine methods at runtime. This effectively means that
method calls cannot be resolved at compile time. The optimization for
improved performance is to optimistically assume closed classes but then
have solid mechanisms to back out in some way (either compile-time guards or
run-time invalidation detection & invalidation).
2. Duck typing: This is also the well known case where you need not have
fixed types for method arguments as long as the argument objects can respond
to a message (method call) and meet the message contract at the time of the
invocation (this could include meeting the contract via dynamic code
generation via method_missing?). This means that you cannot statically bind
method names to static methods. The optimization for improved performance
is to rely on profiling and inline caching.
3. Closures: This is where you can create code blocks, store them, pass them
around, and invoke them. Supporting this requires allocating heap frames
that captures the ennvironment and keeps it around for later. The
optimization for improved performance includes (a) lazy frame allocation
(on-demand on call paths where closures are encountered) (b) only allocating
frame space for variables that might be accessed later (in some cases, this
means all variables) (c) inlining the target method and the closure and
eliminating the closure altogether [ using a technique in one of my early ir
emails ] (d) special case optimizations like the cases charlie and yehuda
have identified.
4. Dynamic dispatch: This is where you use "send" to send method messages.
You can get improved performance by profiling and inline caching techniques.
5. Dynamic code gen: This is the various forms of eval. This means that
eval calls are hard boundaries for optimization since they can modify the
execution context of the currently executing code. There is no clear way I
can think of at this time of getting around the performance penalties
associated with it. But, I can imagine special case optimizations including
analyzing the target string, where it is known, and where the binding
context is local.
6. Dynamic/Late binding: This is where the execution context comes from an
explicit binding argument (proc, binding, closure). This is something I was
not aware of till recently.
Many performance problems and optimization barriers come about because of a
combination of these techniques.
Consider this code snippet:
-----
def foo(m,expr)
a = 1
b = 2
m.send(m, expr)
puts "b is #{b}"
end
foo("puts", "b=a+3") # outputs b=a+3\n b is 2
foo("eval", "b=a+3") # outputs b is 4
-----
This code snippet combines dynamic dispatch and dynamic code gen (send +
eval). The net effect is that all sends where the target cannot be
determined at compile time become hard optimization barriers just like
eval. Before the send you have to dump all live variables to stack/heap,
and after the send, you have to restore them back from the stack/heap. In
addition, you also have to restore all additional variables that the eval
might have created on the stack/heap.
One way around is to use different code paths based on checking whether the
send target is eval or not.
Now, consider this code snippet:
------
def foo(n,x)
proc do
n+1
end
end
def bar(i)
proc do
t = foo(i, "hello")
send("eval", "puts x, n", t)
end
end
delayed_eval_procs = (1..10).collect { |i| bar(i) }
... go round the world, do things, and come back ...
delayed_eval_procs.each { |p| p.call }
------
This is a contrived example, but basically this means you have to keep
around frames for long times till they are GCed. In this case
delayed_eval_procs keeps around a live ref to the 20 frames created by foo
and bar.
While the examples here are contrived, since there is no way to "ban" them
from ruby, the compilation strategies have to be robust enough to be
correct.
I haven't invested aliasing yet ... but, I suspect they introduce further
challenges.
Subbu.
that I understand this better. Please add/correct if I have understood this
incorrectly
1. Open classes: This is the well known case where you can modify classes,
add methods, redefine methods at runtime. This effectively means that
method calls cannot be resolved at compile time. The optimization for
improved performance is to optimistically assume closed classes but then
have solid mechanisms to back out in some way (either compile-time guards or
run-time invalidation detection & invalidation).
2. Duck typing: This is also the well known case where you need not have
fixed types for method arguments as long as the argument objects can respond
to a message (method call) and meet the message contract at the time of the
invocation (this could include meeting the contract via dynamic code
generation via method_missing?). This means that you cannot statically bind
method names to static methods. The optimization for improved performance
is to rely on profiling and inline caching.
3. Closures: This is where you can create code blocks, store them, pass them
around, and invoke them. Supporting this requires allocating heap frames
that captures the ennvironment and keeps it around for later. The
optimization for improved performance includes (a) lazy frame allocation
(on-demand on call paths where closures are encountered) (b) only allocating
frame space for variables that might be accessed later (in some cases, this
means all variables) (c) inlining the target method and the closure and
eliminating the closure altogether [ using a technique in one of my early ir
emails ] (d) special case optimizations like the cases charlie and yehuda
have identified.
4. Dynamic dispatch: This is where you use "send" to send method messages.
You can get improved performance by profiling and inline caching techniques.
5. Dynamic code gen: This is the various forms of eval. This means that
eval calls are hard boundaries for optimization since they can modify the
execution context of the currently executing code. There is no clear way I
can think of at this time of getting around the performance penalties
associated with it. But, I can imagine special case optimizations including
analyzing the target string, where it is known, and where the binding
context is local.
6. Dynamic/Late binding: This is where the execution context comes from an
explicit binding argument (proc, binding, closure). This is something I was
not aware of till recently.
Many performance problems and optimization barriers come about because of a
combination of these techniques.
Consider this code snippet:
-----
def foo(m,expr)
a = 1
b = 2
m.send(m, expr)
puts "b is #{b}"
end
foo("puts", "b=a+3") # outputs b=a+3\n b is 2
foo("eval", "b=a+3") # outputs b is 4
-----
This code snippet combines dynamic dispatch and dynamic code gen (send +
eval). The net effect is that all sends where the target cannot be
determined at compile time become hard optimization barriers just like
eval. Before the send you have to dump all live variables to stack/heap,
and after the send, you have to restore them back from the stack/heap. In
addition, you also have to restore all additional variables that the eval
might have created on the stack/heap.
One way around is to use different code paths based on checking whether the
send target is eval or not.
Now, consider this code snippet:
------
def foo(n,x)
proc do
n+1
end
end
def bar(i)
proc do
t = foo(i, "hello")
send("eval", "puts x, n", t)
end
end
delayed_eval_procs = (1..10).collect { |i| bar(i) }
... go round the world, do things, and come back ...
delayed_eval_procs.each { |p| p.call }
------
This is a contrived example, but basically this means you have to keep
around frames for long times till they are GCed. In this case
delayed_eval_procs keeps around a live ref to the 20 frames created by foo
and bar.
While the examples here are contrived, since there is no way to "ban" them
from ruby, the compilation strategies have to be robust enough to be
correct.
I haven't invested aliasing yet ... but, I suspect they introduce further
challenges.
Subbu.