Charles Oliver Nutter
2009-09-21 16:37:23 UTC
Ok, we've been batting this around a bit and I wanted to write up some
doco on it. We're going to try to formalize Java dispatch for 1.4, so
there's an official definition of how it's supposed to work. This
should also finally address problematic dispatch cases like varargs
and numeric overloads.
*** There are some breaking changes here, so I recommend anyone using
Java integration read this through...
The basic idea is to make Java dispatch more Rubyish by adding some
duck-typing methods.
Java to Java dispatch follows rules specified in the Java Language
Specification, section 15.12.2. Dispatch proceeds through three phases
which just add more applicable methods to the dispatch rules: first
boxing/unboxing conversions, and then varargs methods. Our dispatch
will follow the same logic, with a twist: any given Ruby type could
potentially satisfy multiple target argument types (e.g. a Fixnum
coercing into both a Long or an Integer call).
Introducing 'coerce_to?'
In order to reconcile this twist, all coercible classes will implement
a coerce_to? method that takes a target Java type and returns a
numeric score for the coercion to that type. The score's scale has not
been decided, but for discussion purposes we'll consider 0-10 where 0
means "I can't coerce to that" and 10 means "that is my ideal coercion
target". In the case of Fixnum, the "ideal" target would be to do no
coercion at all and pass RubyFixnum as-is to the target method. This
would allow calling APIs that accept JRuby types directly, which is
not possible now. It would also change calling Object targets to pass
RubyFixnum as well; this is a potentially breaking change, so we
should discuss it. Fixnum also has other targets that are "natural"
conversions like int or Long. Additional coercion targets can be added
to any type's coerce_to? method, as long as you also add logic to
another method: to_java.
The to_java method performs the coercions supported by coerce_to?.
Again in the case of Fixnum, once a "best" target method has been
selected, the arguments are coerced via to_java, passing in the target
Java type desired. The resulting arguments are then passed to the
target Java method. By having this pair of coerce_to? and to_java on
every coercible type, any target method can be called. Because to_java
will be required to support coercion of any type that coerce_to?
accepts, you can now also pre-coerce objects to avoid re-coercing on
every call. But this also means that many cases will no longer
auto-coerce back to Ruby types, so you can expect them to stay
coerced. And this is another area we need to discuss.
Types from Java will not auto-coerce
In order to allow users to pre-coerce objects and to defer coercing
back to Ruby types, most return values will not coerce automatically
anymore. Values coming back as int or Int or Long, etc, will remain as
their Java types. Calling to_int or to_i or similar Ruby coercion
methods will coerce them back into Fixnum or Float, but it won't
happen automatically. Since most APIs that require a Fixnum attempt to
do a conversion already, either through to_i or to_int, this should
allow most APIs to continue to work just fine. The reason for ending
this auto-coercion is simple: it allows you to control that coercion
completely. In the case of Java Strings, we currently auto-coerce both
ways, making any Java String-returning APIs very expensive since they
must be coerced from a char[]-based String to a byte[]-based Ruby
String. With the auto-coercion change, they would remain a Java String
until you choose to coerce or an API you call calls to_str.
Now ideally you'd still be able to specify auto-coercions for Java
types re-entering Ruby, and so we're considering a "to_ruby" method
that, when present, is called to do the conversion, e.g. of a Java
String to a Ruby String. But it's questionable whether this should be
allowable globally...I haven't decided whether it's worth the impact,
since people will probably write most APIs expecting coercion or no
coercion, and allowing it to be overridden will almost always break
one or the other.
Wrapped Java types also follow rules
To keep coercion/dispatch uniform, Java types that have entered Ruby
will also implement coerce_to? and to_java using their real type. So a
pre-coerced Java String will have as its ideal coercion target Java
String (just passing as is) and as less-desirable targets the
supertypes of java.lang.String. This uniformity will also enable
another possibility: defining coercions for *Java* types if you know
there's a natural coercion target. For example, you might add coercion
logic to all java.util.List instances to coerce to Object[], something
Java normally does not do for you.
Performance
There will be a few additions to improve Ruby to Java performance as
well. The lack of auto-coercion is one, since String coercions are
very costly. But we will also have more aggressive caching of the Java
method selected for dispatch and in core class cases we'll avoid doing
the extra dispatching when it's not needed (i.e. use default behavior
directly when it hasn't been overridden).
Other improvements
In order to make reflective access to Java methods easier, we will
also add two methods to all Java objects: java_send and java_method.
java_send will be like a normal Ruby send, but it will accept an array
of the exact target method argument types you wish to invoke. This
will basically skip the normal method selection logic and go straight
to a specific one, so you can choose the "int" version of a method if
we normally would try to call the "long" version. And to avoid
repeatedly doing the search for a specific signature, java_method will
allow you to get a reference directly to a specific Java method
overload. Using java_send and java_method should allow access to any
Java method, something that's not always possible right now.
Example:
// Java
public class Overloads {
public void go(int i) { }
public void go(long i) { }
}
# Ruby
Overloads.new.go(123) # calls 'long' version, since it's the most
natural conversion for Fixnum)
Overloads.new.java_send(:go, [Java::int], 123) # calls "int" version explicitly
method = Overloads.new.java_method(:go, [Java::int])
method.call(123) # calls "int" version explicitly, with less overhead
than java_send
Other coercions based on Ruby conventions
I'm also kicking around the idea that to_ary and friends could be used
by the default coerce_to? and to_java to allow more Ruby types to
automatically enlist in coercions. For example, any type that supports
"to_ary" could additionally coerce to a java.util.List or Object[], by
using the Ruby coercion as an intermediate. This would allow Java
dispatch to also fit into standard Ruby coercion rules very cleanly.
So that's about it...thoughts?
- Charlie
---------------------------------------------------------------------
To unsubscribe from this list, please visit:
http://xircles.codehaus.org/manage_email
doco on it. We're going to try to formalize Java dispatch for 1.4, so
there's an official definition of how it's supposed to work. This
should also finally address problematic dispatch cases like varargs
and numeric overloads.
*** There are some breaking changes here, so I recommend anyone using
Java integration read this through...
The basic idea is to make Java dispatch more Rubyish by adding some
duck-typing methods.
Java to Java dispatch follows rules specified in the Java Language
Specification, section 15.12.2. Dispatch proceeds through three phases
which just add more applicable methods to the dispatch rules: first
boxing/unboxing conversions, and then varargs methods. Our dispatch
will follow the same logic, with a twist: any given Ruby type could
potentially satisfy multiple target argument types (e.g. a Fixnum
coercing into both a Long or an Integer call).
Introducing 'coerce_to?'
In order to reconcile this twist, all coercible classes will implement
a coerce_to? method that takes a target Java type and returns a
numeric score for the coercion to that type. The score's scale has not
been decided, but for discussion purposes we'll consider 0-10 where 0
means "I can't coerce to that" and 10 means "that is my ideal coercion
target". In the case of Fixnum, the "ideal" target would be to do no
coercion at all and pass RubyFixnum as-is to the target method. This
would allow calling APIs that accept JRuby types directly, which is
not possible now. It would also change calling Object targets to pass
RubyFixnum as well; this is a potentially breaking change, so we
should discuss it. Fixnum also has other targets that are "natural"
conversions like int or Long. Additional coercion targets can be added
to any type's coerce_to? method, as long as you also add logic to
another method: to_java.
The to_java method performs the coercions supported by coerce_to?.
Again in the case of Fixnum, once a "best" target method has been
selected, the arguments are coerced via to_java, passing in the target
Java type desired. The resulting arguments are then passed to the
target Java method. By having this pair of coerce_to? and to_java on
every coercible type, any target method can be called. Because to_java
will be required to support coercion of any type that coerce_to?
accepts, you can now also pre-coerce objects to avoid re-coercing on
every call. But this also means that many cases will no longer
auto-coerce back to Ruby types, so you can expect them to stay
coerced. And this is another area we need to discuss.
Types from Java will not auto-coerce
In order to allow users to pre-coerce objects and to defer coercing
back to Ruby types, most return values will not coerce automatically
anymore. Values coming back as int or Int or Long, etc, will remain as
their Java types. Calling to_int or to_i or similar Ruby coercion
methods will coerce them back into Fixnum or Float, but it won't
happen automatically. Since most APIs that require a Fixnum attempt to
do a conversion already, either through to_i or to_int, this should
allow most APIs to continue to work just fine. The reason for ending
this auto-coercion is simple: it allows you to control that coercion
completely. In the case of Java Strings, we currently auto-coerce both
ways, making any Java String-returning APIs very expensive since they
must be coerced from a char[]-based String to a byte[]-based Ruby
String. With the auto-coercion change, they would remain a Java String
until you choose to coerce or an API you call calls to_str.
Now ideally you'd still be able to specify auto-coercions for Java
types re-entering Ruby, and so we're considering a "to_ruby" method
that, when present, is called to do the conversion, e.g. of a Java
String to a Ruby String. But it's questionable whether this should be
allowable globally...I haven't decided whether it's worth the impact,
since people will probably write most APIs expecting coercion or no
coercion, and allowing it to be overridden will almost always break
one or the other.
Wrapped Java types also follow rules
To keep coercion/dispatch uniform, Java types that have entered Ruby
will also implement coerce_to? and to_java using their real type. So a
pre-coerced Java String will have as its ideal coercion target Java
String (just passing as is) and as less-desirable targets the
supertypes of java.lang.String. This uniformity will also enable
another possibility: defining coercions for *Java* types if you know
there's a natural coercion target. For example, you might add coercion
logic to all java.util.List instances to coerce to Object[], something
Java normally does not do for you.
Performance
There will be a few additions to improve Ruby to Java performance as
well. The lack of auto-coercion is one, since String coercions are
very costly. But we will also have more aggressive caching of the Java
method selected for dispatch and in core class cases we'll avoid doing
the extra dispatching when it's not needed (i.e. use default behavior
directly when it hasn't been overridden).
Other improvements
In order to make reflective access to Java methods easier, we will
also add two methods to all Java objects: java_send and java_method.
java_send will be like a normal Ruby send, but it will accept an array
of the exact target method argument types you wish to invoke. This
will basically skip the normal method selection logic and go straight
to a specific one, so you can choose the "int" version of a method if
we normally would try to call the "long" version. And to avoid
repeatedly doing the search for a specific signature, java_method will
allow you to get a reference directly to a specific Java method
overload. Using java_send and java_method should allow access to any
Java method, something that's not always possible right now.
Example:
// Java
public class Overloads {
public void go(int i) { }
public void go(long i) { }
}
# Ruby
Overloads.new.go(123) # calls 'long' version, since it's the most
natural conversion for Fixnum)
Overloads.new.java_send(:go, [Java::int], 123) # calls "int" version explicitly
method = Overloads.new.java_method(:go, [Java::int])
method.call(123) # calls "int" version explicitly, with less overhead
than java_send
Other coercions based on Ruby conventions
I'm also kicking around the idea that to_ary and friends could be used
by the default coerce_to? and to_java to allow more Ruby types to
automatically enlist in coercions. For example, any type that supports
"to_ary" could additionally coerce to a java.util.List or Object[], by
using the Ruby coercion as an intermediate. This would allow Java
dispatch to also fit into standard Ruby coercion rules very cleanly.
So that's about it...thoughts?
- Charlie
---------------------------------------------------------------------
To unsubscribe from this list, please visit:
http://xircles.codehaus.org/manage_email