[jruby-dev] Proposal for new Java dispatch rules

Discussion:

Charles Oliver Nutter

2009-09-21 16:37:23 UTC

Ok, we've been batting this around a bit and I wanted to write up some
doco on it. We're going to try to formalize Java dispatch for 1.4, so
there's an official definition of how it's supposed to work. This
should also finally address problematic dispatch cases like varargs
and numeric overloads.

*** There are some breaking changes here, so I recommend anyone using
Java integration read this through...

The basic idea is to make Java dispatch more Rubyish by adding some
duck-typing methods.

Java to Java dispatch follows rules specified in the Java Language
Specification, section 15.12.2. Dispatch proceeds through three phases
which just add more applicable methods to the dispatch rules: first
boxing/unboxing conversions, and then varargs methods. Our dispatch
will follow the same logic, with a twist: any given Ruby type could
potentially satisfy multiple target argument types (e.g. a Fixnum
coercing into both a Long or an Integer call).

Introducing 'coerce_to?'

In order to reconcile this twist, all coercible classes will implement
a coerce_to? method that takes a target Java type and returns a
numeric score for the coercion to that type. The score's scale has not
been decided, but for discussion purposes we'll consider 0-10 where 0
means "I can't coerce to that" and 10 means "that is my ideal coercion
target". In the case of Fixnum, the "ideal" target would be to do no
coercion at all and pass RubyFixnum as-is to the target method. This
would allow calling APIs that accept JRuby types directly, which is
not possible now. It would also change calling Object targets to pass
RubyFixnum as well; this is a potentially breaking change, so we
should discuss it. Fixnum also has other targets that are "natural"
conversions like int or Long. Additional coercion targets can be added
to any type's coerce_to? method, as long as you also add logic to
another method: to_java.

The to_java method performs the coercions supported by coerce_to?.
Again in the case of Fixnum, once a "best" target method has been
selected, the arguments are coerced via to_java, passing in the target
Java type desired. The resulting arguments are then passed to the
target Java method. By having this pair of coerce_to? and to_java on
every coercible type, any target method can be called. Because to_java
will be required to support coercion of any type that coerce_to?
accepts, you can now also pre-coerce objects to avoid re-coercing on
every call. But this also means that many cases will no longer
auto-coerce back to Ruby types, so you can expect them to stay
coerced. And this is another area we need to discuss.

Types from Java will not auto-coerce

In order to allow users to pre-coerce objects and to defer coercing
back to Ruby types, most return values will not coerce automatically
anymore. Values coming back as int or Int or Long, etc, will remain as
their Java types. Calling to_int or to_i or similar Ruby coercion
methods will coerce them back into Fixnum or Float, but it won't
happen automatically. Since most APIs that require a Fixnum attempt to
do a conversion already, either through to_i or to_int, this should
allow most APIs to continue to work just fine. The reason for ending
this auto-coercion is simple: it allows you to control that coercion
completely. In the case of Java Strings, we currently auto-coerce both
ways, making any Java String-returning APIs very expensive since they
must be coerced from a char[]-based String to a byte[]-based Ruby
String. With the auto-coercion change, they would remain a Java String
until you choose to coerce or an API you call calls to_str.

Now ideally you'd still be able to specify auto-coercions for Java
types re-entering Ruby, and so we're considering a "to_ruby" method
that, when present, is called to do the conversion, e.g. of a Java
String to a Ruby String. But it's questionable whether this should be
allowable globally...I haven't decided whether it's worth the impact,
since people will probably write most APIs expecting coercion or no
coercion, and allowing it to be overridden will almost always break
one or the other.

Wrapped Java types also follow rules

To keep coercion/dispatch uniform, Java types that have entered Ruby
will also implement coerce_to? and to_java using their real type. So a
pre-coerced Java String will have as its ideal coercion target Java
String (just passing as is) and as less-desirable targets the
supertypes of java.lang.String. This uniformity will also enable
another possibility: defining coercions for *Java* types if you know
there's a natural coercion target. For example, you might add coercion
logic to all java.util.List instances to coerce to Object[], something
Java normally does not do for you.

Performance

There will be a few additions to improve Ruby to Java performance as
well. The lack of auto-coercion is one, since String coercions are
very costly. But we will also have more aggressive caching of the Java
method selected for dispatch and in core class cases we'll avoid doing
the extra dispatching when it's not needed (i.e. use default behavior
directly when it hasn't been overridden).

Other improvements

In order to make reflective access to Java methods easier, we will
also add two methods to all Java objects: java_send and java_method.
java_send will be like a normal Ruby send, but it will accept an array
of the exact target method argument types you wish to invoke. This
will basically skip the normal method selection logic and go straight
to a specific one, so you can choose the "int" version of a method if
we normally would try to call the "long" version. And to avoid
repeatedly doing the search for a specific signature, java_method will
allow you to get a reference directly to a specific Java method
overload. Using java_send and java_method should allow access to any
Java method, something that's not always possible right now.

Example:

// Java
public class Overloads {
public void go(int i) { }
public void go(long i) { }
}

# Ruby
Overloads.new.go(123) # calls 'long' version, since it's the most
natural conversion for Fixnum)

Overloads.new.java_send(:go, [Java::int], 123) # calls "int" version explicitly

method = Overloads.new.java_method(:go, [Java::int])
method.call(123) # calls "int" version explicitly, with less overhead
than java_send

Other coercions based on Ruby conventions

I'm also kicking around the idea that to_ary and friends could be used
by the default coerce_to? and to_java to allow more Ruby types to
automatically enlist in coercions. For example, any type that supports
"to_ary" could additionally coerce to a java.util.List or Object[], by
using the Ruby coercion as an intermediate. This would allow Java
dispatch to also fit into standard Ruby coercion rules very cleanly.

So that's about it...thoughts?

- Charlie

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Logan Barnett

2009-09-21 20:28:33 UTC

Permalink

Post by Charles Oliver Nutter
Types from Java will not auto-coerce

This will prevent Monkeybars apps from upgrading without some big
changes. Since a Monkeybars app gets its own standalone JRuby, I'm not
too worried about changes that aren't backwards compatible. I'm sure
this will be similar for any app that must integrate heavily with Java.

Is there a way we can have a smart proxy vs. a dumb proxy? With the
talks I've done people seem mostly interested in how trivial it is to
bind to Java in JRuby. While having to do to_ruby or something similar
on any Java proxy isn't a ton of work, it's no longer as simple as
just dropping in the Java object and running with it.

Everything else seems great!

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Thomas E Enebo

2009-09-21 20:42:48 UTC

Permalink

Post by Charles Oliver Nutter
Types from Java will not auto-coerce

I think this is the most contentious aspect and we are meeting
tomorrow face-to-face to try and hammer out what will break and what
we can do to mitigate any breakage. Worst-case we just put the
contentious stuff into a flag as a way to let people test against what
will come in the future (and not release for 1.4). If we do a few
things, then things may not cause much/if-any breakage. We need to
cover more test cases.

If you can give us any test cases which you think this will break it
will be helpful in our discussions tomorrow...

Is there a way we can have a smart proxy vs. a dumb proxy? With the talks
I've done people seem mostly interested in how trivial it is to bind to Java
in JRuby. While having to do to_ruby or something similar on any Java proxy
isn't a ton of work, it's no longer as simple as just dropping in the Java
object and running with it.

We will brainstorm on this one a bit, but if we can make all Ruby core
types aware of common "it should work" Java types then we may not have
an actual need for explicit coercion. For example, if
RubyFixnum.op_plus(...) accepts a version which works with
java.lang.Number (and subtypes) then things will work for + out of the
box and not break older code.

It is possible to load a Java class as smart versus dumb as a solution
too. It could work. Is it a good idea? Is it confusing to have
different semantics for this mixed together (like two Java object one
with dumb and the other smart)? ...

-Tom

--
blog: http://blog.enebo.com twitter: tom_enebo
mail: tom.enebo-***@public.gmane.org

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Logan Barnett

2009-09-22 02:53:29 UTC

Permalink

Post by Thomas E Enebo
We will brainstorm on this one a bit, but if we can make all Ruby core
types aware of common "it should work" Java types then we may not have
an actual need for explicit coercion. For example, if
RubyFixnum.op_plus(...) accepts a version which works with
java.lang.Number (and subtypes) then things will work for + out of the
box and not break older code.

How does everyone feel about leveraging some more duck-typing in Ruby?
Maybe this was mentioned already, but I'll just repeat it if that's
the case:

I often have code that could be simplified to something like this:
yell_text_field.text = yell_text_field.text.upcase

I really don't care that it's a String, just that it behaves like a
String (:
I'm not sure if that's a lot of work or not, but if that can be
achieved then it really doesn't matter what's being passed around.
Maybe these objects would also respond to kind_of? in a cheating
manner (Java's String returns true when asked if kind_of? a Ruby
String), but I'm not convinced that's a super great idea.

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Charles Oliver Nutter

2009-09-22 12:49:51 UTC

Permalink

How does everyone feel about leveraging some more duck-typing in Ruby? Maybe
yell_text_field.text = yell_text_field.text.upcase
I really don't care that it's a String, just that it behaves like a String

In the current case, of course, yell_text_field.text auto-coerces back
to a Ruby String, which can then upcase and be re-coerced back to a
Java string for the assignment. The trouble there is that instead of
one new string being constructed (for .upcase) we actually construct
three (the other two being the coerced products).

In the proposed newer case, the .text call should just return a Java
String, which could certainly support all of Ruby String's
non-mutating operation. The result in this case would be that we only
create one new String, for .upcase, and pass everything else straight
through.

I'm not sure if that's a lot of work or not, but if that can be achieved
then it really doesn't matter what's being passed around. Maybe these
objects would also respond to kind_of? in a cheating manner (Java's String
returns true when asked if kind_of? a Ruby String), but I'm not convinced
that's a super great idea.

Yes, I'm not convinced it's a good idea either. The main problem I see
is that the Java String needs to reflect its own object hierarchy; it
needs to be < Java Object and CharSequence and so on. By trying to
force it to be kind_of? String we break its relationship with other
Java types.

- Charlie

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Charles Oliver Nutter

2009-09-21 21:11:51 UTC

Permalink

This will prevent Monkeybars apps from upgrading without some big changes. Since a Monkeybars app gets its own standalone JRuby, I'm not too worried about changes that aren't backwards compatible. I'm sure this will be similar for any app that must integrate heavily with Java.
Is there a way we can have a smart proxy vs. a dumb proxy? With the talks I've done people seem mostly interested in how trivial it is to bind to Java in JRuby. While having to do to_ruby or something similar on any Java proxy isn't a ton of work, it's no longer as simple as just dropping in the Java object and running with it.
Everything else seems great!

Yes, as Tom mentioned, this is the most visible and contentious aspect
of the proposal. It is obviously more standard to not automatically
coerce return values, since there's no analog for that in existing
Ruby dispatch logic, and since in some cases you can't even *get*
certain Java objects into Ruby right now. But it does potentially
break cases that expected numbers to come back as Fixnums, strings to
come back as String, and so on.

I have a prototype of this dispatch logic working, and the coerce_to?
and to_java pair really feels "right", but I have not gotten to cases
where there's coercible return values.

- Charlie

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Charles Oliver Nutter

2009-09-21 22:01:47 UTC

Permalink

Here's a description of the logic as it currently exists.

Java methods are pulled out of java.lang.Class and stuffed into the
proxy class in arity-specific groups. This means when calling with N
arguments, we only look for methods that take N arguments (hence the
missing varargs support).

If you are passing zero arguments and there's a single zero-argument
method, we call it without further processing. Likewise, if you are
calling with N arguments and there's only one N-argument overload, we
call that one and hope the types will coerce. This is also a source of
bad error messages, like "expected [java.lang.String], got:
[org.jruby.RubyFixnum]" which aren't very helpful.

When coercing, we call JavaUtil.convertArgumentToType, which is
*almost* the "to_java" in the new specification. It takes
ThreadContext, IRubyObject, and Class, where the Class is the target
Java type we want to try to convert to. convertArgumentToType has
several pieces of logic:

* If the argument is a JavaObject (the inner wrapper type) it tries to
coerce the value it wraps with coerceJavaObjectToType, which tries to
do a duck-typed conversion of procs to interfaces, or else just leaves
it as a JavaObject
* If the argument is a JavaProxy (the concrete supertype for all
wrapped Java objects in Ruby-land) it uses the real Java object it
wraps (without checking if it is of the appropriate type)
* If the argument's dataWrapStruct is a JavaObject, it uses the
unwrapped value without futher processing (differing from the first
JavaObject case above)
* Otherwise, it falls back on a hardcoded set of coercions for core types

The hardcoded coercion logic uses arg.getMetaClass().index
(ClassIndex) as follows:

* for NilClass, use the result of coerceNilToType
* for Fixnum, Bignum, and Float, use getNumericConverter(target
class).coerce(arg, target)
* for String, use coerceStringToType
* for TrueClass, return Boolean.TRUE
* for FalseClass, return Boolean.FALSE
* for Time, return ((RubyTime)arg).getJavaDate()
* for all others, call coerceOtherToType

The coerceOtherToType method tries to do a duck-typing conversion of
procs to interface impls, and otherwise tries to call to_java_object.

In short, it's a big old mess. All this basically fits into the new
logic, but it's sprinkled all over the codebase. That makes it harder
to unravel and recompose it with the new protocol, but it needs to be
done anyway...

- Charlie

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Yoko Harada

2009-09-23 15:44:08 UTC

Permalink

On Mon, Sep 21, 2009 at 12:37 PM, Charles Oliver Nutter

Post by Charles Oliver Nutter
Introducing 'coerce_to?'
In order to reconcile this twist, all coercible classes will implement
a coerce_to? method that takes a target Java type and returns a
numeric score for the coercion to that type. The score's scale has not
been decided, but for discussion purposes we'll consider 0-10 where 0
means "I can't coerce to that" and 10 means "that is my ideal coercion
target". In the case of Fixnum, the "ideal" target would be to do no
coercion at all and pass RubyFixnum as-is to the target method. This
would allow calling APIs that accept JRuby types directly, which is
not possible now. It would also change calling Object targets to pass
RubyFixnum as well; this is a potentially breaking change, so we
should discuss it. Fixnum also has other targets that are "natural"
conversions like int or Long. Additional coercion targets can be added
to any type's coerce_to? method, as long as you also add logic to
another method: to_java.

Let me clarify. There are two types of variables when users use JRuby:
Java originated variables used in Ruby, and Ruby originated variables
used in Java. The new coerce_to? method is for the latter one, Ruby
originated variables used in Java, right? Currently, JRuby Embed uses
JavaEmbedUtils#rubyToJava method to convert types of all variables
used in Ruby so that Java program can use those in it. Will the
coerce_to? affect rubyToJava method?

Post by Charles Oliver Nutter
Types from Java will not auto-coerce
In order to allow users to pre-coerce objects and to defer coercing
back to Ruby types, most return values will not coerce automatically
anymore. Values coming back as int or Int or Long, etc, will remain as
their Java types. Calling to_int or to_i or similar Ruby coercion
methods will coerce them back into Fixnum or Float, but it won't
happen automatically. Since most APIs that require a Fixnum attempt to
do a conversion already, either through to_i or to_int, this should
allow most APIs to continue to work just fine. The reason for ending
this auto-coercion is simple: it allows you to control that coercion
completely. In the case of Java Strings, we currently auto-coerce both
ways, making any Java String-returning APIs very expensive since they
must be coerced from a char[]-based String to a byte[]-based Ruby
String. With the auto-coercion change, they would remain a Java String
until you choose to coerce or an API you call calls to_str.

I think this change is influential to older codes. If users specify
Ruby types when giving variables from Java to Ruby, will it covers the
change? For example, by new method, "javaToRuby(Ruby runtime, Object
value, Class rubyType)" whereas we have "javaToRuby(Ruby runtime,
Object value)" only right now.

Post by Charles Oliver Nutter
// Java
public class Overloads {
public void go(int i) { }
public void go(long i) { }
}
# Ruby
Overloads.new.go(123) # calls 'long' version, since it's the most
natural conversion for Fixnum)
Overloads.new.java_send(:go, [Java::int], 123) # calls "int" version explicitly
method = Overloads.new.java_method(:go, [Java::int])
method.call(123) # calls "int" version explicitly, with less overhead
than java_send

This is nice feature. Then, how about interface implementation by
Ruby? Like in the below?

// Java
public interface Overloads {
public void go(int i);
public void go(long i);
}

// Ruby
class Overloads
include Java::Overloads
def go(i, [Java::int])
# implementation for int
end
def go(i, [Java::long])
#i implementation for long
end
end
Overloads.new

If this is possible, it would be useful when we implement interfaces
defined in some specification.

-Yoko

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Charles Oliver Nutter

2009-09-23 21:35:16 UTC

Permalink

Post by Yoko Harada
Java originated variables used in Ruby, and Ruby originated variables
used in Java. The new coerce_to? method is for the latter one, Ruby
originated variables used in Java, right? Currently, JRuby Embed uses
JavaEmbedUtils#rubyToJava method to convert types of all variables
used in Ruby so that Java program can use those in it. Will the
coerce_to? affect rubyToJava method?

Well sort of. This is for calling Java methods from Ruby and figuring
out based on the target object's class and the incoming arguments
which method to call and how to get the arguments converted to those
target argument types.

Post by Yoko Harada
I think this change is influential to older codes. If users specify
Ruby types when giving variables from Java to Ruby, will it covers the
change? For example, by new method, "javaToRuby(Ruby runtime, Object
value, Class rubyType)" whereas we have "javaToRuby(Ruby runtime,
Object value)" only right now.

This is still all on the Ruby side...but the implication would be that
when you do something like

str = java.lang.System.get_property('foo')

The 'foo' would coerce to a Java string on the way in, but the return
value would not auto-coerce back to a Ruby String. It would ideally
look and feel like a Ruby String and have all the expected
non-mutative methods working fine, and you could call to_s or to_str
to actually force it to become a Ruby String. The intention here is
that whereever possible we could just leave objects as-is and not pay
the coercion cost until the user actually wants to pay it, rather than
forcing strings to coerce on every call.

Post by Yoko Harada
This is nice feature. Then, how about interface implementation by
Ruby? Like in the below?
// Java
public interface Overloads {
public void go(int i);
public void go(long i);
}
// Ruby
class Overloads
include Java::Overloads
def go(i, [Java::int])
# implementation for int
end
def go(i, [Java::long])
#i implementation for long
end
end
Overloads.new
If this is possible, it would be useful when we implement interfaces
defined in some specification.

This is definitely going to be supported in some version of the JRuby
compiler, both for runtime-created classes and for ahead-of-time
compiled code. It will probably look something like this, but it's
still work in process. You could look at ruby2java gem for now to see
one possible way of specifying interface impl and signatures.

- Charlie

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Logan Barnett

2009-09-24 14:14:59 UTC

Permalink

Post by Charles Oliver Nutter
str = java.lang.System.get_property('foo')
The 'foo' would coerce to a Java string on the way in, but the return
value would not auto-coerce back to a Ruby String. It would ideally
look and feel like a Ruby String and have all the expected
non-mutative methods working fine, and you could call to_s or to_str
to actually force it to become a Ruby String. The intention here is
that whereever possible we could just leave objects as-is and not pay
the coercion cost until the user actually wants to pay it, rather than
forcing strings to coerce on every call.

This works for me! It cause a few bugs but nothing we can't fix
quickly while moving forward. Thanks guys!

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email