Discussion:
Hash clash in polymorphic variants
Jon Harrop
2008-01-10 17:09:13 UTC
Permalink
ISTR advice that constructors sharing the first few characters should be
avoided in order to reduce the likelihood of clashing hash values for
polymorphic variants. Is that right?
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Eric Cooper
2008-01-10 20:35:34 UTC
Permalink
Post by Jon Harrop
ISTR advice that constructors sharing the first few characters should be
avoided in order to reduce the likelihood of clashing hash values for
polymorphic variants. Is that right?
I don't think it's worth worrying about.

I wrote a program a while ago to look into this. I never saw any
"human-sensible" collisions (between two identifiers that a person
might have chosen). And if you're producing gensyms in a program, you
can just check ahead of time.

To find a collision with a given identifier, consider each bignum N
that differs by a multiple of 2^31 from the identifier's hash value.
Compute the radix-223 representation of N. If that forms a legal
OCaml identifier, then you've found a collision.

For example, Eric_Cooper collides with azdwbie, c7diagq, hlChrkt,
NSaServ, and SaupDOF, to pick just a few.
--
Eric (call me SaupDOF) Cooper e c c @ c m u . e d u

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Jon Harrop
2008-01-10 21:24:26 UTC
Permalink
This post might be inappropriate. Click to display it.
David Allsopp
2008-01-10 21:40:38 UTC
Permalink
Post by Jon Harrop
Post by Eric Cooper
Post by Jon Harrop
ISTR advice that constructors sharing the first few characters should
be avoided in order to reduce the likelihood of clashing hash values
for polymorphic variants. Is that right?
I don't think it's worth worrying about.
I wrote a program a while ago to look into this. I never saw any
"human-sensible" collisions (between two identifiers that a person
might have chosen). And if you're producing gensyms in a program, you
can just check ahead of time.
I'm interested in automatically translating the GL_* enum from OpenGL into
polymorphic variants. So although it is generated code I have little
control over it, e.g. I cannot change the translation as OpenGL gets
extended because
code will already be using the existing names.
Still, maybe I'm over-reacting. ;-)
I presume you're worried about the bindings clashing internally rather than
someone who uses the library happening to use a variant that clashes?

You can do something about it - when you're generating your bindings, you
can use the hash_variant() C function to detect the collisions yourself. If
you detect one, you can either issue *your own* warning while generating the
bindings allowing you to specify specific renaming for the program
generating your bindings or you could append digits to the names until the
collisions disappear (which is likely, though not guaranteed, to happen
quickly).

It's slightly ugly, but then the possibility of collisions in the first
place is IMHO ugly too!


David

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Kuba Ober
2008-01-11 13:30:29 UTC
Permalink
Post by David Allsopp
Post by Jon Harrop
Post by Eric Cooper
Post by Jon Harrop
ISTR advice that constructors sharing the first few characters should
be avoided in order to reduce the likelihood of clashing hash values
for polymorphic variants. Is that right?
I don't think it's worth worrying about.
I'm interested in automatically translating the GL_* enum from OpenGL into
polymorphic variants. So although it is generated code I have little
I presume you're worried about the bindings clashing internally rather than
someone who uses the library happening to use a variant that clashes?
You can do something about it - when you're generating your bindings, you
can use the hash_variant() C function to detect the collisions yourself. If
you detect one, you can either issue *your own* warning while generating
the bindings allowing you to specify specific renaming for the program
generating your bindings or you could append digits to the names until the
collisions disappear (which is likely, though not guaranteed, to happen
quickly).
It's slightly ugly, but then the possibility of collisions in the first
place is IMHO ugly too!
Are those collisions of any real importance? I mean, do they break anything?
If all they do is imply linearly searching a list of a few elements, for the
colliding entry, then it's a non-issue?

Cheers, Kuba

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Jon Harrop
2008-01-11 13:48:15 UTC
Permalink
Post by Kuba Ober
Are those collisions of any real importance? I mean, do they break
anything? If all they do is imply linearly searching a list of a few
elements, for the colliding entry, then it's a non-issue?
It would prevent code from compiling so it would be a complete show-stopper.

In this case, there is a chance that a hash clash in names that I have no
control over would break my OpenGL bindings at some point in the future.

A theoretical solution would be to grow the bindings and avoid clashes in
identifiers included in later versions of OpenGL by adding random suffixes.
Although this works in theory, in practice it places the burden of a linear
search on the programmer who must then sift through the bindings to find out
if the identifier they want to use happens to have had an internal clash in
my bindings and, therefore, would require them to use a different identifier.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Kuba Ober
2008-01-11 16:14:03 UTC
Permalink
Post by Jon Harrop
Post by Kuba Ober
Are those collisions of any real importance? I mean, do they break
anything? If all they do is imply linearly searching a list of a few
elements, for the colliding entry, then it's a non-issue?
It would prevent code from compiling so it would be a complete
show-stopper.
So what you're saying is that the implementation uses the hash with bucket
size of 1? That's kinda poor decision, methinks.

Maybe perfect hashes should be used, computed at link time (and at runtime
whenever a module is linked in). The pefect hashing function could probably
implement some sort of a table, so that no real code would need to be
generated, just recomputing of decision tree table. Gperf code could be
adapted for that. The benefit is that there would be no collisions, the hashed
data structure would be very compact, and the cost to regenerate the hash is
amortized. Ideally, one would generate the actual perfect hashing function,
but this is currently only possible in bytecode, right? I mean, toplevel won't
run in native code? Or am I mistaken?

Kuba

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
David Allsopp
2008-01-11 18:40:53 UTC
Permalink
Post by Kuba Ober
Post by Jon Harrop
Post by Kuba Ober
Are those collisions of any real importance? I mean, do they break
anything? If all they do is imply linearly searching a list of a few
elements, for the colliding entry, then it's a non-issue?
It would prevent code from compiling so it would be a complete
show-stopper.
So what you're saying is that the implementation uses the hash with bucket
size of 1? That's kinda poor decision, methinks.
I think you're missing the context - there's no hash table. See 18.3.6 in
the manual - the hashed values (and resulting collisions) are to do with the
internal representation of polymorphic variants.

The compiler cannot process code that uses two polymorphic variants whose
tag names will have the same internal representation (and therefore be
incorrectly viewed as having the same value). The test is probably performed
somewhere in the type checker...

An alternative implementation might have been to lookup the tags (in a
perfect hash table) using a system similar to caml_named_value but I imagine
that the present method was preferred because it's simpler (and quite
possibly faster) and collisions are rare (as Eric pointed out) - although in
Jon's case the lack of a guarantee is unfortunate.

Incidentally, and off-the-subject here, using a hash table with a bucket
size of 1 is very important if you need performance guarantees on your hash
table and have some other way of coping with collisions.


David

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Kuba Ober
2008-01-14 12:20:01 UTC
Permalink
Post by David Allsopp
Post by Kuba Ober
Post by Jon Harrop
Post by Kuba Ober
Are those collisions of any real importance? I mean, do they break
anything? If all they do is imply linearly searching a list of a few
elements, for the colliding entry, then it's a non-issue?
It would prevent code from compiling so it would be a complete
show-stopper.
So what you're saying is that the implementation uses the hash with bucket
size of 1? That's kinda poor decision, methinks.
I think you're missing the context - there's no hash table. See 18.3.6 in
the manual - the hashed values (and resulting collisions) are to do with
the internal representation of polymorphic variants.
The compiler cannot process code that uses two polymorphic variants whose
tag names will have the same internal representation (and therefore be
incorrectly viewed as having the same value). The test is probably
performed somewhere in the type checker...
Yeah, I sort of put the wagon ahead of the horse. Of course the hashing
function doesn't imply a hash table.

What I meant was simply that instead of using some fixed hash function, one
could use a perfect hashing function which is optimal for its known set of
inputs, and won't ever generate a collision.

The tables that such a function uses to hash its input have to be generated at
link-time, which means run-time too.

Cheers, Kuba

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Stefan Monnier
2008-01-14 14:44:58 UTC
Permalink
Post by Kuba Ober
What I meant was simply that instead of using some fixed hash function, one
could use a perfect hashing function which is optimal for its known set of
inputs, and won't ever generate a collision.
The problem is that the set of inputs is not know at compile time, only
at link time.


Stefan

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Kuba Ober
2008-01-14 14:56:25 UTC
Permalink
Post by Stefan Monnier
Post by Kuba Ober
What I meant was simply that instead of using some fixed hash function,
one could use a perfect hashing function which is optimal for its known
set of inputs, and won't ever generate a collision.
The problem is that the set of inputs is not know at compile time, only
at link time.
As I've said in the cited post, the perfect hash generator would have to be
invoked at link time, which shouldn't be a big deal.

Cheers, Kuba

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
David Allsopp
2008-01-14 15:37:50 UTC
Permalink
Post by Kuba Ober
Post by Stefan Monnier
Post by Kuba Ober
What I meant was simply that instead of using some fixed hash
function, one could use a perfect hashing function which is optimal
for its known set of inputs, and won't ever generate a collision.
The problem is that the set of inputs is not know at compile time, only
at link time.
As I've said in the cited post, the perfect hash generator would have to
be invoked at link time, which shouldn't be a big deal.
Assuming you're talking hypothetically and designing a new runtime then,
yes, it's not a big deal.

However, this scheme could not just be dropped into the present system - it
would not work with dynamic linking because once you've hashed a polymorphic
variant tag-name you drop the name so you can't re-hash when you update your
perfect hashing function... unless you can devise a perfect hashing scheme
that hashes all the old keys to their old values and new ones to
non-clashing new values ;o)

Internally, `Foo is indistinguishable from the int 3505894* - so if
caml_hash_variant("Foo") suddenly changes value mid-program then any
previous instances of `Foo in memory cease to be equal to it!


David


* Try:
# (Obj.magic `Foo : int);;
- : int = 3505894
# (Obj.magic 3505894) = `Foo;;
- : bool = true

I don't know whether caml_hash_variant varies between version or even
platform so the actual number may be different on other systems.

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Kuba Ober
2008-01-14 15:44:05 UTC
Permalink
Post by David Allsopp
Post by Kuba Ober
Post by Stefan Monnier
Post by Kuba Ober
What I meant was simply that instead of using some fixed hash
function, one could use a perfect hashing function which is optimal
for its known set of inputs, and won't ever generate a collision.
The problem is that the set of inputs is not know at compile time, only
at link time.
As I've said in the cited post, the perfect hash generator would have to
be invoked at link time, which shouldn't be a big deal.
Assuming you're talking hypothetically and designing a new runtime then,
yes, it's not a big deal.
However, this scheme could not just be dropped into the present system - it
would not work with dynamic linking because once you've hashed a
polymorphic variant tag-name you drop the name so you can't re-hash when
you update your perfect hashing function...
A trivial solution to that is to keep both, as obviously each time an
equivalent of dlopen() is made, everything has to be rehashed. gperf
is "slightly" memory-hungry, so surely it'd need to be something using a
different algorithm. I'm talking hypothetically, but I also think it's a
weird design decision to use those possibly-colliding hashes. String
sorting/comparison isn't exactly a CPU killer, so couldn't the original names
have been used instead? I admit not to knowing too many details of the
current implementation of course ;(

Cheers, Kuba

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
David Allsopp
2008-01-14 16:03:35 UTC
Permalink
Post by Kuba Ober
A trivial solution to that is to keep both, as obviously each time an
equivalent of dlopen() is made, everything has to be rehashed. gperf
is "slightly" memory-hungry, so surely it'd need to be something using a
different algorithm. I'm talking hypothetically, but I also think it's a
weird design decision to use those possibly-colliding hashes.
I agree that it's a bit weird - but the clashes are very rare (and the
function was designed to keep them rare for "normal" usage).
Post by Kuba Ober
String sorting/comparison isn't exactly a CPU killer, so couldn't the
original names have been used instead?
String comparison is much slower than integer comparison... we're talking
about one CPU instruction compared to a for loop! Jon would never use them
again :o) Not to mention the storage overhead of keeping the tag names in
memory - not great if you've got long lists of `YetAnotherTag.


David

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Stefan Monnier
2008-01-14 15:45:18 UTC
Permalink
Post by Kuba Ober
Post by Stefan Monnier
Post by Kuba Ober
What I meant was simply that instead of using some fixed hash function,
one could use a perfect hashing function which is optimal for its known
set of inputs, and won't ever generate a collision.
The problem is that the set of inputs is not know at compile time, only
at link time.
As I've said in the cited post, the perfect hash generator would have to be
invoked at link time, which shouldn't be a big deal.
That would require postponing the execution of the hash-function to
link-time or run-time. Run-time is clearly undesirable, and link-time
adds yet-more complexity to the linker.

It's not a bad idea, obviously, but AFAICT the linker currently is kept
very simple.


Stefan

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Jacques Garrigue
2008-01-15 03:36:21 UTC
Permalink
Post by Kuba Ober
Post by Stefan Monnier
Post by Kuba Ober
What I meant was simply that instead of using some fixed hash function,
one could use a perfect hashing function which is optimal for its known
set of inputs, and won't ever generate a collision.
The problem is that the set of inputs is not know at compile time, only
at link time.
As I've said in the cited post, the perfect hash generator would have to be
invoked at link time, which shouldn't be a big deal.
Unfortunately, this would make marshalling between different programs
much more complicated...

Another advantage of knowing the hash function at compile time is
that you can generate efficient code for pattern matching. Since you
already know the ordering of tags, it is easy to generate a decision
tree. I didn't check very recently about efficiency for polymorphic
variants, but the depth of the decision tree is logarithmic in the
number of tags involved in the pattern matching, and if you can keep
it below 3 or 4 (about 10 tags) you can be actually faster than a
jump table.
Another comparison is with the old implementation for method calls.
Originally ocaml used your idea for methods: method hashes were
generated at initialization time. The scheme for dispatch was a two
level array, compressed by reusing buckets so that you don't use too
much memory. This meant actually 3 array accesses for a method call.
The current scheme reuses variant hashes, and implements a simple
dichotomic search, together with an index cache for each call site.
This doesn't look very efficient, but on small method tables, the
search is almost as fast as the old approach, and if the cache hits
this is much faster...

Now concerning the risks of name conflicts. The main point of
polymorphic variants is that there is only a conflict if the two tags
appear in the same type. And logically the type should stay small.
If you want to put all GLenum's inside the same type, then you may
well end up with conflicts. But what LablGL shows is that in practice
only a small number of tags are used together. So if you can partition
your set of tags so that each type has at most 64 tags, then you get
a probability conflict less than 1 per million for each type. This
seems safe enough. But if you have one type with 2000 tags, then the
probability is 1 per thousand. Not that much, but it can happen.
(p(n) is n*n / 2**32)

Jacques Garrigue

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Jon Harrop
2008-01-15 04:59:03 UTC
Permalink
Post by Jacques Garrigue
Unfortunately, this would make marshalling between different programs
much more complicated...
Do people marshal polymorphic variants between different programs?
Post by Jacques Garrigue
Another advantage of knowing the hash function at compile time is
that you can generate efficient code for pattern matching. Since you
already know the ordering of tags, it is easy to generate a decision
tree. I didn't check very recently about efficiency for polymorphic
variants, but the depth of the decision tree is logarithmic in the
number of tags involved in the pattern matching, and if you can keep
it below 3 or 4 (about 10 tags) you can be actually faster than a
jump table.
For 3-16 tags on AMD64, jump tables (ordinary variants) are 2x slower than
decision trees (polymorphic variants) when branches are taken at random.
However, jump tables are consistently up to 2x faster when a single branch is
taken repeatedly. So caching jump tables is more effective at run-time
optimizing pattern matches over ordinary variants than branch prediction is
at optimizing decision trees for pattern matches over polymorphic variants.

So the advantage of a decision tree is probably insignificant on real code
because it will lie between these two extremes.
Post by Jacques Garrigue
Now concerning the risks of name conflicts. The main point of
polymorphic variants is that there is only a conflict if the two tags
appear in the same type. And logically the type should stay small.
If you want to put all GLenum's inside the same type, then you may
well end up with conflicts. But what LablGL shows is that in practice
only a small number of tags are used together.
Can LablGL's design support OpenGL extensions?
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Jacques Garrigue
2008-01-15 09:01:42 UTC
Permalink
Post by Jon Harrop
Post by Jacques Garrigue
Unfortunately, this would make marshalling between different programs
much more complicated...
Do people marshal polymorphic variants between different programs?
Do people marshal data between different programs (or different
versions of the same program)?
Post by Jon Harrop
For 3-16 tags on AMD64, jump tables (ordinary variants) are 2x slower than
decision trees (polymorphic variants) when branches are taken at random.
However, jump tables are consistently up to 2x faster when a single branch is
taken repeatedly. So caching jump tables is more effective at run-time
optimizing pattern matches over ordinary variants than branch prediction is
at optimizing decision trees for pattern matches over polymorphic variants.
So the advantage of a decision tree is probably insignificant on real code
because it will lie between these two extremes.
Since the goal was never to be faster than ordinary variants, but just
obtain comparable speed, this seems good :-)
Post by Jon Harrop
Post by Jacques Garrigue
Now concerning the risks of name conflicts. The main point of
polymorphic variants is that there is only a conflict if the two tags
appear in the same type. And logically the type should stay small.
If you want to put all GLenum's inside the same type, then you may
well end up with conflicts. But what LablGL shows is that in practice
only a small number of tags are used together.
Can LablGL's design support OpenGL extensions?
I'm not sure what this means.
Since LablGL was coded by hand, adding extensions would mean modifying
it.
One might want to add a way to detect whether an extension is
available or not, but making it static does not seem a good idea: one
wouldn't even be able to compile code using an extension that is not
available.
Also, one might want to make code generation automatic, particularly
for C wrappers, to allow adding cases to functions easily. This should
be doable, but there is no infrastructure for that currently
(using CPP macros was simpler to start with...)

Jacques Garrigue

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Jon Harrop
2008-01-15 18:17:32 UTC
Permalink
Post by Jacques Garrigue
Post by Jon Harrop
Post by Jacques Garrigue
Unfortunately, this would make marshalling between different programs
much more complicated...
Do people marshal polymorphic variants between different programs?
Do people marshal data between different programs (or different
versions of the same program)?
I suspect OCaml's marshalling is used almost entirely between same versions of
the same programs.

In particular, I was advised against marshalling data between different
versions of the same program because this is unsafe (not just type safety but
the format used by Marshal is not ossified).
Post by Jacques Garrigue
Post by Jon Harrop
So the advantage of a decision tree is probably insignificant on real
code because it will lie between these two extremes.
Since the goal was never to be faster than ordinary variants, but just
obtain comparable speed, this seems good :-)
Yes. This would probably also work ok if you used a symbol table to store
exact identifier names rather than just a hash. The symbol's index in the
table would serve the same purpose as the hash.
Post by Jacques Garrigue
Post by Jon Harrop
Post by Jacques Garrigue
Now concerning the risks of name conflicts. The main point of
polymorphic variants is that there is only a conflict if the two tags
appear in the same type. And logically the type should stay small.
If you want to put all GLenum's inside the same type, then you may
well end up with conflicts. But what LablGL shows is that in practice
only a small number of tags are used together.
Can LablGL's design support OpenGL extensions?
I'm not sure what this means.
OpenGL has an extension mechanism that can be queried at run-time. If a given
extension is available then you can do things that you could not do before,
such as pass a GLenum to a function that might not have accepted it without
the extension.
Post by Jacques Garrigue
Since LablGL was coded by hand, adding extensions would mean modifying
it.
Exactly, that is a limitation of LablGL's design and, therefore, I think it is
was quite wrong of you to claim "LablGL shows is that in practice only a
small number of tags are used together" when LablGL's use of small, closed
sum types is actually a design limitation that would not be there if it
supported all of OpenGL, i.e. the extension mechanism.

Incidentally, Xavier made a statement based upon what appears to me to be a
similar logical error in the CUFP notes from last year that I read recently:

"On the other hand, certain features seem somewhat unsurprisingly to be
unimportant to industrial users. GUI toolkits are not an issue, because GUIs
tend to be built using more mainstream tools; it seems that different
competencies are involved in Caml and GUI development and companies "don't
want to squander their precious Caml expertise aligning pixels". Rich
libraries don't seem to matter in general; presumably companies are happy to
develop these in-house. And no-one wants yet another IDE; the applications of
interest are usually built using a variety of languages and tools anyway, so
consistency of development environment is a lost cause."
- http://cufp.galois.com/CUFP-2007-Report.pdf (page 3)

Xavier appears to have taken the biased sample of industrialists who already
use OCaml despite its limitations and has drawn the conclusion that these
limitations are not important to industrialists. I was really horrified to
see this because, in my experience, companies are turning away from OCaml in
droves because of exactly the limitations Xavier enumerated and I for one
would dearly love to see them fixed.

OCaml will continue to go from strength to strength regardless but its uptake
would be vastly faster if these problems are addressed. To take them point by
point:

. GUIs are incredibly important (LablGTK is the world's favorite OCaml
library!) and tens of thousands of OCaml programmers are crying out for
proper LablGTK documentation as a first priority, many of whom are in
industry.

. Rich libraries are incredibly important and OCaml has the potential to
become a hugely successful commercial platform where people can buy and sell
cross-platform libraries but OCaml needs support for shared run-time DLLs (or
something equivalent) this before this can happen.

. An easy-to-use IDE would be an excellent way to kick-start people learning
OCaml even if an industrial-strength IDE is intractable.
Post by Jacques Garrigue
Also, one might want to make code generation automatic, particularly
for C wrappers, to allow adding cases to functions easily. This should
be doable, but there is no infrastructure for that currently
(using CPP macros was simpler to start with...)
Yes. A better FFI could also be enormously beneficial. Improving upon OCaml's
FFI is one of the most alluring aspects of a reimplementation on LLVM, IMHO.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Gerd Stolpmann
2008-01-15 19:20:09 UTC
Permalink
Post by Jon Harrop
Incidentally, Xavier made a statement based upon what appears to me to be a
"On the other hand, certain features seem somewhat unsurprisingly to be
unimportant to industrial users. GUI toolkits are not an issue, because GUIs
tend to be built using more mainstream tools; it seems that different
competencies are involved in Caml and GUI development and companies "don't
want to squander their precious Caml expertise aligning pixels". Rich
libraries don't seem to matter in general; presumably companies are happy to
develop these in-house. And no-one wants yet another IDE; the applications of
interest are usually built using a variety of languages and tools anyway, so
consistency of development environment is a lost cause."
- http://cufp.galois.com/CUFP-2007-Report.pdf (page 3)
An interesting thesis, right? Although I wouldn't get that far, there is
some truth in it. The point, IMHO, is that OCaml will never replace
other languages in the sense that a company who uses language X for
years in product Y rewrites the code in OCaml. For what reason? The
company would run into big educational problems (learning a new
environment), would have high initial costs, and it is questionable
whether the result is better. Of course, for rewriting existing software
the company would profit from GUIs, from rich libraries etc. But I think
this does not happen.

What I see, however, is that OCaml is used where new software is
developed, in ambitious projects that start from scratch. It is simply a
fact that GUIs are not crucial in these areas (at least for the
companies I know). GUIs are seen as standard tools where nothing new
happens where OCaml could shine. If you need one, you develop it in one
of the mainstream languages.

IDEs aren't interesting right now because OCaml is mainly used by
(computer & related) scientists (and I include scientists working for
companies outside academia). IDEs are nice for beginners and for people
who do not want to know what's happening inside. They are not
interesting for companies that invent completely new types of products,
because they've hired experts that can live without (and want to live
without).
Post by Jon Harrop
Xavier appears to have taken the biased sample of industrialists who already
use OCaml despite its limitations and has drawn the conclusion that these
limitations are not important to industrialists. I was really horrified to
see this because, in my experience, companies are turning away from OCaml in
droves because of exactly the limitations Xavier enumerated and I for one
would dearly love to see them fixed.
Which companies?

I fully understand that OCaml is not well-suited for the average
company. But it is not because of missing GUIs and IDEs, but because the
language itself is too ambitious. Sorry to say that, but this is not the
mainstream and it will never be.

(I have a good friend who works for an average company, so I know what
I'm talking of. They program business apps for a commercial platform
from CA. A horrible language, but they can manage it. They are experts
for the models they use, and simply take a platform from industry.)
Post by Jon Harrop
OCaml will continue to go from strength to strength regardless but its uptake
would be vastly faster if these problems are addressed. To take them point by
. GUIs are incredibly important (LablGTK is the world's favorite OCaml
library!) and tens of thousands of OCaml programmers are crying out for
proper LablGTK documentation as a first priority, many of whom are in
industry.
See this as opportunity for your next book :-)

GTK is already poorly documented, so this is not only the problem of the
LablGTK creators. Nevertheless, GTK is widely used. I don't think it's a
real problem.
Post by Jon Harrop
. Rich libraries are incredibly important and OCaml has the potential to
become a hugely successful commercial platform where people can buy and sell
cross-platform libraries but OCaml needs support for shared run-time DLLs (or
something equivalent) this before this can happen.
Do you dream or what?

I don't think that selling libraries in binary form is that important...
It is difficult anyway to do that, and why do you expect you could be
successful in a niche language? As customer I would demand to get the
source code - to lower the risks of the investment into a small
platform.
Post by Jon Harrop
. An easy-to-use IDE would be an excellent way to kick-start people learning
OCaml even if an industrial-strength IDE is intractable.
Post by Jacques Garrigue
Also, one might want to make code generation automatic, particularly
for C wrappers, to allow adding cases to functions easily. This should
be doable, but there is no infrastructure for that currently
(using CPP macros was simpler to start with...)
Yes. A better FFI could also be enormously beneficial. Improving upon OCaml's
FFI is one of the most alluring aspects of a reimplementation on LLVM, IMHO.
A general question to you: When you are complaining about so many
aspects of OCaml, why don't you invest time & money to fix them? We
would all be very thankful.

Gerd
--
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany
***@gerd-stolpmann.de http://www.gerd-stolpmann.de
Phone: +49-6151-153855 Fax: +49-6151-997714
------------------------------------------------------------

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Jon Harrop
2008-01-15 22:04:45 UTC
Permalink
This post might be inappropriate. Click to display it.
Kuba Ober
2008-01-16 13:48:14 UTC
Permalink
Post by Jon Harrop
Post by Gerd Stolpmann
Post by Jon Harrop
Incidentally, Xavier made a statement based upon what appears to me to
be a similar logical error in the CUFP notes from last year that I read
"On the other hand, certain features seem somewhat unsurprisingly to
be unimportant to industrial users. GUI toolkits are not an issue,
because GUIs tend to be built using more mainstream tools; it seems
that different competencies are involved in Caml and GUI development
and companies "don't want to squander their precious Caml expertise
aligning pixels". Rich libraries don't seem to matter in general;
presumably companies are happy to develop these in-house. And no-one
wants yet another IDE; the applications of interest are usually built
using a variety of languages and tools anyway, so consistency of
development environment is a lost cause."
- http://cufp.galois.com/CUFP-2007-Report.pdf (page 3)
An interesting thesis, right? Although I wouldn't get that far, there is
some truth in it. The point, IMHO, is that OCaml will never replace
other languages in the sense that a company who uses language X for
years in product Y rewrites the code in OCaml. For what reason? The
company would run into big educational problems (learning a new
environment), would have high initial costs, and it is questionable
whether the result is better. Of course, for rewriting existing software
the company would profit from GUIs, from rich libraries etc. But I think
this does not happen.
I believe many more companies would migrate to OCaml if it had
well-documented GUI APIs and rich libraries. Indeed, Microsoft are gambling
on people migrating to F# in exactly the same way.
Post by Gerd Stolpmann
What I see, however, is that OCaml is used where new software is
developed, in ambitious projects that start from scratch. It is simply a
fact that GUIs are not crucial in these areas (at least for the
companies I know).
But the companies you know were already self-selected to be the ones who do
not care about OCaml's limitations, so it is a biased sample?
Post by Gerd Stolpmann
GUIs are seen as standard tools where nothing new happens where OCaml
could shine.
I have no doubt that OCaml would shine in GUIs just as it does elsewhere.
In fact, after some initial thinking and looking around it seems that the
only "sane" GUI for OCaml, at this time, is Qt, but someone has to write a
machine translator to port it from C++ to OCaml. Qt is reasonably well
designed, and has the richest feature set of all GUI toolkits, even if you
combined all the competition and treated it as one "other" toolkit.

Using Qt with some machine (or not!) generated bindings is just a huge
waste -- it's a nice, clean design, which has recently been tweaked for
performance (some Qt4 apps start in 50% of the time just by having been
ported to Qt4 from Qt3).

Cheers, Kuba

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Dario Teixeira
2008-01-16 15:02:54 UTC
Permalink
Hi,
Post by Kuba Ober
In fact, after some initial thinking and looking around it seems that the
only "sane" GUI for OCaml, at this time, is Qt, but someone has to write a
machine translator to port it from C++ to OCaml. Qt is reasonably well
designed, and has the richest feature set of all GUI toolkits, even if you
combined all the competition and treated it as one "other" toolkit.
Using Qt with some machine (or not!) generated bindings is just a huge
waste -- it's a nice, clean design, which has recently been tweaked for
performance (some Qt4 apps start in 50% of the time just by having been
ported to Qt4 from Qt3).
I'm inclined to agree. I would even go as far as saying that the lack of
Qt bindings is perhaps the biggest open sore as far as Ocaml library support
is concerned.

The guys at Trolltech, however, seem quite keen on having Qt on as many
platforms as possible (Qt-Jambi, which brings Qt to the JVM is one of their
products). Couldn't this whole auto-generation of bindings be made easier
if they got involved? I am sure they already have plenty of tools in
place to facilitate it. Even if they were not to commit actual manpower
to the effort, they might still be able to help.

And incidentally, the afore mentioned Qt-Jambi, together with the Ocamljava
project might provide a last-resort solution in the absence of native bindings.
Another possibility might be the Qyoto/Kimono project (which brings Qt/KDE
into .net) together with the OcamlIL project (if it's still alive). You would
then use Mono to run Ocaml programmes.

cheers,
Dario



__________________________________________________________
Sent from Yahoo! Mail - a smarter inbox http://uk.mail.yahoo.com

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Jon Harrop
2008-01-16 19:00:25 UTC
Permalink
Post by Dario Teixeira
I'm inclined to agree. I would even go as far as saying that the lack of
Qt bindings is perhaps the biggest open sore as far as Ocaml library
support is concerned.
As I understand it, OCaml's FFI makes writing Qt bindings an enormous
undertaking which is why we don't have any.

I'm happy with GTK for now and would rather see OpenGL 2 bindings instead.
Post by Dario Teixeira
The guys at Trolltech, however, seem quite keen on having Qt on as many
platforms as possible (Qt-Jambi, which brings Qt to the JVM is one of their
products). Couldn't this whole auto-generation of bindings be made easier
if they got involved? I am sure they already have plenty of tools in
place to facilitate it. Even if they were not to commit actual manpower
to the effort, they might still be able to help.
I found TrollTech's customer support awful as a customer so I very much doubt
they will go out of their way to help a really obscure virgin corner of the
Qt market. That was a few years ago though.
Post by Dario Teixeira
And incidentally, the afore mentioned Qt-Jambi, together with the Ocamljava
project might provide a last-resort solution in the absence of native
bindings. Another possibility might be the Qyoto/Kimono project (which
brings Qt/KDE into .net) together with the OcamlIL project (if it's still
alive). You would then use Mono to run Ocaml programmes.
I evaluated various such options recently and decided that Mono is truly awful
(very poorly written, unreliable and slow) and LLVM is absolutely superb
(extremely well-written C++ with complete native OCaml bindings!). Moreover,
Mono appears to have no future in its current form whereas LLVM has serious
backers and is improving at a tremendous rate.

Even if you don't want to implement a whole new language or backend, using
LLVM's JIT compilation for code generation has great potential for OCaml,
e.g. regexps. I highly recommend giving it a play!
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Kuba Ober
2008-01-17 13:09:17 UTC
Permalink
Post by Dario Teixeira
Post by Kuba Ober
Using Qt with some machine (or not!) generated bindings is just a huge
waste -- it's a nice, clean design, which has recently been tweaked for
performance (some Qt4 apps start in 50% of the time just by having been
ported to Qt4 from Qt3).
I'm inclined to agree. I would even go as far as saying that the lack of
Qt bindings is perhaps the biggest open sore as far as Ocaml library
support is concerned.
The guys at Trolltech, however, seem quite keen on having Qt on as many
platforms as possible (Qt-Jambi, which brings Qt to the JVM is one of their
products). Couldn't this whole auto-generation of bindings be made easier
if they got involved?
At some point, in order to "naturally" use Qt and benefit from its
performance, the machine translation will be easier than any bindings you
could think of. IMHO, of course. Qt's code itself will become smaller in
Ocaml - I've hacked at porting QObject, and so far I've got the line count to
50% of Trolltech's. And I'm a total noob.

Cheers, Kuba

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Kuba Ober
2008-01-18 05:33:38 UTC
Permalink
Post by Jon Harrop
Post by Gerd Stolpmann
Post by Jon Harrop
Incidentally, Xavier made a statement based upon what appears to me to
be a similar logical error in the CUFP notes from last year that I read
"On the other hand, certain features seem somewhat unsurprisingly to
be unimportant to industrial users. GUI toolkits are not an issue,
because GUIs tend to be built using more mainstream tools; it seems
that different competencies are involved in Caml and GUI development
and companies "don't want to squander their precious Caml expertise
aligning pixels". Rich libraries don't seem to matter in general;
presumably companies are happy to develop these in-house. And no-one
wants yet another IDE; the applications of interest are usually built
using a variety of languages and tools anyway, so consistency of
development environment is a lost cause."
- http://cufp.galois.com/CUFP-2007-Report.pdf (page 3)
An interesting thesis, right? Although I wouldn't get that far, there is
some truth in it. The point, IMHO, is that OCaml will never replace
other languages in the sense that a company who uses language X for
years in product Y rewrites the code in OCaml. For what reason? The
company would run into big educational problems (learning a new
environment), would have high initial costs, and it is questionable
whether the result is better. Of course, for rewriting existing software
the company would profit from GUIs, from rich libraries etc. But I think
this does not happen.
I believe many more companies would migrate to OCaml if it had
well-documented GUI APIs and rich libraries. Indeed, Microsoft are gambling
on people migrating to F# in exactly the same way.
Post by Gerd Stolpmann
What I see, however, is that OCaml is used where new software is
developed, in ambitious projects that start from scratch. It is simply a
fact that GUIs are not crucial in these areas (at least for the
companies I know).
But the companies you know were already self-selected to be the ones who do
not care about OCaml's limitations, so it is a biased sample?
Post by Gerd Stolpmann
GUIs are seen as standard tools where nothing new happens where OCaml
could shine.
I have no doubt that OCaml would shine in GUIs just as it does elsewhere.
Post by Gerd Stolpmann
If you need one, you develop it in one of the mainstream languages.
Actually I would either choose F# on Windows or give up on any other OS.
Post by Gerd Stolpmann
IDEs aren't interesting right now because OCaml is mainly used by
(computer & related) scientists (and I include scientists working for
companies outside academia).
Many of the world's most sophisticated IDEs are targetted solely at
technical users. Look at Mathematica's notebook interface, for example. I
believe that is a great example to aspire to.
Post by Gerd Stolpmann
IDEs are nice for beginners and for people
who do not want to know what's happening inside. They are not
interesting for companies that invent completely new types of products,
because they've hired experts that can live without (and want to live
without).
I couldn't disagree more. Pharmaceuticals are a trillion dollar industry
where many scientists would benefit enormously from being able to use a
tool like OCaml without knowing anything about how it works in order to
create their next generation products (drugs). The same is true of most
industries where scientists and engineers work and there are many such
industries and there are extremely profitable.
Post by Gerd Stolpmann
Post by Jon Harrop
Xavier appears to have taken the biased sample of industrialists who
already use OCaml despite its limitations and has drawn the conclusion
that these limitations are not important to industrialists. I was
really horrified to see this because, in my experience, companies are
turning away from OCaml in droves because of exactly the limitations
Xavier enumerated and I for one would dearly love to see them fixed.
Which companies?
General Electric, Microsoft, Wolfram Research and various bioinformatics
institutes for example.
Look at General Electric. They build some of the world's most sophisticated
medical scanners and that large-scale embedded market is ideal for using
languages like OCaml for its high-performance numerics because you have
complete control over the environment. However, they desperately need GUI
toolkits to provide a front-end for users.
I'd like to know what Alex Barretta makes of this, for example. His glass
cutters must have the same characteristics in this respect...
Post by Gerd Stolpmann
I fully understand that OCaml is not well-suited for the average
company. But it is not because of missing GUIs and IDEs, but because the
language itself is too ambitious. Sorry to say that, but this is not the
mainstream and it will never be.
I still think OCaml has the best chance of any FPL to become a mainstream
tool in technical computing.
Indeed, I recently tried to quantify how far OCaml has already come and I
believe it is already as popular as C# among technical users, for example.
That is quite an achievement!
Post by Gerd Stolpmann
(I have a good friend who works for an average company, so I know what
I'm talking of. They program business apps for a commercial platform
from CA. A horrible language, but they can manage it. They are experts
for the models they use, and simply take a platform from industry.)
Yes. I do not believe OCaml will make significant inroads into displacing
COBOL and relatives but there are a lot of other big opportunities out
there for such a language.
Post by Gerd Stolpmann
Post by Jon Harrop
OCaml will continue to go from strength to strength regardless but its
uptake would be vastly faster if these problems are addressed. To take
. GUIs are incredibly important (LablGTK is the world's favorite OCaml
library!) and tens of thousands of OCaml programmers are crying out for
proper LablGTK documentation as a first priority, many of whom are in
industry.
See this as opportunity for your next book :-)
Indeed. Even after the announcement that Microsoft are productizing F#,
OCaml for Scientists continues to be our biggest earning product.
Consequently, I am very tempted to write a "sequel" that covers many of the
important aspects of the language that I did not cover in the original,
including GUI programming, XML, parallelism and so forth. If anyone has
ideas for subjects they would like to see covered, please e-mail me!
Post by Gerd Stolpmann
GTK is already poorly documented, so this is not only the problem of the
LablGTK creators. Nevertheless, GTK is widely used. I don't think it's a
real problem.
Yes. I'm really not sure what the best course of action would be here.
Would Qt bindings be preferable? Is it worth the hassle? How long would it
be before they reached the maturity of GTK?
Making bindings for Qt is basically putting a beautiful architecture to waste.
Qt's architecture is good enough to be actually machine-translated into OCaml.
This would be an involved project, but not impossible.

Using Qt from OCaml via a set of bindings can be a short-term stop-gap measure
for trivial applications, I would never deploy a Qt application written in
OCaml if the application was any bigger on the GUI side than a couple simple
dialog boxes. There is a binding generator (forgot its name) which can
generate OCaml bindings for Qt, but you have to give it a list of
classes/methods/signals/slots to generate bindings for. So perfect for
trivial applications, but not much else.

Qt, when you start to think of its API in how it may look in OCaml, becomes
pretty cool, and I'm sure there are a few improvements to it you can make to
leverage the power given to you by OCaml, once you loose the shackles of C++.

Cheers, Kuba

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Kuba Ober
2008-01-18 05:19:15 UTC
Permalink
Post by Gerd Stolpmann
Post by Jon Harrop
Incidentally, Xavier made a statement based upon what appears to me to be
a similar logical error in the CUFP notes from last year that I read
"On the other hand, certain features seem somewhat unsurprisingly to be
unimportant to industrial users. GUI toolkits are not an issue, because
GUIs tend to be built using more mainstream tools; it seems that
different competencies are involved in Caml and GUI development and
companies "don't want to squander their precious Caml expertise aligning
pixels". Rich libraries don't seem to matter in general; presumably
companies are happy to develop these in-house. And no-one wants yet
another IDE; the applications of interest are usually built using a
variety of languages and tools anyway, so consistency of development
environment is a lost cause."
- http://cufp.galois.com/CUFP-2007-Report.pdf (page 3)
An interesting thesis, right? Although I wouldn't get that far, there is
some truth in it. The point, IMHO, is that OCaml will never replace
other languages in the sense that a company who uses language X for
years in product Y rewrites the code in OCaml. For what reason? The
company would run into big educational problems (learning a new
environment), would have high initial costs, and it is questionable
whether the result is better. Of course, for rewriting existing software
the company would profit from GUIs, from rich libraries etc. But I think
this does not happen.
What I see, however, is that OCaml is used where new software is
developed, in ambitious projects that start from scratch. It is simply a
fact that GUIs are not crucial in these areas (at least for the
companies I know). GUIs are seen as standard tools where nothing new
happens where OCaml could shine. If you need one, you develop it in one
of the mainstream languages.
IDEs aren't interesting right now because OCaml is mainly used by
(computer & related) scientists (and I include scientists working for
companies outside academia). IDEs are nice for beginners and for people
who do not want to know what's happening inside. They are not
interesting for companies that invent completely new types of products,
because they've hired experts that can live without (and want to live
without).
Post by Jon Harrop
Xavier appears to have taken the biased sample of industrialists who
already use OCaml despite its limitations and has drawn the conclusion
that these limitations are not important to industrialists. I was really
horrified to see this because, in my experience, companies are turning
away from OCaml in droves because of exactly the limitations Xavier
enumerated and I for one would dearly love to see them fixed.
Which companies?
I fully understand that OCaml is not well-suited for the average
company. But it is not because of missing GUIs and IDEs, but because the
language itself is too ambitious. Sorry to say that, but this is not the
mainstream and it will never be.
(I have a good friend who works for an average company, so I know what
I'm talking of. They program business apps for a commercial platform
from CA. A horrible language, but they can manage it. They are experts
for the models they use, and simply take a platform from industry.)
Post by Jon Harrop
OCaml will continue to go from strength to strength regardless but its
uptake would be vastly faster if these problems are addressed. To take
. GUIs are incredibly important (LablGTK is the world's favorite OCaml
library!) and tens of thousands of OCaml programmers are crying out for
proper LablGTK documentation as a first priority, many of whom are in
industry.
See this as opportunity for your next book :-)
GTK is already poorly documented, so this is not only the problem of the
LablGTK creators. Nevertheless, GTK is widely used. I don't think it's a
real problem.
Post by Jon Harrop
. Rich libraries are incredibly important and OCaml has the potential to
become a hugely successful commercial platform where people can buy and
sell cross-platform libraries but OCaml needs support for shared run-time
DLLs (or something equivalent) this before this can happen.
Do you dream or what?
I don't think that selling libraries in binary form is that important...
It is difficult anyway to do that, and why do you expect you could be
successful in a niche language? As customer I would demand to get the
source code - to lower the risks of the investment into a small
platform.
Yeah, I wouldn't be using Qt if there was no source code for it. Quite a few
times over the years I had to tweak away at the implementation details.

In fact, I would never specify *any* mission-critical libraries or frameworks
if they didn't come with full sources.

Cheers, Kuba

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Kuba Ober
2008-01-18 05:39:48 UTC
Permalink
Post by Kuba Ober
Post by Gerd Stolpmann
Post by Jon Harrop
. Rich libraries are incredibly important and OCaml has the potential
to become a hugely successful commercial platform where people can buy
and sell cross-platform libraries but OCaml needs support for shared
run-time DLLs (or something equivalent) this before this can happen.
Do you dream or what?
I don't think that selling libraries in binary form is that important...
It is difficult anyway to do that, and why do you expect you could be
successful in a niche language? As customer I would demand to get the
source code - to lower the risks of the investment into a small
platform.
Yeah, I wouldn't be using Qt if there was no source code for it. Quite a
few times over the years I had to tweak away at the implementation details.
In fact, I would never specify *any* mission-critical libraries or
frameworks if they didn't come with full sources.
In other words, Jon: if you tried to sell me source-code-less libraries, I
simply wouldn't buy, and no amount of persuading could change that. I'd still
keep buying your books, though :)

Just look at what happened to scores of Delphi and OCX controls which became
abandonware, and how much of this stuff eventually had to be simply
reimplemented by the same people who originally bought the controls not to
implement them in the first place. I detest closed-source controls and
libraries, I simply don't use them. The whole idea of "here's the OCX and a
typelib, and a help file, take it or leave it" is preposterous. Well, maybe
it's fine if you're being contracted for a one-off job where the payee has no
clue, and your morals don't seem to interfere -- sure then you can reuse all
the source-less crap you want. But as a part of a long term strategy? No way.

If there was one decision Trolls made right, it was to include the source
code.

Cheers, Kuba

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Jacques GARRIGUE
2008-01-16 03:26:27 UTC
Permalink
Post by Jon Harrop
Post by Jacques Garrigue
Post by Jon Harrop
Post by Jacques Garrigue
Unfortunately, this would make marshalling between different programs
much more complicated...
Do people marshal polymorphic variants between different programs?
Do people marshal data between different programs (or different
versions of the same program)?
I suspect OCaml's marshalling is used almost entirely between same
versions of the same programs.
I'm not so sure. Actually, I do it all the time when recompiling
ocaml. Otherwise I would have to bootstrap after any modification in
the compiler. Fortunately, this is not the case, and one only needs to
bootstrap when the data structures are modified (or semantics changed).
Post by Jon Harrop
In particular, I was advised against marshalling data between different
versions of the same program because this is unsafe (not just type
safety but the format used by Marshal is not ossified).
Marshalling data between different versions of the same program is ok,
but you're on your own concerning compatibility. You must be careful
concerning changes in ocaml versions, but I don't remember any change
in representation, and if one were to happen it would be amply
documented.
Post by Jon Harrop
Post by Jacques Garrigue
Post by Jon Harrop
So the advantage of a decision tree is probably insignificant on real
code because it will lie between these two extremes.
Since the goal was never to be faster than ordinary variants, but just
obtain comparable speed, this seems good :-)
Yes. This would probably also work ok if you used a symbol table to store
exact identifier names rather than just a hash. The symbol's index in the
table would serve the same purpose as the hash.
No, because in order to produce efficient code you have to know the
hash at compile time, and in your scheme you only know it at link time
or runtime.
Post by Jon Harrop
OpenGL has an extension mechanism that can be queried at
run-time. If a given extension is available then you can do things
that you could not do before, such as pass a GLenum to a function
that might not have accepted it without the extension.
Post by Jacques Garrigue
Since LablGL was coded by hand, adding extensions would mean modifying
it.
Exactly, that is a limitation of LablGL's design and, therefore, I think it is
was quite wrong of you to claim "LablGL shows is that in practice only a
small number of tags are used together" when LablGL's use of small, closed
sum types is actually a design limitation that would not be there if it
supported all of OpenGL, i.e. the extension mechanism.
I don't see your point. Even with the extension mechanism, extra
GLenum's are still only allowed for some specific functions. So you
can still define some subsets of GLenum that should be conflict free,
you don't need to prohibit all conflicts in GLenum. This is what I
mean by lablGL's design.

The problem with lablGL and extensions is the implementation, not the
API design. What we would need was some kind of AOP approach to the
stubs, where you could describe what functions are extended by which
extensions.
Post by Jon Harrop
Incidentally, Xavier made a statement based upon what appears to me to be a
"On the other hand, certain features seem somewhat unsurprisingly to be
unimportant to industrial users. GUI toolkits are not an issue, because GUIs
tend to be built using more mainstream tools; it seems that different
competencies are involved in Caml and GUI development and companies "don't
want to squander their precious Caml expertise aligning pixels". Rich
libraries don't seem to matter in general; presumably companies are happy to
develop these in-house. And no-one wants yet another IDE; the applications of
interest are usually built using a variety of languages and tools anyway, so
consistency of development environment is a lost cause."
- http://cufp.galois.com/CUFP-2007-Report.pdf (page 3)
Xavier appears to have taken the biased sample of industrialists who already
use OCaml despite its limitations and has drawn the conclusion that these
limitations are not important to industrialists. I was really horrified to
see this because, in my experience, companies are turning away from OCaml in
droves because of exactly the limitations Xavier enumerated and I for one
would dearly love to see them fixed.
I don't agree with all these points (otherwise I wouldn't be
maintaining a GUI toolkit), but there is some truth in it. I actually
got similar reactions from industry in Japan, if for different
reasons: they don't need the GUI, because they prefer to do it
themselves, to differentiate from others. People doing in-house
programming have a different point of view. I remember somebody from a
bank who told me he wrote a program to be used in all their branches
using labltk. In this case you don't need anything flashy, it just has
to be functional (err, to work).

Concerning IDEs, since eclipse is more and more used, good support
for it seems a must. But you won't have me use anything other than
emacs and ocamlbrowser!
Post by Jon Harrop
Post by Jacques Garrigue
Also, one might want to make code generation automatic, particularly
for C wrappers, to allow adding cases to functions easily. This should
be doable, but there is no infrastructure for that currently
(using CPP macros was simpler to start with...)
Yes. A better FFI could also be enormously beneficial. Improving
upon OCaml's FFI is one of the most alluring aspects of a
reimplementation on LLVM, IMHO.
The current FFI works well, but it's true that the way it cuts the
work in small pieces (stubs in C on one side, externals on the other)
makes it difficult to automate its use. In my experience it is very
flexible, but badly lacks abstraction.

Jacques Garrigue

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Yaron Minsky
2008-01-16 03:34:54 UTC
Permalink
Post by Jacques GARRIGUE
I'm not so sure. Actually, I do it all the time when recompiling
ocaml. Otherwise I would have to bootstrap after any modification in
the compiler. Fortunately, this is not the case, and one only needs to
bootstrap when the data structures are modified (or semantics changed).
I agree. We quite often use marshal to share data between different
programs that share a common library.
Post by Jacques GARRIGUE
I don't agree with all these points (otherwise I wouldn't be
maintaining a GUI toolkit), but there is some truth in it. I actually
got similar reactions from industry in Japan, if for different
reasons: they don't need the GUI, because they prefer to do it
themselves, to differentiate from others. People doing in-house
programming have a different point of view. I remember somebody from a
bank who told me he wrote a program to be used in all their branches
using labltk. In this case you don't need anything flashy, it just has
to be functional (err, to work).
We started out doing entirely back-end processes using OCaml, but as time
went on, we started building more and more GUIs. The fact that OCaml has
lablgtk makes it much more useful for us, without a doubt. The main reason
we like to do GUIs in OCaml is that we see a lot of value in sharing type
definitions and code between the GUIs and the back-end services they connect
to.

y
Jon Harrop
2008-01-16 03:42:20 UTC
Permalink
Post by Yaron Minsky
We started out doing entirely back-end processes using OCaml, but as time
went on, we started building more and more GUIs. The fact that OCaml has
lablgtk makes it much more useful for us, without a doubt. The main reason
we like to do GUIs in OCaml is that we see a lot of value in sharing type
definitions and code between the GUIs and the back-end services they
connect to.
Yes, this is exactly the kind of thing I was referring to. I think a lot of
people want simple GUIs that are perfectly feasible to construct entirely in
OCaml and the overhead of splitting a project across languages is much
higher. Fortunately, LablGTK makes this feasible in OCaml.

There must be some reason why LablGTK is so popular! ;-)
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Jon Harrop
2008-01-16 04:40:09 UTC
Permalink
Post by Jacques GARRIGUE
Post by Jon Harrop
I suspect OCaml's marshalling is used almost entirely between same
versions of the same programs.
I'm not so sure. Actually, I do it all the time when recompiling
ocaml. Otherwise I would have to bootstrap after any modification in
the compiler. Fortunately, this is not the case, and one only needs to
bootstrap when the data structures are modified (or semantics changed).
Interesting.
Post by Jacques GARRIGUE
Post by Jon Harrop
Yes. This would probably also work ok if you used a symbol table to store
exact identifier names rather than just a hash. The symbol's index in the
table would serve the same purpose as the hash.
No, because in order to produce efficient code you have to know the
hash at compile time, and in your scheme you only know it at link time
or runtime.
You could still use the same hashing scheme but you could fall back to linear
search of symbols by name in the event of a clash.
Post by Jacques GARRIGUE
Post by Jon Harrop
Exactly, that is a limitation of LablGL's design and, therefore, I think
it is was quite wrong of you to claim "LablGL shows is that in practice
only a small number of tags are used together" when LablGL's use of
small, closed sum types is actually a design limitation that would not be
there if it supported all of OpenGL, i.e. the extension mechanism.
I don't see your point. Even with the extension mechanism, extra
GLenum's are still only allowed for some specific functions. So you
can still define some subsets of GLenum that should be conflict free,
you don't need to prohibit all conflicts in GLenum. This is what I
mean by lablGL's design.
Provided you can enumerate which tags can be used with which functions
including the presence of extensions, yes. I suppose that would be possible
and you could end up with many small sets of tags and much less chance of
clashing.
Post by Jacques GARRIGUE
The problem with lablGL and extensions is the implementation, not the
API design. What we would need was some kind of AOP approach to the
stubs, where you could describe what functions are extended by which
extensions.
I think it would be better to remove all complexity from the C stubs, have
them all autogenerated and then write a higher-level API on top entirely in
OCaml. GLCaml is the start of a good foundation for OpenGL, IMHO. I think it
would be very productive to merge the projects at some point.
Post by Jacques GARRIGUE
...
I don't agree with all these points (otherwise I wouldn't be
maintaining a GUI toolkit), but there is some truth in it. I actually
got similar reactions from industry in Japan, if for different
reasons: they don't need the GUI, because they prefer to do it
themselves, to differentiate from others. People doing in-house
programming have a different point of view. I remember somebody from a
bank who told me he wrote a program to be used in all their branches
using labltk. In this case you don't need anything flashy, it just has
to be functional (err, to work).
Concerning IDEs, since eclipse is more and more used, good support
for it seems a must. But you won't have me use anything other than
emacs and ocamlbrowser!
Visual Studio's Intellisense makes GUI programming much easier in F# than
ocamlbrowser+ocaml. I think the single most productive thing that could be
added to ocamlbrowser is hyperlinks from the quoted definitions to all
related definitions.

Now that I come to think of it, you can just run ocamldoc on the LablGTK
sources and use a browser to do that. Is the ocamldoc HTML output for the
latest LablGTK2 on the web anywhere?
Post by Jacques GARRIGUE
Post by Jon Harrop
Yes. A better FFI could also be enormously beneficial. Improving
upon OCaml's FFI is one of the most alluring aspects of a
reimplementation on LLVM, IMHO.
The current FFI works well, but it's true that the way it cuts the
work in small pieces (stubs in C on one side, externals on the other)
makes it difficult to automate its use. In my experience it is very
flexible, but badly lacks abstraction.
What sorts of abstractions would you like?
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Eric Cooper
2008-01-16 16:03:04 UTC
Permalink
Is the ocamldoc HTML output for the latest LablGTK2 on the web anywhere?
In Debian, it's included in the liblablgtk2-ocaml-dev package in
/usr/share/doc/liblablgtk2-ocaml-dev/html/api/, and similarly for the
other OCaml -dev packages.
--
Eric Cooper e c c @ c m u . e d u

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Richard Jones
2008-01-16 10:50:36 UTC
Permalink
Post by Jon Harrop
. GUIs are incredibly important (LablGTK is the world's favorite OCaml
library!) and tens of thousands of OCaml programmers are crying out for
proper LablGTK documentation as a first priority, many of whom are in
industry.
GTK itself is horribly undocumented. However SooHyoung Oh has done an
excellent job translating the C-based GTK 2.0 tutorial into OCaml,
here:

http://plus.kaist.ac.kr/~shoh/ocaml/lablgtk2/lablgtk2-tutorial/
Post by Jon Harrop
. Rich libraries are incredibly important and OCaml has the
potential to become a hugely successful commercial platform where
people can buy and sell cross-platform libraries but OCaml needs
support for shared run-time DLLs (or something equivalent) this
before this can happen.
My requirement is similar to this: (1) to be able to take OCaml
libraries and automatically generate C bindings from them (ie.
translate the OCaml .mli file into a .h file, and generate stubs).
(2) to be able to ship the library as a DLL / .so file. Efficiency is
not so much of a concern for me - eg. if the generated stubs worked by
copying all strings passed, that would be OK for my requirements.

I actually did a little bit of work on a stub/wrapper generator, and I
think it is possible to implement it, especially now that ocamlopt can
generate PIC.

Rich.
--
Richard Jones
Red Hat

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Jon Harrop
2008-01-14 17:14:39 UTC
Permalink
Post by Stefan Monnier
Post by Kuba Ober
What I meant was simply that instead of using some fixed hash function,
one could use a perfect hashing function which is optimal for its known
set of inputs, and won't ever generate a collision.
The problem is that the set of inputs is not know at compile time, only
at link time.
Yes. I think this is another case where OCaml would really benefit from a
symbol table and this is something else that seems much easier to do with JIT
compilation.

Also, what happens if you try to dynamically load two libraries that use
polymorphic variants that clash?
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Alain Frisch
2008-01-14 17:36:16 UTC
Permalink
Post by Jon Harrop
Also, what happens if you try to dynamically load two libraries that use
polymorphic variants that clash?
AFAIK, this is ok. The problematic clashes can always be detected at
type-checking time.

-- Alain

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Jacques Garrigue
2008-01-11 00:15:19 UTC
Permalink
Post by Jon Harrop
ISTR advice that constructors sharing the first few characters should be
avoided in order to reduce the likelihood of clashing hash values for
polymorphic variants. Is that right?
Not at all. If the first characters are identical it just means that an
identical value will be added to the hashes of the suffixes, which
actually means that you lower the probability of getting conflicts :-)
The hash functions guarantees that all keys of strictly less than 5
characters will map to different.

The probability of getting clashes being really low, you should not be
concerned by this. Just check aferwards. A simple way to do it is to
produce a big type containing all the tags, and feed it to ocamlc.
Post by Jon Harrop
I'm interested in automatically translating the GL_* enum from
OpenGL into polymorphic variants. So although it is generated code I
have little control over it, e.g. I cannot change the translation as
OpenGL gets extended because code will already be using the existing
names.
In the event you get a conflict when openGL is extended, you can still
add a special case for the newly added tags. I hope this does not
happen, but the birthday theorem tells you that when you get enough
participants, clashes are hard to avoid.

Cheers,

Jacques Garrigue

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Loading...