Why can't I use constructors as functions?

Discussion:

Krishnaswami, Neel

2001-07-30 18:51:07 UTC

Hello,

I'm curious as to the reason why I can't use a datatype constructor
as a function. Eg, in SML I can write a function like this:

datatype peano = Zero | Succ of peano

fun fold succ zero n =
case n of
Zero => zero
| (Succ n') => succ (fold n')

fun add a b = fold Succ a b (* Use the Succ constructor as a funtion *)

If I try something similar in Caml,

type peano = Zero | Succ of peano

let rec fold succ zero n =
match n with
| Zero -> zero
| Succ(n') -> succ (fold succ zero n')

I can't write an add function like I can in SML:

# let add a b = fold (Succ) a b;;
Characters 20-24:
The constructor Succ expects 1 argument(s),
but is here applied to 0 argument(s)

Instead I need to wrap it in a function:

# let add a b = fold (fun x -> Succ x) a b
val add : peano -> peano -> peano = <fun>

Why was this design choice made?

--
Neel Krishnaswami
***@cswcasa.com

-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-***@inria.fr Archives: http://caml.inria.fr

Xavier Leroy

2001-08-03 08:16:51 UTC

Permalink

Post by Krishnaswami, Neel
I'm curious as to the reason why I can't use a datatype constructor
fun add a b = fold Succ a b (* Use the Succ constructor as a funtion *)
If I try something similar in Caml,
# let add a b = fold (fun x -> Succ x) a b

The old Caml V3.1 implementation treated constructors as functions like SML.
In Caml Light, I chose to drop this equivalence for several reasons:

- Simplicity of the compiler. Internally, constructors are not
functions, and a special case is needed to transform Succ into
(fun x -> Succ x) when needed. This isn't hard, but remember that
Caml Light was really a minimal, stripped-down version of Caml.

- Constructors in Caml Light and OCaml really have an arity, e.g.
C of int * int is really a constructor with two integer arguments,
not a constructor taking one argument that is a pair. Hence, there
would be two ways to map the constructor C to a function:
fun (x,y) -> C(x,y)
or
fun x y -> C(x,y)
The former is more natural if you come from an SML background
(where constructors have 0 or 1 argument), but the latter fits better
the Caml Light / OCaml execution model, which favors curried
functions. By not treating constructors like functions, we avoid
having to choose...

- Code clarity. While using a constructor as a function is sometimes
convenient, I would argue it is often hard to read. Writing
"fun x -> Succ x" is more verbose, but easier to read, I think.

- Xavier Leroy
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-***@inria.fr Archives: http://caml.inria.fr

Andreas Rossberg

2001-08-03 09:19:04 UTC

Permalink

This post might be inappropriate. Click to display it.

Markus Mottl

2001-08-03 10:00:17 UTC

Permalink

Post by Xavier Leroy
- Constructors in Caml Light and OCaml really have an arity, e.g.
C of int * int is really a constructor with two integer arguments,
not a constructor taking one argument that is a pair. Hence, there
fun (x,y) -> C(x,y)
or
fun x y -> C(x,y)

Why? To me only the latter seems to be consistent with the
constructor. I'd expect the first form in the case of "C of (int * int)",
which is indeed represented differently to "C of int * int".

Actually, this proves that we already have ambiguity, e.g.:

C (1, 2)

How does the definition of this variant look like?

C of int * int

or

C of (int * int)

?

Nobody can tell...

Therefore, I'd rather propose that it be required to write:

C 1 2

if the definition is "C of int * int". I know, this would break a lot
(maybe almost all) code, but could be automatically transformed if
required. Maybe the choice of the type constructor for pairs "*" wasn't
so good: people really confuse this with tuples. Another symbol would
seem more approriate ("&", "^", ...?).

The language would seem much more regular to me if functions and
constructors were treated in a similar way. Would this be too big a
change to the core language?

Regards,
Markus Mottl

--
Markus Mottl ***@oefai.at
Austrian Research Institute
for Artificial Intelligence http://www.oefai.at/~markus
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-***@inria.fr Archives: http://caml.inria.fr

Marcin 'Qrczak' Kowalczyk

2001-08-05 12:57:06 UTC

Permalink

Post by Markus Mottl
The language would seem much more regular to me if functions and
constructors were treated in a similar way. Would this be too big a
change to the core language?

It's a change to the syntax, not to the core language. Revised syntax
uses curried constructors, with declaration of a constructor looking
like this:
... | C of int and string | ...

--
__("< Marcin Kowalczyk * ***@knm.org.pl http://qrczak.ids.net.pl/
\__/
^^ SYGNATURA ZASTĘPCZA
QRCZAK

-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-***@inria.fr Archives: http://caml.inria.fr

Xavier Leroy

2001-08-06 08:14:37 UTC

Permalink

| The old Caml V3.1 implementation treated constructors as functions like
I think you're forgetting that at some stage you must have re-introduced
it.

You're correct, the "constructor as functions" was implemented in Caml
Light version 0.7. Tremendous psychological pressure must have been
exerted on me to make me implement this, causing me to forget all
about it afterwards :-)

I would be happy to have constructors curried or uncurried, but I don't
see that exposing a distinct notion of arity serves any useful purpose.

It serves the following purpose: all implementations want to represent
specially a constructor that takes a tuple of arguments, e.g.
C of int * int
representing it as one block tagged C with two integer fields, rather
than the "normal" representation as a block tagged C containing a
pointer to a block representing the pair. This optimization is
crucial both for memory size (cuts down memory use by a factor of 5/3)
and for speed.

In combination with modules, it's very hard to perform this
representation trick transparently, i.e. as a compiler optimization.
The reason is that a module implementation can define

type t = C of int * int
type u = int * int
let f x = (* code assuming that C(x,y) is represented as 1 block *)

yet its interface can abstract over u:

type u
type t = C of u
val f : ...

and clients of the module assume a different representation for C,
namely that it has only one field containing "the" argument of C.
Some form of representation coercion or run-time type inspection is
necessary to deal with this case, and it can be extremely complex
(see the TIL compiler for an example).

By exposing a notion of arity for constructors, we prevent the module
implementation above from matching the module signature "type t = C of u",
thus working around the problem.

More generally, I think the representation trick for constructors
taking several arguments is so crucial and so hard to perform as a
transparent optimization that it deserves to be exposed in the
language as a primitive concept of "constructor with arity".

Isn't it conceptually simpler to have constructors behave as regular
objects as much as possible? (Of course, not in pattern matching etc.)

As you said, constructors already have a special status in
pattern-matching, so it doesn't seem much more conceptually difficult
to treat them as a primitive concept distinct (and essentially
orthogonal) from functions.

- Xavier Leroy
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-***@inria.fr Archives: http://caml.inria.fr