The short answer is that I'm most directly referring to "the types in
the programmer's head".
In the database theory world, we speak of three levels: conceptual,
logical, physical. In a dbms, these might roughly be compared to
business entities described in a requirements doc, (or in the
business owners' heads), the (logical) schema encoded in the
dbms, and the b-tree indicies the dbms uses for performance.
So when you say "latent types", I think "conceptual types."
That sounds plausible, but of course at some point we need to pick a
term and attempt to define it. What I'm attempting to do with "latent
types" is to point out and emphasize their relationship to static types,
which do have a very clear, formal definition. Despite some people's
skepticism, that definition gives us a lot of useful stuff that can be
applied to what I'm calling latent types.
The thing about this is, both the Lisp and the Haskell programmers
are using conceptual types (as best I can tell.)
Well, a big difference is that the Haskell programmers have language
implementations that are clever enough to tell them, statically, a very
precise type for every term in their program. Lisp programmers usually
don't have that luxury. And in the Haskell case, the conceptual type
and the static type match very closely.
There can be differences, e.g. a programmer might have knowledge about
the input to the program or some programmed constraint that Haskell
isn't capable of inferring, that allows them to figure out a more
precise type than Haskell can. (Whether that type can be expressed in
Haskell's type system is a separate question.)
In that case, you could say that the conceptual type is different than
the inferred static type. But most of the time, the human is reasoning
about pretty much the same types as the static types that Haskell
infers. Things would get a bit confusing otherwise.
And also, the
conceptual/latent types are not actually a property of the
program;
That's not exactly true in the Haskell case (or other languages with
static type inference), assuming you accept the static/conceptual
equivalence I've drawn above.
Although static types are often not explicitly written in the program in
such languages, they are unambiguously defined and automatically
inferrable. They are a static property of the program, even if in many
cases those properties are only implicit with respect to the source
code. You can ask the language to tell you what the type of any given
term is.
they are a property of the programmer's mental
model of the program.
That's more accurate. In languages with type inference, the programmer
still has to figure out what the implicit types are (or ask the language
to tell her).
You won't get any argument from me that this figuring out of implicit
types in a Haskell program is quite similar to what a Lisp, Scheme, or
Python programmer does. That's one of the places the connection between
static types and what I call latent types is the strongest.
(However, I think I just heard a sound as though a million type
theorists howled in unison.[*])
[*] most obscure inadvertent pun ever.
with or without static analysis
with or without runtime type information (RTTI or "tags")
with or without (runtime) safety
with or without explicit type annotations
with or without type inference
Wow. And I don't think that's a complete list, either.
Yup.
I would be happy to abandon "strong/weak" as terminology
because I can't pin those terms down. (It's not clear what
they would add anyway.)
I wasn't following the discussion earlier, but I agree that strong/weak
don't have strong and unambiguous definitions.
Uh, oh, a new term, "manifest." Should I worry about that?
Well, people have used the term "manifest type" to refer to a type
that's explicitly apparent in the source, but I wouldn't worry about it.
I just used the term to imply that at some point, the idea of "latent
type" has to be converted to something less latent. Once you explicitly
identify a type, it's no longer latent to the entity doing the identifying.
But if we ask Javascript what it thinks the type of timestwo is, by
evaluating "typeof timestwo", it returns "function". That's because the
value bound to timestwo has a tag associated with it which says, in
effect, "this value is a function".
Well, darn. It strikes me that that's just a decision the language
designers
made, *not* to record complete RTTI.
No, there's more to it. There's no way for a dynamically-typed language
to figure out that something like the timestwo function I gave has the
type "number -> number" without doing type inference, by examining the
source of the function, at which point it pretty much crosses the line
into being statically typed.
(Is it going to be claimed that
there is an *advantage* to having only incomplete RTTI? It is a
serious question.)
More than an advantage, it's difficult to do it any other way. Tags are
associated with values. Types in the type theory sense are associated
with terms in a program. All sorts of values can flow through a given
term, which means that types can get complicated (even in a nice clean
statically typed language). The only way to reason about types in that
sense is statically - associating tags with values at runtime doesn't
get you there.
This is the sense in which the static type folk object to the term
"dynamic type" - because the tags/RTTI are not "types" in the type
theory sense.
Latent types as I'm describing them are intended to more closely
correspond to static types, in terms of the way in which they apply to
terms in a program.
Hmmm. Another place where the static type isn't the same thing as
the runtime type occurs in languages with subtyping.
Yes.
Question: if a language *does* record complete RTTI, and the
languages does *not* have subtyping, could we then say that
the runtime type information *is* the same as the static type?
You'd still have a problem with function types, as I mentioned above,
as well as expressions like the conditional one I gave:
(flag ? 5 : "foo")
The problem is that as the program is running, all it can ever normally
do is tag a value with the tags obtained from the path the program
followed on that run. So RTTI isn't, by itself, going to determine
anything similar to a static type for such a term, such as "string |
number".
One way to think about "runtime" types is as being equivalent to certain
kinds of leaves on a statically-typed program syntax tree. In an
expression like "x = 3", an inferring compiler might determine that 3 is
of type "integer". The tagged values which RTTI uses operate at this
level: the level of individual values.
However, as soon as you start dealing with other kinds of nodes in the
syntax tree -- nodes that don't represent literal values, or compound
nodes (that have children) -- the possibility arises that the type of
the overall expression will be more complex than that of a single value.
At that point, RTTI can't do what static types do.
Even in the simple "x = 3" case, a hypothetical inferring compiler might
notice that elsewhere in the same function, x is treated as a floating
point number, perhaps via an expression like "x = x / 2.0". According
to the rules of its type system, our language might determine that x has
type "float", as opposed to "number" or "integer". It might then either
treat the original "3" as a float, or supply a conversion when assigning
it to "x". (Of course, some languages might just give an error and
force you to be more explicit, but bear with me for the example - it's
after 3:30am again.)
Compare this to the dynamically-typed language: it sees "3" and,
depending on the language, might decide it's a "number" or perhaps an
"integer", and tag it as such. Either way, x ends up referring to that
tagged value. So what type is x? All you can really say, in the RTTI
case, is that x is a number or an integer, depending on the tag the
value has. There's no way it can figure out that x should be a float at
that point.
Of course, further down when it runs into the code "x = x / 2.0", x
might end up holding a value that's tagged as a float. But that only
tells you the value of x at some point in time, it doesn't help you with
the static type of x, i.e. a type that either encompasses all possible
values x could have during its lifetime, or alternatively determines how
values assigned to x should be treated (e.g. cast to float).
BTW, it's worth noting at this point, since it's implied by the last
paragraph, that a static type is only an approximation to the values
that a term can have during the execution of a program. Static types
can (often!) be too conservative, describing possible values that a
particular term couldn't actually ever have. This gives another hint as
to why RTTI can't be equivalent to static types. It's only ever dealing
with the concrete values right in front of it, it can't see the bigger
static picture.
Anton