Post by tmPost by BGBPost by tm[snip]
Post by BGBwhat it is they are working on at the moment;
which technologies are already commonly in use;
which technologies they are most personally familiar with;
how the people around them may feel about their choices (what will their
coworkers/boss/friends/... think? how will it reflect back on
themselves? what if the people around them don't really like it? ...);
...
new and unfamiliar or unorthodox solutions may be avoided or regarded as
undesirable in many contexts...
You are right, but: The better is the enemy of the good.
Not that Seed7 is better than Java in every aspect.
But the extensibility of Seed7 will allow that it aquires features
that other languages (non-extensible ones) cannot aquire easy.
Many see extensibility as unimportant or undesireable, but when you
look at programming language history you see a trend towards
- In early BASIC variables were single letters or one letter
followed by one digit (this variables were predefined and other
names were not allowed).
- Old BASIC and FORTRAN did not support user defined data
structures.
- In early Pascal a programmer was not able to define a function
like write (with was overloaded for various types and supported
the formatting operator, e.g. write(i:5); ).
- Many of the oldest languages do not support overloading of
functions and operators.
- Older languages supporting overloading do not allow user defined
operator symbols.
Many languages have predefined features which are hardcoded in the
interpreter or compiler. The user is not allowed to do define
similar features. The trend is towards allowing the user more and
more. This means that the barrier between predefined and user
defined things is moving.
yes, but this does not mean one should up and abandon older language
syntax and semantics for sake of unproven designs...
Many people talk about dream languages and dream features. Contrary
to such dream languages and dream features Seed7 works. Interpreter,
compiler, libraries, examples, documentation and much more can be
downloaded and tested. IMHO the implementation shows, that my
designs work, so they are not unproven. OTOH reality can never
compete with dreams...
Maybe you see the use of keywords instead of curly braces as
unproven. Statements with keywords can be found in many languages.
E.g.: Pascal, Ada, Modula2, Oberon, Eiffel, Python, Ruby and several
other languages use statements with keywords instead of curly
braces. Do you think that this languages use unproven designs?
It is really strange that you think that an extensible language
uses the wrong syntax and semantics. Seed7 is extensible, so it
is possible to change syntax and semantics.
AFAIK, apart from Python, none of these is particularly near the top of
the most-used-languages lists...
this may be informative:
http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html
if one simply takes the languages near the top of the list as being most
authoritative, they have much less chance of losing out or having
obscure syntax (or being unfamiliar to developers).
ok, admittedly, I also used JavaScript and ActionScript as reference
languages, but this is more because they are closer to the origins of
BGBScript and also to my particular usage domain.
Post by tmPost by BGBPost by tmPost by BGBPost by tmPost by BGBPost by tmWhat do you think about features like multiple dispatch or the
Seed7 templates: Useing type parameters and functions which
return a type?
they are useful features...
my languages (BS and the BS2 language still in design) also have
first-class types.
multiple dispatch is not supported though.
What about user defined statements and operators?
neither is currently supported...
user defined operators (in a sense more general than, say, operator
overloading) would lead to issues with parsing, as either the parser
would have to be aware of each new operator, or have special syntax to
indicate them.
Seed7 has solved this issues. If you separate syntax parsing from
semantic parsing everything becomes simple. Seed7 uses syntax
http://seed7.sourceforge.net/manual/syntax.htm
looking at it, I would guess in your case, the parser *is* aware of any
new syntax?...
Yes, new syntax must be defined before it can be used.
In that sense the parser is aware of new syntax.
yeah.
in my case, this presents a problem, since I don't want to force any
include-like features (these hurt parser and compiler performance, as
they force grinding through piles of irrelevant crap).
a similar problem exists for compile-time macros, as there is no real
good way to handle compiler-macros apart from either building code
directly in the context of the running system (say, typical of Lisp or
Scheme systems) or (somehow) including all of the macros prior to
compiling the code.
this was a problem I first ran into while implementing a static Scheme
compiler (many years ago), and I have not yet find an ideal solution to
this problem (hence no macros or similar are currently planned for
BGBScript2, at least if being built standalone, but may or may-not be
supported in a scripting context, or may be supported by a "compiler
plug-in" system...).
it is a similar issue to supporting "nlambda" (used in some Scheme
variants) where one could have a lambda representing a macro (and
creating a problem at compile time of knowing whether a given lambda
would be either a function or a macro).
technically, my parsers for both BGBScript and BGBScript2 (which is
basically just another special mode of my C/C#/Java parser) allow
registration of new syntax forms, only that these can't be done
in-language (but have to be done by registering callbacks with the
compiler).
Post by tmPost by BGBI meant it would be "aware" in the same sense that a C parser is aware
of typedefs, or a C++ parser of class and template definitions...
No. A typedef changes the semantic. Variable and function
declarations also change the semantic. C connects the syntax (how
to define a variable) with the semantic (which types were defined).
Several languages with hardcoded parsers have such connections.
IMHO syntax and semantic should be clearly separated. E.g.: The
parser should be able to recognize a declaration independend of the
type. This is easy when declarations are introduced with a keyword.
Several language designers have recognized this and therefore many
modern languages introduce declarations with a keyword instead of a
type name.
I am going the Java and C# route, whereby it can be statically
determined from the syntax what is a type, although it creates a few
restrictions:
no syntax forms can be defined which would create ambiguity as to
whether or not a given identifier represents a type;
some declaration forms which would work in C or C++ just will not work
(for example, I use a C#-like "delegate" system for declaring function
pointers, as the C/C++ syntax no longer works...).
I eventually ended up changing a few traditional syntax forms (such as
type-casts) to deal with the matter that the casts made a few conflicts
with other syntax forms (messing up my ability to support calling
expressions or support curried functions).
so, vs doing casts like:
y=(type)x;
they are done like:
y=x as type;
more like in ActionScript (I ripped the syntax from AS3).
mostly since I didn't want to be limited to using temporary variables to
do curried functions and similar.
so, unlike in C and C++, the BS2 parser remains independent of
declaration context, while still maintaining a mostly traditional syntax
cosmetic.
Post by tmPost by BGBI explicitly avoided this in my language, since it creates dependency
issues (the parser has to be aware of the declaration of a feature to
parse it), and also because it slows parsing (the compiler has to
endlessly check whether or not identifiers represent known typedefs, ...).
Exactly what I said.
Connecting syntax and semantic is not a good idea.
yes, ok.
Post by tmPost by BGBPost by tmPost by BGBso, the current strategy is more to consider the use of more traditional
operator overloading (no built-in syntax for operator overloading
currently exists as of yet though, as the usual way for doing this sort
of thing in my VM in general is by registering callbacks and similar...).
Why should a VM know about operator overloading. This can and should
be resolved long before the VM runs.
operator overloading needs to know types;
I delay full type resolution until very late (typically during linking
or JIT), mostly so that it can deal more cleanly with information which
may not be visible in the scope until link-time or JIT.
Doesn't this mean you have to wait until linking or JIT to get error
messages about overloading problems?
currently, yes, but if the static compiler behaves similarly to MSVC, it
shouldn't be too much of an issue.
the static compiler would then compile all modules in an assembly, link
them, and emit warnings/errors for any type or overloading errors seen
at this stage (the static linker would probably also go and resolve most
type and class references, and possibly also support symbol-stripping).
note: my bytecode will handle references very differently from the JVM
and MSIL, where symbols need to remain present, but the symbol string
may be absent (reducing a symbol to an abstract handle).
Post by tmPost by BGBthis is of great use to using a "magic sticky glue" system for
interfacing code in different programming languages.
Different programming languages may have totally different ideas
about types. E.g.: String representations differ heavily in many
languages. Structs may have additional fields to support dynamic
dispatch. So you will probably end with basic int and float types
and pointers to them. Even char * has a lot of issues since it
cannot hold binary information and the memory management must be
done outside.
note my use of the term "magic sticky glue":
in a lot of cases I (already) glue together different languages in ways
which may seem absurd, and use heuristic approaches to figure out how to
shove together the different type-system mechanics...
for example, gluing together "char *" between C and BGBScript currently
involves the code trying to make an "educated guess" as to what "char *"
means in this context.
guesses like this are also used to glue together BGBScript and C
structs, C and BGBScript function pointers and closures, ...
theoretically, it could all blow up in ones' face, but usually it all
works out fairly well...
this is why code-mining tools are needed for C, as otherwise there would
not be enough info needed to make these sorts of inferences, but when
one starts mining info from the source-code there is all sorts of nifty
inferences which can be made...
BGBScript2, being statically typed, will likely be actually a lot easier
to glue onto C than BGBScript was (which was by default
dynamically-typed, and one has to make a lot more wild guesses as to the
intended semantics...).
Post by tmPost by BGBit is also a big problem in that it also prevents directly calling from
Java into C without having to first have a method declaration somewhere
in Java-land...
I don't see a problem with such method declarations. They contain
information which allows the Java compiler to produce useful error
messages. And they also provide useful information for the human
reader.
but having to type out all this crap, or do all the copy/paste/edit and
JNI crap, is tedious...
with BGBScript all this was automatic and BGBScript was much further
away from C semantically than was Java...
Post by tmPost by BGBPost by tmPost by BGBthe mechanism could be extended some to support static type-checking,
and by adding "operator resolution handlers" which could then look for
operator declarations.
Seed7 supports static type-checking automatically without suport
from a VM.
yes, but probably also has the cost that the compiler has to know about
all of the types as well...
You probably never saw how fast Seed7 interpreter and compiler work.
dunno...
I am aiming for low-millisecond range (< 10ms) compile times or better
for BGBScript2 (unlike the current high-millisecond or multi-second
times I am getting from my C compiler...).
if I can get microsecond compile times, this is better, but not really
necessary.
basically, I want to keep from-source building a reasonable option
without introducing long-delays in program startup or similar.
Post by tmPost by BGBPost by tm[snip]
Post by BGBactually, a similar mechanism is used for class resolution, so for
example the various VMs register callbacks to try seeing if they know
about a given class.
I guess that you mean "dynamic dispatch" when you write
"class resolution". Operator overloading and and dynamic dispatch
are two sides of the same coin. Overloading is done with static
types (which are known at compile-time) and dynamic dispatch uses
dynamic types (usually called classes).
class resolution basically means "for a given class QName (Qualified
Name), get the associated class handle".
Ok, my guess was wrong.
yep.
Post by tmPost by BGBPost by tmPost by BGBuser-defined statements pose similar problems to the operators case.
unless a defined context-independent syntax exists, it would also cause
context-dependence issues for the parser, which is something I would
prefer to avoid.
A context-idependent syntax is the key to success.
When there is not a clear separation between syntax and semantic
everything becomes very complex.
Many languages mix syntax and semantic in complicated ways without
getting real advantages from it. Seed7 separates this things and
has no disadvantage from it. This is also the reason that will make
it hard for other languages to become extensible.
fair enough...
I tried to think some about the issue, but there is no clean or obvious
way to deal with the issue in my existing code...
I "could" add an analogue of Lisp-style macros, but I really don't want
to deal with this at the moment (even though early forms of BGBScript
did have Lisp-like macros...).
another option could be to support "super-generics", which could behave
more like Lisp-style macros.
my_for<int i=0, i<10, i++> { stdout.println("Hello %d", i); };
Can it be that you introduce a new "unproven" construct?
First you complain that Seed7 statements differ from Java and C#.
Now you introduce a new statement with "super-generics" yourself.
What is the difference between you and me introducing new
statements? Is your construct okay because it is called
"super-generics" or because it uses angle brackets. Or is the
difference that Seed7 allows defining new statements now, while
your dream language might support them at some unspecified date in
the future?
partly it is the syntax and names, and association with Java/C# generics
and C++ templates, but granted, I really don't like this syntax FWIW...
as for implementation timeframe:
most of the code is being reused, so most of the parser for the language
has already been implemented (it is basically just a special-case mode
in my Java/C# parser).
the backend will be a little more work, but:
the backend compiler machinery will mostly be reused from my existing C
and BGBScript backends (the BS2 bytecode is basically just a hacked
fusion of my RPNIL and BGBScript bytecode formats).
if I stay on it, I will probably have it implemented within a few months
or so...
Post by tmPost by BGBPost by tmPost by BGBif a person really wants to get fancy, they could use overloaded
operators to implement faux custom expressions and statements, but this
sort of nastiness is bad enough in a lot of C++ code...
The urban myth that operator overloading is bad...
There are many ways to write unmaintable programs and the
developer is always in charge to write good code.
IMO, even many standard C++ features, such as iostream, are a bit nasty...
This comes from the fact that they reused an existing operator
instead of defining a new one. Allowing user defined operators is
just superior to overloading predefined operators.
my strategy here is likely to define a number of "spare" extended
operators, which can be overloaded for new tasks...
I will probably use the BGBScript extended operators, which are mostly
ones like:
+. and .+ and similar...
I used +`, ... in the C parser, but this was mostly because people were
complaining that +. and similar could potentially break some obscurely
written C code ("1.+2" and "1+.2").
in BS2, "1." will be discouraged (should be "1.0" instead, or I could
require that it be "1.0" and allow that "1." may be parsed as "1"
followed by the '.' operator), and ".2" as a number will simply not be
allowed (only "0.2" will be valid).
Post by tmPost by BGBPost by tmPost by BGBcurrent advisory would be to just use plain methods or similar, and
closures if one really needs to pass code blocks.
Closures are also a key concept used in Seed7. The body of a loop
and a loop condition are examples of closures. In Seed7 a closure
is an expression. No special closure notation (like brackets) is
needed. This way a while loop looks just like a while loop from a
conventional (not-extensible) language.
this has been considered, but would be a problem in that, like macros,
it would be required to be known at compile time that a method expects a
closure...
so, in the current BS2 syntax, an explicit declaration is needed, ...
That takes away much of the advantages of closures. Especially
they cannot be used to define statements which look similar to
predefined statements. Except when the statements blocks used by
predefined statements require this special closure notation also.
yes, but there is no way to handle this case without creating ambiguous
syntax, so it is a moot point...
I currently allow:
"fun{...}" for an argument-free closure of statements, and
"fun[...]" for an argument-free expression closure.
for closures with arguments, there will not be a difference between
statement and expression closures, since it will be possible to infer
the difference from the syntax.
"fun ..." is not allowed since given the syntax, it would not likely be
possible to unambiguously parse this case.
or such...