Post by Chad PerrinPost by Lester L. Martin IIServer side syntax highlighting is an excellent idea, though I'm not
quite sure at this time how to implement it. The other issue with such
being that fossil would basically need to cache the results of running
a highlight for the liftime of the program up until something
invalidates the cache. Line numbering does/will need serious work to
integrate with syntax highlighting regardless of the approach.
Yeah, that'd probably be more work overall, and would likely reduce the
customization for syntax highlighting allowed to people deploying
Fossil
repositories to the web. It would make how line numbering and syntax
highlighting integrate much more "deterministic", though, in that
Fossil
devs would have a clearer view of everything that happens when trying
to
account for it in updates to Fossil source. It's a trade-off, as with
most such decisions.
I'm not sure syntax highlighting is Fossil's task, though integrating
easily with other things that do syntax highlighting sounds like it is
something of benefit to Fossil. That said, I would not want to be
responsible for writing syntax parsers in C so as to generate pretty
content. It might be horrible to offload this to the client via JS,
but that might actually be the best solution just because it keeps
Fossil flexible.
Post by Chad PerrinPost by Lester L. Martin IIPost by Chad PerrinThus, you would have HTML for a line of code that looks something
<tr>
<td class="line-no">$num</td>
<td class="code-line">
<span class="color-type">uint16_t</span> <span class="color-label">get_next</span><span class="color-delim">() {</span>
</td>
</tr>
Basically how GitHub and several other things implement it.
I guess my intuition about how to handle it is in good company, for
some
definition of "good".
The way GitHub does it is fine, however, they have existed before
something of the power of CSS line counters existed likely. Either that
or they tried such an approach and found an issue that I can't think of
at the moment. This will see more addressed a bit further down.
Post by Chad PerrinPost by Lester L. Martin IIThe issue with just applying highlights first is how will line endings
be tracked since html elements need not necessarily be rendered
similarly by all highlighting libraries. Detecting line endings in a
generic way after markup has been applied will be very difficult and
likely library specific. I keep using Prism.js as my goto for
illustration but I would bet that the differences between hljs and
prism are enough that the JS needing to be written to (hopefully)
detect marked up line endings between them would be different and we
get into a "supports $library" case vs a generic case like it has been
so far without syntax highlights and how it'd remain if we didn't go
forward with syntax highlighting when lines are numbered.
If you mean that syntax highlighting libraries might insert literal
newlines into the file when marking it up for highlighting, that's
pretty awful, and could indeed screw up the whole exercise.
I mean that a syntax highlighting library can do it however it likes
and while I'd think most wouldn't insert a literal newline, I might not
think I could plausibly count on `</br>` to be a consistent method
guaranteeing numbering. The other problem is if the syntax highlighter
fails halfway through but doesn't undo it's work, leaving things
partially highlighted you're in for some confusion in the JS you write
yourself. This might not be common but it is quite possible.
Post by Chad PerrinPost by Lester L. Martin IIWe still would end up depending on the "Line numbers" checkbox being a
call into JS to add those in for everything but the server-side case.
I'd rather not have to write JS to try to target 2 different
highlighting engines (or possibly more dependent upon what other users
prefer). Then that means that we'd need to check the JS code written
against say... the latest 3 versions of each highlighting engine in
our "support list". At that point it could be said that our hold ups
in deploying a new version are tied up in making sure integration with
several external resources will move along properly. We'd also get
into a case of saying "supports up to $version_number of this library"
(and more of those statements for other libraries supported). At this
point I came to the conclusion it's a huge undertaking and would
require extensive long term management, and believe at that point, it
might be best to "bless" a certain syntax highlighting library and
forgo anything else. If that library was included in fossil, then
wouldn't need to worry about having to possibly push a fix to allow
the newest version to work.
This pretty much makes the detriments of a server-side approach that I
described earlier apply to the client-side approach, too. There are
other concerns that apply to the server-side, too, though, such as the
fact I suspect more rewriting of Fossil source would be required,
though
I'm just guessing at this point. I'm beginning to think that the best
approach might be to ship a JS syntax highlighting library with Fossil,
or just bless a single library, and allow people deploying their own
repositories to the web to monkey with that at their own peril, in the
short term. Building in some server-side syntax highlighting with the
ability to ignore that and use client-side of one's own choosing (again
at one's own peril) might be the "correct" long-term approach for how
to
handle syntax highlighting.
I'm of the opinion of blessing not only a single
library, but a singular version of that library per Fossil release.
Point
blank I do not see a way to get line numbering and syntax highlighting
to work without depending on either a server-side or client-side
external solution, and producing code to cover each kind and all sorts
of error conditions that can occur due to syntax highlighting I believe
is outside of the scope of Fossil. Writing our own syntax highlighter
might be a great long term approach, but I also don't think anyone
wants to duplicate the work of the likes of hljs, prism, or for a
server side example pygments and hugo's chroma, in C, nor create
something
akin to hugo's solution in that it depends on yet another project's
syntax definition files.
Post by Chad PerrinPost by Lester L. Martin II1. Move towards server side highlighting implementing a caching
mechanism.
This seems like something that should be done eventually, while making
some intermediate approach available in the meantime with no guarantees
of future compatibility -- an optional, experimental, biohazard-warning
approach just to fill in the gap until the server side is available.
Until such time as someone wishes to create the server side solution be
it that it depends on an external solution or something that is built in
to Fossil I'd opt that the intermediate approach should (with some work
and sadly without supporting but one library and version of that
library)
be considered "good enough" and not necessarily considered experimental
so long as one follows a guide someone (likely myself) would create
on setting this up.
Post by Chad PerrinPost by Lester L. Martin II2. Chase multiple versions of differing libraries and maintain our
own JS that either calls the library's line numbering function
or uses our own stuff to afix numbering after the other has been
done.
Sorry, I'm not sure what you're saying here. If you're saying that
syntax highlighting libraries have their own line numbering
functionality, it might make sense to just defer to that in cases where
syntax highlighting is used, and thus obviate most of this discussion.
Some syntax highlighting libraries have their own ways of adding in
line numbering yes (prism). Those that don't have tend to have other
solutions provided by others to get line numbering indeed. By "chase"
I meant that if we support more than one library, and more than one
version of each library, we would be steadily chasing compatibility and
ability to integrate with such a wide range of "stuff" that it'd be
a complete mess and likely hold up the project when it's time for a
release due to checking against all such that the community wants.
Post by Chad PerrinPost by Lester L. Martin II3. Bless a certain highlighting library and/or version of that library
with possible inclusion into fossil itself or a vivid notice that
only $version is supported at this point in time.
That seems like the reasonable short-term solution, to me, but probably
not as an intended long-term official solution. There are reasons to
favor server-side functionality for these things "eventually", and
avoid
pushing all this off to the end user. If syntax highlighting is
considered a nonessential option, though, a simpler solution would be
to
just make some minor server-side changes to allow people deploying
repos
to the web to do the work of experimenting and integrating as they feel
inclined to do so.
A solution for line-numbering in the case of "well, we use the pre-code
tag convention, and the rest is up to you" might be to just use JS to
apply a specially-styled ordered list to the entire block of code after
any hooked-in JS syntax highlighting code and call it done. When you
customize, you get what you get. Right?
Unless someone really wants to write our own syntax highlighting library
be it in C or JS, 3 might need to suffice for the long term solution.
The `<pre><code>` tag convention probably without introducing a `<code>`
tag per source line would work if a JS library supports (or via other
means supports) highlighting with lines though would definitely create
an
absence of "?ln" query capabilities likely unless we go full forward
with the "bless a certain highlightng library" part.
Post by Chad PerrinPost by Lester L. Martin II4. Relegate line numbering with syntax highlighting to a no go.
That's definitely a short-term hack kind of "solution", and probably
not
something that should be an official implementation decision five years
down the road.
Actually, if we don't want to introduce a "for syntax highlighting,
you may only use this library and expect it to work, and it must be this
version" and we can't live with *broken* highlighting when lines
are numbered, and we wish to keep "?ln" queries (though this might be
possible to continue having even with the explicit external dep for
if highlighting is enabled), then 4 is our only option.
Post by Chad PerrinAs a target, I would suggest the emitted html look as much like this as
view-source:https://github.com/jvirkki/libbloom/blob/master/bloom.c
The actual code block begins at line 821.
In ref to your mentioning line 821, I don't happen to see said line
in the linked file.
Post by Chad PerrinThis style of markup is a de-facto standard and leads to a linking
style that would
greatly aid migration from git if fossil could adhere to it.
GitHub isn't "the standard", they're just predominant. Anything we do
could borrow ideas from them, but should not necessarily lean towards
a complete re-implementation of what they do. If anything we should
start on a solid list of syntax highlighting features needed, and build
from there. I'd always say do better than those people view as the
predominant way of doing things excepting perhaps where RFC's are
concerned, adhere strictly to RFC's.
As to linking of lines of code within a project, that's yet another
feature and doesn't necessarily have anything to do with syntax
highlighting either (though it would perhaps cause some concern
with line numbering and syntax highlighting too).
Post by Chad PerrinPost by Lester L. Martin IIAs a target, I would suggest the emitted html look as much like this
view-source:https://github.com/jvirkki/libbloom/blob/master/bloom.c
The actual code block begins at line 821.
This style of markup is a de-facto standard and leads to a linking
style that would greatly aid migration from git if fossil could adhere
to it.
My example was nothing but off the top of my head equivalent to
pseudocode (except I think the code was all valid HTML around valid C).
Only the class names change between my version and this version, apart
from some extra details like data-line-number and id properties, in any
case. That means I was evidently thinking identically (in principle) to
the thoughts of whoever wrote the code that produced your example.
I'm not sure how this has any effect on migration from git to fossil,
though. Git export and Fossil import wouldn't touch this code. Are you
talking about some kind of external tools being able to interact with
this code in the browser? If so, the classes involved probably come
from whatever JS library is used for syntax highlighting anyway, rather
than from something like code internal to Fossil (unless syntax
highlighting gets implemented in C as part of Fossil).
I guess the upshot is that I'm not sure what you mean, and all I've been
able to do so far is guess.
I believe he would have meant a migration from GitHub (the defacto
pseudonym for git itself seemingly). Git itself has no standard way
of displaying code, and interoperability with GitHub as to class
definitions and such would only allow migrations *if* GitHub allows
exporting the wiki pages you create over there in a format useable
by Fossil as well as only coming into play *if* your wiki links
reference certain lines of code in your codebase.
Alright, this has become quite a deep conversation.
At this time, I believe there is no sure way to go forward with
syntax highlighting in relation to line numbers. I believe until
Fossil's creator chimes in and validates a certain way as compatible
with the direction of the project that syntax highlighting with
line numbers should be tabled until he fields an idea or says
one such idea of ours generated is the way to attempt.
My proposal for such is to depend on external JS and keep this
out of the realm of Fossil and to get Fossil to where it supports
this way of doing it. I think I'd rather no syntax highlighting
for line numbers than broken syntax highlighting. With this proposal
we *will* need to reimagine how "?ln" queries will work on artifact
content pages. At the moment it doesn't matter if we render the code
as a table, or as a `<pre>` with multiple `<code>` blocks or what,
what matters is a decision on how to do syntax highlighting whilst
line numbering and not be broken. Let's find which way is going
to be okay going forward to work on that problem then worry
about what html will actually be generated.
This is starting to become a thread about
src/info.c's `artifact_page` function and what should be done going
forward to improve it in even greater aspects.
I would propose at this point we do the following:
First, consider syntax highlighting tackled. The other cases are
cases where having it would be nice, but are not absolutely
necessary and are not easily implemented in a non-broken manner
or only implementable via a complete replacement of the code dealing
with line numbering (possibly moving it out to JS) and a
re-implementation of how "?ln" queries are handled.
Second, get together a list of further features we'd want
syntax highlighting capability tacked on to and discuss each feature
individually and come to a consensus that is agreeable with the
project's direction. I could include in this list off the top of my
head: syntax highlighting for line numberings, syntax highlighting
for diff viewing. Both of those require such a rework of how things
are done, that we can either choose to depend on one JS library,
or need to figure out what the project's direction would support us
doing.
Third, start working on the features in relation to syntax highlighting
after having found a solution compatible with the projects direction.
Wash, rinse, repeat, up until we can stop talking about syntax
highlighting because at that point it's done and not only done,
but well done and superior in context of capability and ability
within reference to the scope of the Fossil project.
Finally, if JS is going to be responsible for line numbering anyway,
perhaps line numberings should be on by default and our only
consideration
be how will we get "?ln" queries back working and probably diff viewing
as well.
--
Lester L. Martin II