[hybi] Requirement: WebSocket HTTP coexistence (was: Requirement for Maciej security issue)

Discussion:

[hybi] Requirement: WebSocket HTTP coexistence (was: Requirement for Maciej security issue)

Salvatore Loreto

2010-02-26 08:44:48 UTC

Hi,

(as this is an important aspect of the protocol, I have changed the
subject so to better track the discussion and the eventual consensus.)

in the current version of the draft we have the following requirements:

REQ. 6: The Web Socket Protocol MUST be designed in such a way that
its servers can share a port with HTTP servers, as
by default the
Web Socket Protocol uses port 80 for regular Web
Socket
connections and port 443 for Web Socket connection
tunnelled over TLS.

however during the HyBi BoF there was not a clear consensus on reusing
HTTP ports.
So I'd like to check what the real consensus is on this specific point:
reusing HTTP ports (80 & 443)
as well as the reasons for doing so.

Moreover, in the mail discussion based on Roberto, Joe and Tim suggestions,
Maciej has drafted the following requirements that in my opinion are
much clear and generic.

Requirement X: The WebSocket protocol MUST be designed so that it is
possible and practical
to share a single host and port with
HTTP. Consideration should be given to making
it possible to provide WebSocket
services via modules that plug in to existing popular Web servers.

Reason: TBD...

Requirement Y: The WebSocket protocol MUST be designed so that it is
possible and practical
to make a standalone WebSocket server that
is not embedded into an existing HTTP server.

Reason: there could be scenarios where it is not opportune or
possible to set up a proxy on the same HTTP server.
-> possible problem highlighted by Joe: in this case the same-origin
policy in the hosting application SHOULD NOT match port numbers.

Requirement Z: The WebSocket protocol SHOULD be designed in a way
that facilitate re-use of existing
well-debugged software when implementing it.

Reason: the re-usage of existing well-debugged software decreases
the number of implementation errors as well as the
possibility to introduce security holes especially and
at the same time speed up the development especially when
the Web Socket server is implemented as modules that
plug in to existing popular Web servers

all thoughts or comments are very well welcome and appreciated.

cheers
Sal

KOMATSU Kensaku

2010-02-26 11:30:22 UTC

I'm a web application developer, so I'll have a comment in this perspective.

I like requirement X. ("to share a single host and port with HTTP")

Post by Salvatore Loreto
Requirement X: The WebSocket protocol MUST be designed so that it is
possible and practical
to share a single host and port with HTTP.
Consideration should be given to making
it possible to provide WebSocket services via
modules that plug in to existing popular Web servers.
Reason: TBD...

Reason:
I'm now interested in websockets pipeline, because this technology
will have the potential to make web service faster. In my test
(text-mining web services w/ handreds of sentence), websockets
pipeline makes x10 - 40 faster than today's xhr enviroment (for
example, from U.S. to japan, websokets pipeline => less than 1seconds,
xhr => more than 30 seconds, from Kaazing blog :) ).
(server => apache2 + mod_pywebsocket, client => chrome4+)

To make better web services, I'll continuously tests this technology
and also serve web application with it (To make actual use-case).
But, for the backward compatibility, I have to serve same services via
http( even it's slower
than ws service), and this situation won't be changed next 10 years.

In this scenario, since almost contents ( for example libraries ) are
shared both ws pipeline and http restful services, it's reasonable to
manage applications and contents under same server ( like
mod_pywebsocket does ). If these services are served under different
server, it'll make complexity to deploy and maintenance services.

I don't negate another requirements, in some situation ( sorry, I
can't concretely describe it) these requirements are reasonable for
service developers.

Best regards.
--
Kensaku KOMATSU
http://code.google.com/p/websocket-sample/

Post by Salvatore Loreto
Hi,
(as this is an important aspect of the protocol, I have changed the subject
so to better track the discussion and the eventual consensus.)
   REQ. 6: The Web Socket Protocol MUST be designed in such a way that
                 its servers can share a port with HTTP servers, as by
default the
                 Web Socket Protocol uses port 80 for regular Web Socket
               connections and port 443 for Web Socket connection tunnelled
over TLS.
however during the HyBi BoF there was not a clear consensus on reusing HTTP
ports.
reusing HTTP ports (80 & 443)
as well as the reasons for doing so.
Moreover, in the mail discussion based on Roberto, Joe and Tim suggestions,
Maciej has drafted the following requirements that in my opinion are much
clear and generic.
Requirement X: The WebSocket protocol MUST be designed so that it is
possible and practical
                            to share a single host and port with HTTP.
Consideration should be given to making
                            it possible to provide WebSocket services via
modules that plug in to existing popular Web servers.
Reason: TBD...
Requirement Y: The WebSocket protocol MUST be designed so that it is
possible and practical
                        to make a standalone WebSocket server that is not
embedded into an existing HTTP server.
Reason: there could be scenarios where it is not opportune or possible to
set up a proxy on the same HTTP server.
-> possible problem highlighted by Joe: in this case the same-origin policy
in the hosting application SHOULD NOT match port numbers.
Requirement Z: The WebSocket protocol SHOULD be designed in a way that
facilitate re-use of existing
                          well-debugged software when implementing it.
Reason: the re-usage of existing well-debugged software decreases the number
of implementation errors as well as the
             possibility to introduce security holes especially and at the
same time speed up the development especially when
             the Web Socket server is implemented as modules that plug in to
existing popular Web servers
all thoughts or comments are very well welcome and appreciated.
cheers
Sal
_______________________________________________
hybi mailing list
https://www.ietf.org/mailman/listinfo/hybi

Christopher Blizzard

2010-02-26 17:15:53 UTC

Post by Salvatore Loreto
however during the HyBi BoF there was not a clear consensus on reusing
HTTP ports.
So I'd like to check what the real consensus is on this specific
point: reusing HTTP ports (80 & 443)
as well as the reasons for doing so.

Note that I'm not wearing my Mozilla hat for this response, but as
someone who has done web programming in the past.

I'd personally like to be able to run a WS-capable server next to an
existing Apache installation. On the browser side I'd love to see the
security model reflect this. i.e. that connections to the well-known
port WS port on the same host are allowed under the same domain
restrictions as if you were connecting to the original web server.

This is largely for the convenience of deployment in low-fi situations.
A lot of the people on this list are on the high end of the scale with
their own custom servers and large environments. I am not one of those
people, but I do want to see people be able to adopt this technology
with as low friction as possible, and I think that side-by-side
deployment is an important part of that story.

So my personal opinion is that upgrading existing port 80 is fine, but I
would also love to see an explicit well-known port added and have the
browser APIs support side-by-side install as well.

--Chris

Maciej Stachowiak

2010-02-26 17:25:50 UTC

Post by Christopher Blizzard

Post by Salvatore Loreto
however during the HyBi BoF there was not a clear consensus on
reusing HTTP ports.
So I'd like to check what the real consensus is on this specific
point: reusing HTTP ports (80 & 443)
as well as the reasons for doing so.

Note that I'm not wearing my Mozilla hat for this response, but as
someone who has done web programming in the past.
I'd personally like to be able to run a WS-capable server next to an
existing Apache installation. On the browser side I'd love to see
the security model reflect this. i.e. that connections to the well-
known port WS port on the same host are allowed under the same
domain restrictions as if you were connecting to the original web
server.

The way WebSocket is currently designed, you can always attempt to
connect to any server, and it always has to explicitly grant
permission to a particular Origin. So on the one hand, there's no
automatic same-origin grant, but on the other hand connecting to a
WebSocket on a different port than your Web server is no harder than
connecting to one on the same port.

If we did have an automatic same-origin exception to the need for
explicit permission grant, I think it would be unwise to ignore the
port, because that could endanger existing configurations where you
have two independent Web servers running on the same host but
different ports. But since an explicit permission grant is needed to
connect to a WebSocket server in any case, it doesn't really matter.

Post by Christopher Blizzard
This is largely for the convenience of deployment in low-fi
situations. A lot of the people on this list are on the high end of
the scale with their own custom servers and large environments. I
am not one of those people, but I do want to see people be able to
adopt this technology with as low friction as possible, and I think
that side-by-side deployment is an important part of that story.
So my personal opinion is that upgrading existing port 80 is fine,
but I would also love to see an explicit well-known port added and
have the browser APIs support side-by-side install as well.

Having a recommended non-HTTP / non-HTTPS port might be helpful in
some cases, but doesn't seem essential for this use case. The browser
APIs support connecting to any port. However, only one port can be the
default, so I think the usefulness of an extra port that is "well-
known" but not the default is limited. You'd still have to explicitly
choose to connect to it.

Regards,
Maciej

Greg Wilkins

2010-02-27 07:24:24 UTC

I agree with the sentiment of this requirement, but I don't like
the wording.

Specifically "share a port" is not very well defined and could be
interpreted to mean being able to send HTTP and hybi traffic
at the same time to the same port.

Historically websocket design started as a standalone
mechanism to give browsers access to raw sockets. However
it became apparent that this was not seen as secure or
likely to be allocated, so the design was changed to
"take over" a HTTP connection.

Thus I'm wondering if the requirement actually is:

The Web Socket Protocol MUST be able to be used as
an Upgrade protocol of a HTTP connection as defined by
RFC2616. Specifically the protocol must work within
the security model of a HTTP connection.

I know this feels a little solution-ish but I do think it
is a requirement, as the protocol must work with HTTP
authentication, origin policies, cookie policies etc. etc.

regards

Post by Salvatore Loreto
Hi,
(as this is an important aspect of the protocol, I have changed the
subject so to better track the discussion and the eventual consensus.)
REQ. 6: The Web Socket Protocol MUST be designed in such a way that
its servers can share a port with HTTP servers, as
by default the
Web Socket Protocol uses port 80 for regular Web Socket
connections and port 443 for Web Socket connection
tunnelled over TLS.
however during the HyBi BoF there was not a clear consensus on reusing
HTTP ports.
reusing HTTP ports (80 & 443)
as well as the reasons for doing so.
Moreover, in the mail discussion based on Roberto, Joe and Tim suggestions,
Maciej has drafted the following requirements that in my opinion are
much clear and generic.
Requirement X: The WebSocket protocol MUST be designed so that it is
possible and practical
to share a single host and port with
HTTP. Consideration should be given to making
it possible to provide WebSocket
services via modules that plug in to existing popular Web servers.
Reason: TBD...
Requirement Y: The WebSocket protocol MUST be designed so that it is
possible and practical
to make a standalone WebSocket server that
is not embedded into an existing HTTP server.
Reason: there could be scenarios where it is not opportune or
possible to set up a proxy on the same HTTP server.
-> possible problem highlighted by Joe: in this case the same-origin
policy in the hosting application SHOULD NOT match port numbers.
Requirement Z: The WebSocket protocol SHOULD be designed in a way
that facilitate re-use of existing
well-debugged software when implementing it.
Reason: the re-usage of existing well-debugged software decreases
the number of implementation errors as well as the
possibility to introduce security holes especially and
at the same time speed up the development especially when
the Web Socket server is implemented as modules that
plug in to existing popular Web servers
all thoughts or comments are very well welcome and appreciated.
cheers
Sal
------------------------------------------------------------------------
_______________________________________________
hybi mailing list
https://www.ietf.org/mailman/listinfo/hybi

Maciej Stachowiak

2010-02-27 07:55:28 UTC

Post by Greg Wilkins
I agree with the sentiment of this requirement, but I don't like
the wording.
Specifically "share a port" is not very well defined and could be
interpreted to mean being able to send HTTP and hybi traffic
at the same time to the same port.
Historically websocket design started as a standalone
mechanism to give browsers access to raw sockets. However
it became apparent that this was not seen as secure or
likely to be allocated, so the design was changed to
"take over" a HTTP connection.
The Web Socket Protocol MUST be able to be used as
an Upgrade protocol of a HTTP connection as defined by
RFC2616. Specifically the protocol must work within
the security model of a HTTP connection.
I know this feels a little solution-ish but I do think it
is a requirement, as the protocol must work with HTTP
authentication, origin policies, cookie policies etc. etc.

I don't think we should encode the exact solution in the requirement,
even if we think there is only one solution. I think we all have a
reasonable shared understanding of what "share a port" means, but if
you have a more precise way to phrase it without getting into the
details of the mechanism, that would be good.

I'm not sure what "must work within the security model of a HTTP
connection" means or how it relates to the rest of the requirement. At
least from the browser point of view, WebSocket does *not* "work
within the security model of an HTTP connection", because any Web page
can attempt to connect to any server, and access is granted explicitly
to the page's Origin and not at all based on the same-origin policy.
Even if you attempt a WebSocket connection to your own host and port,
the server has to grant access to its own Origin explicitly.

If you can clarify what you mean by the security statement, perhaps it
makes sense as a separate requirement.

Regards,
Maciej

Greg Wilkins

2010-02-27 08:05:43 UTC

Post by Maciej Stachowiak
I don't think we should encode the exact solution in the requirement,
even if we think there is only one solution. I think we all have a
reasonable shared understanding of what "share a port" means, but if you
have a more precise way to phrase it without getting into the details of
the mechanism, that would be good.

Maceij, I agree that even if there is only 1 solution, it's best
to avoid making that a requirement. I still think that "share a
port" is wrong, even if we know what it means.

Sohow about:

The Web Socket Protocol MUST be designed to be able to able
to take over an established HTTP connection.

Note that HTTP can work on ports other than 80 and 443 and
it's not the port that is shared, but the HTTP connection.

Post by Maciej Stachowiak
I'm not sure what "must work within the security model of a HTTP
connection" means or how it relates to the rest of the requirement. At
least from the browser point of view, WebSocket does *not* "work within
the security model of an HTTP connection", because any Web page can
attempt to connect to any server, and access is granted explicitly to
the page's Origin and not at all based on the same-origin policy. Even
if you attempt a WebSocket connection to your own host and port, the
server has to grant access to its own Origin explicitly.
If you can clarify what you mean by the security statement, perhaps it
makes sense as a separate requirement.

I agree this is probably a separate requirement.

What I'm just saying is that the protocol needs to be designed
to work within the existing web security model with respect to:

Cookies sent in the handshake will need
to respect the existing HTTP model. Ie we can't require that
cookies are sent (or are not sent) on any grounds other than
domain, path and age.

Credentials sent in the handshake (eg BASIC or DIGEST)
will be according to the HTTP spec.

The protocol must work within the origin policy currently
being extended for other web resources.

I guess this can be stated that nothing before the 101 shall
be contrary to standard HTTP.

cheers

Maciej Stachowiak

2010-02-27 09:57:01 UTC

(Consolidating subthreads)

Hi Greg,

I'm wary of getting into too much back-and-forth on the exact wording,
as I do not wish to monopolize the thread. However, I do feel it is
important to properly reflect the use cases submitted by Roberto and
Joe, and I am not sure your wording is quite there. Let me try one
more time to see if we can get on the same page, and then I'll take a
break and let the rest of the WG comment.

Post by Maciej Stachowiak
I don't think we should encode the exact solution in the requirement,
even if we think there is only one solution. I think we all have a
reasonable shared understanding of what "share a port" means, but if you
have a more precise way to phrase it without getting into the
details of
the mechanism, that would be good.

Maciej, I agree that even if there is only 1 solution, it's best
to avoid making that a requirement. I still think that "share a
port" is wrong, even if we know what it means.
The Web Socket Protocol MUST be designed to be able to able
to take over an established HTTP connection.

I like this version better. But I still feel it is leaning a bit too
much towards specific solutions. However, I think we may be using some
terms differently, leading to miscommunication. See below.

Note that HTTP can work on ports other than 80 and 443 and
it's not the port that is shared, but the HTTP connection.

Ah, I see. When I say "share a port", I do not mean "share the default
port assignment". I mean there is a single server listening on a
particular port (whether or not that is the default well-known port
for the protocol) which can act as both a conforming HTTP server and a
conforming WebSocket server.
I believe it follows from this that there must be a way for a server
to accept either kind of connection and dispatch to the appropriate
piece of code to handle it. And it seems like this implies that the
initial handshake must look like HTTP and be compatible with HTTP
processing, which I think is the core of what you are trying to say.

I think your phrasing, "take over an established HTTP connection", is
a little too specific about the solution. Let's say the way we find a
way to let Web servers cleanly dispatch to a WebSocket component
without actually having an "established HTTP connection". I wouldn't
rule that out a priori. At the same time, I think it might be a little
too weak. I specifically said "possible and practical", because I
think a design that is in theory workable on the server side, but in
practice a huge pain, would not be a design we should be happy with.

With those things in mind, here's my attempt at rewording:

Requirement: The WebSocket protocol MUST be designed so that it
is possible and practical to use a single host and port to serve HTTP
requests and WebSocket requests at the same time. Consideration MUST
be given to making it possible to provide WebSocket services via
modules that plug in to existing popular Web servers.

Reason: Some server developers would like to integrate WebSocket
support into existing HTTP servers. In addition, the default HTTP and
HTTPS ports are often favored for traffic that has to go through a
firewall, so service providers will likely want to be able to use
WebSocket over ports 80 and 443, even when running a Web server on the
same host.

Post by Maciej Stachowiak
If you can clarify what you mean by the security statement, perhaps it
makes sense as a separate requirement.

I agree this is probably a separate requirement.
What I'm just saying is that the protocol needs to be designed
Cookies sent in the handshake will need
to respect the existing HTTP model. Ie we can't require that
cookies are sent (or are not sent) on any grounds other than
domain, path and age.

Here's how I'd word it:

Requirement: Any use of HTTP cookies in the WebSocket protocol
MUST be consistent with the ordinary requirements for HTTP cookies.

Reason: Cookies are often used to identify a login session or to
store session state. They provide a convenient way to identify the
user. It may be useful for a WebSocket server to use cookies to
identify a user or store state, but to maintain security it is
important for the processing to be consistent.

Credentials sent in the handshake (eg BASIC or DIGEST)
will be according to the HTTP spec.

Requirement: Any use of HTTP authentication credentials in the
WebSocket protocol MUST be consistent with the ordinary requirements
for HTTP authentication.

Reason: HTTP authentication is sometimes used to identify a login
session. It provides a convenient way to identify the user. It may be
useful for a WebSocket server to use HTTP authentication, but to
maintain security it is important for the processing to be consistent.

The protocol must work within the origin policy currently
being extended for other web resources.

I disagree with this requirement. WebSocket protocol does not use the
same-origin policy normally applied to other Web resources. I believe
the design it does use is better overall, despite being different.

I guess this can be stated that nothing before the 101 shall
be contrary to standard HTTP.

I like the requirements that tie very directly to use cases much more
than solution-oriented ones like this.

Maciej,
I think we are agreeing on substance.... just not on words.
I agree a standalone websocket server should not
have to support HEAD.
So if you ignore what I said about *is-a* HTTP server, do you agree
A web socket server MUST support only those parts of HTTP that are
necessary for a websocket connection to be securely established.
ie this is saying that a websocket server does not need
to support full RFC2616. It would only need to support the bits
necessary to establish a websocket connection - whatever they may
end up being.

I feel like your phrasing is a very roundabout way of getting at that
point. I think Joe's point was that we should impose the minimum
requirements necessary on standalone WebSocket servers. But your
wording appears to be about particular requirements that we should
impose on them. I think it could be argued that the current WebSocket
protocol requires a WebSocket server to support *no* parts of HTTP at
all. I think you can write a conforming WebSocket server that violates
nearly every MUST for origin servers in RFC2616. I'd argue that's a
good thing, if we can maintain that while also being more consistent
with HTTP processing. So I'd word it like this:

Requirement: The WebSocket protocol MUST be designed so that it
is possible and practical to make a standalone WebSocket server that
is not embedded into an existing HTTP server. In particular the
protocol design MUST NOT require supporting any parts of HTTP that are
not strictly necessary to implement the protocol.

Reason: In some cases it is not desirable or necessary to
integrate with an existing Web server. In this case, it would be
burdensome to require supporting any HTTP requirements that are not
strictly necessary to operation of the protocol.

Regards,
Maciej

Greg Wilkins

2010-02-27 12:17:38 UTC

Post by Maciej Stachowiak
I'm wary of getting into too much back-and-forth on the exact wording,
as I do not wish to monopolize the thread.

sure - but others are free to contribute even with the back and
forthing :) I think we are converging and if others disagree
on where we are going, then they should speak up!

Post by Maciej Stachowiak
I think your phrasing, "take over an established HTTP connection", is a
little too specific about the solution.

I agree "take over" kind of implies upgrade. But what I'm trying
to capture with these words is:

+ it is connection based - not request based.
+ if operating in "shared port" mode, you start with HTTP
and move to WS.

But I also agree that we should not require that ws always starts with a
HTTP connection. We may eventually come up with a mechanism that allows
standalone ws servers to altogether avoid HTTP.

Post by Maciej Stachowiak
Requirement: The WebSocket protocol MUST be designed so that it is
possible and practical to use a single host and port to serve HTTP
requests and WebSocket requests at the same time. Consideration MUST be
given to making it possible to provide WebSocket services via modules
that plug in to existing popular Web servers.

I think the phrase "server HTTP requests and WebSocket requests
at the same time" is confusing for the reasons I stated above.

I think the following phrases capture what we have both been
saying, but I'm not sure if they are one or many requirements:

The WebSocket protocol MUST be designed so that it is
possible and practical to implement a server that can
establish both HTTP and Websocket connections on the
same host and port.

When operating on standard HTTP ports, the server MUST
initially handle a new connection as a HTTP connection.

A web socket server MUST support only those parts of HTTP
that are necessary for a websocket connection to be securely
established and to reject unacceptable connections.

Consideration MUST be given to making it possible to provide
WebSocket services via modules that plug in to existing
web infrastructure.

Post by Maciej Stachowiak
Reason: Some server developers would like to integrate WebSocket
support into existing HTTP servers.

It's not that we'd like to integrate, but that we need to
in order to support the valid desire of websocket to be a
protocol upgrade of HTTP. There are good reasons that
ws has been shifted towards using the HTTP ports and there
are good reasons that HTTP supports such protocol upgrades

Post by Maciej Stachowiak
Requirement: Any use of HTTP cookies in the WebSocket protocol MUST
be consistent with the ordinary requirements for HTTP cookies.

+1

Post by Maciej Stachowiak

Post by Greg Wilkins
Credentials sent in the handshake (eg BASIC or DIGEST)
will be according to the HTTP spec.

Requirement: Any use of HTTP authentication credentials in the
WebSocket protocol MUST be consistent with the ordinary requirements for
HTTP authentication.

+1

Post by Maciej Stachowiak

Post by Greg Wilkins
The protocol must work within the origin policy currently
being extended for other web resources.

I disagree with this requirement. WebSocket protocol does not use the
same-origin policy normally applied to other Web resources. I believe
the design it does use is better overall, despite being different.

I'm not suggesting the same origin policy - rather the enhanced
cross domain origin policies that are already being developed
and deployed by browsers:

http://www.w3.org/TR/access-control/
http://developer.mozilla.org/En/HTTP_access_control

How about something a little weaker

If possible and practical, the protocol SHOULD use existing
and emerging standards for an origin based cross domain
security policy.

Reason: we don't want to re-invent the wheel.

cheers

cheers

Maciej Stachowiak

2010-02-27 12:38:19 UTC

Post by Greg Wilkins
I think the following phrases capture what we have both been
The WebSocket protocol MUST be designed so that it is
possible and practical to implement a server that can
establish both HTTP and Websocket connections on the
same host and port.
When operating on standard HTTP ports, the server MUST
initially handle a new connection as a HTTP connection.
A web socket server MUST support only those parts of HTTP
that are necessary for a websocket connection to be securely
established and to reject unacceptable connections.
Consideration MUST be given to making it possible to provide
WebSocket services via modules that plug in to existing
web infrastructure.

I think those are closer to statements I could get behind. One general
comment: to go in a requirements document, they should be phrased as
requirements on the protocol, on the design, or on the specification,
rather than as requirements on servers.

I also still disagree with stating in the requirements document that a
WebSocket server MUST support any part of HTTP. The current WebSocket
draft does not require standalone servers to meet any HTTP
requirements (maybe some trivial ones, but there's no need to be
anything close to a conforming HTTP implementation). Maybe we will
find it is required while reviewing and revising the protocol, maybe
not. I feel like Joe's original use case called for minimizing burden
on standalone servers. But your framing of the issue seems to be about
requiring them to do more. I would really like to have a statement
that is directly about enabling standalone servers and not making them
needlessly hard to implement, rather than one about what parts of HTTP
need to be implemented.

I may comment further later, just wanted to take this opportunity to
indicate partial agreement.

Regards,
Maciej

Greg Wilkins

2010-02-27 13:48:52 UTC

Maciej

in response to your latest comments... how about:

The WebSocket protocol MUST be designed so that it is
possible and practical to establish both HTTP and Websocket
connections to the same host and port.

When operating on standard HTTP ports, an implementation
of the protocol MUST initially handle a new connection as a
HTTP connection.

Consideration MUST be given to making it possible and
practical to implement standalone implementations of the
protocol without requiring a fully conforming HTTP
implementation. An implementation of the protocol SHALL NOT
be required to implement any part of the HTTP protocol that
is not necessary for the establishment of a websocket connection.

Consideration MUST be given to provide WebSocket services via
modules that plug in to existing web infrastructure.

cheers

Jamie Lokier

2010-02-27 20:15:45 UTC

Post by Greg Wilkins
When operating on standard HTTP ports, an implementation
of the protocol MUST initially handle a new connection as a
HTTP connection.

So a standalone WebSocket server on port 80 (a "standard HTTP port")
MUST initially handle a new connection as an HTTP connection.

Does that mean it MUST do things like process TRACE requests and
OPTIONS *? Does that mean it must process the Expect header (in an
ideal world, "Expect: websocket" would work...), and that it must
check the Host header?

If it is not necessary to implement HTTP that is not necessary for
establishing WebSocket connections - what is necessary? If it's the
minimal handshake only, then it may be incompatible with speculative
extensions which we can request from "full" HTTP-compatible WebSocket
servers with ease. And they won't work with requests from some
clients that HTTP+WebSocket servers would be guaranteed to work with,
which is begging for interoperability problems.

Basically I see two requirements:

1. Standalone WebSocket servers must be easy to write and the
specification easy to understand.

(1a. Same for standalone WebSocket clients?)

2. Servers which serve HTTP and WebSocket on the same port must be
fully HTTP compatible during the handshake, and easy to write
using standard HTTP components to implement that handshake -
while being fully compatible with standalone WebSocket clients
and servers which do not implement full HTTP.

-- Jamie

Maciej Stachowiak

2010-02-28 00:18:06 UTC

Post by Jamie Lokier
1. Standalone WebSocket servers must be easy to write and the
specification easy to understand.
(1a. Same for standalone WebSocket clients?)
2. Servers which serve HTTP and WebSocket on the same port must be
fully HTTP compatible during the handshake, and easy to write
using standard HTTP components to implement that handshake -
while being fully compatible with standalone WebSocket clients
and servers which do not implement full HTTP.

This lines up much more with how I see the requirements than Greg's
wording.

(For clients implemented inside the browser, I wouldn't worry about
it; it's pretty arbitrary whether they considered "standalone" or also
HTTP clients, and port sharing is not an important consideration. If
you want any client implementability requirements, I would ask for
"must be feasible to implement in a Web browser". I don't know what
kinds of non-browser clients people want so I don't know the
requirements there.)

Regards,
Maciej

Justin Erenkrantz

2010-02-28 04:12:10 UTC

1. Standalone WebSocket servers must be easy to write and the
specification easy to understand.
(1a. Same for standalone WebSocket clients?)
2. Servers which serve HTTP and WebSocket on the same port must be
fully HTTP compatible during the handshake, and easy to write
using standard HTTP components to implement that handshake -
while being fully compatible with standalone WebSocket clients
and servers which do not implement full HTTP.

This lines up much more with how I see the requirements than Greg's wording.
(For clients implemented inside the browser, I wouldn't worry about it; it's
pretty arbitrary whether they considered "standalone" or also HTTP clients,
and port sharing is not an important consideration. If you want any client
implementability requirements, I would ask for "must be feasible to
implement in a Web browser". I don't know what kinds of non-browser clients
people want so I don't know the requirements there.)

The serf client framework (which supports a PoC BWTP implementation;
so I think an async WS implementation is pretty feasible once the
drafts become more intelligible) has the same HTTP header ordering
semantics as Apache httpd - ie, they are not deterministic. (Again,
not surprising as serf came out of httpd. *grin*)

So, I think #2 might be:

2. During the initial handshake between servers and clients over an
existing HTTP connection, all messages must be fully HTTP compatible
until the upgrade handshake is successfully completed and accepted.

My $.02. -- justin

Maciej Stachowiak

2010-02-28 06:40:06 UTC

Post by Justin Erenkrantz

Post by Jamie Lokier
1. Standalone WebSocket servers must be easy to write and the
specification easy to understand.
(1a. Same for standalone WebSocket clients?)
2. Servers which serve HTTP and WebSocket on the same port must be
fully HTTP compatible during the handshake, and easy to write
using standard HTTP components to implement that handshake -
while being fully compatible with standalone WebSocket clients
and servers which do not implement full HTTP.

This lines up much more with how I see the requirements than Greg's wording.
(For clients implemented inside the browser, I wouldn't worry about it; it's
pretty arbitrary whether they considered "standalone" or also HTTP clients,
and port sharing is not an important consideration. If you want any client
implementability requirements, I would ask for "must be feasible to
implement in a Web browser". I don't know what kinds of non-browser clients
people want so I don't know the requirements there.)

The serf client framework (which supports a PoC BWTP implementation;
so I think an async WS implementation is pretty feasible once the
drafts become more intelligible) has the same HTTP header ordering
semantics as Apache httpd - ie, they are not deterministic. (Again,
not surprising as serf came out of httpd. *grin*)

It's not clear to me why you'd base a WebSocket client on httpd code.
For BWTP it might make sense, since the client may receive messages
that look like HTTP requests and be obliged to send responses that
look like HTTP responses. But that doesn't happen in WebSocket. Does
anyone have an example of an actual (not just planned) WebSocket
client that reuses HTTP server code?

Post by Justin Erenkrantz
2. During the initial handshake between servers and clients over an
existing HTTP connection, all messages must be fully HTTP compatible
until the upgrade handshake is successfully completed and accepted.

I think that wording expresses a mechanism, not a use case. I think
the requirements should be about what we want to enable people to do
with the protocol, not about how we want to do it. That's kind of the
point of having a separate requirements gathering phase, right?

Regards,
Maciej

Justin Erenkrantz

2010-02-28 07:18:07 UTC

It's not clear to me why you'd base a WebSocket client on httpd code. For
BWTP it might make sense, since the client may receive messages that look
like HTTP requests and be obliged to send responses that look like HTTP
responses. But that doesn't happen in WebSocket. Does anyone have an example
of an actual (not just planned) WebSocket client that reuses HTTP server
code?

While we're getting off-track here, serf does not reuse any httpd code
- but was a further refinement of ideas presented in httpd 2.0 applied
to a client library. (serf can be used by httpd as the proxy and has
been discussed to replace the core of httpd with serf. Again,
off-topic and besides the point...)

My issue is that I think it is very common for many HTTP frameworks -
client and intermediaries and server - to purposely throw away any
header-ordering - which is just one example of a common optimization
that is broken by the strictness requirements in the WS drafts. We
need to be cognizant of what optimizations and implementations do in
the real-world and make sure that we don't knee-cap them without any
solid reason.

Post by Justin Erenkrantz
2. During the initial handshake between servers and clients over an
existing HTTP connection, all messages must be fully HTTP compatible
until the upgrade handshake is successfully completed and accepted.

I think that wording expresses a mechanism, not a use case. I think the
requirements should be about what we want to enable people to do with the
protocol, not about how we want to do it. That's kind of the point of having
a separate requirements gathering phase, right?

I will repeat that I think folks are misguided when they say that no
one will write frameworks - either client or server-side - that offer
*both* HTTP and WS support. I think most of us are hesitant to
re-invent the wheel - I think the majority of implementers on this
list are going to do their best to reuse their existing HTTP stacks as
much as possible. As such, I believe that the implementations that
are not strongly related in some way to an existing HTTP stack are
going to be in the minority by far. So, we are going to be beholden
to the interop-capabilities of our existing HTTP stacks - so placing
additional requirements above and beyond HTTP/1.1 in the upgrade
process should be a non-starter...but sadly, the current draft tries
to impose rules that simply don't exist in HTTP/1.1. -- justin

Maciej Stachowiak

2010-02-28 18:26:50 UTC

Post by Justin Erenkrantz
My issue is that I think it is very common for many HTTP frameworks -
client and intermediaries and server - to purposely throw away any
header-ordering - which is just one example of a common optimization
that is broken by the strictness requirements in the WS drafts. We
need to be cognizant of what optimizations and implementations do in
the real-world and make sure that we don't knee-cap them without any
solid reason.

For WebKit's client implementation of WebSocket, header-ordering
constraints were not a problem in practice. There were already much
more fundamental reasons we could not reuse most of the HTTP client
stack. There's really no need to talk HTTP and WebSocket on the same
port however.

Post by Justin Erenkrantz
I will repeat that I think folks are misguided when they say that no
one will write frameworks - either client or server-side - that offer
*both* HTTP and WS support. I think most of us are hesitant to
re-invent the wheel - I think the majority of implementers on this
list are going to do their best to reuse their existing HTTP stacks as
much as possible. As such, I believe that the implementations that
are not strongly related in some way to an existing HTTP stack are
going to be in the minority by far. So, we are going to be beholden
to the interop-capabilities of our existing HTTP stacks - so placing
additional requirements above and beyond HTTP/1.1 in the upgrade
process should be a non-starter...but sadly, the current draft tries
to impose rules that simply don't exist in HTTP/1.1. -- justin

For WebKit's existing WebSocket implementation, this was not really an
issue. The fact that full duplex messaging occurs after the handshake
made it impossible to effectively reuse our HTTP client code, because
HTTP client libraries are simply not set up to support that. I expect
the same will likely be true of any browser-hosted WebSocket
implementation. Are there any real implementations that ran into this
problem - that they would have been able to reuse some HTTP client
code, but header ordering constraints or the like made it impossible?

I'm dubious about where you're going with this, because what you
suggest may happen with client implementations is contrary to our
actual client implementation experience.

Regards,
Maciej

Justin Erenkrantz

2010-02-28 23:01:01 UTC

I'm dubious about where you're going with this, because what you suggest may
happen with client implementations is contrary to our actual client
implementation experience.

Yet, I have an existence proof in the other direction. The client
HTTP implementation I work on (serf) can easily handle async protocols
like BWTP. So, the fact that your implementation couldn't do so
doesn't mean that it is impossible and infeasible either - which has
been the tone of the messages from yourself and others. -- justin

Maciej Stachowiak

2010-02-28 23:15:46 UTC

Post by Justin Erenkrantz

I'm dubious about where you're going with this, because what you suggest may
happen with client implementations is contrary to our actual client
implementation experience.

Yet, I have an existence proof in the other direction. The client
HTTP implementation I work on (serf) can easily handle async protocols
like BWTP. So, the fact that your implementation couldn't do so
doesn't mean that it is impossible and infeasible either - which has
been the tone of the messages from yourself and others. -- justin

If you can design an HTTP client library to support full-duplex, then
more of it may be reusable. Once you've solved that problem, adapting
the library to deal with other specialized WebSocket requirements
should be trivial by comparison.

That being said, it wasn't feasible to reuse much code from the HTTP
client stack in our case. It also didn't seem like we would have
gotten much benefit, since producing and parsing the handshake are
about the simplest problems our implementation has to solve.

Regards,
Maciej

Justin Erenkrantz

2010-02-28 23:25:54 UTC

If you can design an HTTP client library to support full-duplex, then more
of it may be reusable. Once you've solved that problem, adapting the library
to deal with other specialized WebSocket requirements should be trivial by
comparison.

Which has been my point entirely - I added support for BWTP to serf in
a few hours. Sadly, due to the incomprehensibility of the current
IDs, I simply could not do so for WS. So, if the goal is to make it
trivial to write WS implementations, the current drafts are a
miserable failure in my experience.

That being said, it wasn't feasible to reuse much code from the HTTP client
stack in our case. It also didn't seem like we would have gotten much
benefit, since producing and parsing the handshake are about the simplest
problems our implementation has to solve.

Sure - I readily admit that not every HTTP client or server codebase
is going to be in a position to handle WS, but there certainly will be
reuse out there - especially from those on an async network
stack...which, in my experience, most high-performance frameworks are
already on just due to the scale issues. -- justin

Maciej Stachowiak

2010-02-28 23:34:53 UTC

Post by Justin Erenkrantz

If you can design an HTTP client library to support full-duplex, then more
of it may be reusable. Once you've solved that problem, adapting the library
to deal with other specialized WebSocket requirements should be trivial by
comparison.

Which has been my point entirely - I added support for BWTP to serf in
a few hours. Sadly, due to the incomprehensibility of the current
IDs, I simply could not do so for WS. So, if the goal is to make it
trivial to write WS implementations, the current drafts are a
miserable failure in my experience.

Our implementors apparently had an easier time than you (although
there was also significant feedback that led to improvements in the
spec). I'm sure further feedback on problems would be useful. You
should also feel free to use our WebSocket client test suite: <http://trac.webkit.org/browser/trunk/LayoutTests/websocket

Post by Justin Erenkrantz
.

That being said, it wasn't feasible to reuse much code from the HTTP client
stack in our case. It also didn't seem like we would have gotten much
benefit, since producing and parsing the handshake are about the simplest
problems our implementation has to solve.

Sure - I readily admit that not every HTTP client or server codebase
is going to be in a position to handle WS, but there certainly will be
reuse out there - especially from those on an async network
stack...which, in my experience, most high-performance frameworks are
already on just due to the scale issues.

I'm not sure what distinction you are trying to draw. The HTTP stack
in most browsers (and certainly in WebKit) is definitely "async" in
the sense that I understand the word.

Regards,
Maciej

Greg Wilkins

2010-02-28 06:25:15 UTC

Post by Jamie Lokier
1. Standalone WebSocket servers must be easy to write and the
specification easy to understand.

This is essentially meaningless. Can you ever imagine a requirements
document saying: servers must be hard to write and the specification
hard to understand? One could argue it is easier to reuse an existing
HTTP stack than reinvent connection/thread/parsing standalone.

There is a specific requirement we are trying to capture. Some on
this list want the handshake to be fully legal HTTP, and in response
to that others are concerned that we might required full HTTP
compliance.

I think what we are trying to capture is the requirement that the parts
of HTTP that are used are compliant, but that does not mean we require
all of HTTP. If that is what we want, then we should say that
explicitly rather than infer that with our own definitions of "easy".

Post by Jamie Lokier
2. Servers which serve HTTP and WebSocket on the same port must be
fully HTTP compatible during the handshake, and easy to write
using standard HTTP components to implement that handshake -
while being fully compatible with standalone WebSocket clients
and servers which do not implement full HTTP.

I think this is a better approach and most of the clauses
already proposed. But it suffers from the problems
that Maciej raises of speaking about servers and implementations.

So merging all these bits together and wording as Maciej suggests:

The protocol MUST allow that HTTP and websocket connections to
be served from the same port. When operating on the same port
as HTTP, the protocol MUST be HTTP compatible until both
ends have established the websocket protocol.

The protocol MUST make it possible and practical to establish
websockets connections without requiring a fully conforming HTTP
implementation at either end of the connection.

The protocol MUST make it possible and practical to reuse
existing HTTP components where appropriate.

regards

Ian Hickson

2010-02-28 08:27:11 UTC

I was out of town the last few days, so instead of peppering replies
across all the threads that happened while I was away, I'll just send this

CGIs, today (and PHP, and whatever else) still need the framework to
support them (ie, a webserver that supports CGI, PHP, etc). On a shared
host provider, you typically don't get to install servers, you pick one
that supports what you need - CGI, PHP, or in this case WebSocket
servlets.

The point is that all you should need to use Web Sockets is a shell and a
scripting language.

Even really quite creative people rarely reimplement HTTP to "do web
stuff", so I fail to see why you think this will radically change with
WebSockets.

HTTP is hugely complicated compared to WebSockets. IMHO that's a bug, not
a feature. By making WebSockets trivial to implement, we enable these
"quite creative people" to write their own servers.

Okay, so for the sake of example I'll pretend to agree that the average
weekend hacker will willingly implement their own HTTP framework rather
than just grab some existing one.

There's no need to implement an HTTP framework to use Web Sockets.

What you seem to be attempting to do is make the protocol sufficiently
hard to implement that you'll either put people off doing so, or else
force them somehow into doing it "right", such that the resulting
weekend hacking project is secure against various XSS style attacks and
other potential hazards.

What I'm attempting to do is make the protocol easier to implement safely
than to implement it unsafely.

The thing is, I think people will be just fine with knowing that if they
knock something together without reading the spec - particularly the
Security Considerations section - then it might turn out to be insecure
in some respects.

But the point is they won't know that. Random authors don't think about
how if they implement their HTML+HTTP+CGI feedback forms incorrectly
they'll make it possible for spammers to launch campaigns from their site.
They don't think about how if they don't generate unique tokens for each
user/form combination they'll expose their users to XSRF attacks. They
often don't even think about how if they don't escape their user's input
they'll be vulnerable to XSS or SQL injection attacks.

I think you're basically safe from people hunting you down with
pitchforks and burning torches, screaming at you because they got
something working, but it isn't secure against some XSS attack.

I'm not worried about me, I'm worried about _them_. I don't _want_ them to
have an XSS attack.

I'm not certain you understand the complexity of a WebSocket
application. Not parsing. Parsing is trivial. Grammars are trivial.
But the threading and synchronization issues are not trivial at all.

For small-scale operations, there's really no need for things to be
particularly complicated. Sure, if you have hundreds, thousands, millions,
or billions of users then it is hard work -- and the protocol's complexity
pales in comparison to the scaling issues. But when there are no scaling
issues, when you have four users _total_, it doesn't have to be hard and
the protocol _can_ be a significant burden.

I think your argument that websocket should be easy to implement by
application developers was probably a lot more valid before websocket
was moved to share port 80 with HTTP.
Now that it is sharing a port with HTTP, I believe the vast majority of
users will be working with HTTP servers that provide websocket support
(just as they provide CGI or servlet support).

I wouldn't expect any of the small-scale authors I've been talking about
to use a privileged port (<1024), least of which port 80.

[...] I don't think we should compromise any real security or
interoperability requirements to specifically target such developers.

On the contrary, targeting such developers actually makes it even more
important that we nail down the security and interop well.

is it perhaps something like
"it should be possible for the protocol reuse the existing web
infrastructure (only server side or also intermediaries?)

As far as I can tell the discussion has been specifically about
intermediaries on the server side.

sorry to ask, but what do you exactly mean for intermediaries on the
server side?

There are three kinds of intermediaries.

1. Explicit client-side intermediaries. These are the ones that the user
would configure their browser to use (either implicitly or through proxy
auto-configuration). These are easy to deal with, and are not a problem.

2. Server-side intermediaries. These are the ones under the control of the
author: reverse-proxies, load-balancers, and the like. It would be optimal
if they could be reused easily, though frankly, at the end of the day, it
is technically possible for the author to wholesale-replace everything on
their network if necessary. It would obviously be a huge impediment to
author adoption.

3. Man-in-the-middle intermediaries. These are typically at the ISP, doing
firewalling, traffic shaping, silent caching, and the like. Sometimes
these are (incorrectly) called "transparent proxies", though that term is
used in the HTTP spec to mean #1 above. These are a huge problem, because
neither the client nor the server knows about them, and they cannot be
updated, upgraded, or replaced by either the author or the client, so if
the protocol cannot tunnel through them, there is no way to deploy the
protocol to the user.

Greg Wilkins

2010-02-28 08:56:02 UTC

Post by Ian Hickson
There are three kinds of intermediaries.
[snip]
2. Server-side intermediaries. These are the ones under the control of the
author: reverse-proxies, load-balancers, and the like. It would be optimal
if they could be reused easily, though frankly, at the end of the day, it
is technically possible for the author to wholesale-replace everything on
their network if necessary. It would obviously be a huge impediment to
author adoption.

I have never ever ever seen such an intermediary in the wild.
However I have seen:

2a. Server-side intermediaries. These are the ones under the control
of system administrators in a different department who are under no
direct control or influence of the author: reverse-proxies, load balancers,
and the like. It would be optimal if they could be not used with only
incremental upgrades, because frankly, at the end of the day, it is
procedurally and commercially impossible for the author to wholesale-replace
anything on their networks, even if necessary. It would obviously be a
huge impediment to author adoption.

cheers

Martin Tyler

2010-02-28 09:37:02 UTC

Post by Ian Hickson

Post by Ian Hickson
There are three kinds of intermediaries.
[snip]
2. Server-side intermediaries. These are the ones under the control of

the

Post by Ian Hickson
author: reverse-proxies, load-balancers, and the like. It would be

optimal

Post by Ian Hickson
if they could be reused easily, though frankly, at the end of the day, it
is technically possible for the author to wholesale-replace everything on
their network if necessary. It would obviously be a huge impediment to
author adoption.

I have never ever ever seen such an intermediary in the wild.
2a. Server-side intermediaries. These are the ones under the control
of system administrators in a different department who are under no
direct control or influence of the author: reverse-proxies, load balancers,
and the like. It would be optimal if they could be not used with only
incremental upgrades, because frankly, at the end of the day, it is
procedurally and commercially impossible for the author to
wholesale-replace
anything on their networks, even if necessary. It would obviously be a
huge impediment to author adoption.

Totally agree with Greg here. These things are controlled by different
people, people who do not like to change things and are usually covered by
some company wide policy that is very hard to get around.

Jamie Lokier

2010-02-28 14:57:34 UTC

Post by Ian Hickson

CGIs, today (and PHP, and whatever else) still need the framework to
support them (ie, a webserver that supports CGI, PHP, etc). On a shared
host provider, you typically don't get to install servers, you pick one
that supports what you need - CGI, PHP, or in this case WebSocket
servlets.

The point is that all you should need to use Web Sockets is a shell and a
scripting language.

Even really quite creative people rarely reimplement HTTP to "do web
stuff", so I fail to see why you think this will radically change with
WebSockets.

HTTP is hugely complicated compared to WebSockets. IMHO that's a bug, not
a feature. By making WebSockets trivial to implement, we enable these
"quite creative people" to write their own servers.

I understand and respect those positions.

However, please take a quick look at HTTP's history, because the same
will probably occur to WebSocket.

In the old days, you could write a HTTP server as a short shell script.
Seriously, I saw it done. NCSA Mosaic + shell scripts for servers.

HTTP was seen as a *much* simpler, "lighter" protocol than FTP for
fetching files. That was one of it's touted benefits. Simplicity,
ease of implementation.

Nowadays, it's reversed. I see people installing FTP servers because
HTTP is seen as heavy and FTP is seen as light. Bizarre but true.

Now, the only HTTP servers I see implemented in short scripts - and in
fact in some script libraries that are more widely used - are quite
buggy. Nobody notices because they get away without testing the buggy
parts. (For example, treating HEAD the same as GET, treating OPTIONS
the same as GET, ignoring the Expect header, not passing 1xx headers,
stripping spaces either side of a header name, claiming HTTP/1.1
compliance while not remotely complying, etc.).

I also see, on embedded systems lists, people asking where they can
get a small HTTP server to do this, that or the other. It never
occurs to them to just write one (except in one case I can think of,
where the device's badly written HTTP server for diagnostics was the
reason it kept crashing.) Even though it's actually easy to write a
conforming HTTP/0.9 server.

I think the same will happen with WebSocket: Simple at first, but
before you know it, it'll have evolved - and the web will be full of
dirty workarounds for buggy clients and servers - just like what
happened with HTTP.

When that happens, almost nobody will write an implementation raw in a
few lines of script, and a situation similar to HTTP today will occur.
There will be lots of WebSocket implementations available to choose
from - at least 6 in every major language and another 20 written in C
- and each targetted at a different audience, whether it's
scalability, easy of use from a scripting perspective, minimal size,
single-threaded, etc. So that almost everyone will use one of the
available ones, because it'd be a waste of their time not to.

Post by Ian Hickson

Okay, so for the sake of example I'll pretend to agree that the average
weekend hacker will willingly implement their own HTTP framework rather
than just grab some existing one.

There's no need to implement an HTTP framework to use Web Sockets.

+1, fwiw, I do agree with this.

Post by Ian Hickson

What you seem to be attempting to do is make the protocol sufficiently
hard to implement that you'll either put people off doing so, or else
force them somehow into doing it "right", such that the resulting
weekend hacking project is secure against various XSS style attacks and
other potential hazards.

What I'm attempting to do is make the protocol easier to implement safely
than to implement it unsafely.

I think it'll succeed, but only because these attack-blocking
strategies are too complex to implement in a few lines of script by an
amateur programmer, so they'll cut & paste other people's 5 pages of
code (the old buggy one from 2010 probably), when they decide not to
use libraries or front-ends.

Nearly all such programmers will learn, after one or two tries, that
it's easier to get an off-the-shelf implementation and use it anyway.
Then they can concentrate on their application. This is based on
observing what people do with other protocols - not just HTTP, but at
all layers.

We see people using SOAP after all. Clearly people see simplicity as
whatever pre-packaged APIs they can easily use, not on-the-wire
simplicity. And for whatever reasons, sockets are harder to use than
pre-packaged messaging APIs.

Post by Ian Hickson

The thing is, I think people will be just fine with knowing that if they
knock something together without reading the spec - particularly the
Security Considerations section - then it might turn out to be insecure
in some respects.

But the point is they won't know that. Random authors don't think about
how if they implement their HTML+HTTP+CGI feedback forms incorrectly
they'll make it possible for spammers to launch campaigns from their site.
They don't think about how if they don't generate unique tokens for each
user/form combination they'll expose their users to XSRF attacks. They
often don't even think about how if they don't escape their user's input
they'll be vulnerable to XSS or SQL injection attacks.

I agree with the desire to protect users in this way.

But even random CGI authors tend to use a "CGI library" to read their
environment variables.

Post by Ian Hickson

I'm not certain you understand the complexity of a WebSocket
application. Not parsing. Parsing is trivial. Grammars are trivial.
But the threading and synchronization issues are not trivial at all.

For small-scale operations, there's really no need for things to be
particularly complicated. Sure, if you have hundreds, thousands, millions,
or billions of users then it is hard work -- and the protocol's complexity
pales in comparison to the scaling issues. But when there are no scaling
issues, when you have four users _total_, it doesn't have to be hard and
the protocol _can_ be a significant burden.

+1, I agree with the general sentiment. But I think it's doomed to
fail. Especially if the attack-blocking strategies are included, it's
simply too much for many programmers at that point to write their own.

So I think the attack-blocking strategies are fine, but partly because
the added complexity will put some people off writing their own servers.

Let's put it this way: If plain sockets were perceived as easy to
program, why are there people considering using WebSockets for
application-to-application communications (no browser) - instead of
just using plain sockets? Even though plain sockets are obviously
much less work than do-it-yourself WebSockets?

Post by Ian Hickson

I think your argument that websocket should be easy to implement by
application developers was probably a lot more valid before websocket
was moved to share port 80 with HTTP.
Now that it is sharing a port with HTTP, I believe the vast majority of
users will be working with HTTP servers that provide websocket support
(just as they provide CGI or servlet support).

I wouldn't expect any of the small-scale authors I've been talking about
to use a privileged port (<1024), least of which port 80.

[...] I don't think we should compromise any real security or
interoperability requirements to specifically target such developers.

On the contrary, targeting such developers actually makes it even more
important that we nail down the security and interop well.
There are three kinds of intermediaries.
1. Explicit client-side intermediaries. These are the ones that the user
would configure their browser to use (either implicitly or through proxy
auto-configuration). These are easy to deal with, and are not a problem.
2. Server-side intermediaries. These are the ones under the control of the
author: reverse-proxies, load-balancers, and the like. It would be optimal
if they could be reused easily, though frankly, at the end of the day, it
is technically possible for the author to wholesale-replace everything on
their network if necessary. It would obviously be a huge impediment to
author adoption.
3. Man-in-the-middle intermediaries. These are typically at the ISP, doing
firewalling, traffic shaping, silent caching, and the like. Sometimes
these are (incorrectly) called "transparent proxies", though that term is
used in the HTTP spec to mean #1 above. These are a huge problem, because
neither the client nor the server knows about them, and they cannot be
updated, upgraded, or replaced by either the author or the client, so if
the protocol cannot tunnel through them, there is no way to deploy the
protocol to the user.

Well put. There's a couple more types, which don't affect the HTTP
handshake but definitely affect reliable WebSocket usage:

4. TCP relays (a couple of flavours, depending on whether they relay
half-closes (shutdown) or full-closes only). Those are used on some
mobile/wireless networks, and some tunnelled environments. This
doesn't affect the protocol bytes, but it does affect orderly
close strategy, and any ideas people may have had about TCP ACKs.

5. Not-quite-intermediary but affects the stream: Stateful firewalls
and NATs, whose ports remain open only as long as there is regular
keepalive traffic - and even then occasionally close randomly - and
always do so if the number of connections (from everyone) exceeds a limit.

4 and 5 are currently things that applications themselves must code
for if they want reliable operation.

Post by Ian Hickson

Post by Jamie Lokier
1. Standalone WebSocket servers must be easy to write and the
specification easy to understand.

This is essentially meaningless. Can you ever imagine a requirements
document saying: servers must be hard to write and the specification
hard to understand?

Heh, I've seen a few specs which imply it :-)

Some people have said WebSocket's draft falls into this category at
the moment... because people have different ideas about what
constitutes hard to write (some see reusing Apache as easier than
writing a socket program) and hard to understand (opinions vary).

Post by Ian Hickson
I agree. That's why I prefer the way I phrased it originally, something
along the lines of "it should be possible to implement a fully-conforming
server in a few dozen lines of code in Python or Perl" or some such.

Oh, that's easy in Perl in 2 lines :-)

use Net::WebSocket::Simple qw(server, send);
server (port => 8910) { send("Response: ".uc($_)) }

It seems to me the dominant culture in Perl and Python worlds is to
use libraries. Maybe that's just the vocal bloggers, though.

-- Jamie

Ian Hickson

2010-02-28 22:18:22 UTC

Post by Jamie Lokier

Post by Ian Hickson
HTTP is hugely complicated compared to WebSockets. IMHO that's a bug,
not a feature. By making WebSockets trivial to implement, we enable
these "quite creative people" to write their own servers.

However, please take a quick look at HTTP's history, because the same
will probably occur to WebSocket.

We (as the working group) decide whether it happens or not. If we avoid
adding features like content negotiation that are so complicated that
hardly anybody ends up using them, if we avoid adding five ways to encode
frames, if we avoid defining complicated header syntaxes like continuation
lines, if we define error-handling behaviour up front, if we avoid leaving
things as basic as what character encoding to use undefined, if we avoid
making our initial protocol non-extensible, in short, if we learn from the
mistakes that the HTTP working group(s) have made over the years, we can
keep Web Socket simple and it _never_ needs to get complicated.

So yes, I'm quite familiar with HTTP's history. I'm also quite confident
that the same fate does not need to befall Web Sockets. We don't need to
put multiplexing in the base protocol. We don't need to put metadata
mechanisms in the base protocol. We can make compression an optional
server-opt-in feature in the second version that has just one algorithm
and that isn't required for simple servers. We can avoid designing
theoretical Architectures like REST and just provide trivial tools on
which people can build their applications to whatever level of complexity
they like.

Post by Jamie Lokier
I also see, on embedded systems lists, people asking where they can get
a small HTTP server to do this, that or the other. It never occurs to
them to just write one (except in one case I can think of, where the
device's badly written HTTP server for diagnostics was the reason it
kept crashing.) Even though it's actually easy to write a conforming
HTTP/0.9 server.

People don't know about 0.9; when they think of HTTP they think of 1.1,
and I think it's quite reasonable to not want to implement an HTTP 1.1
server. If we design Web Sockets right, we won't have that problem.

Post by Jamie Lokier
I think the same will happen with WebSocket: Simple at first, but before
you know it, it'll have evolved - and the web will be full of dirty
workarounds for buggy clients and servers - just like what happened with
HTTP.

The reason it happened with HTTP is the same reason it happened with HTML
and CSS -- HTTP didn't define error handling behaviour, so when a peer did
the wrong thing, there was no rule saying what you had to do in response,
and whatever the dominant server or client behaviour was ended up being
the de-facto standard. If we define error handling rules so that there's
no sequence of bytes for which the behaviour is undefined, we side-step
this problem completely.

Post by Jamie Lokier

Post by Ian Hickson
For small-scale operations, there's really no need for things to be
particularly complicated. Sure, if you have hundreds, thousands,
millions, or billions of users then it is hard work -- and the
protocol's complexity pales in comparison to the scaling issues. But
when there are no scaling issues, when you have four users _total_, it
doesn't have to be hard and the protocol _can_ be a significant
burden.

+1, I agree with the general sentiment. But I think it's doomed to
fail. Especially if the attack-blocking strategies are included, it's
simply too much for many programmers at that point to write their own.

If we don't manage it, then fair enough, but I think giving up before we
try is too pessimistic for my tastes.

Post by Jamie Lokier
Let's put it this way: If plain sockets were perceived as easy to
program, why are there people considering using WebSockets for
application-to-application communications (no browser) - instead of just
using plain sockets? Even though plain sockets are obviously much less
work than do-it-yourself WebSockets?

That's what I've been asking! The answer I've gotten has been "we don't
expect the servers to provide a TCP socket solution, but they'll provide
one for their scripts running in Web browsers so at least we'll be able to
reuse that".

--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

Rob Sayre

2010-02-28 22:25:40 UTC

provide trivial tools on which people can build their applications

Yes.

"You can Solve Any Problem... if you're willing to make the problem
small enough."

- Rob

Justin Erenkrantz

2010-02-28 23:21:39 UTC

Post by Ian Hickson
and that isn't required for simple servers. We can avoid designing
theoretical Architectures like REST and just provide trivial tools on
which people can build their applications to whatever level of complexity
they like.

Comments like this reflect a poor understanding of the history of
HTTP. HTTP/0.9 and HTTP/1.0 had serious scalability issues because
people just followed your suggested model and implemented things
willy-nilly without an understanding of what side effects it might
have. The only way out of that rabbit hole was to try to come up with
a new architectural model (REST) that explained where things went
wrong - out of that came some of the changes in HTTP/1.1 that made
life easier on everyone.

If we continue to create specification drafts that present serious
scalability challenges because the editor doesn't understand (or
accept) the intrinsic problems, we're going to find ourselves back in
the dark days of HTTP/0.9 and 1.0. That's a mistake I don't want to
see happen again. Those days were not fun. -- justin

Jamie Lokier

2010-03-01 02:42:14 UTC

Post by Justin Erenkrantz

Post by Ian Hickson
and that isn't required for simple servers. We can avoid designing
theoretical Architectures like REST and just provide trivial tools on
which people can build their applications to whatever level of complexity
they like.

Comments like this reflect a poor understanding of the history of
HTTP. HTTP/0.9 and HTTP/1.0 had serious scalability issues because
people just followed your suggested model and implemented things
willy-nilly without an understanding of what side effects it might
have. The only way out of that rabbit hole was to try to come up with
a new architectural model (REST) that explained where things went
wrong - out of that came some of the changes in HTTP/1.1 that made
life easier on everyone.
If we continue to create specification drafts that present serious
scalability challenges because the editor doesn't understand (or
accept) the intrinsic problems, we're going to find ourselves back in
the dark days of HTTP/0.9 and 1.0. That's a mistake I don't want to
see happen again. Those days were not fun. -- justin

Without going into details, I do see WebSocket's tendancy to have at
least one active TCP per open tab in a browser (more than one on pages
build out of components from different sources), each with scripted
keepalives back and forth, becoming a scalability issue on the client
side with mobile browsers, and users who open a hundred tabs in their
browser.

No more so than XHR polling. But I believe we can do better than that.

-- Jamie

Jamie Lokier

2010-03-01 03:09:03 UTC

Post by Jamie Lokier

Post by Justin Erenkrantz

Post by Ian Hickson
and that isn't required for simple servers. We can avoid designing
theoretical Architectures like REST and just provide trivial tools on
which people can build their applications to whatever level of complexity
they like.

Comments like this reflect a poor understanding of the history of
HTTP. HTTP/0.9 and HTTP/1.0 had serious scalability issues because
people just followed your suggested model and implemented things
willy-nilly without an understanding of what side effects it might
have. The only way out of that rabbit hole was to try to come up with
a new architectural model (REST) that explained where things went
wrong - out of that came some of the changes in HTTP/1.1 that made
life easier on everyone.
If we continue to create specification drafts that present serious
scalability challenges because the editor doesn't understand (or
accept) the intrinsic problems, we're going to find ourselves back in
the dark days of HTTP/0.9 and 1.0. That's a mistake I don't want to
see happen again. Those days were not fun. -- justin

Without going into details, I do see WebSocket's tendancy to have at
least one active TCP per open tab in a browser (more than one on pages
build out of components from different sources), each with scripted
keepalives back and forth, becoming a scalability issue on the client
side with mobile browsers, and users who open a hundred tabs in their
browser.

Some back of the envelope keepalive calculations:

1. Browser open with 10 tabs (normal usage on my phone, expect more
like this in future).

Let's suppose each is a well written WebSocket-using page, and
there are no third party components, so all the page components
can share a single WebSocket.

Keepalive messages are sent and received every 30 seconds, on each
connection. Each keepalive causes a TCP ACK.

That's one keepalive packet every 0.75 seconds average. That's a
bit power hungry, but we can just about live with it with a good signal.

2. Desktop browser open with 100 tabs (normal usage on my desktop is 200).
This isn't everyone's usage, but many people do browse like this.

Let's suppose 10% of those are "mashup" type pages, with an
embedded map or tic-tac-toe game or something, call it 3 connections.

The remainder: let's say 50% static pages, 50% using WebSocket (year 2012).

That's 75 WebSocket continuously open TCP connections. (Wow!)

Keepalive messages are sent and received every 30 seconds, on each
connection. Each keepalive causes a TCP ACK.

That's 10 packets per second continuously just from keepalives!

3. Household with 3 running browsers (per person), similar to 2 above.

=> 30 packets per second continuously just from keepalives.
=> 225 open TCPs going through the home router. Some routers will break.

4. Household with 3 light users - only 10 tabs per browser, same mix of pages.

=> 3 packets per second continuously. That's acceptable.
=> 22 open TCPs through the home routers. That's acceptable.

Problems:

On the mobile phone, it's quite power hungry. Waking up the packet
radio is one of the major power consumers. The number of _bytes_
isn't very important; the number of packets is.

At home (with 3 people and lots of browser tabs), it's a lot of
continuous traffic, and we haven't counted actual messages, just idle
keepalives.

If we decrease the number of tabs in the desktop browsers at home to
10 per desktop, it's acceptable. But that shouldn't be necessary.

If we increased the keepalive interval to 1 minute (which is too long
for some networks), these numbers still don't look good.

The issue here is entirely down to being unable to share keepalives
across multiple WebSocket instances.

-- Jamie

Dave Cridland

2010-03-01 11:10:09 UTC

Post by Jamie Lokier
Keepalive messages are sent and received every 30 seconds, on each
connection. Each keepalive causes a TCP ACK.

That's deeply unrealistic now, thank heavens.

2 minutes is much more sane, and it's increasing rapidly, as
protocols other than HTTP have reached the public demand stages -
XMPP, IMAP, etc all benefit from long-lived, dormant connections.

So keepalives should, in principle, gradually fade away as reasonable
network behaviour takes over.

This behavioural flux, though, is an excellent argument for putting
keepalive behaviour into the core, as the people designing the
frameworks are likely to have good ideas of the defaults, and
keepalive frequencies will need to be controlled per-deployment,
typically, rather than per-application.

Dave.

--
Dave Cridland - mailto:***@cridland.net - xmpp:***@dave.cridland.net
- acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
- http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade

M***@nokia.com

2010-03-01 13:36:16 UTC

Hi,

I haven't followed the discussions very deeply but the keepalive issue caught my eye.

Post by Jamie Lokier
On the mobile phone, it's quite power hungry. Waking up the
packet radio is one of the major power consumers. The number
of _bytes_ isn't very important; the number of packets is.

That's very true. Wide area cellular radios (WCDMA, HSPA, LTE, CDMA-2000, WiMAX) all have pretty good power-save modes, but sending/receiving a packet always requires to activate the radio, and after that it stays up for a while (from a couple of seconds upto 30). So, with any cellular wireless interfaces a keepalive rate of 1/30 sec. is really bad for the battery. In practice you could only run such an application for a few hours before you have to recharge. Keepalive rate of 1/10 minutes or so is still tolerable, the users may not notice anything exceptional with how fast the battery runs out. Things are improving with radios and batteries, but this will still be a major constraint for a next few years.

If WebSocket connection is mainly used for apps where user is somehow interactive and closes the app/connection, this is not so bad. But if the intention is that some background widget etc. uses this all the time, then things may get infeasible. Users will not use such widgets when they notice that they have to charge the device constantly.

How large the packets are may not matter that much, but actually there are some mainstream radio systems (WCDMA-based), where sending very small IP packets (~100-200 bytes) is possible without reserving a dedicated channel, thus requiring also less radio signaling and power. So it would be very good if the keepalive packet were as small as possible. (Full HTTP request is already a problem, but since WebSocket is different I hope it's also more compact in this respect.) So keep this in mind too.

Post by Jamie Lokier
At home (with 3 people and lots of browser tabs), it's a lot
of continuous traffic, and we haven't counted actual messages,
just idle keepalives.
If we decrease the number of tabs in the desktop browsers at
home to 10 per desktop, it's acceptable. But that shouldn't
be necessary.

This is a different topic, but if IPv4 address scarcity eventually gets really bad (and IPv6 does not get deployed fast enough to help), keeping a large number of open TCP connections per subscriber/home may become costly too for the ISPs, since each requires a mapping in their big NAT box.

Post by Jamie Lokier
If we increased the keepalive interval to 1 minute (which is
too long for some networks), these numbers still don't look good.

True.

Post by Jamie Lokier
The issue here is entirely down to being unable to share
keepalives across multiple WebSocket instances.

That would be useful. Having said that, even if each instance uses a separate keepalive, an implementation may still be able to "synchronize" them so that they happen at the same time. That would help with waking up the radio to some extent too. But even better would be if a single keepalive could be used.

Here's my wishlist for the keepalive:
- Make it as small as possible in terms of bytes.
- Provide a way how the client (and the server?) can determine what the required rate is so that an optimal rate [as low as possible vs. keep all the boxes in the middle happy] can be used.
- If possible, allow sharing keepalives. (Don't know if it's feasible in this context. I understand this also requires sharing a TCP connection, at least when NAT/FW-type of middleboxes are concerned.)
- If possible, allow switching to some other protocol mechanism for keepalive purposes, say XMPP. (This might be out-of-scope for WebSocket work but in theory this could be possible.)

I guess now it would be a good time for me to review what the draft actually says about keepalives and comment on that....

Post by Jamie Lokier
-- Jamie

Regards,
Markus

Ian Hickson

2010-03-01 22:20:44 UTC

Post by M***@nokia.com
If WebSocket connection is mainly used for apps where user is somehow
interactive and closes the app/connection, this is not so bad. But if
the intention is that some background widget etc. uses this all the
time, then things may get infeasible. Users will not use such widgets
when they notice that they have to charge the device constantly.

For server-push updates, the server-sent events feature:

http://dev.w3.org/html5/eventsource/

...is designed to be implementable in such a way that handsets can offload
the keepalive responsibility to the cell network infrastructure, to avoid
this problem. I wouldn't recommend using WebSockets for server-push
updates; when the client isn't sending data, it's best to shut down the
WebSocket connection altogether.

--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

Greg Wilkins

2010-03-01 22:57:33 UTC

Post by Ian Hickson

Post by M***@nokia.com
If WebSocket connection is mainly used for apps where user is somehow
interactive and closes the app/connection, this is not so bad. But if
the intention is that some background widget etc. uses this all the
time, then things may get infeasible. Users will not use such widgets
when they notice that they have to charge the device constantly.

http://dev.w3.org/html5/eventsource/
...is designed to be implementable in such a way that handsets can offload
the keepalive responsibility to the cell network infrastructure, to avoid
this problem. I wouldn't recommend using WebSockets for server-push
updates; when the client isn't sending data, it's best to shut down the
WebSocket connection altogether.

Ian,

I had seen this server-sent "feature" some time ago and had erroneously
thought it was some early attempt server push that had been replaced
by websocket.

I find it very very strange that HTML 5 contains two very similar
mechanism. You'd think that a full duplex mechanism like websocket
could easily handle the simplex needs of server-sent-events.

Note also that I don't see anything in server-sent-events that
will prevent an intermediary caching the content in a way
that will prevent events from being delivered. I expect
intermediaries will need to be updated to have special case
handling for this content type. It also appears that it
cannot deliver binary data.

The connectionless push feature would be really worthwhile
to have for full duplex as well as simplex communication, although
the current spec appears to imply this is a proprietary mechanism
rather than an open standard?

Is there much development happening with this feature?

regards

Maciej Stachowiak

2010-03-01 23:58:40 UTC

Post by Greg Wilkins

Post by Ian Hickson

Post by M***@nokia.com
If WebSocket connection is mainly used for apps where user is somehow
interactive and closes the app/connection, this is not so bad. But if
the intention is that some background widget etc. uses this all the
time, then things may get infeasible. Users will not use such widgets
when they notice that they have to charge the device constantly.

http://dev.w3.org/html5/eventsource/
...is designed to be implementable in such a way that handsets can offload
the keepalive responsibility to the cell network infrastructure, to avoid
this problem. I wouldn't recommend using WebSockets for server-push
updates; when the client isn't sending data, it's best to shut down the
WebSocket connection altogether.

Ian,
I had seen this server-sent "feature" some time ago and had
erroneously
thought it was some early attempt server push that had been replaced
by websocket.
I find it very very strange that HTML 5 contains two very similar
mechanism. You'd think that a full duplex mechanism like websocket
could easily handle the simplex needs of server-sent-events.
Note also that I don't see anything in server-sent-events that
will prevent an intermediary caching the content in a way
that will prevent events from being delivered. I expect
intermediaries will need to be updated to have special case
handling for this content type. It also appears that it
cannot deliver binary data.
The connectionless push feature would be really worthwhile
to have for full duplex as well as simplex communication, although
the current spec appears to imply this is a proprietary mechanism
rather than an open standard?
Is there much development happening with this feature?

WebKit supports EventSource and has for some time. Even in light of
WebSocket, I believe it is useful because:

1) The half duplex model has less deployment challenge. We know it can
go through unmodified HTTP proxies and man-in-the-middle pseudo-
proxies without breaking mysteriously. It's just syntactic sugar on
top of a streaming GET.

2) This model can plausibly convert to an alternate tunnel for
notifications, such as SMS to cell phones. There have been proposals
for how to do this though no formal spec yet. I believe there is
interest in standards activity in this area.

As for caching, it would be the server's responsibility to ensure that
its response is not unduly cached, as with XMLHttpRequest-based data
services. It seems to me this can be achieved with Cache-Control
headers. Servers need to make the response uncacheable anyway to avoid
caching at the client.

Regards,
Maciej

Greg Wilkins

2010-03-02 06:41:50 UTC

Post by Maciej Stachowiak

Post by Greg Wilkins

Post by Ian Hickson

Post by M***@nokia.com
If WebSocket connection is mainly used for apps where user is somehow
interactive and closes the app/connection, this is not so bad. But if
the intention is that some background widget etc. uses this all the
time, then things may get infeasible. Users will not use such widgets
when they notice that they have to charge the device constantly.

http://dev.w3.org/html5/eventsource/
...is designed to be implementable in such a way that handsets can offload
the keepalive responsibility to the cell network infrastructure, to avoid
this problem. I wouldn't recommend using WebSockets for server-push
updates; when the client isn't sending data, it's best to shut down the
WebSocket connection altogether.

Ian,
I had seen this server-sent "feature" some time ago and had erroneously
thought it was some early attempt server push that had been replaced
by websocket.
I find it very very strange that HTML 5 contains two very similar
mechanism. You'd think that a full duplex mechanism like websocket
could easily handle the simplex needs of server-sent-events.
Note also that I don't see anything in server-sent-events that
will prevent an intermediary caching the content in a way
that will prevent events from being delivered. I expect
intermediaries will need to be updated to have special case
handling for this content type. It also appears that it
cannot deliver binary data.
The connectionless push feature would be really worthwhile
to have for full duplex as well as simplex communication, although
the current spec appears to imply this is a proprietary mechanism
rather than an open standard?
Is there much development happening with this feature?

WebKit supports EventSource and has for some time. Even in light of
1) The half duplex model has less deployment challenge. We know it can
go through unmodified HTTP proxies and man-in-the-middle pseudo-proxies
without breaking mysteriously. It's just syntactic sugar on top of a
streaming GET.

Does it? Proxies are under no obligation to pass on even 1 byte
of a response until the very last byte has been received from the
server. It just so happens that most of them don't buffer
significant amounts of content, so "streaming" push like this
mostly work. But there is nothing in the standards to prevent
a valid proxy buffering such streams and there are enough proxies
out there that do so that pure streaming push solutions mostly
need to have a fall back to long polling when the stream does not
work.

Has there been any study to see how well this content
type actually transmits through the internet?

Post by Maciej Stachowiak
2) This model can plausibly convert to an alternate tunnel for
notifications, such as SMS to cell phones. There have been proposals for
how to do this though no formal spec yet. I believe there is interest in
standards activity in this area.

Which is a really good initiative. If you have any pointers to where
this is taking place, then I'm very interested to know, as I think we
should consider a similar facility for websocket.

As Roy said in his recent post, requiring N connections for N clients
and this applies equally for duplex as for one-way communications.

Post by Maciej Stachowiak
As for caching, it would be the server's responsibility to ensure that
its response is not unduly cached, as with XMLHttpRequest-based data
services. It seems to me this can be achieved with Cache-Control
headers. Servers need to make the response uncacheable anyway to avoid
caching at the client.

Sorry - I meant to say buffering rather than caching (see above).

regards

Maciej Stachowiak

2010-03-02 06:57:27 UTC

Post by Greg Wilkins

Post by Maciej Stachowiak
WebKit supports EventSource and has for some time. Even in light of
1) The half duplex model has less deployment challenge. We know it can
go through unmodified HTTP proxies and man-in-the-middle pseudo-
proxies
without breaking mysteriously. It's just syntactic sugar on top of a
streaming GET.

Does it? Proxies are under no obligation to pass on even 1 byte
of a response until the very last byte has been received from the
server. It just so happens that most of them don't buffer
significant amounts of content, so "streaming" push like this
mostly work. But there is nothing in the standards to prevent
a valid proxy buffering such streams and there are enough proxies
out there that do so that pure streaming push solutions mostly
need to have a fall back to long polling when the stream does not
work.

I'm primarily concerned with whether it breaks in practice, not
whether a proxy could conform to standards and still break it.

Post by Greg Wilkins
Has there been any study to see how well this content
type actually transmits through the internet?

I'm not aware of any quantitative studies. The best evidence I am
aware of that it works is the fact that streaming GET (i.e. via
XMLHttpRequest) is widely used and is not reported to be problematic.
EventSource behaves identically at the protocol level (other than the
funny content type).

Post by Greg Wilkins

Post by Maciej Stachowiak
2) This model can plausibly convert to an alternate tunnel for
notifications, such as SMS to cell phones. There have been
proposals for
how to do this though no formal spec yet. I believe there is
interest in
standards activity in this area.

Which is a really good initiative. If you have any pointers to where
this is taking place, then I'm very interested to know, as I think we
should consider a similar facility for websocket.

This was discussed at TPAC 2009 but I do not have pointers handy.

Post by Greg Wilkins
As Roy said in his recent post, requiring N connections for N clients
and this applies equally for duplex as for one-way communications.

EventSource can in theory work with 0 persistent connections, given
the right kind of way to communicate with the server. But I believe
there are categories of services where a persistent connection is
workable and desirable. I expect not every Web site will want to use
it, but the widespread use of the "long polling" pattern indicates
demand and willingness to keep connections open for a while.

Post by Greg Wilkins

Post by Maciej Stachowiak
As for caching, it would be the server's responsibility to ensure that
its response is not unduly cached, as with XMLHttpRequest-based data
services. It seems to me this can be achieved with Cache-Control
headers. Servers need to make the response uncacheable anyway to avoid
caching at the client.

Sorry - I meant to say buffering rather than caching (see above).

In that case, I see your point. The best information we have is that
undue buffering is not widespread. But it's certainly possible. Are
you aware of any HTTP intermediary software that would buffer an
uncacheable streaming response indefinitely?

Regards,
Maciej

Greg Wilkins

2010-03-02 07:30:11 UTC

I'm primarily concerned with whether it breaks in practice, not whether
a proxy could conform to standards and still break it.

I agree that we can't get too hung up about every theoretical
problems that might occur. But in the case, it is definitely
a real problem - albeit not that common. Generally I
see that relying on common implementation rather than
standard specification is going to be at best fragile.

In that case, I see your point. The best information we have is that
undue buffering is not widespread. But it's certainly possible. Are you
aware of any HTTP intermediary software that would buffer an uncacheable
streaming response indefinitely?

I have definitely encountered such proxies in the wild.
In fact for a few years I was living in an apartment block that
provided internet, but they had a transparent proxy that
buffer responses (it could have been squid?) and only flushed on
buffer overflow.

As I was unable to use streaming, I was unable to well support
it and that is one of the reasons it was not maintained and
eventually dropped from the cometd project.

I know that is only anecdotal, but it was frustrating enough
for me to be vary wary of streaming content over protocols
not designed for streaming.

It would be really good if somebody with a widely deployed
push site that supports streaming could comment on any stats
they have about how often they need to fall back to long polling?
My own sites don't even try streaming, so I don't have that info.
Anybody from Google wave here? what about gchat?

regards

Julian Reschke

2010-03-02 11:01:25 UTC

Post by Ian Hickson

Post by M***@nokia.com
If WebSocket connection is mainly used for apps where user is somehow
interactive and closes the app/connection, this is not so bad. But if
the intention is that some background widget etc. uses this all the
time, then things may get infeasible. Users will not use such widgets
when they notice that they have to charge the device constantly.

http://dev.w3.org/html5/eventsource/
...

I had forgotten about this one. It would probably be good if the HTTP
aspects of this spec got more review; I already sent a comment tp
public-webapps that the new "Last-Event-ID" header appears to duplicate
functionality you can already get from "If-None-Match".

Best regards, Julian

M***@nokia.com

2010-03-02 13:21:18 UTC

Hi Ian,

Post by Ian Hickson
http://dev.w3.org/html5/eventsource/
...is designed to be implementable in such a way that handsets
can offload the keepalive responsibility to the cell network
infrastructure, to avoid this problem. I wouldn't recommend
using WebSockets for server-push updates; when the client
isn't sending data, it's best to shut down the WebSocket
connection altogether.

This is certainly possible. Mobile operators do use SMS wake-up for some of their own services, such as Multimedia Messaging (MMS). But I foresee severe challenges with using it for Internet/Web applications. SMS is a service offered by mobile operators, and often they charge for it per message and the price varies per operator and region. So, this brings up the question of business relationships etc., which can limit the service uptake quite much.

I'm definitely hoping that WebSocket will provide a reasonable channel for "server push", at least for all cases where HTTP long polling and/or streaming are used at the moment. I'm not holding my breath for getting SMS or any cellular network specific transport widely available for those use cases.

I understand that if there is a middlebox (proxy, NAT, whatever) that simply requires very frequent keep-alive, there's not much we can do in that case. But I also expect that the operators' awareness in these matters will grow, and many (not all) of them will configure their networks well, and in those situations WebSocket should be able to work well.

Markus

Greg Wilkins

2010-03-01 07:26:31 UTC

All,

While this thread appears a little unproductive (as I'm not sure
establishing who has the best appreciation of HTTP history is
a milestone for the WG), I actually think there are some
green shoots.

It's really great that Ian is now talking about "we" the working
group and acknowledging that we are in a process that might
add some more features to the spec.

Since we are in the requirements phase of the working group,
I think we all should look at the points being raised here
and see if we can formulate them as requirements on the specification.

eg - we all kind of agree that good error handling is needed
in the protocol.... but can we agree that error handling
includes differentiating orderly/idle/failure closes of
connections? We should write up the requirements for error handling!

eg - there is some consensus that compression should be
supported as an option . Is that something
that we can capture as a requirement?

Also, we all appear to agree that the resulting protocol will
support layers... a base framing layer with things like
content-types, meta-data, fragmentation, multiplexing, etc.
done in higher layers. A fundamental question is - are
these higher layers something for this WG or not? I would
like to think yes, because we can't design a framing transport
in a total vacuum, nor do we wish to encourage 20 different
meta-data layers etc. The SPDY spec appears able to
differentiate itself into a framing layer and a services layer,
so perhaps we can follow that approach?

cheers

Post by Ian Hickson

Post by Jamie Lokier

Post by Ian Hickson
HTTP is hugely complicated compared to WebSockets. IMHO that's a bug,
not a feature. By making WebSockets trivial to implement, we enable
these "quite creative people" to write their own servers.

However, please take a quick look at HTTP's history, because the same
will probably occur to WebSocket.

We (as the working group) decide whether it happens or not. If we avoid
adding features like content negotiation that are so complicated that
hardly anybody ends up using them, if we avoid adding five ways to encode
frames, if we avoid defining complicated header syntaxes like continuation
lines, if we define error-handling behaviour up front, if we avoid leaving
things as basic as what character encoding to use undefined, if we avoid
making our initial protocol non-extensible, in short, if we learn from the
mistakes that the HTTP working group(s) have made over the years, we can
keep Web Socket simple and it _never_ needs to get complicated.
So yes, I'm quite familiar with HTTP's history. I'm also quite confident
that the same fate does not need to befall Web Sockets. We don't need to
put multiplexing in the base protocol. We don't need to put metadata
mechanisms in the base protocol. We can make compression an optional
server-opt-in feature in the second version that has just one algorithm
and that isn't required for simple servers. We can avoid designing
theoretical Architectures like REST and just provide trivial tools on
which people can build their applications to whatever level of complexity
they like.

Post by Jamie Lokier
I also see, on embedded systems lists, people asking where they can get
a small HTTP server to do this, that or the other. It never occurs to
them to just write one (except in one case I can think of, where the
device's badly written HTTP server for diagnostics was the reason it
kept crashing.) Even though it's actually easy to write a conforming
HTTP/0.9 server.

People don't know about 0.9; when they think of HTTP they think of 1.1,
and I think it's quite reasonable to not want to implement an HTTP 1.1
server. If we design Web Sockets right, we won't have that problem.

Post by Jamie Lokier
I think the same will happen with WebSocket: Simple at first, but before
you know it, it'll have evolved - and the web will be full of dirty
workarounds for buggy clients and servers - just like what happened with
HTTP.

The reason it happened with HTTP is the same reason it happened with HTML
and CSS -- HTTP didn't define error handling behaviour, so when a peer did
the wrong thing, there was no rule saying what you had to do in response,
and whatever the dominant server or client behaviour was ended up being
the de-facto standard. If we define error handling rules so that there's
no sequence of bytes for which the behaviour is undefined, we side-step
this problem completely.

Post by Jamie Lokier

Post by Ian Hickson
For small-scale operations, there's really no need for things to be
particularly complicated. Sure, if you have hundreds, thousands,
millions, or billions of users then it is hard work -- and the
protocol's complexity pales in comparison to the scaling issues. But
when there are no scaling issues, when you have four users _total_, it
doesn't have to be hard and the protocol _can_ be a significant
burden.

+1, I agree with the general sentiment. But I think it's doomed to
fail. Especially if the attack-blocking strategies are included, it's
simply too much for many programmers at that point to write their own.

If we don't manage it, then fair enough, but I think giving up before we
try is too pessimistic for my tastes.

Post by Jamie Lokier
Let's put it this way: If plain sockets were perceived as easy to
program, why are there people considering using WebSockets for
application-to-application communications (no browser) - instead of just
using plain sockets? Even though plain sockets are obviously much less
work than do-it-yourself WebSockets?

That's what I've been asking! The answer I've gotten has been "we don't
expect the servers to provide a TCP socket solution, but they'll provide
one for their scripts running in Web browsers so at least we'll be able to
reuse that".

Maciej Stachowiak

2010-03-01 08:13:15 UTC

Post by Greg Wilkins
All,
While this thread appears a little unproductive (as I'm not sure
establishing who has the best appreciation of HTTP history is
a milestone for the WG), I actually think there are some
green shoots.
It's really great that Ian is now talking about "we" the working
group and acknowledging that we are in a process that might
add some more features to the spec.
Since we are in the requirements phase of the working group,
I think we all should look at the points being raised here
and see if we can formulate them as requirements on the specification.
eg - we all kind of agree that good error handling is needed
in the protocol.... but can we agree that error handling
includes differentiating orderly/idle/failure closes of
connections? We should write up the requirements for error handling!

The most basic error-handling requirement should be that behavior in
the face of any error condition must be well-defined.

A requirement to report specific conditions (error or otherwise) would
be separate. I think as far as differentiating orderly/idle/failure
closes, I'd need to hear more about what counts as what. An idle close
by your opposite endpoint could presumably just be an orderly close.
So I assume idle closes worth distinguishing would be by an
intermediary. Are those actually distinguishable from failure?

Post by Greg Wilkins
eg - there is some consensus that compression should be
supported as an option . Is that something
that we can capture as a requirement?

I think Ian's position was that compression could be added as an
option in a future version of the protocol. (I personally have no
opinion, though I would be inclined to say let's not add it until we
know it's actually needed - see <http://c2.com/xp/YouArentGonnaNeedIt.html

Post by Greg Wilkins
).
Also, we all appear to agree that the resulting protocol will
support layers... a base framing layer with things like
content-types, meta-data, fragmentation, multiplexing, etc.
done in higher layers. A fundamental question is - are
these higher layers something for this WG or not? I would
like to think yes, because we can't design a framing transport
in a total vacuum, nor do we wish to encourage 20 different
meta-data layers etc. The SPDY spec appears able to
differentiate itself into a framing layer and a services layer,
so perhaps we can follow that approach?

I don't think we are chartered to define protocols that run on top of
the WebSocket protocol, in addition to the WebSocket protocol itself.

Note: I think some of those features could reasonably added to the
base protocol, but perhaps not in v1.

Regards,
Maciej

Post by Greg Wilkins
cheers

Post by Ian Hickson

Post by Jamie Lokier

Post by Ian Hickson
HTTP is hugely complicated compared to WebSockets. IMHO that's a bug,
not a feature. By making WebSockets trivial to implement, we enable
these "quite creative people" to write their own servers.

However, please take a quick look at HTTP's history, because the same
will probably occur to WebSocket.

We (as the working group) decide whether it happens or not. If we avoid
adding features like content negotiation that are so complicated that
hardly anybody ends up using them, if we avoid adding five ways to encode
frames, if we avoid defining complicated header syntaxes like
continuation
lines, if we define error-handling behaviour up front, if we avoid leaving
things as basic as what character encoding to use undefined, if we avoid
making our initial protocol non-extensible, in short, if we learn from the
mistakes that the HTTP working group(s) have made over the years, we can
keep Web Socket simple and it _never_ needs to get complicated.
So yes, I'm quite familiar with HTTP's history. I'm also quite confident
that the same fate does not need to befall Web Sockets. We don't need to
put multiplexing in the base protocol. We don't need to put metadata
mechanisms in the base protocol. We can make compression an optional
server-opt-in feature in the second version that has just one
algorithm
and that isn't required for simple servers. We can avoid designing
theoretical Architectures like REST and just provide trivial tools on
which people can build their applications to whatever level of complexity
they like.

Post by Jamie Lokier
I also see, on embedded systems lists, people asking where they can get
a small HTTP server to do this, that or the other. It never
occurs to
them to just write one (except in one case I can think of, where the
device's badly written HTTP server for diagnostics was the reason it
kept crashing.) Even though it's actually easy to write a
conforming
HTTP/0.9 server.

People don't know about 0.9; when they think of HTTP they think of 1.1,
and I think it's quite reasonable to not want to implement an HTTP 1.1
server. If we design Web Sockets right, we won't have that problem.

Post by Jamie Lokier
I think the same will happen with WebSocket: Simple at first, but before
you know it, it'll have evolved - and the web will be full of dirty
workarounds for buggy clients and servers - just like what
happened with
HTTP.

The reason it happened with HTTP is the same reason it happened with HTML
and CSS -- HTTP didn't define error handling behaviour, so when a peer did
the wrong thing, there was no rule saying what you had to do in response,
and whatever the dominant server or client behaviour was ended up being
the de-facto standard. If we define error handling rules so that there's
no sequence of bytes for which the behaviour is undefined, we side-
step
this problem completely.

Post by Jamie Lokier

Post by Ian Hickson
For small-scale operations, there's really no need for things to be
particularly complicated. Sure, if you have hundreds, thousands,
millions, or billions of users then it is hard work -- and the
protocol's complexity pales in comparison to the scaling issues. But
when there are no scaling issues, when you have four users
_total_, it
doesn't have to be hard and the protocol _can_ be a significant
burden.

+1, I agree with the general sentiment. But I think it's doomed to
fail. Especially if the attack-blocking strategies are included, it's
simply too much for many programmers at that point to write their own.

If we don't manage it, then fair enough, but I think giving up before we
try is too pessimistic for my tastes.

Post by Jamie Lokier
Let's put it this way: If plain sockets were perceived as easy to
program, why are there people considering using WebSockets for
application-to-application communications (no browser) - instead of just
using plain sockets? Even though plain sockets are obviously much less
work than do-it-yourself WebSockets?

That's what I've been asking! The answer I've gotten has been "we don't
expect the servers to provide a TCP socket solution, but they'll provide
one for their scripts running in Web browsers so at least we'll be able to
reuse that".

_______________________________________________
hybi mailing list
https://www.ietf.org/mailman/listinfo/hybi

Jamie Lokier

2010-03-01 08:53:36 UTC

Post by Maciej Stachowiak

Post by Greg Wilkins
All,
While this thread appears a little unproductive (as I'm not sure
establishing who has the best appreciation of HTTP history is
a milestone for the WG), I actually think there are some
green shoots.
It's really great that Ian is now talking about "we" the working
group and acknowledging that we are in a process that might
add some more features to the spec.
Since we are in the requirements phase of the working group,
I think we all should look at the points being raised here
and see if we can formulate them as requirements on the specification.
eg - we all kind of agree that good error handling is needed
in the protocol.... but can we agree that error handling
includes differentiating orderly/idle/failure closes of
connections? We should write up the requirements for error handling!

The most basic error-handling requirement should be that behavior in
the face of any error condition must be well-defined.
A requirement to report specific conditions (error or otherwise) would
be separate. I think as far as differentiating orderly/idle/failure
closes, I'd need to hear more about what counts as what. An idle close
by your opposite endpoint could presumably just be an orderly close.
So I assume idle closes worth distinguishing would be by an
intermediary. Are those actually distinguishable from failure?

Post by Greg Wilkins
eg - there is some consensus that compression should be
supported as an option . Is that something
that we can capture as a requirement?

I think Ian's position was that compression could be added as an
option in a future version of the protocol. (I personally have no
opinion, though I would be inclined to say let's not add it until we
know it's actually needed - see <http://c2.com/xp/YouArentGonnaNeedIt.html

Post by Greg Wilkins
).
Also, we all appear to agree that the resulting protocol will
support layers... a base framing layer with things like
content-types, meta-data, fragmentation, multiplexing, etc.
done in higher layers. A fundamental question is - are
these higher layers something for this WG or not? I would
like to think yes, because we can't design a framing transport
in a total vacuum, nor do we wish to encourage 20 different
meta-data layers etc. The SPDY spec appears able to
differentiate itself into a framing layer and a services layer,
so perhaps we can follow that approach?

I don't think we are chartered to define protocols that run on top of
the WebSocket protocol, in addition to the WebSocket protocol itself.
Note: I think some of those features could reasonably added to the
base protocol, but perhaps not in v1.

I think there are some aspects of framing that are difficult to put in
a higher layer without making inefficient use of the lower layer. We
should work out exactly what the minimum capability needed from the
lower layer actually is, without making it so minimal that it causes
gross inefficiency.

Specifically, distinguishing parts of split messages (a higher layer
concept), because that affects the delivery timing (whether to delay
for more, or forward immediately) done by an agent that's only aware
of basic frames/messages (a lower layer concept). The concept needing
to be passed down is a flag (maybe two) influencing delivery
buffering/timing. Without it, messages are either delayed for a long
time, or forwarded too quickly which paradoxically causes TCP to
introduce quarter-second delays all over the place due to partial
segments.

-- Jamie

Greg Wilkins

2010-03-01 17:55:47 UTC

Post by Maciej Stachowiak
A requirement to report specific conditions (error or otherwise) would
be separate. I think as far as differentiating orderly/idle/failure
closes, I'd need to hear more about what counts as what. An idle close
by your opposite endpoint could presumably just be an orderly close. So
I assume idle closes worth distinguishing would be by an intermediary.
Are those actually distinguishable from failure?

I think idle close where the browser/server/network decides to close
the connection because it is idle, is very different to an application
initiated close, where the application has decided that a connection
to the server is no longer needed. Both are very different to
the server closing the connection because of some protocol
violation or a too large a message.

For a network idle close, an app is likely to want to
immediately create a new websocket connection.

For a close during handshake the app is likely
to want to retry, but after a backed off delay to
prevent busy looping.

For an application initiated close, the app is
not likely to reconnect at all.

I've just blogged about these kinds of use case, which for
IETF IP reasons I'll paste the badly formatted
text of it below: Sorry for the product plugs.

cheers

http://blogs.webtide.com/gregw/entry/websocket_chat
-------------------------------------------------------------

The websocket protocol has been touted as a great leap forward for bidirectional web applications like chat, promising a new era of simple comet applications. Unfortunately there
is no such thing as a silver bullet and this blog will walk through a simple chat room to see where websocket does and does not help with comet applications. In a websocket world,
there is even more need for frameworks like cometd.

Simple Chat

A chat is the "helloworld" application of web-2.0 and a simple websocket chat room is included with the jetty-7 which now supports websockets. The source of the simple chat can be
seen in svn for the client side and server side. The key part of the client side is to establish a WebSocket connection:

join: function(name) {
this._username=name;
var location = document.location.toString().replace('http:','ws:');
this._ws=new WebSocket(location);
this._ws.onopen=this._onopen;
this._ws.onmessage=this._onmessage;
this._ws.onclose=this._onclose;
},

It is then possible for the client to send a chat message to the server:

_send: function(user,message){
user=user.replace(':','_');
if (this._ws)
this._ws.send(user+':'+message);
},

and to receive a chat message from the server and to display it:

_onmessage: function(m) {
if (m.data){
var c=m.data.indexOf(':');
var from=m.data.substring(0,c).replace('<','<').replace('>','>');
var text=m.data.substring(c+1).replace('<','<').replace('>','>');

var chat=$('chat');
var span

Maciej Stachowiak

2010-03-01 18:18:19 UTC

Post by Greg Wilkins

Post by Maciej Stachowiak
A requirement to report specific conditions (error or otherwise) would
be separate. I think as far as differentiating orderly/idle/failure
closes, I'd need to hear more about what counts as what. An idle close
by your opposite endpoint could presumably just be an orderly
close. So
I assume idle closes worth distinguishing would be by an
intermediary.
Are those actually distinguishable from failure?

I think idle close where the browser/server/network decides to close
the connection because it is idle, is very different to an application
initiated close, where the application has decided that a connection
to the server is no longer needed. Both are very different to
the server closing the connection because of some protocol
violation or a too large a message.
For a network idle close, an app is likely to want to
immediately create a new websocket connection.

That seems like a reasonable use case, I'm just not sure if it is
possible to distinguish idle close by an intermediary from random
network failure that breaks the connection. At least for
intermediaries that can pass the protocol through without being
modified (e.g. HTTPS or SOCKS proxies).

Regards,
Maciej

Greg Wilkins

2010-03-01 18:31:50 UTC

Post by Maciej Stachowiak
That seems like a reasonable use case, I'm just not sure if it is
possible to distinguish idle close by an intermediary from random
network failure that breaks the connection. At least for intermediaries
that can pass the protocol through without being modified (e.g. HTTPS or
SOCKS proxies).

I don't think we can tell the difference - unless we do some heuristic/statistical
analysis to see if a certain path always closes after a particular period of idleness.
Some imples might do this if other timeouts become discoverable... but you'd not
want to require such analysis.

Luckily the handling for a network failure and an idle timeout in an intermediary
will be pretty much the same - immediately reopen the connection.

If the reopen then closes before handshake completes, that will tell the
app that it was a network failure and not an idle timeout.

cheers

Maciej Stachowiak

2010-03-01 19:48:18 UTC

Post by Greg Wilkins

Post by Maciej Stachowiak
That seems like a reasonable use case, I'm just not sure if it is
possible to distinguish idle close by an intermediary from random
network failure that breaks the connection. At least for
intermediaries
that can pass the protocol through without being modified (e.g. HTTPS or
SOCKS proxies).

I don't think we can tell the difference - unless we do some
heuristic/statistical
analysis to see if a certain path always closes after a particular period of idleness.
Some imples might do this if other timeouts become discoverable... but you'd not
want to require such analysis.
Luckily the handling for a network failure and an idle timeout in an intermediary
will be pretty much the same - immediately reopen the connection.
If the reopen then closes before handshake completes, that will tell the
app that it was a network failure and not an idle timeout.

So it sounds like the cases to distinguish are "clean close" vs
"unexpected close". It sounds like that is feasible to do with
explicit close messages.

Regards,
Maciej

Julian Reschke

2010-03-01 08:39:37 UTC

Post by Ian Hickson
...
making our initial protocol non-extensible, in short, if we learn from the
...

Sorry? HTTP appears to be very extensible to me.

Best regards, Julian

Ian Hickson

2010-03-01 08:59:26 UTC

Post by Julian Reschke

Post by Ian Hickson
...
making our initial protocol non-extensible, in short, if we learn from the
...

Sorry? HTTP appears to be very extensible to me.

Tim's initial implementation (pre-0.9, though often referred to as 0.9)
didn't have response headers or any way to indicate metadata with a
response. That's what I was referring to.

--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

Ian Hickson

2010-03-01 09:06:10 UTC

Post by Ian Hickson

Post by Julian Reschke

Post by Ian Hickson
...
making our initial protocol non-extensible, in short, if we learn from the
...

Sorry? HTTP appears to be very extensible to me.

Tim's initial implementation (pre-0.9, though often referred to as 0.9)

Actually my bad, what I described is indeed 0.9, not pre-0.9. My
apologies.

Post by Ian Hickson
didn't have response headers or any way to indicate metadata with a
response. That's what I was referring to.

This still holds, however. HTTP 0.9, as originally designed, had no
response metadata, it could only carry HTML with all metadata in the
actual file itself. (I wouldn't be surprised if this is the original
reason for why we have to do so much sniffing these days.)

--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

Anne van Kesteren

2010-03-01 09:28:08 UTC

Post by Ian Hickson
This still holds, however. HTTP 0.9, as originally designed, had no
response metadata, it could only carry HTML with all metadata in the
actual file itself. (I wouldn't be surprised if this is the original
reason for why we have to do so much sniffing these days.)

Julian Reschke

2010-03-01 09:54:10 UTC

Post by Ian Hickson

Post by Ian Hickson

Post by Julian Reschke

Post by Ian Hickson
...
making our initial protocol non-extensible, in short, if we learn from the
...

Sorry? HTTP appears to be very extensible to me.

Tim's initial implementation (pre-0.9, though often referred to as 0.9)

Actually my bad, what I described is indeed 0.9, not pre-0.9. My
apologies.
...

I find this *very* misleading. We are discussing protocol design in the
context of the IETF. The IETF has taken 0.9 as input, and came up with
1.0, which is *very* extensible.

Maybe a lesson for this WG?

Best regards, Julian

Ian Hickson

2010-03-01 10:11:04 UTC

Post by Julian Reschke

Post by Ian Hickson

Post by Ian Hickson

Post by Julian Reschke

Post by Ian Hickson
...
making our initial protocol non-extensible, in short, if we learn from the
...

Sorry? HTTP appears to be very extensible to me.

Tim's initial implementation (pre-0.9, though often referred to as 0.9)

Actually my bad, what I described is indeed 0.9, not pre-0.9. My
apologies.
...

I find this *very* misleading. We are discussing protocol design in the
context of the IETF. The IETF has taken 0.9 as input, and came up with 1.0,
which is *very* extensible.

...and which broke backwards-compatibility with legacy clients at the
time.

Post by Julian Reschke
Maybe a lesson for this WG?

I agree that we can learn from that, though maybe not about what we should
learn. I think it teaches us to make sure to make our protocol have
designed extension mechanisms for future versions that are
forwards-compatible, that we should keep our protocol very simple, even in
an "official" version, and we should ensure that we never break backwards-
compatibility with legacy deployed services.

--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

Julian Reschke

2010-03-01 10:29:53 UTC

Post by Ian Hickson

Post by Julian Reschke

Post by Ian Hickson

Post by Ian Hickson

Post by Julian Reschke

Post by Ian Hickson
...
making our initial protocol non-extensible, in short, if we learn from the
...

Sorry? HTTP appears to be very extensible to me.

Tim's initial implementation (pre-0.9, though often referred to as 0.9)

Actually my bad, what I described is indeed 0.9, not pre-0.9. My
apologies.
...

I find this *very* misleading. We are discussing protocol design in the
context of the IETF. The IETF has taken 0.9 as input, and came up with 1.0,
which is *very* extensible.

...and which broke backwards-compatibility with legacy clients at the
time.

Was this a problem in practice? I honestly don't recall.

Post by Ian Hickson

Post by Julian Reschke
Maybe a lesson for this WG?

I agree that we can learn from that, though maybe not about what we should
learn. I think it teaches us to make sure to make our protocol have
designed extension mechanisms for future versions that are
forwards-compatible, that we should keep our protocol very simple, even in
an "official" version, and we should ensure that we never break backwards-
compatibility with legacy deployed services.

Hm, no. Break backwards compatibility if it brings clear benefits, and
the damage is not too big.

Best regards, Julian

Martin J. Dürst

2010-03-02 09:19:10 UTC

Post by Julian Reschke

Post by Ian Hickson

Post by Julian Reschke
I find this *very* misleading. We are discussing protocol design in the
context of the IETF. The IETF has taken 0.9 as input, and came up with 1.0,
which is *very* extensible.

...and which broke backwards-compatibility with legacy clients at the
time.

Was this a problem in practice? I honestly don't recall.

As far as you can see from
http://www.w3.org/DesignIssues/CompatibleProof, there wasn't a problem.
But seen from 0.9, that was by accident (not every detail written down)
and due to how the servers of the time (of which there were very few)
were implemented, not by design.

Regards, Martin.

--
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp mailto:***@it.aoyama.ac.jp

Greg Wilkins

2010-03-01 10:39:51 UTC

Post by Ian Hickson
I agree that we can learn from that, though maybe not about what we should
learn. I think it teaches us to make sure to make our protocol have
designed extension mechanisms for future versions that are
forwards-compatible, that we should keep our protocol very simple, even in
an "official" version, and we should ensure that we never break backwards-
compatibility with legacy deployed services.

Unless those legacy deployed services are using an insecure handshake
that we have to change anyway.

Seriously, I think that any services built on the current rollout
of websocket have to know that it is very early days... and that
they can expect at least one phase of breakage before a standard
is agreed.

cheers

Roy T. Fielding

2010-03-02 05:32:19 UTC

Post by Ian Hickson

Post by Ian Hickson

Post by Julian Reschke

Post by Ian Hickson
...
making our initial protocol non-extensible, in short, if we learn from the
...

Sorry? HTTP appears to be very extensible to me.

Tim's initial implementation (pre-0.9, though often referred to as 0.9)

Actually my bad, what I described is indeed 0.9, not pre-0.9. My
apologies.
...

I find this *very* misleading. We are discussing protocol design in the context of the IETF. The IETF has taken 0.9 as input, and came up with 1.0, which is *very* extensible.

Actually, we didn't even create an IETF working group for HTTP
until December 1994, more than a year after everyone was using
the various bits of "HTTP/1.0" as designed on www-talk and mostly
documented on CERN's website. The IETF effort was to make a
standard out of what everyone agreed to implement, not just
document every single misfeature found on the Web.

RFC1945 is the subset of HTTP/1.0 implementations in the wild
that actually worked in practice after two years of deployment.

HTTP/1.1 became necessary after it was made clear that 1.0
in its then-current form was unsuitable for standardization.

My preference is that Internet protocols be proven in practice
before anyone tries to claim them as a standard. Experiments
should be allowed to experiment, and distinct versioning is
necessary to get deployment experience without reducing the
standards process to a bad case of rubber stamping.

....Roy

Alexander Philippou

2010-03-01 12:57:47 UTC

If we avoid adding features like content negotiation that are so complicated that hardly anybody ends up using them

In Web services, content negotiation (exchanging headers such as Accept and Content-Type) is used to support multiple SOAP/REST message encodings and compression algorithms on a single service endpoint so that responses are encoded depending on what each particular client supports/prefers.

Web Socket is superior to HTTP for use by Web services. It would be sufficient to support content negotiation on a per-connection basis during the initial handshake (instead of on a per-message basis as per HTTP) for Web Socket to be used by Web services.

Can we use the initial HTTP handshake for content negotiation?

Alexander Philippou

Ian Hickson

2010-03-01 22:14:31 UTC

Post by Alexander Philippou

If we avoid adding features like content negotiation that are so
complicated that hardly anybody ends up using them

In Web services, content negotiation (exchanging headers such as Accept
and Content-Type) is used to support multiple SOAP/REST message
encodings and compression algorithms on a single service endpoint so
that responses are encoded depending on what each particular client
supports/prefers.
Web Socket is superior to HTTP for use by Web services. It would be
sufficient to support content negotiation on a per-connection basis
during the initial handshake (instead of on a per-message basis as per
HTTP) for Web Socket to be used by Web services.
Can we use the initial HTTP handshake for content negotiation?

Could you elaborate on what the concrete use case is?

--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

Alexander Philippou

2010-03-02 10:30:32 UTC

Post by Ian Hickson

Post by Alexander Philippou
In Web services, content negotiation (exchanging headers such as Accept
and Content-Type) is used to support multiple SOAP/REST message
encodings and compression algorithms on a single service endpoint so
that responses are encoded depending on what each particular client
supports/prefers.
Web Socket is superior to HTTP for use by Web services. It would be
sufficient to support content negotiation on a per-connection basis
during the initial handshake (instead of on a per-message basis as per
HTTP) for Web Socket to be used by Web services.
Can we use the initial HTTP handshake for content negotiation?

Could you elaborate on what the concrete use case is?

Example. Server supports XML, Fast Infoset and JSON as message encodings; it also supports GZIP, DEFLATE and LZF compression. Client connects and specifies in the header that it accepts "application/xml" and "application/fastinfoset" content types, and "gzip" content encoding. Server responds that, for this particular channel, "application/fastinfoset" will be the content type and "gzip" the content encoding for all messages exchanged by both sides.

So content negotiation enables services to be used in a manner that is most efficient for a particular situation. It enables the performance of Web services to be improved while retaining cross-platform interop and support for clients of different capabilities under a single service endpoint.

Does this answer your question or were you asking for something else?

Alexander

Ian Hickson

2010-03-02 10:50:51 UTC

Post by Alexander Philippou

Post by Ian Hickson

Post by Alexander Philippou
In Web services, content negotiation (exchanging headers such as Accept
and Content-Type) is used to support multiple SOAP/REST message
encodings and compression algorithms on a single service endpoint so
that responses are encoded depending on what each particular client
supports/prefers.
Web Socket is superior to HTTP for use by Web services. It would be
sufficient to support content negotiation on a per-connection basis
during the initial handshake (instead of on a per-message basis as per
HTTP) for Web Socket to be used by Web services.
Can we use the initial HTTP handshake for content negotiation?

Could you elaborate on what the concrete use case is?

Example. Server supports XML, Fast Infoset and JSON as message
encodings; it also supports GZIP, DEFLATE and LZF compression. Client
connects and specifies in the header that it accepts "application/xml"
and "application/fastinfoset" content types, and "gzip" content
encoding. Server responds that, for this particular channel,
"application/fastinfoset" will be the content type and "gzip" the
content encoding for all messages exchanged by both sides.

Why would we specify more than one compression algorithm?

The message encodings are an application-layer issue (implemented in the
JavaScript on the client), so that seems like something you'd implement in
the subprotocol, not at the WebSocket layer.

--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

Greg Wilkins

2010-03-02 11:49:45 UTC

Post by Ian Hickson
Why would we specify more than one compression algorithm?

Because preferred compression algorithms change over time.

gzip was not initially the preferred compression algorithm for HTTP
and may not remain so as content types and encodings used change
over time.

Post by Ian Hickson
The message encodings are an application-layer issue (implemented in the
JavaScript on the client), so that seems like something you'd implement in
the subprotocol, not at the WebSocket layer.

Actually, for this type of per connection negotiation, so long as
arbitrary headers are allowed in the upgrade request an 101
response (which they are), then I think client and server are
already able to use existing content negotiation (or not) as
they see fit.

Where I think we do need to consider content-type, is if
we wish to send polymorphic content. eg, a stream of images
of various formats. I think that mostly should be handled
by sub protocol, but I do think some base support for meta
data would prevent every sub protocol reinventing that wheel.

cheers

Alexander Philippou

2010-03-02 12:50:59 UTC

Post by Ian Hickson
Why would we specify more than one compression algorithm?

They have different performance characteristics. For example:
- LZF: medium compactness, very low processing overhead (also very small code)
- GZIP/DEFLATE: good compactness, medium processing overhead
- LZMA: high compactness, significant processing overhead (asymmetric: more on encoding, less on decoding)
- PPMs: high compactness, significant processing overhead (symmetric: same on encoding/decoding)

Post by Ian Hickson
The message encodings are an application-layer issue (implemented in the
JavaScript on the client), so that seems like something you'd
implement in
the subprotocol, not at the WebSocket layer.

Absolutely agree that the implementation of compression or message encodings should be left to the application layer. Permitting the optional exchange of Accept, Accept-Encoding, Content-Type and Content-Encoding headers during WebSocket's initial HTTP handshake would be sufficient; the rest will be easily implementable by those of us interested in negotiating the content.

Alexander

Ian Hickson

2010-03-02 20:37:12 UTC

Post by Alexander Philippou

Post by Ian Hickson
Why would we specify more than one compression algorithm?

- LZF: medium compactness, very low processing overhead (also very small code)
- GZIP/DEFLATE: good compactness, medium processing overhead
- LZMA: high compactness, significant processing overhead (asymmetric: more on encoding, less on decoding)
- PPMs: high compactness, significant processing overhead (symmetric: same on encoding/decoding)

How much of a difference are we talking about? A factor of 2? 10?

If it's less than an order of magnitude, the benefits gained would be
outweighed by the damage of having more than one choice.

Post by Alexander Philippou

Post by Ian Hickson
The message encodings are an application-layer issue (implemented in
the JavaScript on the client), so that seems like something you'd
implement in the subprotocol, not at the WebSocket layer.

Absolutely agree that the implementation of compression or message
encodings should be left to the application layer. Permitting the
optional exchange of Accept, Accept-Encoding, Content-Type and
Content-Encoding headers during WebSocket's initial HTTP handshake would
be sufficient; the rest will be easily implementable by those of us
interested in negotiating the content.

Why can't you do it in the first frame?

Mixing the higher-level protocol negotiation with the security handshake
seems like a bad idea. In particular, part of the security of the
handshake relies on the author having very little control over the ability
to inject content into the handshake.

--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

Pieter Hintjens

2010-03-02 21:38:26 UTC

Post by Ian Hickson
How much of a difference are we talking about? A factor of 2? 10?
If it's less than an order of magnitude, the benefits gained would be
outweighed by the damage of having more than one choice.

Choice = damage? I've rarely heard such a... singular analysis.

I thought choice meant, for example, that if an algorithm had flaws,
it could be replaced. Performance is only one of many aspects. Do
you recall something called GIF?

Not to mention the different qualities of service Alexander explained,
which already should make it clear why choice is valuable.

Even in your strange parallel universe where one size fits all, what
do you actually do if, ten years later, someone finds an algorithm
that is indeed 10x faster? Do you say, "no, it should be 20x"? Or
did you get a message from the future telling you what the compression
algorithm of 2020, or 2060 is going to be?

-Pieter

Roy T. Fielding

2010-03-02 05:01:56 UTC

Post by Ian Hickson

Post by Jamie Lokier

Post by Ian Hickson
HTTP is hugely complicated compared to WebSockets. IMHO that's a bug,
not a feature. By making WebSockets trivial to implement, we enable
these "quite creative people" to write their own servers.

However, please take a quick look at HTTP's history, because the same
will probably occur to WebSocket.

We (as the working group) decide whether it happens or not. If we avoid
adding features like content negotiation that are so complicated that
hardly anybody ends up using them, if we avoid adding five ways to encode
frames, if we avoid defining complicated header syntaxes like continuation
lines, if we define error-handling behaviour up front, if we avoid leaving
things as basic as what character encoding to use undefined, if we avoid
making our initial protocol non-extensible, in short, if we learn from the
mistakes that the HTTP working group(s) have made over the years, we can
keep Web Socket simple and it _never_ needs to get complicated.
So yes, I'm quite familiar with HTTP's history.

Apparently not.

HTTP/1.0 had only end-on-close framing before the working group
got involved. Should we have been happy with a framing
that had no distinction between success and error? I added
content-length framing because it was a minimal backwards-compatible
way to indicate early-close, enable safe caching, and introduce
keep-alive connections. However, CL was known at the time to
not be sufficient for dynamic streams, so we also introduced
chunked encoding for 1.1 (when it was first possible to do so),
again in a way that promoted backwards-compatibility.

Those were design choices taken in the context of 1.x compatibility
while HTTPng/MUX was being worked on in parallel. Do you actually
disagree with those decisions?

Content negotiation is so complicated that hardly anybody uses it?
Get a clue. Almost all major corporate websites use content negotiation
for the initial selection of language, and more than half of all
Internet sites use it for error messages. The design came from
CERN libwww, long before the WG was formed, and was intended to
deploy automated user agents in parallel with browsers. The only
applications that don't support it very well are browsers,
because they couldn't figure out a decent config mechanism and
because the early implementations were stupid (see XMosaic).

As an architectural decision, however, I think content negotiation
was a bad trade-off because the shorter latency of a quick
negotiated response was not worth the negative impacts on
latency due to sending too many request headers, the privacy
concerns of same, and effect of variance on caching. That is
why I added other ways of doing the same thing (300 and Alternates),
which failed to be picked up by browsers.

HTTP was defined in the ISO-8859-1 character encoding, since 1992.
The fact that it is no longer a suitable default is not HTTP's
problem (UTF-8 did not exist at the time).

Lack of error-handling requirements is because error-handling is
specific to implementation purpose (not protocol) and HTTP is used
by more bizarrely-unique applications than any other application
protocol (not just browsers). Repeated insistence that such a thing
is an aspect of "good" protocol design does not make it so.

Extensibility (something that HTTP has more of than almost all
other Internet protocols) embodies an inherent trade-off with
multi-organizational trust -- a protocol that can be bent any
which way cannot be trusted as much as a protocol that has
very specific bend points and is amenable to inspection. One
person's extension is another person's denial of service.

But it is largely a waste of time to tell you any of this.
Go read the early www-talk history.

Post by Ian Hickson
I'm also quite confident
that the same fate does not need to befall Web Sockets. We don't need to
put multiplexing in the base protocol. We don't need to put metadata
mechanisms in the base protocol. We can make compression an optional
server-opt-in feature in the second version that has just one algorithm
and that isn't required for simple servers. We can avoid designing
theoretical Architectures like REST and just provide trivial tools on
which people can build their applications to whatever level of complexity
they like.

Yes. You are happily redesigning TCP-for-the-clueless. Big deal.
I still don't understand why you don't just implement TLS directly,
especially if the hostname binding to same-origin can be done
in the initial handshake.

HTTP doesn't do this stuff because it won't SCALE, even with the
vastly superior networks today. It is just a bad idea, in general,
to pretend that N:1 services should involve long-term connections
when N is large. The only way to avoid that "theoretical" problem
is to go connectionless with some magic congestion-free message
passing protocol that flies through NATs. I am pretty sure that
will happen shortly after IPv6 is fully deployed.

Post by Ian Hickson
I also see, on embedded systems lists, people asking where they can get

Post by Jamie Lokier
a small HTTP server to do this, that or the other. It never occurs to
them to just write one (except in one case I can think of, where the
device's badly written HTTP server for diagnostics was the reason it
kept crashing.) Even though it's actually easy to write a conforming
HTTP/0.9 server.

People don't know about 0.9; when they think of HTTP they think of 1.1,
and I think it's quite reasonable to not want to implement an HTTP 1.1
server. If we design Web Sockets right, we won't have that problem.

Post by Jamie Lokier
I think the same will happen with WebSocket: Simple at first, but before
you know it, it'll have evolved - and the web will be full of dirty
workarounds for buggy clients and servers - just like what happened with
HTTP.

The reason it happened with HTTP is the same reason it happened with HTML
and CSS -- HTTP didn't define error handling behaviour, so when a peer did
the wrong thing, there was no rule saying what you had to do in response,
and whatever the dominant server or client behaviour was ended up being
the de-facto standard. If we define error handling rules so that there's
no sequence of bytes for which the behaviour is undefined, we side-step
this problem completely.

HTTP is still trivial. The spec is not, but the spec's condition is
what happens when there are too many editors that haven't written
their own implementations of the protocol. Everyone has an opinion
on HTTP.

If you want to learn from HTTP's mistakes, then do so. Stop parroting
the ignorant assumptions you've heard and actually learn how the Web
works across heterogeneous systems.

....Roy

Scott Ferguson

2010-02-28 17:38:24 UTC

Post by Ian Hickson
The point is that all you should need to use Web Sockets is a shell and a
scripting language.

You can't. The fundamental Web Sockets requirements are beyond the
capabilities of a bare scripting language. You cannot use CGI as the Web
Socket plugin API, for example, because it's not powerful enough.

Post by Ian Hickson
HTTP is hugely complicated compared to WebSockets. IMHO that's a bug, not
a feature. By making WebSockets trivial to implement, we enable these
"quite creative people" to write their own servers.

No, WebSockets is fundamentally more complicated than HTTP. See below.

Post by Ian Hickson
For small-scale operations, there's really no need for things to be
particularly complicated. Sure, if you have hundreds, thousands, millions,
or billions of users then it is hard work -- and the protocol's complexity
pales in comparison to the scaling issues. But when there are no scaling
issues, when you have four users _total_, it doesn't have to be hard and
the protocol _can_ be a significant burden.

No, the complexity exists with a single user. It's fundamental to
understanding the Web Socket protocol. This has nothing to do with
scaling and everything to do with implementing Web Sockets for a single
client.

Look at the problems solved by HTTP and Web Sockets abstractly:

HTTP is a trivial, stateless, single-threaded, synchronous protocol. The
abstract model for HTTP looks like:

1. event generated by client (*important* this is the only event in
the entire system), i.e. the GET request
2. server spawns application instance thread #1 (i.e. stateless)
3. single-threaded application processes request, blocking as
necessary (i.e. CGI is trivial single-threaded blocking script)
4. all state closes (Apache modules have even used this to work around
memory leaks.)

Trivial. One event. One thread. You don't need to know or think about
multithreaded programming to solve the problem.

Web Sockets is intrinsically asynchronous, multi-threaded and stateful.
The following is the perspective from the server application (the issues
are symmetrical, though browser implementations do the hard work on the
client end):

1. one of several asynchronous events can occur at any time: a new web
socket event/request, or a server application The server must handle any
of the events at any time. (Compare with the one unique initiating event
in HTTP.)

2. while processing a web socket request (e.g. a slow database call),
the application will probably want to be able to handle other web socket
request, and send server-initiated requests, rather than blocking while
#2 occurs. (Blocking isn't an issue in HTTP because it's synchronous.)

3. while processing a server request (e.g. pushing a large file), the
application will probably want to be able to handle new
web-socket-initiated requests, rather than blocking with #3 occurs.
(There's no matching complexity in HTTP because HTTP servers don't
initiate requests.)

A CGI script can't begin to solve this problem, because it cannot handle
#1 and specifically cannot handle any request of #3.

If you don't understand that complexities of an asynchronous system,
like even the simplest non-degenerate Web Socket server, dwarf the
synchronous parsing of a simple grammar, then you need to spend more
time studying the problem, because these are fundamental concepts.

-- Scott

Ian Hickson

2010-02-28 22:29:19 UTC

Post by Scott Ferguson

Post by Ian Hickson
The point is that all you should need to use Web Sockets is a shell and a
scripting language.

You can't.

Given that I've done it, I'd have to say this was inaccurate. With the
current version of the protocol, you can write a server in about 100 lines
of Perl:

http://damowmow.com/playground/demos/websocket/blank-server.pl

This will increase somewhat when we add features to require the server to
prove he read the handshake and if we add a closing handshake, but it'd
still be a far cry from impossible.

--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

Scott Ferguson

2010-03-01 00:47:32 UTC

Post by Ian Hickson

Post by Scott Ferguson

Post by Ian Hickson
The point is that all you should need to use Web Sockets is a shell and a
scripting language.

You can't.

Given that I've done it, I'd have to say this was inaccurate. With the
current version of the protocol, you can write a server in about 100 lines
http://damowmow.com/playground/demos/websocket/blank-server.pl
This will increase somewhat when we add features to require the server to
prove he read the handshake and if we add a closing handshake, but it'd
still be a far cry from impossible.

Let's take that example.

1. Any blocking call in application code freezes the entire server.
That's fine for a demo or a toy example, but it's not acceptable for
even the simplest real application. (This was my point #2.)

2. Any large file writes, or blocking by a client freezes the entire
server. (Point #3)

3. There are no server initiated events, which skips the main value of
web sockets (point #1).

Since any actual application is unlikely to restrict itself to those
limitations in capability or quality-of-service, I don't see how this is
an example of an actual server that people will write for themselves.

Err, and you didn't even parse the request.

And you're buffering character by character after your select() until
the buffer is full. Wow.

Ok, that final bit is the whole point I'm making. Even your trivial
example needed to create a state machine to buffer the incoming request
before passing it along to the application to avoid blocking. And, on
the parsing end, you're using Perl libraries to do the actual parsing of
simple things like utf-8. Someone who knows how to parse will write an
equivalent module for you for Web Sockets. So you can assume those
libraries exist instead of tying the spec in knots avoiding it.

Yes, you can build a server around a big state machine, i.e. simulating
multithreading, but you really don't want to do that for anything other
than a trivial demo.

-- Scott

Ian Hickson

2010-03-01 01:18:24 UTC

Post by Scott Ferguson

Post by Ian Hickson
http://damowmow.com/playground/demos/websocket/blank-server.pl

1. Any blocking call in application code freezes the entire server.
That's fine for a demo or a toy example, but it's not acceptable for
even the simplest real application. (This was my point #2.)

Not all applications have blocking calls. For example, my tic-tac-toe
server does not:

http://damowmow.com/playground/demos/websocket/tic-tac-toe.pl
http://damowmow.com/playground/demos/websocket/tic-tac-toe.html

Post by Scott Ferguson
2. Any large file writes, or blocking by a client freezes the entire
server. (Point #3)

Not all applications have large file writes. Indeed, few do.

Post by Scott Ferguson
3. There are no server initiated events, which skips the main value of
web sockets (point #1).

That isn't the main value, IMHO.

Post by Scott Ferguson
Since any actual application is unlikely to restrict itself to those
limitations in capability or quality-of-service, I don't see how this is
an example of an actual server that people will write for themselves.

Am I not a person?

Post by Scott Ferguson
Err, and you didn't even parse the request.

Indeed; you don't need to to conform to Web Sockets today. (You'll
probably need a minor bit of parsing in the future when I add the
protection against HTTP-to-WebSocket cross-protocol attacks, but that
should only add a few lines.)

Post by Scott Ferguson
And you're buffering character by character after your select() until
the buffer is full. Wow.

Wow?

Post by Scott Ferguson
Ok, that final bit is the whole point I'm making. Even your trivial
example needed to create a state machine to buffer the incoming request
before passing it along to the application to avoid blocking.

Yes?

Post by Scott Ferguson
And, on the parsing end, you're using Perl libraries to do the actual
parsing of simple things like utf-8. Someone who knows how to parse will
write an equivalent module for you for Web Sockets. So you can assume
those libraries exist instead of tying the spec in knots avoiding it.

UTF-8 support is far more likely to be available in any random language or
development environment than WebSocket support for the forseeable future.

Post by Scott Ferguson
Yes, you can build a server around a big state machine, i.e. simulating
multithreading, but you really don't want to do that for anything other
than a trivial demo.

"Trivial demos" as you put it, or small-scale applications, as I would put
it, are the main target audience I'm interested in. There are as many
more applications in the long tail than in the head.

I'm perfectly happy for this working group to work on a specification that
targets a different target audience; I'm not trying to force a particular
target audience on the working group. Personally, I'm interested in
working on something that targets the long tail. If this is not something
this working group is interested in, that's fine; I've no interest in
blocking other work.

--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

Jamie Lokier

2010-03-01 03:13:34 UTC

Post by Ian Hickson

Post by Scott Ferguson
3. There are no server initiated events, which skips the main value of
web sockets (point #1).

That isn't the main value, IMHO.

If that's not the main value, what is?

For client-initiated sporadic request-response applications,
HTTP is surely more efficient than WebSocket - as packets are the
dominant cost factor in modern networks for sporadic traffic.
So there is no reason to use WebSocket for those.

-- Jamie

Ian Hickson

2010-03-01 03:26:13 UTC

Post by Jamie Lokier

Post by Ian Hickson

Post by Scott Ferguson
3. There are no server initiated events, which skips the main value of
web sockets (point #1).

That isn't the main value, IMHO.

If that's not the main value, what is?

The persistent connection. For example, a multiplayer game can have a
single connection for each user, such that whenever a user does anything
it can determine which user sent the message trivially, and it can notify
all the other users. Consider a tic-tac-toe game: there's nothing
happening except when the user sends a message. It's orders of magnitude
easier to implement a tic-tac-toe server with Web Sockets than over HTTP.
The same applies to many other scenarios; chat, any turn-based game, etc.

In any case, it's trivial to add a timeout to a select() call and send
messages in response to something other than an incoming message, so I
really don't think that's a real concern.

--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

Jamie Lokier

2010-03-01 03:54:50 UTC

Post by Ian Hickson

Post by Jamie Lokier

Post by Ian Hickson

Post by Scott Ferguson
3. There are no server initiated events, which skips the main value of
web sockets (point #1).

That isn't the main value, IMHO.

If that's not the main value, what is?

The persistent connection. For example, a multiplayer game can have a
single connection for each user, such that whenever a user does anything
it can determine which user sent the message trivially, and it can notify
all the other users. Consider a tic-tac-toe game: there's nothing
happening except when the user sends a message.

Eh? I think you're making up reasons now.

HTTP is identical in *all* those respects, except for "it can notify
all the other users". Nothing happens until someone moves, then you
receive a request, with the user trivially determined by query path.

I think you'll find at least 99% of people who've written something
with HTTP would disagree with the sentiment that it's anything but
trivial to track a sequence of requests from the same origin script.

The only actual difference, which is the interesting bit, is the
ability to notify all the other users of the game update. And that is
a server-initiated event, right? We all agree those are good.

(Well, the other difference is the WebSocket version will break on a
home router if nobody moves for 2 minutes due to the NAT timeout.)

-- Jamie

Ian Hickson

2010-03-01 04:58:19 UTC

Post by Jamie Lokier

Post by Ian Hickson
The persistent connection. For example, a multiplayer game can have a
single connection for each user, such that whenever a user does
anything it can determine which user sent the message trivially, and
there's nothing happening except when the user sends a message.

Eh? I think you're making up reasons now.

...

Post by Jamie Lokier
HTTP is identical in *all* those respects, except for "it can notify all
the other users". Nothing happens until someone moves, then you receive
a request, with the user trivially determined by query path.

Could you show me how you would implement this using HTTP?:

http://damowmow.com/playground/demos/websocket/tic-tac-toe.pl

An actual running example would be ideal. (That shouldn't be hard: the
above took me -- a tech writer -- only a few hours to write from scratch,
so doing it for HTTP should take you -- an actual engineer -- mere minutes
if HTTP is really no more complicated.)

I cannot see a sane way to do it. If it's as easy to do in HTTP as in Web
Socket, then maybe there isn't a need for this protocol at all. That
certainly would make our lives easier.

--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

Maciej Stachowiak

2010-03-01 07:19:54 UTC

Post by Ian Hickson

Post by Jamie Lokier

Post by Ian Hickson
The persistent connection. For example, a multiplayer game can have a
single connection for each user, such that whenever a user does
anything it can determine which user sent the message trivially, and
there's nothing happening except when the user sends a message.

Eh? I think you're making up reasons now.

...

Post by Jamie Lokier
HTTP is identical in *all* those respects, except for "it can
notify all
the other users". Nothing happens until someone moves, then you receive
a request, with the user trivially determined by query path.

http://damowmow.com/playground/demos/websocket/tic-tac-toe.pl
An actual running example would be ideal. (That shouldn't be hard: the
above took me -- a tech writer -- only a few hours to write from scratch,
so doing it for HTTP should take you -- an actual engineer -- mere minutes
if HTTP is really no more complicated.)
I cannot see a sane way to do it. If it's as easy to do in HTTP as in Web
Socket, then maybe there isn't a need for this protocol at all. That
certainly would make our lives easier.

You'd have to encode the full game state in a Cookie. That's simple
enough for Tic Tac Toe, because the game state is trivial. For a more
complicated game, or a multiplayer game, not so much. You'd need a
session cookie that indexes to some persistent state on the server
side. Coding that would surely be more complicated and would likely
require some form of framework.

Regards,
Maciej

Pieter Hintjens

2010-03-01 07:35:01 UTC

You'd need a session cookie that
indexes to some persistent state on the server side. Coding that would
surely be more complicated and would likely require some form of framework.

The lack of state at the server is what lets HTTP scale. Keeping
state in the server via the connection is easier, under two
conditions. One, when you have some kind of a state-machine threaded
connection framework to run with. I've built such frameworks (iMatix
SMT) since 1996. When you have such a beast, making a stateful server
is trivial, and fun. But it is much, much more work than making a
stateless server that has to process cookies.

Second condition, your connection ceiling is modest. A single HTTP
server can handle hundreds of thousands of clients. A stateful
connected server cannot.

I personally prefer by far the connected stateful server, built on a
solid multithreading state machine framework. It produces robust,
fast engines. But it is significantly complex, out of reach for most
engineers. For examples, look at iMatix OpenAMQ.

If Web( )Socket(s) demands a connected stateful server, it puts itself
in the "very hard" category for server implementors. You can write
trivial servers, you can write real servers, but there is nothing in
between. A 100-line demo cannot evolve into a real server. Oh, yes,
you can do things like spawn a process per connection.

Whereas HTTP does actually scale like that. You can actually write a
single threaded HTTP server that is fast and scalable. And again, it
is one or two orders of magnitude easier to add state to HTTP than to
build a real multithreaded connected stateful server, even a trivial
one.

-Pieter

Martin J. Dürst

2010-03-02 08:18:58 UTC

Post by Maciej Stachowiak

Post by Ian Hickson
I cannot see a sane way to do it. If it's as easy to do in HTTP as in Web
Socket, then maybe there isn't a need for this protocol at all. That
certainly would make our lives easier.

You'd have to encode the full game state in a Cookie. That's simple
enough for Tic Tac Toe, because the game state is trivial. For a more
complicated game, or a multiplayer game, not so much. You'd need a
session cookie that indexes to some persistent state on the server side.
Coding that would surely be more complicated and would likely require
some form of framework.

Hello Maciej,

Do you want to say that having to use a framework is a problem? In that
case, I'd disagree. (If not, then could you explain what your point
was?) There are dozens of frameworks for HTTP, for all kinds of needs.
I'd definitely prefer using a good framework (let's say Ruby on Rails or
some such) to having to code something like Ian's Tic-Tac-Toe server
(http://damowmow.com/playground/demos/websocket/tic-tac-toe.pl) by hand,
if that were the alternatives I had.

I think I have said so earlier, but I'd also highly prefer to use a
server-side framework for WebSockets rather than having to write code
like the one in the Tic-Tac-Toe examples. Essentially, DRY (don't repeat
yourself) or DRO (don't repeat others).

Regards, Martin.

--
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp mailto:***@it.aoyama.ac.jp

Ian Hickson

2010-03-02 08:26:41 UTC

Post by Martin J. DÃ¼rst
I think I have said so earlier, but I'd also highly prefer to use a
server-side framework for WebSockets rather than having to write code
like the one in the Tic-Tac-Toe examples. Essentially, DRY (don't repeat
yourself) or DRO (don't repeat others).

Nobody is saying that using a framework should be disallowed. Personally,
I'm interested in designing a protocol so simple that a framework is
unnecessary, but that doesn't mean that there should be no frameworks.

--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

Greg Wilkins

2010-03-02 09:00:39 UTC

Post by Ian Hickson

Post by Martin J. DÃ¼rst
I think I have said so earlier, but I'd also highly prefer to use a
server-side framework for WebSockets rather than having to write code
like the one in the Tic-Tac-Toe examples. Essentially, DRY (don't repeat
yourself) or DRO (don't repeat others).

Nobody is saying that using a framework should be disallowed. Personally,
I'm interested in designing a protocol so simple that a framework is
unnecessary, but that doesn't mean that there should be no frameworks.

Ian,

the features needed by non trivial web applications
are going to have to be implemented somewhere: either
in the base protocol, in a framework or in the
application.

To advocate an ultra simple protocol and no frameworks
is to advocate that all features must be implemented
in the application. If you want a no frameworks,
then surely you should be advocating a rich feature
set for the base protocol?

Or are you saying that the features commonly used by web
applications (eg threading, dispatch, authentication,
authorization, encoding, internationalization, logging,
content type handling, error handling, etc.) are
just not needed?

As I've said before, I think your ideas on web
application architecture are not exactly
main stream and I fear that you are attempting to use
the websocket protocol to drive an an agenda
for change.

Protocols and standards should be following the
existent use-cases and web application architectures.
They should not be developed in anticipation of
speculative brave new worlds of web architecture.

If you have some architectural ideas that you wish
to promote, then I suggest trying them out on some
real world applications somewhat more challenging
than naughts and crosses for 2 users. If the
ideas have merit, then the approach will take
off and standards and protocols will be developed
to follow.

regards

Greg Wilkins

2010-03-01 07:59:23 UTC

Post by Ian Hickson

Post by Jamie Lokier

Post by Ian Hickson
The persistent connection. For example, a multiplayer game can have a
single connection for each user, such that whenever a user does
anything it can determine which user sent the message trivially, and
there's nothing happening except when the user sends a message.

Eh? I think you're making up reasons now.
HTTP is identical in *all* those respects, except for "it can notify all
the other users". Nothing happens until someone moves, then you receive
a request, with the user trivially determined by query path.

http://damowmow.com/playground/demos/websocket/tic-tac-toe.pl

Sorry, I don't have a tic-tac-toe example, but if you
care to look at http://live.chess.com/ there are currently
2834 users online playing 785 games. This is a reasonable
load - all over HTTP.

So it's possible with HTTP - but I'm not going to say it
was painless or simple.

However, websocket would not have helped one bit with
that site! Most of the work in a site like chess.com
is not about how to make it work well, but about how to
make it fail well.

So handling orderly close is needed, plus you have
to deal with slow clients that block writes to them
and handle that in a way that does not block all other
users in same game/chat.

The one pain point websocket might help with for this
site is to avoid silly nginx turning HTTP/1.1 into
HTTP/1.0 and stressing the network stack with too many
connections. But even this indicates that it is
relatively easy to stress a TCP/IP stack even with
a modest site like this - so don't tell me that
connections are cheap and free and there is no need
to manage them as a resource.

The messaging layer of this site (cometd) is currently
being upgraded to use websockets and I expect that
chess.com might be an early adoptor of the work.
But using websocket has not made cometd any simpler,
nor allowed us to remove any timeouts or messages.
We still need all the same mechanisms for tracking
sessions, handling timeouts and failures. In actual
fact, because there is no request/response paradigm,
we have extra work to do!

Websocket is not some kind of silver bullet that is
going to make creating such sites any easier.

It might take some load off the stressed firewall,
but it will be almost entirely transparent to the
application developers.

It is just not going to be the enabler to allow
such sites to be easily created by semi technical
developers. It solves few of the real problems that you
need to deal with. But that does not mean that it
is not a good step, and perhaps there is still
time to add some support for features like orderly
close?

cheers

Mridul Muralidharan

2010-03-01 14:54:42 UTC

I hope you are not serious with this reasoning !
Server side notifications would be the compelling usecase for websockets - everything else is already handled by existing protocols.

Regards,
Mridul

----- Original Message ----

Sent: Mon, 1 March, 2010 8:56:13 AM
Subject: Re: [hybi] Various comments on recent threads

Post by Jamie Lokier

Post by Ian Hickson

Post by Scott Ferguson
3. There are no server initiated events, which skips the main value of
web sockets (point #1).

That isn't the main value, IMHO.

If that's not the main value, what is?

The persistent connection. For example, a multiplayer game can have a
single connection for each user, such that whenever a user does anything
it can determine which user sent the message trivially, and it can notify
all the other users. Consider a tic-tac-toe game: there's nothing
happening except when the user sends a message. It's orders of magnitude
easier to implement a tic-tac-toe server with Web Sockets than over HTTP.
The same applies to many other scenarios; chat, any turn-based game, etc.
In any case, it's trivial to add a timeout to a select() call and send
messages in response to something other than an incoming message, so I
really don't think that's a real concern.
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
_______________________________________________
hybi mailing list
https://www.ietf.org/mailman/listinfo/hybi

Your Mail works best with the New Yahoo Optimized IE8. Get it NOW! http://downloads.yahoo.com/in/internetexplorer/

Martin J. Dürst

2010-03-02 08:54:27 UTC

Post by Ian Hickson
"Trivial demos" as you put it, or small-scale applications, as I would put
it, are the main target audience I'm interested in. There are as many
more applications in the long tail than in the head.
I'm perfectly happy for this working group to work on a specification that
targets a different target audience; I'm not trying to force a particular
target audience on the working group. Personally, I'm interested in
working on something that targets the long tail. If this is not something
this working group is interested in, that's fine; I've no interest in
blocking other work.

Not every application in the long tail is a 'trivial' or 'small-scale'
(in terms of complexity) application. Indeed many large-scale (in terms
of users) applications are in some ways simpler because scalability
issues restrict their complexity (at least for some axes of complexity).

Also, some people write small-scale apps and know that they'll never
have more than a few users. Others start with something small-scale but
really wouldn't mind to become the next Twitter or FaceBook or Google or
what (and even if they don't get THAT far, they may soon be less
'small-scale' than they imagined in the first place. I'm sure these
people would prefer something that at least didn't come with a label of
"caution, not designed to scale". (We can't claim something scales until
it's actually deployed and does scale.)

I don't think this WG should target "small-scale" or "large-scale"
audiences specifically. Technology is successful if it works on a large
range of scales. That's why it's good to have people with all kinds of
interests in this WG.

Regards, Martin.

--
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp mailto:***@it.aoyama.ac.jp

Maciej Stachowiak

2010-02-28 18:36:59 UTC

Post by Greg Wilkins

Post by Jamie Lokier
1. Standalone WebSocket servers must be easy to write and the
specification easy to understand.

This is essentially meaningless. Can you ever imagine a requirements
document saying: servers must be hard to write and the specification
hard to understand? One could argue it is easier to reuse an existing
HTTP stack than reinvent connection/thread/parsing standalone.

It's entirely possible to create a version of this protocol that is
*not* friendly to the standalone use case. If we designed the protocol
solely to work well as part of an HTTP server, it's actually pretty
likely that we'd make it needlessly complicated for authors of
standalone servers. In particular, your suggestion that "it is easier
to reuse an existing HTTP stack" is one that I think standalone server
authors would probably not agree with.

Post by Greg Wilkins
There is a specific requirement we are trying to capture. Some on
this list want the handshake to be fully legal HTTP, and in response
to that others are concerned that we might required full HTTP
compliance.
I think what we are trying to capture is the requirement that the parts
of HTTP that are used are compliant, but that does not mean we require
all of HTTP. If that is what we want, then we should say that
explicitly rather than infer that with our own definitions of "easy".

I don't think that's all of what we are trying to capture. We'd like
it to be reasonable to implement a standalone server, regardless of
whether the difficulties are imposed by HTTP-related processing or
something else. For example, if the message framing was unduly
elaborate, then I'd say we are failing the requirements for standalone
servers.

Post by Greg Wilkins

Post by Jamie Lokier
2. Servers which serve HTTP and WebSocket on the same port must be
fully HTTP compatible during the handshake, and easy to write
using standard HTTP components to implement that handshake -
while being fully compatible with standalone WebSocket clients
and servers which do not implement full HTTP.

I think this is a better approach and most of the clauses
already proposed. But it suffers from the problems
that Maciej raises of speaking about servers and implementations.
The protocol MUST allow that HTTP and websocket connections to
be served from the same port. When operating on the same port
as HTTP, the protocol MUST be HTTP compatible until both
ends have established the websocket protocol.
The protocol MUST make it possible and practical to establish
websockets connections without requiring a fully conforming HTTP
implementation at either end of the connection.
The protocol MUST make it possible and practical to reuse
existing HTTP components where appropriate.

I think this captures the combo server requirements reasonably well,
but is not so good at capturing the standalone server requirements. I
think we should listen to those who have implemented or plan to
implement a standalone server with regards to those.

Regards,
Maciej

Tim Bray

2010-02-28 19:18:33 UTC

I was thinking that I should put together a standalone client & server in
Ruby trying to use as much of Net::HTTP and Rack respectively as possible,
to explore how much re-use is possible.
Two questions:
- has anyone done this already?
- Is this so far outside the design space as to be silly?
It'd only be a few hours work I think.
- Tim

Post by Greg Wilkins

Basically I see ...

It's entirely possible to create a version of this protocol that is *not*
friendly to the standalone use case. If we designed the protocol solely to
work well as part of an HTTP server, it's actually pretty likely that we'd
make it needlessly complicated for authors of standalone servers. In
particular, your suggestion that "it is easier to reuse an existing HTTP
stack" is one that I think standalone server authors would probably not
agree with.

Post by Greg Wilkins
There is a specific requirement we are trying to capture. Some on
this list want the handshake...

I don't think that's all of what we are trying to capture. We'd like it to
be reasonable to implement a standalone server, regardless of whether the
difficulties are imposed by HTTP-related processing or something else. For
example, if the message framing was unduly elaborate, then I'd say we are
failing the requirements for standalone servers.

Post by Greg Wilkins

2. Servers which serve HTTP and WebSocket on the same port must be
fully HTTP compat...

I think this captures the combo server requirements reasonably well, but is
not so good at capturing the standalone server requirements. I think we
should listen to those who have implemented or plan to implement a
standalone server with regards to those.

Regards,
Maciej

Maciej Stachowiak

2010-02-28 19:34:37 UTC

Post by Tim Bray
I was thinking that I should put together a standalone client &
server in Ruby trying to use as much of Net::HTTP and Rack
respectively as possible, to explore how much re-use is possible.
- has anyone done this already?
- Is this so far outside the design space as to be silly?
It'd only be a few hours work I think.

Server: Definitely not outside of the design space. Interesting
questions:
- What things are unusually hard to implement?
- Does it seem harder or simpler to make a standalone implementation
this way, compared to coding directly against the wire protocol?

Client: Not sure if any practical reuse of HTTP is possible, but it
would be interesting to find out.

Regards,
Maciej

Post by Tim Bray
- Tim

Post by Maciej Stachowiak

Post by Greg Wilkins

Basically I see ...

It's entirely possible to create a version of this protocol that is
*not* friendly to the standalone use case. If we designed the
protocol solely to work well as part of an HTTP server, it's
actually pretty likely that we'd make it needlessly complicated for
authors of standalone servers. In particular, your suggestion that
"it is easier to reuse an existing HTTP stack" is one that I think
standalone server authors would probably not agree with.

Post by Greg Wilkins
There is a specific requirement we are trying to capture. Some on
this list want the handshake...

I don't think that's all of what we are trying to capture. We'd
like it to be reasonable to implement a standalone server,
regardless of whether the difficulties are imposed by HTTP-related
processing or something else. For example, if the message framing
was unduly elaborate, then I'd say we are failing the requirements
for standalone servers.

Post by Greg Wilkins

2. Servers which serve HTTP and WebSocket on the same port must

be

Post by Greg Wilkins

fully HTTP compat...

I think this captures the combo server requirements reasonably
well, but is not so good at capturing the standalone server
requirements. I think we should listen to those who have
implemented or plan to implement a standalone server with regards
to those.
Regards,
Maciej
_______________________________________________
hybi mailing list
https://www.ietf.o...

Greg Wilkins

2010-02-28 20:06:02 UTC

Post by Maciej Stachowiak
Client: Not sure if any practical reuse of HTTP is possible, but it
would be interesting to find out.

Surely there are parts of a HTTP client that would be good to
reuse:

+ URL handling
+ Cookie handling
+ authentication
+ Origin handling
+ generating upgrade request
+ parsing 101 response.
+ connection handling
+ IO layer

We've not yet implemented a WSClient in java, but when we do,
my expectation is that we will reuse much of our HttpClient.

cheers

Maciej Stachowiak

2010-02-28 20:57:36 UTC

Post by Greg Wilkins

Post by Maciej Stachowiak
Client: Not sure if any practical reuse of HTTP is possible, but it
would be interesting to find out.

Surely there are parts of a HTTP client that would be good to
+ URL handling
+ Cookie handling
+ authentication
+ Origin handling
+ generating upgrade request
+ parsing 101 response.
+ connection handling
+ IO layer
We've not yet implemented a WSClient in java, but when we do,
my expectation is that we will reuse much of our HttpClient.

I just sent a walkthrough of WebKit's WebSocket implementation. We
don't reuse most of those parts of the HTTP stack. We do reuse the
storage for cookies and http auth credentials, but not parsing or
generation of the relevant headers. We don't reuse most of the other
things you mentioned at all.

Reusing connection handling in particular is not possible because we
don't want to go into the per-host pool of persistent connections, and
because we need to make sure per the spec that there is only one
pending connection at a time. Most other things I covered in my
overview of how our client code is put together. Note: somewhat to my
surprise very little of our client code is specific to being hosted in
a browser.

Regards,
Maciej

Ian Hickson

2010-02-28 22:37:07 UTC

Post by Greg Wilkins
Surely there are parts of a HTTP client that would be good to
+ URL handling

There's no need for URL handling in a standalone (non-browser) client,
actually. (In a browser, the URL handling isn't part of the HTTP stack.)

Post by Greg Wilkins
+ Cookie handling
+ authentication

These are only required if the subprotocol uses it for authentication, in
which case they can actually more or less be hard-coded. For example,
there's no Set-Cookie support in Web Socket, and (for now at least) no
part of HTTP authentication is reused.

Post by Greg Wilkins
+ Origin handling

There's really nothing to this in a non-browser client except outputting
essentially a hard-coded string.

Post by Greg Wilkins
+ generating upgrade request

This is just a hard-coded string.

Post by Greg Wilkins
+ parsing 101 response.

This is a trivial operation that the Web Socket spec defines in detail;
it wouldn't be correct to reuse an HTTP stack to do this.

Post by Greg Wilkins
+ connection handling
+ IO layer

An HTTP stack's connection and IO layers are actually somewhat unlikely to
be appropriate for Web Socket, since HTTP is not full-duplex.

--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

Justin Erenkrantz

2010-02-28 23:07:51 UTC

Post by Ian Hickson
An HTTP stack's connection and IO layers are actually somewhat unlikely to
be appropriate for Web Socket, since HTTP is not full-duplex.

Serf was designed to be async, so it is pretty trivial to handle
full-duplex communication. I expect the more legacy client frameworks
(WebKit, etc.) may have serious issues in an async world. But, I
believe it is a fallacy to declare that all implementations of WS will
not share any code with HTTP frameworks. It depends upon the
frameworks - I realize that your personal hobby horses may be stuck in
a model where full-duplex I/O isn't possible, but mine are all
designed for high-volume async I/O.

As I have pointed out repeatedly, the current drafts assume
synchronous network I/O rather than async. So, most of the prose is
useless to me - and why I strongly believe that the drafts should
switch away from the current text to help implementers start from
scratch rather than pigeonholing them to synchronous I/O models. --
justin

Greg Wilkins

2010-03-01 08:09:40 UTC

Post by Ian Hickson
An HTTP stack's connection and IO layers are actually somewhat unlikely to
be appropriate for Web Socket, since HTTP is not full-duplex.

HTTP at a IO level is full duplex - or at least frequently implemented
that way. Eg 102 processing responses can be sent while the body of
a request is being received.

Anyway, this is a pretty meaningless debate. Sure some clients
are not going to reuse HTTP client code. But some are.

In Jetty, our HTTP client and HTTP server share about 75% of
their code base. Websockets are now supported as part of
the servers ability to handle Upgrade as per RFC2616, and the
resulting endpoint reuses all the async IO layer of Jetty.

When we do our websocket client, I expect that it will
take a similar approach - because while a good websocket
connection will start with an Upgrade request and 101
response, I fully expect that we will need to handle
"bad" responses like 403's and 500's. I don't want
to have to rewrite that handling.

So the point is - some want to reuse code and some
don't. We need a spec that makes either approach
possible and practical.

cheers

Salvatore Loreto

2010-03-01 17:58:06 UTC

Hi,

I want take the chance to remember that HyBi has as a goal to submit
a document as a working group item describing the Design Space
characterization

at the BoF we discussed this:
http://www.ietf.org/id/draft-loreto-design-space-bidirectional-00.txt

so any initiative to investigate the Design Space (and also take on the
draft editorship) are very well welcome.

cheers
Sal

Post by Maciej Stachowiak

Post by Tim Bray
I was thinking that I should put together a standalone client &
server in Ruby trying to use as much of Net::HTTP and Rack
respectively as possible, to explore how much re-use is possible.
- has anyone done this already?
- Is this so far outside the design space as to be silly?
It'd only be a few hours work I think.

- What things are unusually hard to implement?
- Does it seem harder or simpler to make a standalone implementation
this way, compared to coding directly against the wire protocol?
Client: Not sure if any practical reuse of HTTP is possible, but it
would be interesting to find out.
Regards,
Maciej

Post by Tim Bray
- Tim

Post by Maciej Stachowiak

Post by Greg Wilkins

Basically I see ...

It's entirely possible to create a version of this protocol that is
*not* friendly to the standalone use case. If we designed the
protocol solely to work well as part of an HTTP server, it's
actually pretty likely that we'd make it needlessly complicated for
authors of standalone servers. In particular, your suggestion that
"it is easier to reuse an existing HTTP stack" is one that I think
standalone server authors would probably not agree with.

Post by Greg Wilkins
There is a specific requirement we are trying to capture. Some on
this list want the handshake...

I don't think that's all of what we are trying to capture. We'd like
it to be reasonable to implement a standalone server, regardless of
whether the difficulties are imposed by HTTP-related processing or
something else. For example, if the message framing was unduly
elaborate, then I'd say we are failing the requirements for
standalone servers.

Post by Greg Wilkins

2. Servers which serve HTTP and WebSocket on the same port must be
fully HTTP compat...

I think this captures the combo server requirements reasonably well,
but is not so good at capturing the standalone server requirements.
I think we should listen to those who have implemented or plan to
implement a standalone server with regards to those.
Regards,
Maciej
_______________________________________________
hybi mailing list
https://www.ietf.o...

Salvatore Loreto

2010-03-01 17:36:30 UTC

Hi,

thank for the discussion, feedbacks and comments
it seems to me that there is an acceptable consensus on the following
requirements

Req. X The WebSocket protocol MUST allow HTTP and WebSocket
connections to be served from the same port.
Consideration MUST be given
* to provide WebSocket services via modules that plug in
to existing web infrastructure.
* to making it possible and practical to implement
standalone implementations of the protocol
without requiring a fully conforming HTTP implementation.

Reason: Some server developers would like to integrate WebSocket support
into existing HTTP servers.
In addition, the default HTTP and HTTPS ports are ofter
favoured for traffic that has to go through a firewall,
so service providers will likely want to be able to use
WebSocket over ports 80 and 443, even when running a We server
on the same host. However there could be scenarios where
it is not opportune or possible to setup a proxy on the same
HTTP server.

Req. Y When sharing host and "well known" port with HTTP, the
WebSocket protocol MUST be HTTP compatible until both ends have
established the WebSocket protocol.

Reason: when operating on the standard HTTP ports, existing web
infrastructure may handle according to existing standards prior to
the establishment of the new protocol.

Req. Z The protocol SHOULD make it possible and practical to reuse
existing HTTP components where appropriate.

Reason: the re-usage of existing well-debugged software decreases the
number of implementation errors as well as the possibility to
introduce security holes especially and at the same time
speed up the development especially when the Web Socket server
is implemented as modules that plug in to existing
popular Web servers

any objection to have them inserted in the draft instead of

REQ. 6: The Web Socket Protocol MUST be designed in such a way that
its servers can share a port with HTTP servers, as by
default the
Web Socket Protocol uses port 80 for regular Web Socket
connections and port 443 for Web Socket connection
tunnelled over TLS.

cheers
Sal

Roberto Peon

2010-03-01 17:56:33 UTC

I think that Propsed reqs X,Y,Z are much clearer than the current Req 6.
No objection here.
-=R

On Mon, Mar 1, 2010 at 9:36 AM, Salvatore Loreto <

Post by Salvatore Loreto
Hi,
thank for the discussion, feedbacks and comments
it seems to me that there is an acceptable consensus on the following
requirements
Req. X The WebSocket protocol MUST allow HTTP and WebSocket connections
to be served from the same port.
Consideration MUST be given
* to provide WebSocket services via modules that plug in to
existing web infrastructure.
* to making it possible and practical to implement standalone
implementations of the protocol
without requiring a fully conforming HTTP implementation.
Reason: Some server developers would like to integrate WebSocket support
into existing HTTP servers.
In addition, the default HTTP and HTTPS ports are ofter
favoured for traffic that has to go through a firewall,
so service providers will likely want to be able to use
WebSocket over ports 80 and 443, even when running a We server
on the same host. However there could be scenarios where it is
not opportune or possible to setup a proxy on the same
HTTP server.
Req. Y When sharing host and "well known" port with HTTP, the WebSocket
protocol MUST be HTTP compatible until both ends have
established the WebSocket protocol.
Reason: when operating on the standard HTTP ports, existing web
infrastructure may handle according to existing standards prior to
the establishment of the new protocol.
Req. Z The protocol SHOULD make it possible and practical to reuse
existing HTTP components where appropriate.
Reason: the re-usage of existing well-debugged software decreases the
number of implementation errors as well as the possibility to
introduce security holes especially and at the same time speed
up the development especially when the Web Socket server
is implemented as modules that plug in to existing popular Web
servers
any objection to have them inserted in the draft instead of
REQ. 6: The Web Socket Protocol MUST be designed in such a way that
its servers can share a port with HTTP servers, as by
default the
Web Socket Protocol uses port 80 for regular Web Socket
connections and port 443 for Web Socket connection
tunnelled over TLS.
cheers
Sal
_______________________________________________
hybi mailing list
https://www.ietf.org/mailman/listinfo/hybi

Justin Erenkrantz

2010-03-01 18:46:50 UTC

Post by Roberto Peon
I think that Propsed reqs X,Y,Z are much clearer than the current Req 6.
No objection here.

Ditto... -- justin

Maciej Stachowiak

2010-02-28 00:14:48 UTC

Post by Salvatore Loreto
Maciej
The WebSocket protocol MUST be designed so that it is
possible and practical to establish both HTTP and Websocket
connections to the same host and port.
When operating on standard HTTP ports, an implementation
of the protocol MUST initially handle a new connection as a
HTTP connection.
Consideration MUST be given to making it possible and
practical to implement standalone implementations of the
protocol without requiring a fully conforming HTTP
implementation. An implementation of the protocol SHALL NOT
be required to implement any part of the HTTP protocol that
is not necessary for the establishment of a websocket connection.
Consideration MUST be given to provide WebSocket services via
modules that plug in to existing web infrastructure.

The second one is still written as a requirement on implementations,
not on the spec. (I also don't understand the rationale for the second
one - what does it give us that the first one doesn't, other than
assuming a solution?)

The second sentence of the third one is also apparently written as a
requirement on implementations.

Try expressing these as "the protocol MUST" or "the specification
MUST" rather than "an implementation MUST" or "an implementation SHALL
NOT".

Regards,
Maciej

Greg Wilkins

2010-02-28 05:52:58 UTC

The second one is still written as a requirement on implementations, not
on the spec. (I also don't understand the rationale for the second one -
what does it give us that the first one doesn't, other than assuming a
solution?)

It was trying to express that it is HTTP mechanisms that must be
used to select if a connection is HTTP or websocket. But I think
Jamie's words better capture this requirement (and the first and last
requirement as well). See my response to his post.

The second sentence of the third one is also apparently written as a
requirement on implementations.

Attempted fix in response to Jamie

cheers

Justin Erenkrantz

2010-02-28 04:00:38 UTC

Post by Greg Wilkins
Specifically "share a port" is not very well defined and could be
interpreted to mean being able to send HTTP and hybi traffic
at the same time to the same port.

While I think there is more-or-less implicit consensus that we don't
want that behavior, it may be good to explicitly codify this in a
requirements document.

Yet, on the client-side, I could see this being an easy trap to fall
into depending upon how you architect your connection pools to the
server. It should be clear that there is a one-way upgrade here, but
I do sort of wonder if there's a useful reduction of network
overhead/latency if there was a way to shove in a HTTP/1.1-formatted
request in an already-upgraded WS connection. Yes, this might be a
sub-protocol on top of WS; but I could see some real-world benefits if
we *did* allow sharing of HTTP and hybi traffic over the same
port...so maybe I'm not quite so sure I'd dismiss it entirely out of
hand... -- justin

Jamie Lokier

2010-02-28 13:56:23 UTC

Post by Justin Erenkrantz
Yet, on the client-side, I could see this being an easy trap to fall
into depending upon how you architect your connection pools to the
server. It should be clear that there is a one-way upgrade here, but
I do sort of wonder if there's a useful reduction of network
overhead/latency if there was a way to shove in a HTTP/1.1-formatted
request in an already-upgraded WS connection. Yes, this might be a
sub-protocol on top of WS; but I could see some real-world benefits if
we *did* allow sharing of HTTP and hybi traffic over the same
port...so maybe I'm not quite so sure I'd dismiss it entirely out of
hand... -- justin

+1 from me. This is quite close to what would be achieved with BWTP
and/or SPDY over WebSocket.

It's also potentially a route to "real" HTTP pipelining that works.
Oh dear, I think it's getting a bit far out of the scope of the WG :-)

-- Jamie

Jamie Lokier

2010-02-28 13:50:40 UTC

Post by Maciej Stachowiak

Post by Jamie Lokier
1. Standalone WebSocket servers must be easy to write and the
specification easy to understand.
(1a. Same for standalone WebSocket clients?)
2. Servers which serve HTTP and WebSocket on the same port must be
fully HTTP compatible during the handshake, and easy to write
using standard HTTP components to implement that handshake -
while being fully compatible with standalone WebSocket clients
and servers which do not implement full HTTP.

This lines up much more with how I see the requirements than Greg's
wording.
(For clients implemented inside the browser, I wouldn't worry about
it; it's pretty arbitrary whether they considered "standalone" or also
HTTP clients, and port sharing is not an important consideration. If
you want any client implementability requirements, I would ask for
"must be feasible to implement in a Web browser". I don't know what
kinds of non-browser clients people want so I don't know the
requirements there.)

Think of all the things people do with non-browser HTTP clients today.
Here is a very small subset:

- Facebook posting/reading apps.
- Twitter posting/reading apps.
- Fetch and show Webmail availability.
- Google Maps mashup apps.
- SOAP / XML-RPC callers.
- Financial quote trackers.
- News ticker tapes.
- Software package updaters.

All but the last will have no choice but to become standalone
WebSocket clients to the extent that the services they depend on are
only exported over WebSocket in future.

We have recently seen an example on this list of someone using
WebSocket for some kind of web searching/indexing and getting 30 times
speedup compared with XHR. (Of course they could just old-fashioned
sockets and get 50 times speedup! It's an idiotic world.)

If that's the sort of thing which excites people, expect a lot more
services which are currently using XHR to be available as proprietary
protocols over WebSocket in a year or two.

I think that standalone WebSocket clients (number of distinct
programs) are going to be more prevalent than standalone WebSocket
servers (number of distinct internet-facing parts, not the WSCGI nor
whatever backends). We should definitely have it as a consideration,
even if it's only "let's think about whether this is relevant".

-- Jamie

109 Replies
4 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Salvatore Loreto 2010-02-26 08:44:48 UTC

KOMATSU Kensaku 2010-02-26 11:30:22 UTC

Christopher Blizzard 2010-02-26 17:15:53 UTC

Maciej Stachowiak 2010-02-26 17:25:50 UTC

Greg Wilkins 2010-02-27 07:24:24 UTC

Maciej Stachowiak 2010-02-27 07:55:28 UTC

Greg Wilkins 2010-02-27 08:05:43 UTC

Maciej Stachowiak 2010-02-27 09:57:01 UTC

Greg Wilkins 2010-02-27 12:17:38 UTC

Maciej Stachowiak 2010-02-27 12:38:19 UTC

Greg Wilkins 2010-02-27 13:48:52 UTC

Jamie Lokier 2010-02-27 20:15:45 UTC

Maciej Stachowiak 2010-02-28 00:18:06 UTC

Justin Erenkrantz 2010-02-28 04:12:10 UTC

Maciej Stachowiak 2010-02-28 06:40:06 UTC

Justin Erenkrantz 2010-02-28 07:18:07 UTC

Maciej Stachowiak 2010-02-28 18:26:50 UTC

Justin Erenkrantz 2010-02-28 23:01:01 UTC

Maciej Stachowiak 2010-02-28 23:15:46 UTC

Justin Erenkrantz 2010-02-28 23:25:54 UTC

Maciej Stachowiak 2010-02-28 23:34:53 UTC

Greg Wilkins 2010-02-28 06:25:15 UTC

Ian Hickson 2010-02-28 08:27:11 UTC

Greg Wilkins 2010-02-28 08:56:02 UTC

Martin Tyler 2010-02-28 09:37:02 UTC

Jamie Lokier 2010-02-28 14:57:34 UTC

Ian Hickson 2010-02-28 22:18:22 UTC

Rob Sayre 2010-02-28 22:25:40 UTC

Justin Erenkrantz 2010-02-28 23:21:39 UTC

Jamie Lokier 2010-03-01 02:42:14 UTC

Jamie Lokier 2010-03-01 03:09:03 UTC

Dave Cridland 2010-03-01 11:10:09 UTC

M***@nokia.com 2010-03-01 13:36:16 UTC

Ian Hickson 2010-03-01 22:20:44 UTC

Greg Wilkins 2010-03-01 22:57:33 UTC

Maciej Stachowiak 2010-03-01 23:58:40 UTC

Greg Wilkins 2010-03-02 06:41:50 UTC

Maciej Stachowiak 2010-03-02 06:57:27 UTC

Greg Wilkins 2010-03-02 07:30:11 UTC

Julian Reschke 2010-03-02 11:01:25 UTC

M***@nokia.com 2010-03-02 13:21:18 UTC

Greg Wilkins 2010-03-01 07:26:31 UTC

Maciej Stachowiak 2010-03-01 08:13:15 UTC

Jamie Lokier 2010-03-01 08:53:36 UTC

Greg Wilkins 2010-03-01 17:55:47 UTC

Maciej Stachowiak 2010-03-01 18:18:19 UTC

Greg Wilkins 2010-03-01 18:31:50 UTC

Maciej Stachowiak 2010-03-01 19:48:18 UTC

Julian Reschke 2010-03-01 08:39:37 UTC

Ian Hickson 2010-03-01 08:59:26 UTC

Ian Hickson 2010-03-01 09:06:10 UTC

Anne van Kesteren 2010-03-01 09:28:08 UTC

Julian Reschke 2010-03-01 09:54:10 UTC

Ian Hickson 2010-03-01 10:11:04 UTC

Julian Reschke 2010-03-01 10:29:53 UTC

Martin J. Dürst 2010-03-02 09:19:10 UTC

Greg Wilkins 2010-03-01 10:39:51 UTC

Roy T. Fielding 2010-03-02 05:32:19 UTC

Alexander Philippou 2010-03-01 12:57:47 UTC

Ian Hickson 2010-03-01 22:14:31 UTC

Alexander Philippou 2010-03-02 10:30:32 UTC

Ian Hickson 2010-03-02 10:50:51 UTC

Greg Wilkins 2010-03-02 11:49:45 UTC

Alexander Philippou 2010-03-02 12:50:59 UTC

Ian Hickson 2010-03-02 20:37:12 UTC

Pieter Hintjens 2010-03-02 21:38:26 UTC

Roy T. Fielding 2010-03-02 05:01:56 UTC

Scott Ferguson 2010-02-28 17:38:24 UTC

Ian Hickson 2010-02-28 22:29:19 UTC

Scott Ferguson 2010-03-01 00:47:32 UTC

Ian Hickson 2010-03-01 01:18:24 UTC

Jamie Lokier 2010-03-01 03:13:34 UTC

Ian Hickson 2010-03-01 03:26:13 UTC

Jamie Lokier 2010-03-01 03:54:50 UTC

Ian Hickson 2010-03-01 04:58:19 UTC

Maciej Stachowiak 2010-03-01 07:19:54 UTC

Pieter Hintjens 2010-03-01 07:35:01 UTC

Martin J. Dürst 2010-03-02 08:18:58 UTC

Ian Hickson 2010-03-02 08:26:41 UTC

Greg Wilkins 2010-03-02 09:00:39 UTC

Greg Wilkins 2010-03-01 07:59:23 UTC

Mridul Muralidharan 2010-03-01 14:54:42 UTC

Martin J. Dürst 2010-03-02 08:54:27 UTC

Maciej Stachowiak 2010-02-28 18:36:59 UTC

Tim Bray 2010-02-28 19:18:33 UTC

Maciej Stachowiak 2010-02-28 19:34:37 UTC

Greg Wilkins 2010-02-28 20:06:02 UTC

Maciej Stachowiak 2010-02-28 20:57:36 UTC

Ian Hickson 2010-02-28 22:37:07 UTC

Justin Erenkrantz 2010-02-28 23:07:51 UTC

Greg Wilkins 2010-03-01 08:09:40 UTC

Salvatore Loreto 2010-03-01 17:58:06 UTC

Salvatore Loreto 2010-03-01 17:36:30 UTC

Roberto Peon 2010-03-01 17:56:33 UTC

Justin Erenkrantz 2010-03-01 18:46:50 UTC

Maciej Stachowiak 2010-02-28 00:14:48 UTC

Greg Wilkins 2010-02-28 05:52:58 UTC

Justin Erenkrantz 2010-02-28 04:00:38 UTC

Jamie Lokier 2010-02-28 13:56:23 UTC

Jamie Lokier 2010-02-28 13:50:40 UTC

about - legalese

Loading...