Discussion:
Collaboration on standard Wayland protocol extensions
Drew DeVault
2016-03-27 20:34:37 UTC
Permalink
Greetings! I am the maintainer of the Sway Wayland compositor.

http://swaywm.org

It's almost the Year of Wayland on the Desktop(tm), and I have
reached out to each of the projects this message is addressed to (GNOME,
Kwin, and wayland-devel) to collaborate on some shared protocol
extensions for doing a handful of common tasks such as display
configuration and taking screenshots. Life will be much easier for
projects like ffmpeg and imagemagick if they don't have to implement
compositor-specific code for capturing the screen!

I want to start by establishing the requirements for these protocols.
Broadly speaking, I am looking to create protocols for the following
use-cases:

- Screen capture
- Output configuration
- More detailed surface roles (should it be floating, is it a modal,
does it want to draw its own decorations, etc)
- Input device configuration

I think that these are the core protocols necessary for
cross-compositor compatability and to support most existing tools for
X11 like ffmpeg. Considering the security goals of Wayland, it will also
likely be necessary to implement some kind of protocol for requesting
and granting sensitive permissions to clients.

How does this list look? What sorts of concerns do you guys have with
respect to what features each protocol needs to support? Have I missed
any major protocols that we'll have to work on? Once we have a good list
of requirements I'll start writing some XML.

--
Drew DeVault
Martin Peres
2016-03-27 20:50:48 UTC
Permalink
Post by Drew DeVault
Greetings! I am the maintainer of the Sway Wayland compositor.
http://swaywm.org
It's almost the Year of Wayland on the Desktop(tm), and I have
reached out to each of the projects this message is addressed to (GNOME,
Kwin, and wayland-devel) to collaborate on some shared protocol
extensions for doing a handful of common tasks such as display
configuration and taking screenshots. Life will be much easier for
projects like ffmpeg and imagemagick if they don't have to implement
compositor-specific code for capturing the screen!
I want to start by establishing the requirements for these protocols.
Broadly speaking, I am looking to create protocols for the following
- Screen capture
- Output configuration
- More detailed surface roles (should it be floating, is it a modal,
does it want to draw its own decorations, etc)
- Input device configuration
I think that these are the core protocols necessary for
cross-compositor compatability and to support most existing tools for
X11 like ffmpeg. Considering the security goals of Wayland, it will also
likely be necessary to implement some kind of protocol for requesting
and granting sensitive permissions to clients.
How does this list look? What sorts of concerns do you guys have with
respect to what features each protocol needs to support? Have I missed
any major protocols that we'll have to work on? Once we have a good list
of requirements I'll start writing some XML.
--
Drew DeVault
We had discussions about it years ago and here are the results of them:
http://mupuf.org/blog/2014/02/19/wayland-compositors-why-and-how-to-handle/
http://mupuf.org/blog/2014/03/18/managing-auth-ui-in-linux/

And here is the software we created, under the name "Wayland Security
Modules":
http://www.x.org/wiki/Events/XDC2014/XDC2014DodierPeresSecurity/xorg-talk.pdf
https://github.com/mupuf/libwsm

This approach has generally be liked by KDE, but not by Gnome who, last
i heard, did not care about cross-platform apps doing privileged
operations. This may have changed since they also decided to work on
sandboxing (xdg-app) and implemented something like the following
approach when they said they would never do because it changed the API:
http://mupuf.org/blog/2014/05/14/introducing-sandbox-utils-0-6-1/

I really wish we can have everyone onboard on one solution to get these
cross-platform apps and so far, I do not see any better solution than WSM.

Martin
Drew DeVault
2016-03-27 21:00:47 UTC
Permalink
Thanks for the links! I'll read through them. I figured that a
discussion like this had happened in the past around how to give clients
privledge, but I couldn't find anything that would allow them to
actually do the thing they were given permission to. We should flesh out
both parts of this model. I read over libwsm and it seems like a fairly
sane approach. I'd like to read the arguments for/against it.
This approach has generally be liked by KDE, but not by Gnome who, last i
heard, did not care about cross-platform apps doing privileged operations.
This may have changed since they also decided to work on sandboxing
(xdg-app) and implemented something like the following approach when they
http://mupuf.org/blog/2014/05/14/introducing-sandbox-utils-0-6-1/
I would hope that our friends at Gnome aren't planning on implementing
software to cover every screen capturing use case in mutter! I'd like to
find a way to use OBS (https://obsproject.com/) from Wayland, for
example.
I really wish we can have everyone onboard on one solution to get these
cross-platform apps and so far, I do not see any better solution than WSM.
Well, I'm definitely on board. Sway is clearly a smaller project than
Gnome or KDE and I would rather not build the "Sway Desktop
Environment". I think we can arrive at some solutions that are in line
with the Unix way AND meet the goals of the big DEs.

--
Drew DeVault
Jasper St. Pierre
2016-03-27 23:41:43 UTC
Permalink
You're probably referring to my response when you say "GNOME does not
care about cross-platform apps doing privileged operations". My
response wasn't meant to be speaking on behalf of GNOME. These are my
opinions and mine alone.

My opinion is still as follows: having seen how SELinux and PAM work
out in practice, I'm skeptical of any "Security Module" which
implements policy. The "module" part of it rarely happens, since
people simply gravitate towards a standard policy. What's interesting
to me isn't a piece of code that allows or rejects operations, it's
the resulting UI *around* those operations and managing them, since
that's really, at the end of the day, all the user cares about.

It would be a significant failure to me if we didn't have a standard
way for a user to examine or recall the policy of an application,
using whatever API they wanted. If every module implements its own
policy store separately, such a UI would be extremely difficult to
build.

From what I read, Wayland Security Modules didn't seem to even provide
that as a baseline, which is why I believe they're tackling the
problem from the wrong angle.
Post by Martin Peres
Post by Drew DeVault
Greetings! I am the maintainer of the Sway Wayland compositor.
http://swaywm.org
It's almost the Year of Wayland on the Desktop(tm), and I have
reached out to each of the projects this message is addressed to (GNOME,
Kwin, and wayland-devel) to collaborate on some shared protocol
extensions for doing a handful of common tasks such as display
configuration and taking screenshots. Life will be much easier for
projects like ffmpeg and imagemagick if they don't have to implement
compositor-specific code for capturing the screen!
I want to start by establishing the requirements for these protocols.
Broadly speaking, I am looking to create protocols for the following
- Screen capture
- Output configuration
- More detailed surface roles (should it be floating, is it a modal,
does it want to draw its own decorations, etc)
- Input device configuration
I think that these are the core protocols necessary for
cross-compositor compatability and to support most existing tools for
X11 like ffmpeg. Considering the security goals of Wayland, it will also
likely be necessary to implement some kind of protocol for requesting
and granting sensitive permissions to clients.
How does this list look? What sorts of concerns do you guys have with
respect to what features each protocol needs to support? Have I missed
any major protocols that we'll have to work on? Once we have a good list
of requirements I'll start writing some XML.
--
Drew DeVault
http://mupuf.org/blog/2014/02/19/wayland-compositors-why-and-how-to-handle/
http://mupuf.org/blog/2014/03/18/managing-auth-ui-in-linux/
And here is the software we created, under the name "Wayland Security
http://www.x.org/wiki/Events/XDC2014/XDC2014DodierPeresSecurity/xorg-talk.pdf
https://github.com/mupuf/libwsm
This approach has generally be liked by KDE, but not by Gnome who, last i
heard, did not care about cross-platform apps doing privileged operations.
This may have changed since they also decided to work on sandboxing
(xdg-app) and implemented something like the following approach when they
http://mupuf.org/blog/2014/05/14/introducing-sandbox-utils-0-6-1/
I really wish we can have everyone onboard on one solution to get these
cross-platform apps and so far, I do not see any better solution than WSM.
Martin
_______________________________________________
wayland-devel mailing list
https://lists.freedesktop.org/mailman/listinfo/wayland-devel
--
Jasper
Drew DeVault
2016-03-28 02:33:52 UTC
Permalink
Post by Jasper St. Pierre
My opinion is still as follows: having seen how SELinux and PAM work
out in practice, I'm skeptical of any "Security Module" which
implements policy. The "module" part of it rarely happens, since
people simply gravitate towards a standard policy. What's interesting
to me isn't a piece of code that allows or rejects operations, it's
the resulting UI *around* those operations and managing them, since
that's really, at the end of the day, all the user cares about.
It has been done successfully, though. Consider the experience for iOS
and Android permissions. When an application needs to do something
sensitive, a simple dialog pops up explaining what it's asking for, and
allowing the user to consent once or forever. It's pretty simple and I
think we can accomplish something similar.
Post by Jasper St. Pierre
It would be a significant failure to me if we didn't have a standard
way for a user to examine or recall the policy of an application,
using whatever API they wanted. If every module implements its own
policy store separately, such a UI would be extremely difficult to
build.
Ah, but here we are, all talking about it together. Let's make a
solution that works for all of us, then.
Post by Jasper St. Pierre
From what I read, Wayland Security Modules didn't seem to even provide
that as a baseline, which is why I believe they're tackling the
problem from the wrong angle.
What are your specific concerns with it? I would tend to agree. I think
that it's not bad as an implementation of this mechanic, but I agree
that it's approaching the problem wrong. I think it would be wiser to
start with how clients ask the compositor for permissions and how they
receive them, then leave the details libwsm implements up to the
compositors.

I think a protocol extension would work just fine to implement a
permission requesting/granting dialogue between clients and compositors.

--
Drew DeVault
Jasper St. Pierre
2016-03-28 05:21:52 UTC
Permalink
Post by Drew DeVault
What are your specific concerns with it? I would tend to agree. I think
that it's not bad as an implementation of this mechanic, but I agree
that it's approaching the problem wrong. I think it would be wiser to
start with how clients ask the compositor for permissions and how they
receive them, then leave the details libwsm implements up to the
compositors.
I think a protocol extension would work just fine to implement a
permission requesting/granting dialogue between clients and compositors.
That's what we should be doing, and that's why I'm not a huge fan of
WSM -- it provides a solution for the stuff that doesn't matter, and
doesn't make any progress on the part we need to tackle. I won't enjoy
using libwsm because it adds complexity and error cases (e.g. what
happens with no modules, like on a misconfigured system?), without
solving the actual problem.

Also, as I've mentioned in my emails before, APIs aren't exclusively
used through Wayland, they might also be on other systems like DBus,
which already has its own confusing policy system. It gets even worse
when protocols might cross both systems. So libwsm is already far in
the negative points bucket to me -- a Wayland-protocol centric
solution that ignores other IPCs and APIs, is configurable for no
purpose as far as I can tell, and still doesn't have an approachable
story about how it provides more security to the user.

I would rather the effort be spent making secure interfaces, exactly
as you've described.
Post by Drew DeVault
--
Drew DeVault
--
Jasper
Drew DeVault
2016-03-28 13:03:55 UTC
Permalink
Post by Jasper St. Pierre
I would rather the effort be spent making secure interfaces, exactly
as you've described.
Agreed. I think it should be pretty straightforward:

Client->Server: What features do you support?
Server->Client: These privledged features are available.
Client->Server: I want this feature (nonblocking)
[compositor prompts user to agree]
Server->Client: Yes/no
[compositor enables the use of those protocols for this client]

I can start to write up some XML to describe this formally. We can take
some inspiration from the pointer-constraints protocol and I'll also
rewrite that protocol with this model in mind. Does anyone see anything
missing from this exchange?

--
Drew DeVault
Martin Peres
2016-03-28 19:50:39 UTC
Permalink
Post by Drew DeVault
Post by Jasper St. Pierre
I would rather the effort be spent making secure interfaces, exactly
as you've described.
Client->Server: What features do you support?
Server->Client: These privledged features are available.
Client->Server: I want this feature (nonblocking)
[compositor prompts user to agree]
Server->Client: Yes/no
[compositor enables the use of those protocols for this client]
That looks like the bind operation to me. Why do you need a
new protocol?
Post by Drew DeVault
I can start to write up some XML to describe this formally. We can take
some inspiration from the pointer-constraints protocol and I'll also
rewrite that protocol with this model in mind. Does anyone see anything
missing from this exchange?
So, you are OK with being asked *every time* if you accept that VLC
is trying to go fullscreen? I most definitely am not :D

This is why we wanted to let distro devs decide for their users what
the default policy should be. We then need to have a good UI for users
to realize that the application is running in fullscreen mode (following
what chrome and firefox are doing is likely a good idea).

However, Jasper has a point that we need to be sure that we can
override the policy in a consistent way across all backends. I have a
plan for this.
Drew DeVault
2016-03-28 20:47:58 UTC
Permalink
Post by Martin Peres
Post by Drew DeVault
Client->Server: What features do you support?
Server->Client: These privledged features are available.
Client->Server: I want this feature (nonblocking)
[compositor prompts user to agree]
Server->Client: Yes/no
[compositor enables the use of those protocols for this client]
That looks like the bind operation to me. Why do you need a
new protocol?
How do you propose clients would communicate their intentions to
compositors? I may be misunderstanding.
Post by Martin Peres
So, you are OK with being asked *every time* if you accept that VLC
is trying to go fullscreen? I most definitely am not :D
No, the compositor can remember your choice.
Post by Martin Peres
(following what chrome and firefox are doing is likely a good idea).
I agree, permission management on Firefox is good.

--
Drew DeVault
Martin Peres
2016-03-28 23:15:00 UTC
Permalink
Post by Drew DeVault
Post by Martin Peres
Post by Drew DeVault
Client->Server: What features do you support?
Server->Client: These privledged features are available.
Client->Server: I want this feature (nonblocking)
[compositor prompts user to agree]
Server->Client: Yes/no
[compositor enables the use of those protocols for this client]
That looks like the bind operation to me. Why do you need a
new protocol?
How do you propose clients would communicate their intentions to
compositors? I may be misunderstanding.
I was proposing for applications to just bind the interface and see if
it works or not. But Giulio's proposal makes sense because it could be
used to both grant and revoke rights on the fly.
Post by Drew DeVault
Post by Martin Peres
So, you are OK with being asked *every time* if you accept that VLC
is trying to go fullscreen? I most definitely am not :D
No, the compositor can remember your choice.
So, you would be asking users to click on "I agree" and "remember my
choice" every time they set up a computer then :D Not nearly as bad, but
still bad.
Post by Drew DeVault
Post by Martin Peres
(following what chrome and firefox are doing is likely a good idea).
I agree, permission management on Firefox is good.
Right, we should really aim for something similar, UX-wise.
Drew DeVault
2016-03-29 03:23:31 UTC
Permalink
I was proposing for applications to just bind the interface and see if it
works or not. But Giulio's proposal makes sense because it could be used to
both grant and revoke rights on the fly.
I think both solutions have similar merit and I don't feel strongly
about either one.
Post by Drew DeVault
No, the compositor can remember your choice.
So, you would be asking users to click on "I agree" and "remember my choice"
every time they set up a computer then :D Not nearly as bad, but still bad.
Well, I imagine we'd store their choices in a file somewhere. They can
bring that file along for the ride.

--
Drew DeVault
Giulio Camuffo
2016-03-29 05:25:19 UTC
Permalink
Post by Drew DeVault
I was proposing for applications to just bind the interface and see if it
works or not. But Giulio's proposal makes sense because it could be used to
both grant and revoke rights on the fly.
I think both solutions have similar merit and I don't feel strongly
about either one.
If the client just binds the interface the compositor needs to
immediately create the resource and send the protocol error, if the
client is not authorized. It doesn't have the time to ask the user for
input on the matter, while my proposal gives the compositor that.
Post by Drew DeVault
Post by Drew DeVault
No, the compositor can remember your choice.
So, you would be asking users to click on "I agree" and "remember my choice"
every time they set up a computer then :D Not nearly as bad, but still bad.
Well, I imagine we'd store their choices in a file somewhere. They can
bring that file along for the ride.
--
Drew DeVault
_______________________________________________
wayland-devel mailing list
https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Pekka Paalanen
2016-03-29 09:17:46 UTC
Permalink
On Tue, 29 Mar 2016 08:25:19 +0300
Post by Giulio Camuffo
Post by Drew DeVault
I was proposing for applications to just bind the interface and see if it
works or not. But Giulio's proposal makes sense because it could be used to
both grant and revoke rights on the fly.
I think both solutions have similar merit and I don't feel strongly
about either one.
If the client just binds the interface the compositor needs to
immediately create the resource and send the protocol error, if the
client is not authorized. It doesn't have the time to ask the user for
input on the matter, while my proposal gives the compositor that.
More precisely, you cannot gracefully fail to use an interface exposed
via wl_registry. It either works, or the client gets disconnected.
Protocol error always means disconnection, and wl_registry has no other
choice to communicate a "no, you can't use this".

Checking "whether an interface works or not" is also not trivial. It
would likely lead to adding a "yes, this works" event to all such
interfaces, since anything less explicit is harder than necessary. But
why do that separately in every interface rather than in a common
interface?

Btw. I did say in the past that I didn't quite understand or like
Giulio's proposal, but I have come around since. For the above reasons,
it does make sense on a high level.


Thanks,
pq
Drew DeVault
2016-03-29 11:42:57 UTC
Permalink
Post by Giulio Camuffo
If the client just binds the interface the compositor needs to
immediately create the resource and send the protocol error, if the
client is not authorized. It doesn't have the time to ask the user for
input on the matter, while my proposal gives the compositor that.
Understood. I'm on board.
Martin Peres
2016-03-28 19:50:23 UTC
Permalink
Post by Jasper St. Pierre
You're probably referring to my response when you say "GNOME does not
care about cross-platform apps doing privileged operations". My
response wasn't meant to be speaking on behalf of GNOME. These are my
opinions and mine alone.
I must have mis-remembered then. Sorry.
Post by Jasper St. Pierre
My opinion is still as follows: having seen how SELinux and PAM work
out in practice, I'm skeptical of any "Security Module" which
implements policy. The "module" part of it rarely happens, since
people simply gravitate towards a standard policy. What's interesting
to me isn't a piece of code that allows or rejects operations, it's
the resulting UI *around* those operations and managing them, since
that's really, at the end of the day, all the user cares about.
The UI is definitely the most important part of this work. I think
we gave already many ideas for the UI part in [1].

As much as possible, we want to avoid having the traditional
ACL-style because not only is it not easy to discover and tweak
but it is also going in the way of users. Instead, we would like
to be as unintrusive as possible.

We thus wanted to let distros take care of most of the policies (which
does not amount to much and will likely come with the application
anyway). However, some distros or devices come with a system
that already defines security policies and they will likely not want
a proliferation of storage places. Hence why we allowed for
multiple backends. But this is an exception rather than the rule.

What we envisioned was that when an app is using a privileged
interface, there would be a new icon in the notification area telling
which app is using what priviledged interface.

Also, when right clicking on a running window, there would be a
"capabilities" entry which, when clicked, would show on top of the
application what are the current capabilities allowed and what is
not. There, the user would be able to change the policy and have
multiple options for each capability:
- Default: (soft/hard allow/deny)
- One time granting (until the app is closed)
- One time revoke (revoke the rights immediatly)
Post by Jasper St. Pierre
It would be a significant failure to me if we didn't have a standard
way for a user to examine or recall the policy of an application,
using whatever API they wanted. If every module implements its own
policy store separately, such a UI would be extremely difficult to
build.
Yes, you are most definitely right. We only provided a default policy,
but we also need a way to get feedback from the user about his/her
preferences.

One thing that is going to get problematic is what happens when
one updates a piece of software and the policy does not match
anymore. Since it is a pretty coarse grained policy, it should not
be an issue!

In any case, the user preference could be stored and managed by
libwsm and modules would only be called if no user preference is
found.
Post by Jasper St. Pierre
From what I read, Wayland Security Modules didn't seem to even provide
that as a baseline, which is why I believe they're tackling the
problem from the wrong angle.
We attacked the problem from a UX point of view and then a distro
point of view. The custom configuration was a bit left on the side
since experimented users could write their own file and more novice
users would likely be more than OK with a runtime GUI to manage
the rights and allow granting privileges when needed.

We most definitely do not wish to ever have something as annoying
as the UAC in Linux, constantly asking the user for permissions on
things the user has no understanding about.

I really think that the 4 levels proposed by libwsm make a lot of sense
to reduce annoyance as much as possible [2]

[1] http://mupuf.org/blog/2014/03/18/managing-auth-ui-in-linux/
[2] https://github.com/mupuf/libwsm
Martin Peres
2016-03-30 06:28:03 UTC
Permalink
Post by Martin Peres
We thus wanted to let distros take care of most of the policies (which
does not amount to much and will likely come with the application
anyway). However, some distros or devices come with a system
that already defines security policies and they will likely not want
a proliferation of storage places. Hence why we allowed for
multiple backends. But this is an exception rather than the rule.
Why should every distribution decide on some policy? The default way
should work sanely and the way that a user would experience it makes
sense. I help out with Mageia (+GNOME), I'm 98% sure Mageia has 0
interest in creating/developing such a policy.
In WSM, you can set default behaviours for interfaces. This should cover
your use case.

However, remember this: If it is not the user or the distribution, then
you are basically trusting the developer of the application... which
basically means we are back to the security of X11.
e.g. Linus complaining about (IIRC) needing to provide a root password
after plugging in a printer. If we create such a situation again I might
even understand why he's rants :-P
This would be utterly ridiculous, and this is what we addressed here:
http://mupuf.org/blog/2014/03/18/managing-auth-ui-in-linux/
Jasper St. Pierre
2016-03-30 06:33:03 UTC
Permalink
I really hope that distributions don't see security policies as a
differentiator. This is how we got SELinux vs. AppArmor and real-world
apps having to ship both kinds of policies (or Fedora flat out
ignoring any idea of third-parties and such and including literally
every application ever in its contrib policy file
https://github.com/fedora-selinux/selinux-policy/tree/f23-contrib).
Post by Martin Peres
Post by Martin Peres
We thus wanted to let distros take care of most of the policies (which
does not amount to much and will likely come with the application
anyway). However, some distros or devices come with a system
that already defines security policies and they will likely not want
a proliferation of storage places. Hence why we allowed for
multiple backends. But this is an exception rather than the rule.
Why should every distribution decide on some policy? The default way
should work sanely and the way that a user would experience it makes
sense. I help out with Mageia (+GNOME), I'm 98% sure Mageia has 0
interest in creating/developing such a policy.
In WSM, you can set default behaviours for interfaces. This should cover
your use case.
However, remember this: If it is not the user or the distribution, then you
are basically trusting the developer of the application... which basically
means we are back to the security of X11.
e.g. Linus complaining about (IIRC) needing to provide a root password
after plugging in a printer. If we create such a situation again I might
even understand why he's rants :-P
http://mupuf.org/blog/2014/03/18/managing-auth-ui-in-linux/
--
Jasper
Martin Peres
2016-03-30 06:35:49 UTC
Permalink
Post by Jasper St. Pierre
I really hope that distributions don't see security policies as a
differentiator. This is how we got SELinux vs. AppArmor and real-world
apps having to ship both kinds of policies (or Fedora flat out
ignoring any idea of third-parties and such and including literally
every application ever in its contrib policy file
https://github.com/fedora-selinux/selinux-policy/tree/f23-contrib).
I would also *hate* that. I hate distros that do not ship software as
vanilla as possible :s

However, what may save us here is that the policy is very high-level and
it should be quite hard to differentiate!
Carsten Haitzler (The Rasterman)
2016-03-27 23:55:33 UTC
Permalink
Post by Drew DeVault
Greetings! I am the maintainer of the Sway Wayland compositor.
http://swaywm.org
It's almost the Year of Wayland on the Desktop(tm), and I have
reached out to each of the projects this message is addressed to (GNOME,
Kwin, and wayland-devel) to collaborate on some shared protocol
extensions for doing a handful of common tasks such as display
configuration and taking screenshots. Life will be much easier for
projects like ffmpeg and imagemagick if they don't have to implement
compositor-specific code for capturing the screen!
I want to start by establishing the requirements for these protocols.
Broadly speaking, I am looking to create protocols for the following
- Screen capture
i can tell you that screen capture is a security sensitive thing and likely
won't get a regular wayland protocol. it definitely won't from e. if you can
capture screen, you can screenscrape. some untrusted game you downloaded for
free can start watching your internet banking and see how much money you have
in which accounts where...

the simple solution is to build it into the wm/desktop itself as an explicit
user action (keypress, menu option etc.) and now it can't be exploited as it's
not pro grammatically available. :)

i would imagine the desktops themselves would in the end provide video capture
like they would stills.

of course you have the more nasty variety of screencapture which is "screen
sharing" where you don't want to just store to a file but broadcast live. and
this then even gets worse - you would want to be able to inject events -
control the mouse, keyboard etc. from an app. this is a nasty slippery slope at
least i don't want to walk down any time soon. this is a bit of a pandoras box
of security holes to open up.
Post by Drew DeVault
- Output configuration
why? currently pretty much every desktop provides its OWN output configuration
tool that is part of the desktop environment. why do you want to re-invent
randr here allowing any client to mess with screen config. after YEARS of games
using xvidtune and what not to mess up screen setups this would be a horrible
idea. if you want to make a presentation tool that uses 1 screen for output and
another for "controls" then that's a matter of providing info that multiple
displays exist and what type they may be (internal, external) and clients can
tag surfaces with "intents" eg - this iss a control surface, this is an
output/display surface. compositor will then assign them appropriately.

same for games. same for media usage. etc. - there is little to no need for
clients to go messing with screen setup. this is a desktop/compositor task that
will be handled by that DE as it sees fit (some may implement a wl protocol but
only on a specific FD - maybe a socketpair to a forked child) or something dbus
or some private protocol or maybe even build it directly in to the compositor.
the same technique can be used to allow extended protocol for specific clients
too (socketpair etc.) but just don't expose at all what is not needed.
Post by Drew DeVault
- More detailed surface roles (should it be floating, is it a modal,
does it want to draw its own decorations, etc)
that seems sensible and over time i can imagine this will expand.
Post by Drew DeVault
- Input device configuration
as above. i see no reason clients should be doing this. surface
intents/roles/whatever can deal with this. compositor may alter how an input
device works for that surface based on this.
Post by Drew DeVault
I think that these are the core protocols necessary for
cross-compositor compatability and to support most existing tools for
X11 like ffmpeg. Considering the security goals of Wayland, it will also
likely be necessary to implement some kind of protocol for requesting
and granting sensitive permissions to clients.
How does this list look? What sorts of concerns do you guys have with
respect to what features each protocol needs to support? Have I missed
any major protocols that we'll have to work on? Once we have a good list
of requirements I'll start writing some XML.
as above. anything apps have no business messing with i have no interest in
having any protocol for. input device config, screen setup config etc. etc. for
sure. screen capture is a nasty one and for now - no. no access. for the common
case the DE can do it. for screen sharing kind of things... you also need input
control (take over mouse and be able to control from app - or create a 2nd
mouse pointer and control that... keyboard - same, etc. etc. etc.). this is a
nasty little thing and in implementing something like this you are also forcing
compositors to work ion specific ways - eg screen capture will likely FORCE the
compositor to merge it all into a single ARGB buffer for you rather than just
assign it to hw layers. or perhaps it would require just exposing all the
layers, their config and have the client "deal with it" ? but that means the
compositor needs to expose its screen layout. do you include pointer or not?
compositor may draw ptr into the framebuffer. it may use a special hw layer.
what about if the compositor defers rendering - does a screen capture api force
the compositor to render when the client wants? this can have all kinds of
nasty effects in the rendering pipeline - for use our rendering pipeline iss
not in the compositor but via the same libraries clients use so altering this
pipeline affects regular apps as well as compositor. ... can of worms :)
Post by Drew DeVault
--
Drew DeVault
_______________________________________________
wayland-devel mailing list
https://lists.freedesktop.org/mailman/listinfo/wayland-devel
--
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler) ***@rasterman.com
Drew DeVault
2016-03-28 02:29:57 UTC
Permalink
Post by Carsten Haitzler (The Rasterman)
i can tell you that screen capture is a security sensitive thing and likely
won't get a regular wayland protocol. it definitely won't from e. if you can
capture screen, you can screenscrape. some untrusted game you downloaded for
free can start watching your internet banking and see how much money you have
in which accounts where...
Right, but there are legitimate use cases for this feature as well. It's
also true that if you have access to /dev/sda you can read all of the
user's files, but we still have tools like mkfs. We just put them behind
some extra security, i.e. you have to be root to use mkfs.
Post by Carsten Haitzler (The Rasterman)
the simple solution is to build it into the wm/desktop itself as an explicit
user action (keypress, menu option etc.) and now it can't be exploited as it's
not pro grammatically available. :)
i would imagine the desktops themselves would in the end provide video capture
like they would stills.
I'd argue that this solution is far from simple. Instead, it moves *all*
of the responsibilities of your entire desktop into one place, and one
codebase. And consider the staggering amount of work that went into
making ffmpeg, which has well over 4x the git commits as enlightenment.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
- Output configuration
why? currently pretty much every desktop provides its OWN output configuration
tool that is part of the desktop environment. why do you want to re-invent
randr here allowing any client to mess with screen config. after YEARS of games
using xvidtune and what not to mess up screen setups this would be a horrible
idea. if you want to make a presentation tool that uses 1 screen for output and
another for "controls" then that's a matter of providing info that multiple
displays exist and what type they may be (internal, external) and clients can
tag surfaces with "intents" eg - this iss a control surface, this is an
output/display surface. compositor will then assign them appropriately.
There's more than desktop environments alone out there. Not everyone
wants to go entirely GTK or Qt or EFL. I bet everyone on this ML has
software on their computer that uses something other than the toolkit of
their choice. Some people like piecing their system together and keeping
things lightweight, and choosing the best tool for the job. Some people
might want to use the KDE screengrab tool on e, or perhaps some other
tool that's more focused on doing just that job and doing it well. Or
perhaps there's existing tools like ImageMagick that are already written
into scripts and provide a TON of options to the user, which could be
much more easily patched with support for some standard screengrab
protocol than to implement all of its features in 5 different desktops.

We all have to implement output configuration, so why not do it the same
way and share our API? I don't think we need to let any client
manipulate the output configuration. We need to implement a security
model for this like all other elevated permissions.

Using some kind of intents system to communicate things like Impress
wanting to use one output for presentation and another for notes is
going to get out of hand quickly. There are just so many different
"intents" that are solved by just letting applications configure outputs
when it makes sense for them to. The code to handle this in the
compositor is going to become an incredibly complicated mess that rivals
even xorg in complexity. We need to avoid making the same mistakes
again. If we don't focus on making it simple, then in 15 years we're
going to be writing a new protocol and making a new set of mistakes. X
does a lot of things wrong, but the tools around it have a respect for
the Unix philosophy that we'd be wise to consider.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
- More detailed surface roles (should it be floating, is it a modal,
does it want to draw its own decorations, etc)
that seems sensible and over time i can imagine this will expand.
Cool. Suggestions for what sort of capability thiis protocol should
have, what kind of surface roles we will be looking at? We should
consider a few things. Normal windows, of course, which on compositors
like Sway would be tiled. Then there's floating windows, like
gnome-calculator, that are better off being tiled. Modals would be
something that pops up and prevents the parent window from being
interacted with, like some sort of alert (though preventing this
interactivity might not be the compositor's job). Then we have some
roles like dmenu would use, where the tool would like to arrange itself
(perhaps this would demand another permission?) Surfaces that want to be
fullscreen could be another. We should also consider additional settings
a surface might want, like negotiating for who draws the decorations or
whether or not it should appear in a taskbar sort of interface.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
- Input device configuration
as above. i see no reason clients should be doing this. surface
intents/roles/whatever can deal with this. compositor may alter how an input
device works for that surface based on this.
I don't feel very strongly about input device configuration as a
protocol here, but it's something that many of Sway's users are asking
for. People are trying out various compositors and may switch back and
forth depending on their needs and they want to configure all of their
input devices the same way.

However, beyond detailed input device configuration, there are some
other things that we should consider. Some applications (games, vnc,
etc) will want to capture the mouse and there should be a protocol for
them to indicate this with (perhaps again associated with special
permissions). Some applications (like Krita) may want to do things like
take control of your entire drawing tablet.
Post by Carsten Haitzler (The Rasterman)
[snip] screen capture is a nasty one and for now - no. no access [snip]
Wayland has been in the making for 4 years. Fedora is thinking about
shipping it by default. We need to quit with this "not for now" stuff
and start thinking about legitimate use-cases that we're killing off
here. The problems are not insurmountable and they are going to kill
Wayland adoption. We should not force Wayland upon our users, we should
make it something that they *want* to switch to. I personally have
gathered a lot of interest in Sway and Wayland in general by
livestreaming development of it from time to time, which has led to more
contributors getting in on the code and more people advocating for us to
get Wayland out there.
Post by Carsten Haitzler (The Rasterman)
for the common case the DE can do it. for screen sharing kind of
things... you also need input control (take over mouse and be able to
control from app - or create a 2nd mouse pointer and control that...
keyboard - same, etc. etc. etc.). [snip]
Screen sharing for VOIP applications is only one of many, many use-cases
for being able to get the pixels from your screen. VNC servers,
recording video to provide better bug reports or to demonstrate
something, and so on. We aren't opening pandora's box here, just
supporting video capture doens't mean you need to support all of these
complicated and dangerous things as well.
Post by Carsten Haitzler (The Rasterman)
nasty little thing and in implementing something like this you are also forcing
compositors to work ion specific ways - eg screen capture will likely FORCE the
compositor to merge it all into a single ARGB buffer for you rather than just
assign it to hw layers. or perhaps it would require just exposing all the
layers, their config and have the client "deal with it" ? but that means the
compositor needs to expose its screen layout. do you include pointer or not?
compositor may draw ptr into the framebuffer. it may use a special hw layer.
what about if the compositor defers rendering - does a screen capture api force
the compositor to render when the client wants? this can have all kinds of
nasty effects in the rendering pipeline - for use our rendering pipeline iss
not in the compositor but via the same libraries clients use so altering this
pipeline affects regular apps as well as compositor. ... can of worms :)
All of this would still be a problem if you want to support video
capture at all. You have to get the pixels into your encoder somehow.
There might be performance costs, but we aren't recording video all the
time.

We can make Wayland support use-cases that are important to our users or
we can watch them stay on xorg perpetually and end up maintaining two
graphical stacks forever.

--
Drew DeVault
Carsten Haitzler (The Rasterman)
2016-03-28 05:13:21 UTC
Permalink
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
i can tell you that screen capture is a security sensitive thing and likely
won't get a regular wayland protocol. it definitely won't from e. if you can
capture screen, you can screenscrape. some untrusted game you downloaded for
free can start watching your internet banking and see how much money you
have in which accounts where...
Right, but there are legitimate use cases for this feature as well. It's
also true that if you have access to /dev/sda you can read all of the
user's files, but we still have tools like mkfs. We just put them behind
some extra security, i.e. you have to be root to use mkfs.
yes but you need permission and that is handled at kernel level on a specific
file. not so here. compositor runs as a specific user and so you cant do that.
you'd have to do in-compositor security client-by-client.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
the simple solution is to build it into the wm/desktop itself as an explicit
user action (keypress, menu option etc.) and now it can't be exploited as
it's not pro grammatically available. :)
i would imagine the desktops themselves would in the end provide video
capture like they would stills.
I'd argue that this solution is far from simple. Instead, it moves *all*
of the responsibilities of your entire desktop into one place, and one
codebase. And consider the staggering amount of work that went into
making ffmpeg, which has well over 4x the git commits as enlightenment.
you wouldn't recreate ffmpeg. ffmpec produce libraries like avcodec. like a
reasonable developer we'd just use their libraries to do the encoding - we'd
capture frames and then hand off to avcodec (ffmpeg) library routines to do the
rest. ffmpeg doesnt need to know how to capture - just to do what 99% of its
code is devoted to doing - encode/decode. :) that's rather simple. already we
have decoding wrapped - we sit on top of either gstreamer, vlc or xine as the
codec engine and just glue in output and control api's and events. encoding is
just the same but in reverse. :) the encapsulation is simple.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
- Output configuration
why? currently pretty much every desktop provides its OWN output
configuration tool that is part of the desktop environment. why do you want
to re-invent randr here allowing any client to mess with screen config.
after YEARS of games using xvidtune and what not to mess up screen setups
this would be a horrible idea. if you want to make a presentation tool that
uses 1 screen for output and another for "controls" then that's a matter of
providing info that multiple displays exist and what type they may be
(internal, external) and clients can tag surfaces with "intents" eg - this
iss a control surface, this is an output/display surface. compositor will
then assign them appropriately.
There's more than desktop environments alone out there. Not everyone
wants to go entirely GTK or Qt or EFL. I bet everyone on this ML has
software on their computer that uses something other than the toolkit of
their choice. Some people like piecing their system together and keeping
things lightweight, and choosing the best tool for the job. Some people
might want to use the KDE screengrab tool on e, or perhaps some other
tool that's more focused on doing just that job and doing it well. Or
perhaps there's existing tools like ImageMagick that are already written
into scripts and provide a TON of options to the user, which could be
much more easily patched with support for some standard screengrab
protocol than to implement all of its features in 5 different desktops.
the expectation is there won't be generic tools but desktop specific ones. the
CURRENT ecosystem of tools exist because that is the way x was designed to
work. thus the srate of software matches its design. wayland is different. thus
tools and ecosystem will adapt.

as for output config - why would the desktops that already have their own tools
then want to support OTHER tools too? their tools integrate with their settings
panels and look and feel right and support THEIR policies.

let me give you an example:

Loading Image...

bottom-right - i can assign special scale factors and different toolkit
profiles per screen. eg one screen can be a desktop, one a media center style,
one a mobile "touch centric ui" etc. etc. - this is part of the screen setup
tool. a generic tool will miss features that make the desktop nice and
functional for its purposes. do you want to go create some kind of uber
protocol that every de has to support with every other de's feature set in it
and limit de's to modifying the protocol because they now have to go through a
shared protocol in libwayland that they cant just add features to as they
please? ok - so these features will be added adhoc in extra protocols so now
you have a bit of a messy protocol with 1 protocol referring to another... and
the "kde tool" messes up on e or the e tool messes up in gnome because all
these extra features are either not even supported by the tool or existing
features don't work because the de doesn't support those extensions?

just "i want to use the kde screen config tool" is not reason enough for there
to be a public/shared/common protocol. it will fall apart quickly like above
and simply mean work for most people to go support it rather than actual value.
Post by Drew DeVault
We all have to implement output configuration, so why not do it the same
way and share our API? I don't think we need to let any client
no - we don't have to implement it as a protocol. enlightenment needs zero
protocol. it's done by the compositor. the compositors own tool is simply a
settings dialog inside the compositor itself. no protocol. not even a tool.
it's the same as edit/tools -> preferences in most gui apps. its just a dialog
the app shows to configure itself.

chances are gnome likely will do this via dbus (they love dbus :)). kde - i
don't know. but not everyone is implementing a wayland protocol at all so
assuming they are and saying "do it the same way" is not necessarily saving any
work.
Post by Drew DeVault
manipulate the output configuration. We need to implement a security
model for this like all other elevated permissions.
like above. if gnome uses dbus - they will use polkit etc. etc. to decide that.
enlightenment doesn't even need to because there isn't even a protocol nor an
external tool - it's built directly in.
Post by Drew DeVault
Using some kind of intents system to communicate things like Impress
wanting to use one output for presentation and another for notes is
going to get out of hand quickly. There are just so many different
"intents" that are solved by just letting applications configure outputs
even impress doesnt configure outputs. thank god for that.
Post by Drew DeVault
when it makes sense for them to. The code to handle this in the
compositor is going to become an incredibly complicated mess that rivals
even xorg in complexity. We need to avoid making the same mistakes
again. If we don't focus on making it simple, then in 15 years we're
going to be writing a new protocol and making a new set of mistakes. X
does a lot of things wrong, but the tools around it have a respect for
the Unix philosophy that we'd be wise to consider.
how would it be complex. a compositor is already, if decent, going to handle
multiple outputs. it's either going to auto-configure new ones to extend/clone
or maybe pop up a settings dialog. e already does this for example and
remembers config for that screen (edid+output) so plug it in a 2nd time and it
automatically uses the last stored config for that. so the screen will "work"
as basicalyl a biu product of making a compositor that can do multiple outputs.

then intents are only a way of deciding where a surface is to be displayed -
rather than on the current desktop/screen.

so simply mark a surface as "for presentation" and the compositor will put it
on the non-internal display (chosen maybe by physical size reported in edid as
the larger one, or by elimination - its on the screen OTHER than the
internal... maybe user simply marks/checkboxes that screen as "use this
screen for presenting" and all apps that want so present get their content
there etc.)

so what you are saying is it's better to duplicate all this logic of screen
configuration inside every app that wants to present things (media players -
play movie on presentation screen, ppt/impress/whatever show presentation there,
etc. etc.) and how to configure the screen etc. etc., rather than have a simple
tag/intent and let your de/wm/compositor "deal with it" universally for all
such apps in a consistent way?
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
- More detailed surface roles (should it be floating, is it a modal,
does it want to draw its own decorations, etc)
that seems sensible and over time i can imagine this will expand.
Cool. Suggestions for what sort of capability thiis protocol should
have, what kind of surface roles we will be looking at? We should
consider a few things. Normal windows, of course, which on compositors
like Sway would be tiled. Then there's floating windows, like
ummm whats the difference between floating and normal? apps like gnome
calculator just open ... normal windows.
Post by Drew DeVault
gnome-calculator, that are better off being tiled. Modals would be
something that pops up and prevents the parent window from being
interacted with, like some sort of alert (though preventing this
interactivity might not be the compositor's job). Then we have some
yeah - good old "transient for" :)
Post by Drew DeVault
roles like dmenu would use, where the tool would like to arrange itself
(perhaps this would demand another permission?) Surfaces that want to be
fullscreen could be another. We should also consider additional settings
a surface might want, like negotiating for who draws the decorations or
whether or not it should appear in a taskbar sort of interface.
xdg shell should be handling these already - except dmenu. dmenu is almost a
special desktop component. like a shelf/panel/bar thing.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
- Input device configuration
as above. i see no reason clients should be doing this. surface
intents/roles/whatever can deal with this. compositor may alter how an input
device works for that surface based on this.
I don't feel very strongly about input device configuration as a
protocol here, but it's something that many of Sway's users are asking
for. People are trying out various compositors and may switch back and
forth depending on their needs and they want to configure all of their
input devices the same way.
they are going to have to deal with this then. already gnome and kde and e will
all configure mouse accel/left/right mouse on their own based on settings. yes
- i can RUN xset and set it back later but its FIGHTING with your DE. waqyland
is the same. use the desktop tools for this :) yes - it'll change between
compositors. :) at least in wayland you cant fight with the compositor here.
for sway - you are going ot have to write this yourself. eg - write tools that
talk to sway or sway reads a cfg file you edit or whatever. :)
Post by Drew DeVault
However, beyond detailed input device configuration, there are some
other things that we should consider. Some applications (games, vnc,
etc) will want to capture the mouse and there should be a protocol for
them to indicate this with (perhaps again associated with special
permissions). Some applications (like Krita) may want to do things like
take control of your entire drawing tablet.
as i said. can of worms. :)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
[snip] screen capture is a nasty one and for now - no. no access [snip]
Wayland has been in the making for 4 years. Fedora is thinking about
shipping it by default. We need to quit with this "not for now" stuff
and start thinking about legitimate use-cases that we're killing off
here. The problems are not insurmountable and they are going to kill
Wayland adoption. We should not force Wayland upon our users, we should
make it something that they *want* to switch to. I personally have
gathered a lot of interest in Sway and Wayland in general by
livestreaming development of it from time to time, which has led to more
contributors getting in on the code and more people advocating for us to
get Wayland out there.
you have no idea how many non-security-sensitive things need fixing first
before addressing the can-of-worms problems. hell nvidia just released drivers
that requrie compositors to re-do how they talk to egl/kms/drm to work that's
not compatible with existing drm dmabuf buffers etc. etc.

there's lots of things to solve like window "intents/tags/etc." that are not
security sensitive.

even clients and decorations. tiled wm's will not want clients to add
decorations with shadows etc. - currently clients will do csd because csd is
what weston chose and gnome has followed and enlightenment too. kde do not want
to do csd. i think that's wrong. it adds complexity to wayland just to "not
follow the convention". but for tiling i see the point of at least removing the
shadows. clients may choose to slap a title bar there still because it's useful
displaying state. but advertising this info from the compositor is not
standardized. what do you advertise to clients? where/when? at connect time? at
surface creation time? what negotiation is it? it easily could be that 1
screen or desktop is tiled and another is not and you dont know what to tell
the client until it has created a surface and you know where that surface would
go. perhaps this might be part of a larger set of negotiation like "i am a
mobile app so please stick me on the mobile screen" or "i'm a desktop app -
desktop please" then with the compositor saying where it decided to allocate
you (no mobile screen available - you are on desktop) and app is expected to
adapt...

these are not security can-of-worms things. most de's are still getting to the
point of "usable" atm without worrying about all of these extras yet.

there's SIMPLE stuff like - what happens when compositor crashes? how do we
handle this? do you really want to lose all your apps when compositors crash?
what should clients do? how do we ensure clients are restored to the same place
and state? crash recovery is important because it is always what allows
updates/upgrades without losing everything. THIS stuff is still "un solved".
i'm totally not concerned about screen casting or vnc etc. etc. until all of
these other nigglies are well solved first.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
for the common case the DE can do it. for screen sharing kind of
things... you also need input control (take over mouse and be able to
control from app - or create a 2nd mouse pointer and control that...
keyboard - same, etc. etc. etc.). [snip]
Screen sharing for VOIP applications is only one of many, many use-cases
for being able to get the pixels from your screen. VNC servers,
recording video to provide better bug reports or to demonstrate
something, and so on. We aren't opening pandora's box here, just
supporting video capture doens't mean you need to support all of these
complicated and dangerous things as well.
apps can show their own content for their own bug reporting. for system-wide
reporting this will be DE integrated anyway. supporting video capture is a a
can of worms. as i said - single buffer? multiple with metadata? who does
conversion/scaling/transforms? what is the security model? and as i said - this
has major implications to the rendering back-end of a compositor.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
nasty little thing and in implementing something like this you are also
forcing compositors to work ion specific ways - eg screen capture will
likely FORCE the compositor to merge it all into a single ARGB buffer for
you rather than just assign it to hw layers. or perhaps it would require
just exposing all the layers, their config and have the client "deal with
it" ? but that means the compositor needs to expose its screen layout. do
you include pointer or not? compositor may draw ptr into the framebuffer.
it may use a special hw layer. what about if the compositor defers
rendering - does a screen capture api force the compositor to render when
the client wants? this can have all kinds of nasty effects in the rendering
pipeline - for use our rendering pipeline iss not in the compositor but via
the same libraries clients use so altering this pipeline affects regular
apps as well as compositor. ... can of worms :)
All of this would still be a problem if you want to support video
capture at all. You have to get the pixels into your encoder somehow.
There might be performance costs, but we aren't recording video all the
time.
there's a difference. when its an internal detail is can be changed and
adapted to how the compositor and its rendering subsystem work. when its a
protocol you HAVE to support THAT protocol and the way THAT protocol defines
things to work or apps break.

keep it internal - you can break at will and adapt as needed, make it public
and you are boxed in by what the public api allows.
Post by Drew DeVault
We can make Wayland support use-cases that are important to our users or
we can watch them stay on xorg perpetually and end up maintaining two
graphical stacks forever.
priorities. there are other issues that should be solved first before worrying
about the pandoras box ones.
--
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler) ***@rasterman.com
Drew DeVault
2016-03-28 13:00:34 UTC
Permalink
Post by Carsten Haitzler (The Rasterman)
yes but you need permission and that is handled at kernel level on a specific
file. not so here. compositor runs as a specific user and so you cant do that.
you'd have to do in-compositor security client-by-client.
It is different, but we should still find a way to do it. After all,
we're going to be in a similar situation eventually where we're running
sandboxed applications and the compositor is granting rights from the
same level of privledge as the kernel provides to root users (in this
case, the role is almost of a hypervisor and a guest).
Post by Carsten Haitzler (The Rasterman)
you wouldn't recreate ffmpeg. ffmpec produce libraries like avcodec. like a
reasonable developer we'd just use their libraries to do the encoding - we'd
capture frames and then hand off to avcodec (ffmpeg) library routines to do the
rest. ffmpeg doesnt need to know how to capture - just to do what 99% of its
code is devoted to doing - encode/decode. :) that's rather simple. already we
have decoding wrapped - we sit on top of either gstreamer, vlc or xine as the
codec engine and just glue in output and control api's and events. encoding is
just the same but in reverse. :) the encapsulation is simple.
True, that most of the work is in the avcodec. However, there's more to
it than that. The entire command line interface of ffmpeg would be
nearly impossible to build into the compositor effectively. With ffmpeg
I can capture x, flip it, paint it sepia, add a logo to the corner, and
mux it with my microphone and a capture of the speakers (thanks,
pulseaudio) and add a subtitle track while I'm at it. Read the ffmpeg
man pages. ffmpeg-all(1) is 23,191 lines long on my terminal (that's
just the command line interface, not avcodec). There's no way in hell
all of the compositors/DEs are going to be able to fulfill all of its
use cases, nor do I think we should be trying to.

Look at things like OBS. It lets you specify detailed encoding options
and composites a scene from multiple video sources and audio sources,
as well as letting the user switch between different scenes with
configurable transitions. It even lets you embed a web browser into the
final result! All of this with a nice GUI to top it off. Again, we can't
possibly hope to effectively implement all of this in the compositor/DE,
or the features of the other software that we haven't even thought of.
Post by Carsten Haitzler (The Rasterman)
the expectation is there won't be generic tools but desktop specific ones. the
CURRENT ecosystem of tools exist because that is the way x was designed to
work. thus the srate of software matches its design. wayland is different. thus
tools and ecosystem will adapt.
That expectation is misguided. I like being able to write a script to
configure my desktop layout between several presets. Here's an example -
a while ago, I used a laptop at work that could be plugged into a
docking station. I would close the lid and use external displays at my
desk. I wanted to automatically change the screen layout when I came and
went, so I wrote a script that used xrandr to do it. It detected when
there were new outputs plugged in, then disabled the laptop screen and
enabled+configured the two new screens in the correct position and
resolution. This was easy for me to configure to behave the way I wanted
because the tooling was flexible and cross-desktop. Sure, we could make
each compositor support it, but each one is going to do it differently
and in their own subtly buggy ways and with their own subset of the
total possible features and use-cases, and none of them are going to
address every possible scenario.
Post by Carsten Haitzler (The Rasterman)
as for output config - why would the desktops that already have their own tools
then want to support OTHER tools too? their tools integrate with their settings
panels and look and feel right and support THEIR policies.
Base your desktop's tools on the common protocol, of course. Gnome
settings, KDE settings, arandr, xrandr, nvidia-settings, and so on, all
seem to work fine configuring your outputs with the same protocol today.
Yes, the protocol is meh and the implementation is a mess, but the
clients of that protocol aren't bad by any stretch of the imagination.
Post by Carsten Haitzler (The Rasterman)
http://devs.enlightenment.org/~raster/ssetup.png
[snip]
This is a very interesting screenshot, and I hadn't considered this. I
don't think it's an unsolvable problem, though - we can make the
protocol flexible enough to allow compositor-specific metadata to be
added and configurable. These are the sorts of requirements I want to be
gathering to design this protocol with.
Post by Carsten Haitzler (The Rasterman)
no - we don't have to implement it as a protocol. enlightenment needs zero
protocol. it's done by the compositor. the compositors own tool is simply a
settings dialog inside the compositor itself. no protocol. not even a tool.
it's the same as edit/tools -> preferences in most gui apps. its just a dialog
the app shows to configure itself.
I currently do several things in different processes/binaries that
enlightenment does in the compositor, things like the bar and the
wallpaper. I don't want to make an output configuration GUI tool nested
into the compositor, it's out of scope.
Post by Carsten Haitzler (The Rasterman)
chances are gnome likely will do this via dbus (they love dbus :)). kde - i
don't know. but not everyone is implementing a wayland protocol at all so
assuming they are and saying "do it the same way" is not necessarily saving any
work.
We're all writing wayland compositors here. We may not all have dbus or
whatever else in common, but we do have the wayland protocol in common,
and it can support this use-case. It makes sense to use it.
Post by Carsten Haitzler (The Rasterman)
then intents are only a way of deciding where a surface is to be displayed -
rather than on the current desktop/screen.
so simply mark a surface as "for presentation" and the compositor will put it
on the non-internal display (chosen maybe by physical size reported in edid as
the larger one, or by elimination - its on the screen OTHER than the
internal... maybe user simply marks/checkboxes that screen as "use this
screen for presenting" and all apps that want so present get their content
there etc.)
Man, this is going to get really complicated. How do you decide what
display is "internal" or not? What if the user wants to present on their
primary display? What about applications that use the entire output for
things other then presentations? What if the application wants to use
several outputs, and for different purposes? What language are you going
to use to describe these settings to the user in a way that makes more
sense than the clients describing for themselves why they need to use a
particular output?
Post by Carsten Haitzler (The Rasterman)
so what you are saying is it's better to duplicate all this logic of screen
configuration inside every app that wants to present things (media players -
play movie on presentation screen, ppt/impress/whatever show presentation there,
etc. etc.) and how to configure the screen etc. etc., rather than have a simple
tag/intent and let your de/wm/compositor "deal with it" universally for all
such apps in a consistent way?
No. Applications want to be full screen or they don't want to be. If
they want to pick a particular output, we can easily let them do so.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
Cool. Suggestions for what sort of capability thiis protocol should
have, what kind of surface roles we will be looking at? We should
consider a few things. Normal windows, of course, which on compositors
like Sway would be tiled. Then there's floating windows, like
ummm whats the difference between floating and normal? apps like gnome
calculator just open ... normal windows.
Gnome calculator doesn't like being tiled: Loading Image...

There are probably some other applications that would very much like to
be shown at a particular aspect ratio or resolution.
Post by Carsten Haitzler (The Rasterman)
xdg shell should be handling these already - except dmenu. dmenu is almost a
special desktop component. like a shelf/panel/bar thing.
dmenu isn't the only one, though, that may want to arrange itself in
special ways. Lemonbar and rofi also come to mind.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
[input is] something that many of Sway's users are asking for.
they are going to have to deal with this then. already gnome and kde and e will
all configure mouse accel/left/right mouse on their own based on settings. yes
- i can RUN xset and set it back later but its FIGHTING with your DE. waqyland
is the same. use the desktop tools for this :) yes - it'll change between
compositors. :) at least in wayland you cant fight with the compositor here.
for sway - you are going ot have to write this yourself. eg - write tools that
talk to sway or sway reads a cfg file you edit or whatever. :)
I've already written this into sway, fwiw, in your config file. I think
this is fine, too, and I intend to keep supporting configuring outputs
like that. But consider the use case of Krita, or video games like Osu!
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
However, beyond detailed input device configuration, there are some
other things that we should consider. Some applications (games, vnc,
etc) will want to capture the mouse and there should be a protocol for
them to indicate this with (perhaps again associated with special
permissions). Some applications (like Krita) may want to do things like
take control of your entire drawing tablet.
as i said. can of worms. :)
It's a can of worms we should deal with, and one that I don't think it's
hard to deal with. libinput lets you configure a handful of details
about input devices. Let's expose these things in a protocol.
Post by Carsten Haitzler (The Rasterman)
you have no idea how many non-security-sensitive things need fixing first
before addressing the can-of-worms problems. hell nvidia just released drivers
that requrie compositors to re-do how they talk to egl/kms/drm to work that's
not compatible with existing drm dmabuf buffers etc. etc.
Why do those things need to be dealt with first? Sway is at a good spot
where I can start thinking about these sorts of things. There are
enough people involved to work on multiple things at once. Plus,
everyone thinks nvidia's design is bad and we're hopefully going to see
something from them that avoids vendor-specific code.

I don't see these problems as a can of worms. I see them as problems
that are solvable and necessary to solve, and now is a good time to
solve them. My compositor is coming up on version 1.0. Supporting the
APIs is the driver's problem, we've described the spec and as soon as
they implement it, it will Just Work(tm).
Post by Carsten Haitzler (The Rasterman)
even clients and decorations. tiled wm's will not want clients to add
decorations with shadows etc. - currently clients will do csd because csd is
what weston chose and gnome has followed and enlightenment too. kde do not want
to do csd. i think that's wrong.
What is a can of worms is the argument over whether or not we should use
CSD or SSD. I fall in the latter camp, but I don't think we need to
fight over it now. We should be able to agree that a protocol for
negotiating whether or not borders are drawn would be reasonable. Is it
a GTK app that does nothing interesting with its titlebar? Well, if the
compositor wants to draw its borders, then let it do so. Does it do
fancy GTK stuff with the borders? Well, no, mister compositor, I want to
do fancy things. Easy enough.
Post by Carsten Haitzler (The Rasterman)
it adds complexity to wayland just to "not follow the convention". but
for tiling i see the point of at least removing the shadows. clients
may choose to slap a title bar there still because it's useful
displaying state. but advertising this info from the compositor is not
standardized. what do you advertise to clients? where/when? at connect
time? at surface creation time? what negotiation is it? it easily
could be that 1 screen or desktop is tiled and another is not and you
dont know what to tell the client until it has created a surface and
you know where that surface would go. perhaps this might be part of a
larger set of negotiation like "i am a mobile app so please stick me
on the mobile screen" or "i'm a desktop app - desktop please" then
with the compositor saying where it decided to allocate you (no mobile
screen available - you are on desktop) and app is expected to adapt...
In Wayland you create a surface, then assign it a role. Extra details
can go in between, or go in the call that gives it a role. Right now
most applications are creating their surface and then making it a shell
surface. The compositor can negotiate based on its own internal state
over whether a given output is tiled or not, or in cases like AwesomeWM,
whether a given workspace is tiled or not. And I don't think the
decision has to be final. If the window is moved to another output or
really if any of the circumstances change, they can renegotiate and the
surface can start drawing its own decorations.
Post by Carsten Haitzler (The Rasterman)
there's SIMPLE stuff like - what happens when compositor crashes? how do we
handle this? do you really want to lose all your apps when compositors crash?
what should clients do? how do we ensure clients are restored to the same place
and state? crash recovery is important because it is always what allows
updates/upgrades without losing everything. THIS stuff is still "un solved".
i'm totally not concerned about screen casting or vnc etc. etc. until all of
these other nigglies are well solved first.
I'm still not on board with all of this "first" stuff. I don't see any
reason why we have to order ourselves like this. It all needs to get
done at some point. Right now we haven't standardized anything, and each
compositor is using its own unique, incompatible way of taking
screenshots and recording videos, and each is probably introducing some
kind of security problem.
Post by Carsten Haitzler (The Rasterman)
apps can show their own content for their own bug reporting. for system-wide
reporting this will be DE integrated anyway. supporting video capture is a a
can of worms. as i said - single buffer? multiple with metadata? who does
conversion/scaling/transforms? what is the security model? and as i said - this
has major implications to the rendering back-end of a compositor.
The compositor hands RGBA (or ARGB, whatever, I don't care, we just pick
one) data to the client that's recording. This problem doesn't have to
be complicated. As for the "major implications"...
Post by Carsten Haitzler (The Rasterman)
there's a difference. when its an internal detail is can be changed and
adapted to how the compositor and its rendering subsystem work. when its a
protocol you HAVE to support THAT protocol and the way THAT protocol defines
things to work or apps break.
You STILL have to get the pixels into the encoder on the compositor
side. You will ALWAYS have to do that if you want to support video
captures, regardless of who's doing it. At some point you're going to
have to get the pixels you're rendering and hand them off to someone, be
that libavcodec or a privledged client.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
We can make Wayland support use-cases that are important to our users or
we can watch them stay on xorg perpetually and end up maintaining two
graphical stacks forever.
priorities. there are other issues that should be solved first before worrying
about the pandoras box ones.
These are not pandora's box. These are small, necessary features.

--
Drew DeVault
Carsten Haitzler (The Rasterman)
2016-03-28 14:03:00 UTC
Permalink
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
yes but you need permission and that is handled at kernel level on a
specific file. not so here. compositor runs as a specific user and so you
cant do that. you'd have to do in-compositor security client-by-client.
It is different, but we should still find a way to do it. After all,
we're going to be in a similar situation eventually where we're running
sandboxed applications and the compositor is granting rights from the
same level of privledge as the kernel provides to root users (in this
case, the role is almost of a hypervisor and a guest).
should we? is it right to create yet another rsecurity model in userspace
"quickly" just to solve things that dont NEED solving at least at this point.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
you wouldn't recreate ffmpeg. ffmpec produce libraries like avcodec. like a
reasonable developer we'd just use their libraries to do the encoding - we'd
capture frames and then hand off to avcodec (ffmpeg) library routines to do
the rest. ffmpeg doesnt need to know how to capture - just to do what 99%
of its code is devoted to doing - encode/decode. :) that's rather simple.
already we have decoding wrapped - we sit on top of either gstreamer, vlc
or xine as the codec engine and just glue in output and control api's and
events. encoding is just the same but in reverse. :) the encapsulation is
simple.
True, that most of the work is in the avcodec. However, there's more to
it than that. The entire command line interface of ffmpeg would be
nearly impossible to build into the compositor effectively. With ffmpeg
I can capture x, flip it, paint it sepia, add a logo to the corner, and
mux it with my microphone and a capture of the speakers (thanks,
pulseaudio) and add a subtitle track while I'm at it. Read the ffmpeg
man pages. ffmpeg-all(1) is 23,191 lines long on my terminal (that's
just the command line interface, not avcodec). There's no way in hell
all of the compositors/DEs are going to be able to fulfill all of its
use cases, nor do I think we should be trying to.
Look at things like OBS. It lets you specify detailed encoding options
and composites a scene from multiple video sources and audio sources,
as well as letting the user switch between different scenes with
configurable transitions. It even lets you embed a web browser into the
final result! All of this with a nice GUI to top it off. Again, we can't
possibly hope to effectively implement all of this in the compositor/DE,
or the features of the other software that we haven't even thought of.
adding watermarks can be done after encoding as another pass (encode in high
quality). hell watermarks can just be a WINDOW (surface) on the screen. you
don't need options. ass for audio - not too hard to do along with it. just
offer to record an input device - and choose (input can be current mixed output
or a mic ... or both).
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
the expectation is there won't be generic tools but desktop specific ones.
the CURRENT ecosystem of tools exist because that is the way x was designed
to work. thus the srate of software matches its design. wayland is
different. thus tools and ecosystem will adapt.
That expectation is misguided. I like being able to write a script to
configure my desktop layout between several presets. Here's an example -
a while ago, I used a laptop at work that could be plugged into a
docking station. I would close the lid and use external displays at my
desk. I wanted to automatically change the screen layout when I came and
went, so I wrote a script that used xrandr to do it. It detected when
there were new outputs plugged in, then disabled the laptop screen and
enabled+configured the two new screens in the correct position and
resolution. This was easy for me to configure to behave the way I wanted
because the tooling was flexible and cross-desktop. Sure, we could make
each compositor support it, but each one is going to do it differently
and in their own subtly buggy ways and with their own subset of the
total possible features and use-cases, and none of them are going to
address every possible scenario.
exactly what you describe is how e works out of the box. no sscripts needed.
requiring people write script to do their screen configuration is just wrong.
taking the position of "well i give up and won't bother and will just make my
users write scripts instead" iss sticking your head in the sand and not solving
the problem. you are now asking everyone ELSE who writes a compositor to
implement a protocol because YOU wont solve a problem that others have solved
in a user friendly manner.

i've been doing x11 wm's since 1996. i've seen the bad, the ugly and the
horrible. there is no way i want any kind of protocol for configuring the
screen. not after having seen just how much it is abused when there and what a
horrible state things are left in when it's there.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
as for output config - why would the desktops that already have their own
tools then want to support OTHER tools too? their tools integrate with
their settings panels and look and feel right and support THEIR policies.
Base your desktop's tools on the common protocol, of course. Gnome
settings, KDE settings, arandr, xrandr, nvidia-settings, and so on, all
seem to work fine configuring your outputs with the same protocol today.
Yes, the protocol is meh and the implementation is a mess, but the
clients of that protocol aren't bad by any stretch of the imagination.
no tools. why do it? it's built in. in order for screen config "magic" to
work set of metadata attached to screens. you can set priority (screens get
numbers from highest to lowest priority at any given time allowing behaviour
like your "primary" screen to migrate to an external one then migrate back when
external monitor is attached etc.) sure we can start having that metadata
separate but then ALTERNATE TOOLS won't be able to configure it thus breaking
the desktop environment not providing metadata and other settings associated
with a display. this breaks functionality for users who then complain about
things not working right AND then the compositor has to now deal with these
"error cases" too because a foreign tool will be messing with its data/setup.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
http://devs.enlightenment.org/~raster/ssetup.png
[snip]
This is a very interesting screenshot, and I hadn't considered this. I
don't think it's an unsolvable problem, though - we can make the
protocol flexible enough to allow compositor-specific metadata to be
added and configurable. These are the sorts of requirements I want to be
gathering to design this protocol with.
as above. i have seen screen configuration used and abused over the years where
i just do not want to have a protocol for messing around with it for any
client. give them an inch and they'll take a mile.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
no - we don't have to implement it as a protocol. enlightenment needs zero
protocol. it's done by the compositor. the compositors own tool is simply a
settings dialog inside the compositor itself. no protocol. not even a tool.
it's the same as edit/tools -> preferences in most gui apps. its just a
dialog the app shows to configure itself.
I currently do several things in different processes/binaries that
enlightenment does in the compositor, things like the bar and the
wallpaper. I don't want to make an output configuration GUI tool nested
into the compositor, it's out of scope.
and that's perfectly fine - that is your choice. do not force your choice on
other compositors. you can implement all the protocol you want in any way you
want for your wm's tools.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
chances are gnome likely will do this via dbus (they love dbus :)). kde - i
don't know. but not everyone is implementing a wayland protocol at all so
assuming they are and saying "do it the same way" is not necessarily saving
any work.
We're all writing wayland compositors here. We may not all have dbus or
whatever else in common, but we do have the wayland protocol in common,
and it can support this use-case. It makes sense to use it.
gnome does almost everything with dbus. they love dbus. a lot of gnome is
centred around dbus. they likely will choose dbus to do this. likely. i
personally wouldn't choose to use dbus.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
then intents are only a way of deciding where a surface is to be displayed -
rather than on the current desktop/screen.
so simply mark a surface as "for presentation" and the compositor will put
it on the non-internal display (chosen maybe by physical size reported in
edid as the larger one, or by elimination - its on the screen OTHER than the
internal... maybe user simply marks/checkboxes that screen as "use this
screen for presenting" and all apps that want so present get their content
there etc.)
Man, this is going to get really complicated. How do you decide what
display is "internal" or not? What if the user wants to present on their
at least e already knows this. its screen management subsystem is perfectly
aware of this. :)
Post by Drew DeVault
primary display? What about applications that use the entire output for
the app can simply not request to present on their "presentation" screen... or
the user would mark their primary screen (internal on laptop maybe) AS their
presentation screen - more metadata to be held by compositor.

now ALL presentation tools behave the same - you dont have to reconfigure each
one separately and deal with the difference and lack or otherwise of features.
it's done in 1 place - compositor, and then all apps that want to do a
similar thing follow and work "as expected". far better than just ignoring the
issue. you yourself already talked about extra tags/hints/whatever - this is
one of those.
Post by Drew DeVault
things other then presentations? What if the application wants to use
several outputs, and for different purposes? What language are you going
to use to describe these settings to the user in a way that makes more
sense than the clients describing for themselves why they need to use a
particular output?
because this require clients DEFINING screen layout. wayland was specifically
designed to HIDE THIS. if the compositor displayed a screen wrapped around a
sphere in real life in a room - then it doesn't have rectangles... how will an
app deal with that? what if the compositor is literally a VR world with
surfaces wrapped around spheres and cubes - the point of wayland's design was
to hide this info from clients completely so the compositor decides based on
environment, not each and every client. this was a basic premise/design in
wayland from the get go and it was a good one. letting apps break this
abstraction breaks this design.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
so what you are saying is it's better to duplicate all this logic of screen
configuration inside every app that wants to present things (media players -
play movie on presentation screen, ppt/impress/whatever show presentation
there, etc. etc.) and how to configure the screen etc. etc., rather than
have a simple tag/intent and let your de/wm/compositor "deal with it"
universally for all such apps in a consistent way?
No. Applications want to be full screen or they don't want to be. If
they want to pick a particular output, we can easily let them do so.
i don't know about you.. but fullscreen to enlightenment means you use up ONE
SCREEN. not all screens. and from user response.. they LOVE IT. it is correct.
it's the right way. so when an app asks to be fullscreen it gets to use the
scren its on - not all. so no. fullscreen does NOT mean they would want to span
all screens (you imply that) and then just draw different areas of their
massive window to correspond to screens (and control those screens,
resolutions, geometries etc.).

what makes sense is an app hints at the purpose of its window and opens n
windows (surfaces). it can ask for fullscreen for each. the hints would allow
the compositor to choose which screen the window/surface is assigned to.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
Cool. Suggestions for what sort of capability thiis protocol should
have, what kind of surface roles we will be looking at? We should
consider a few things. Normal windows, of course, which on compositors
like Sway would be tiled. Then there's floating windows, like
ummm whats the difference between floating and normal? apps like gnome
calculator just open ... normal windows.
Gnome calculator doesn't like being tiled: https://sr.ht/Ai5N.png
i think the problem is you are not handling min/max sizing of clients
properly. :) you need to fix sway. gnome calculator is not sizing up its buffer
on surface size. that is a message "i can't be bigger than this - this is my
biggest size. deal with is". you need to deal with it. eg - pad it and make it
sized AT the buffer size :)
Post by Drew DeVault
There are probably some other applications that would very much like to
be shown at a particular aspect ratio or resolution.
as above. buffer size tells you that.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
xdg shell should be handling these already - except dmenu. dmenu is almost a
special desktop component. like a shelf/panel/bar thing.
dmenu isn't the only one, though, that may want to arrange itself in
special ways. Lemonbar and rofi also come to mind.
all of these basically are "desktop components" ala
taskbars/shelves/panels/whatever - i know that for e we don't want to support
such apps. these are built in. i don't know what gnome or kde think but these
go against their design as an integrated desktop environment. YOU need these
because your compositor has no such feature itself. the bigger desktops don't
need it. they MAY support it - may not. i know i don't want to. :)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
[input is] something that many of Sway's users are asking for.
they are going to have to deal with this then. already gnome and kde and e
will all configure mouse accel/left/right mouse on their own based on
settings. yes
- i can RUN xset and set it back later but its FIGHTING with your DE.
waqyland is the same. use the desktop tools for this :) yes - it'll change
between compositors. :) at least in wayland you cant fight with the
compositor here. for sway - you are going ot have to write this yourself.
eg - write tools that talk to sway or sway reads a cfg file you edit or
whatever. :)
I've already written this into sway, fwiw, in your config file. I think
this is fine, too, and I intend to keep supporting configuring outputs
like that. But consider the use case of Krita, or video games like Osu!
i don't know osu - but i see no reason krita needs to configure a tablet. it
can just deal with input from it. :)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
However, beyond detailed input device configuration, there are some
other things that we should consider. Some applications (games, vnc,
etc) will want to capture the mouse and there should be a protocol for
them to indicate this with (perhaps again associated with special
permissions). Some applications (like Krita) may want to do things like
take control of your entire drawing tablet.
as i said. can of worms. :)
It's a can of worms we should deal with, and one that I don't think it's
hard to deal with. libinput lets you configure a handful of details
about input devices. Let's expose these things in a protocol.
input is very sensitive. having done this for years and watched how games like
to turn off key repeat then leave it off when they crash... or change mouse
accel then you find its changed everywhere and have to "fix it" etc. etc. - i'd
be loathe to do this. give them TOO much config ability anbd it can become a
security issue.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
you have no idea how many non-security-sensitive things need fixing first
before addressing the can-of-worms problems. hell nvidia just released
drivers that requrie compositors to re-do how they talk to egl/kms/drm to
work that's not compatible with existing drm dmabuf buffers etc. etc.
Why do those things need to be dealt with first? Sway is at a good spot
where I can start thinking about these sorts of things. There are
enough people involved to work on multiple things at once. Plus,
everyone thinks nvidia's design is bad and we're hopefully going to see
something from them that avoids vendor-specific code.
because these imho are far more important. you might be surprised at how few
people are involved.
Post by Drew DeVault
I don't see these problems as a can of worms. I see them as problems
that are solvable and necessary to solve, and now is a good time to
solve them. My compositor is coming up on version 1.0. Supporting the
APIs is the driver's problem, we've described the spec and as soon as
they implement it, it will Just Work(tm).
Post by Carsten Haitzler (The Rasterman)
even clients and decorations. tiled wm's will not want clients to add
decorations with shadows etc. - currently clients will do csd because csd is
what weston chose and gnome has followed and enlightenment too. kde do not
want to do csd. i think that's wrong.
What is a can of worms is the argument over whether or not we should use
CSD or SSD. I fall in the latter camp, but I don't think we need to
fight over it now. We should be able to agree that a protocol for
negotiating whether or not borders are drawn would be reasonable. Is it
a GTK app that does nothing interesting with its titlebar? Well, if the
compositor wants to draw its borders, then let it do so. Does it do
fancy GTK stuff with the borders? Well, no, mister compositor, I want to
do fancy things. Easy enough.
not so simple. with more of the ui of an app being moved INTO the border
(titlebar etc.) this is not a simple thing to just turn it off. you then turn
OFF necessary parts of the ui or have to push the problem out to the app to
"fallback". only having CSD solves all that complexity and is more efficient
than SSD when it comes to things like assigning hw layers or avoiding copies of
vast amounts of pixels. i was against CSD to start with too but i see their
major benefits.

of course the shadow padding area is something i do see as optional and
something to hint at that would be useful. i can't see gnome dropping CSD
especially given how integrated to the ui it's becoming. i can tel you that i'm
strongly considering going the same way and fully integrating into CSD for many
good reasons that go far beyond just a desktop.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
it adds complexity to wayland just to "not follow the convention". but
for tiling i see the point of at least removing the shadows. clients
may choose to slap a title bar there still because it's useful
displaying state. but advertising this info from the compositor is not
standardized. what do you advertise to clients? where/when? at connect
time? at surface creation time? what negotiation is it? it easily
could be that 1 screen or desktop is tiled and another is not and you
dont know what to tell the client until it has created a surface and
you know where that surface would go. perhaps this might be part of a
larger set of negotiation like "i am a mobile app so please stick me
on the mobile screen" or "i'm a desktop app - desktop please" then
with the compositor saying where it decided to allocate you (no mobile
screen available - you are on desktop) and app is expected to adapt...
In Wayland you create a surface, then assign it a role. Extra details
can go in between, or go in the call that gives it a role. Right now
most applications are creating their surface and then making it a shell
surface. The compositor can negotiate based on its own internal state
over whether a given output is tiled or not, or in cases like AwesomeWM,
whether a given workspace is tiled or not. And I don't think the
decision has to be final. If the window is moved to another output or
really if any of the circumstances change, they can renegotiate and the
surface can start drawing its own decorations.
yup. but this signalling/negotiation has to exist. currently it doesnt. :)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
there's SIMPLE stuff like - what happens when compositor crashes? how do we
handle this? do you really want to lose all your apps when compositors
crash? what should clients do? how do we ensure clients are restored to the
same place and state? crash recovery is important because it is always what
allows updates/upgrades without losing everything. THIS stuff is still "un
solved". i'm totally not concerned about screen casting or vnc etc. etc.
until all of these other nigglies are well solved first.
I'm still not on board with all of this "first" stuff. I don't see any
reason why we have to order ourselves like this. It all needs to get
done at some point. Right now we haven't standardized anything, and each
compositor is using its own unique, incompatible way of taking
screenshots and recording videos, and each is probably introducing some
kind of security problem.
you aren't going to talk me into implementing something that is important for
you and not a priority for e until such a time as i'm satisfied that the other
issues are solved. you are free to do what you want, but standardizing things
takes a looong time and a lot of experimentation, discussion, and repeating
this. we have resources on wayland and nothing you described is a priority for
them. there are far more important things to do that are actual business
requirements and so the people working need to prioritize what is such a
requirement as opposed to what is not. resources are not infinite and free.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
apps can show their own content for their own bug reporting. for system-wide
reporting this will be DE integrated anyway. supporting video capture is a a
can of worms. as i said - single buffer? multiple with metadata? who does
conversion/scaling/transforms? what is the security model? and as i said -
this has major implications to the rendering back-end of a compositor.
The compositor hands RGBA (or ARGB, whatever, I don't care, we just pick
one) data to the client that's recording. This problem doesn't have to
be complicated. As for the "major implications"...
let me complicate it for you. let's say i'm playing a video fullscreen. you now
have to convert argb to yuv then encode when it would have been far more
efficient to get access directly to the yuv buffer before it was even scaled to
screen size... :) so you have just specified a protocol that is by design
inefficient when it could be more efficient.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
there's a difference. when its an internal detail is can be changed and
adapted to how the compositor and its rendering subsystem work. when its a
protocol you HAVE to support THAT protocol and the way THAT protocol defines
things to work or apps break.
You STILL have to get the pixels into the encoder on the compositor
side. You will ALWAYS have to do that if you want to support video
captures, regardless of who's doing it. At some point you're going to
have to get the pixels you're rendering and hand them off to someone, be
that libavcodec or a privledged client.
yes - but when, how often and via what mechanisms pixels get there is a very
delicate thing.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
We can make Wayland support use-cases that are important to our users or
we can watch them stay on xorg perpetually and end up maintaining two
graphical stacks forever.
priorities. there are other issues that should be solved first before
worrying about the pandoras box ones.
These are not pandora's box. These are small, necessary features.
i disagree. i've been doing graphics for long enough to smell the nasties from
a mile off. it's not trivial. the decisions that are made now will haunt us
for a lifetime. they are not internal details that can be fixed easily. even
internal details are hard to fix once enough code relies on them...

so far we don't exactly have a lot of inter-desktop co-operation happening.
it's pretty much everyone for themselves except for a smallish core protocol.
do NOT try and solve security sensitive AND performance sensitive AND design
limiting/dictating things first and definitely don't do it without everyone on
the same page.
--
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler) ***@rasterman.com
Drew DeVault
2016-03-28 14:55:05 UTC
Permalink
Post by Carsten Haitzler (The Rasterman)
should we? is it right to create yet another rsecurity model in userspace
"quickly" just to solve things that dont NEED solving at least at this point.
I don't think that the protocol proposed in other branches of this
thread is complex or short sighted. Can you hop on that branch and
provide feedback?
Post by Carsten Haitzler (The Rasterman)
adding watermarks can be done after encoding as another pass (encode in high
quality). hell watermarks can just be a WINDOW (surface) on the screen. you
don't need options. ass for audio - not too hard to do along with it. just
offer to record an input device - and choose (input can be current mixed output
or a mic ... or both).
You're still not grasping the scope of this. I want you to run this
command right now:

man ffmpeg-all

Just read it for a while. You're delusional if you think you can
feasibly implement all of these features in the compositor. Do you
honestly want your screen capture tool to be able to add a watermark?
How about live streaming, some people add a sort of extra UI to read off
donations and such. The scope of your screen capture tool is increasing
at an alarming rate if you intend to support all of the features
currently possible with ffmpeg. How about instead we make a simple
wayland protocol extension that we can integrate with ffmpeg and OBS and
imagemagick and so on in a single C file.
Post by Carsten Haitzler (The Rasterman)
exactly what you describe is how e works out of the box. no sscripts needed.
requiring people write script to do their screen configuration is just wrong.
taking the position of "well i give up and won't bother and will just make my
users write scripts instead" iss sticking your head in the sand and not solving
the problem. you are now asking everyone ELSE who writes a compositor to
implement a protocol because YOU wont solve a problem that others have solved
in a user friendly manner.
What if I want my laptop display to remain usable? Right now I'm docked
somewhere else and I actually do have this scenario - my laptop is one
of my working displays. How would I configure the difference between
these situations in your tool? What if I'm on a laptop with poorly
supported hardware (I've seen this before) where there's a limit on how
many outputs I can use at once? What if I want to write a script where I
put on a movie and it disables every output but my TV automatically? The
user is losing a lot of power here and there's no way you can satisfy
everyone's needs unless you make it programmable.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
Base your desktop's tools on the common protocol, of course. Gnome
settings, KDE settings, arandr, xrandr, nvidia-settings, and so on, all
seem to work fine configuring your outputs with the same protocol today.
Yes, the protocol is meh and the implementation is a mess, but the
clients of that protocol aren't bad by any stretch of the imagination.
no tools. why do it? it's built in. in order for screen config "magic" to
work set of metadata attached to screens. you can set priority (screens get
numbers from highest to lowest priority at any given time allowing behaviour
like your "primary" screen to migrate to an external one then migrate back when
external monitor is attached etc.) sure we can start having that metadata
separate but then ALTERNATE TOOLS won't be able to configure it thus breaking
the desktop environment not providing metadata and other settings associated
with a display. this breaks functionality for users who then complain about
things not working right AND then the compositor has to now deal with these
"error cases" too because a foreign tool will be messing with its data/setup.
Your example has a pretty straightforward baseline - the "default"
profile. Even so, we can design the protocol to make the custom metadata
options visible to the tools, and the tools can then provide the user
with options to configure that as well.
Post by Carsten Haitzler (The Rasterman)
as above. i have seen screen configuration used and abused over the years where
i just do not want to have a protocol for messing around with it for any
client. give them an inch and they'll take a mile.
Let them take a mile. _I_ want a mile. Here's an old quote that I think
is always relevant:

UNIX was not designed to stop its users from doing stupid things, as
that would also stop them from doing clever things.
Post by Carsten Haitzler (The Rasterman)
and that's perfectly fine - that is your choice. do not force your choice on
other compositors. you can implement all the protocol you want in any way you
want for your wm's tools.
Why do we have to be disjointed? We have a common set of problems and we
should strive for a common set of solutions.
Post by Carsten Haitzler (The Rasterman)
gnome does almost everything with dbus. they love dbus. a lot of gnome is
centred around dbus. they likely will choose dbus to do this. likely. i
personally wouldn't choose to use dbus.
Let's not speak for Gnome. They're copied on this thread, they'll speak
for themselves.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
primary display? What about applications that use the entire output for
the app can simply not request to present on their "presentation" screen... or
the user would mark their primary screen (internal on laptop maybe) AS their
presentation screen - more metadata to be held by compositor.
Then we're back to the very thing you were criticising before - making
the applications implement some sort of switch between using a
"presentation" output and using some other kind of output. It would be a
lot less complicated if the application asked to go full screen and the
compositor said "hey, this app wants to be full screen, which output
would you like to put it on?"
Post by Carsten Haitzler (The Rasterman)
now ALL presentation tools behave the same - you dont have to reconfigure each
one separately and deal with the difference and lack or otherwise of features.
it's done in 1 place - compositor, and then all apps that want to do a
similar thing follow and work "as expected". far better than just ignoring the
issue. you yourself already talked about extra tags/hints/whatever - this is
one of those.
I think I'm getting at something here. Does the workflow I just
described satisfy everyone's needs for this?
Post by Carsten Haitzler (The Rasterman)
because this require clients DEFINING screen layout. wayland was specifically
designed to HIDE THIS. if the compositor displayed a screen wrapped around a
sphere in real life in a room - then it doesn't have rectangles... how will an
app deal with that? what if the compositor is literally a VR world with
surfaces wrapped around spheres and cubes - the point of wayland's design was
to hide this info from clients completely so the compositor decides based on
environment, not each and every client. this was a basic premise/design in
wayland from the get go and it was a good one. letting apps break this
abstraction breaks this design.
In practice the VAST majority of our users are going to be using one or
more rectangular displays. We shouldn't cripple what they can do for the
sake of the niche. We can support both - why do we have to hide
information about the type of outputs in use from the clients? It
doesn't make sense for an app to get fullscreened in a virtual reality
compositor, yet we still support that. Rather than shoehorning every
design to meet the least common denominator, we should be flexible.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
No. Applications want to be full screen or they don't want to be. If
they want to pick a particular output, we can easily let them do so.
i don't know about you.. but fullscreen to enlightenment means you use up ONE
SCREEN. [snip]
I never said that fullscreen means multiple screens. No clue where
that's coming from.
Post by Carsten Haitzler (The Rasterman)
what makes sense is an app hints at the purpose of its window and opens n
windows (surfaces). it can ask for fullscreen for each. the hints would allow
the compositor to choose which screen the window/surface is assigned to.
Hinting doesn't and cannot capture all of the use cases. Just letting
the client say what it wants does.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
Gnome calculator doesn't like being tiled: https://sr.ht/Ai5N.png
i think the problem is you are not handling min/max sizing of clients
properly. :) you need to fix sway. gnome calculator is not sizing up its buffer
on surface size. that is a message "i can't be bigger than this - this is my
biggest size. deal with is". you need to deal with it. eg - pad it and make it
sized AT the buffer size :)
This is harmful to tiling window managers in general. The window manager
arranges the windows, not the other way around. You can't have tiling
window management if you can't have the compositor tell the clients what
size to be. There's currently no metadata to tell the compositor that a
surface is strict about its geometry. Most applications handle being
given a size quite well and will rearrange/rerender itself to
compensate. Things like gnome-calcualtor are the exception, not the
rule.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
xdg shell should be handling these already - except dmenu. dmenu is almost a
special desktop component. like a shelf/panel/bar thing.
dmenu isn't the only one, though, that may want to arrange itself in
special ways. Lemonbar and rofi also come to mind.
all of these basically are "desktop components" ala
taskbars/shelves/panels/whatever - i know that for e we don't want to support
such apps. these are built in. i don't know what gnome or kde think but these
go against their design as an integrated desktop environment. YOU need these
because your compositor has no such feature itself. the bigger desktops don't
need it. they MAY support it - may not. i know i don't want to. :)
Users should be free to choose the tools they want. dmenu is much more
flexible and scriptable than anything any of the DEs offer in its place
- you just pipe in a list of things and the user picks one. Don't be
fooled into thinking that whatever your DE does for a given feature is
the mecca of that feature. Like you were saying to make other points -
there are fewer contributors to each DE than you might imagine. DEs are
spread too thin to make the perfect _everything_. But some projects like
dmenu are small and singular in their focus, and maintained by one or
two people who put in a much larger amount of effort than is put in by
DE contributors on the corresponding features of that DE.

Be flexible enough for users to pick the tools they want.
Post by Carsten Haitzler (The Rasterman)
i don't know osu - but i see no reason krita needs to configure a tablet. it
can just deal with input from it. :)
input is very sensitive. having done this for years and watched how games like
to turn off key repeat then leave it off when they crash... or change mouse
accel then you find its changed everywhere and have to "fix it" etc. etc. - i'd
be loathe to do this. give them TOO much config ability anbd it can become a
security issue.
Let's change the tone of the input configuration discussion. I've come
around to your points about providing input configuration in general to
clients, let's not do that. I think the only issue we should worry about
for input at this point is fixing the pointer-constraints protocol to
use our new permissions model.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
Why do those things need to be dealt with first? Sway is at a good spot
where I can start thinking about these sorts of things. There are
enough people involved to work on multiple things at once. Plus,
everyone thinks nvidia's design is bad and we're hopefully going to see
something from them that avoids vendor-specific code.
because these imho are far more important. you might be surprised at how few
people are involved.
These features have to get done at some point. Backlog your
implementation of these protocols if you can't work on it now.
Post by Carsten Haitzler (The Rasterman)
not so simple. with more of the ui of an app being moved INTO the border
(titlebar etc.) this is not a simple thing to just turn it off. you then turn
OFF necessary parts of the ui or have to push the problem out to the app to
"fallback".
You misunderstand me. I'm not suggesting that these apps be crippled.
I'm suggesting that, during the negotiation, they _object_ to having the
server draw their decorations. Then other apps that don't care can say
so.
Post by Carsten Haitzler (The Rasterman)
only having CSD solves all that complexity and is more efficient
than SSD when it comes to things like assigning hw layers or avoiding copies of
vast amounts of pixels. i was against CSD to start with too but i see their
major benefits.
I don't want to rehash this old argument here. There's two sides to this
coin. I think everyone fully understands the other position. It's not
hard to reach a compromise on this.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
In Wayland you create a surface, then assign it a role. Extra details
can go in between, or go in the call that gives it a role. Right now
most applications are creating their surface and then making it a shell
surface. The compositor can negotiate based on its own internal state
over whether a given output is tiled or not, or in cases like AwesomeWM,
whether a given workspace is tiled or not. And I don't think the
decision has to be final. If the window is moved to another output or
really if any of the circumstances change, they can renegotiate and the
surface can start drawing its own decorations.
yup. but this signalling/negotiation has to exist. currently it doesnt. :)
We'll make this part of the protocols we're working on here :)
Post by Carsten Haitzler (The Rasterman)
you aren't going to talk me into implementing something that is important for
you and not a priority for e until such a time as i'm satisfied that the other
issues are solved. you are free to do what you want, but standardizing things
takes a looong time and a lot of experimentation, discussion, and repeating
this. we have resources on wayland and nothing you described is a priority for
them. there are far more important things to do that are actual business
requirements and so the people working need to prioritize what is such a
requirement as opposed to what is not. resources are not infinite and free.
Like I said before, put it on your backlog. I'm doing it now, and I want
your input on it. Provide feedback now and implement later if you need
to, but if you don't then the protocols won't meet your needs.
Post by Carsten Haitzler (The Rasterman)
let me complicate it for you. let's say i'm playing a video fullscreen. you now
have to convert argb to yuv then encode when it would have been far more
efficient to get access directly to the yuv buffer before it was even scaled to
screen size... :) so you have just specified a protocol that is by design
inefficient when it could be more efficient.
What, do you expect to tell libavcodec to switch pixel formats
mid-recording? No one is recording their screen all the time. Yeah, you
might hit performance issues. So be it. It may not be ideal but it'll
likely be well within the limits of reason.
Post by Carsten Haitzler (The Rasterman)
yes - but when, how often and via what mechanisms pixels get there is a very
delicate thing.
And yet you still need to convert the entire screen to a frame and feed
it into an encoder, no matter what. Feed the frame to a client instead.
Post by Carsten Haitzler (The Rasterman)
so far we don't exactly have a lot of inter-desktop co-operation happening.
it's pretty much everyone for themselves except for a smallish core protocol.
Which is ridiculous.
Post by Carsten Haitzler (The Rasterman)
do NOT try and solve security sensitive AND performance sensitive AND design
limiting/dictating things first and definitely don't do it without everyone on
the same page.
I'm here to get everyone on the same page. Get on it.

--
Drew DeVault
Drew DeVault
2016-03-28 17:44:42 UTC
Permalink
If you want to add additional stuff on top of a live stream, use
something with a programmable pipeline that can add effects to the
stream coming from the compositor. Why do we need negotiation, or user
interation, or exchange of metadata for this stuff?
The stream isn't coming from the compositor. That's the point. It needs
to be. However, providing programmable access to that stream is a
security concern, so it should be given only to certain privledged
clients.

--
Drew DeVault
Carsten Haitzler (The Rasterman)
2016-03-29 02:31:01 UTC
Permalink
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
should we? is it right to create yet another rsecurity model in userspace
"quickly" just to solve things that dont NEED solving at least at this point.
I don't think that the protocol proposed in other branches of this
thread is complex or short sighted. Can you hop on that branch and
provide feedback?
my take on it is that it's premature and not needed at this point. in fact i
wouldn't implement a protocol at all. *IF* i were to allow special access, i'd
simply require to fork the process directly from compositor and provide a
socketpair fd to this process and THAT fd could have extra capabilities
attached to the wl protocol. i would do nothing else because as a compositor i
cannot be sure what i am executing. i'd hand over the choice of being able to
execute this tool to the user to say ok to and not just blindly execute
anything i like.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
adding watermarks can be done after encoding as another pass (encode in high
quality). hell watermarks can just be a WINDOW (surface) on the screen. you
don't need options. ass for audio - not too hard to do along with it. just
offer to record an input device - and choose (input can be current mixed
output or a mic ... or both).
You're still not grasping the scope of this. I want you to run this
man ffmpeg-all
Just read it for a while. You're delusional if you think you can
feasibly implement all of these features in the compositor. Do you
all a compositor has to do is be able to capture a video stream to a file. you
can ADD watermarking, sepia, and other effects later on in a video editor. next
you'll tell me gimp is incapable of editing image files so we need programmatic
access to a digital cameras ccd to implement effects/watermarking etc. on
photos...
Post by Drew DeVault
honestly want your screen capture tool to be able to add a watermark?
no - this can be done in a video editing tool later on. just record video at
high quality so degradation is not an issue.
Post by Drew DeVault
How about live streaming, some people add a sort of extra UI to read off
donations and such. The scope of your screen capture tool is increasing
at an alarming rate if you intend to support all of the features
no. i actually did not increase the scope. i kept it simple to "compositor can
write a file". everything else can be done in a post-processing task. that file
may include captured audio at the same time from a specific audio input.
Post by Drew DeVault
currently possible with ffmpeg. How about instead we make a simple
wayland protocol extension that we can integrate with ffmpeg and OBS and
imagemagick and so on in a single C file.
i'm repeating myself. there are bigger fish to fry.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
exactly what you describe is how e works out of the box. no sscripts needed.
requiring people write script to do their screen configuration is just
wrong. taking the position of "well i give up and won't bother and will
just make my users write scripts instead" iss sticking your head in the
sand and not solving the problem. you are now asking everyone ELSE who
writes a compositor to implement a protocol because YOU wont solve a
problem that others have solved in a user friendly manner.
What if I want my laptop display to remain usable? Right now I'm docked
eh? ummm that is what happens - unless you close the lid, then internal display
is "disconnected".
Post by Drew DeVault
somewhere else and I actually do have this scenario - my laptop is one
of my working displays. How would I configure the difference between
these situations in your tool? What if I'm on a laptop with poorly
supported hardware (I've seen this before) where there's a limit on how
many outputs I can use at once? What if I want to write a script where I
put on a movie and it disables every output but my TV automatically? The
user is losing a lot of power here and there's no way you can satisfy
everyone's needs unless you make it programmable.
not true. this can be encapsulated without it being programmable. i have yet to
find a laptop that cannot run all its outputs, but the general limitation can
be accounted for - eg via prioritization. if you have 4 outputs and only 3 can
work at a time - then chose the 3 with the highest priority - adjust priority
of screens to have what you want.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
Base your desktop's tools on the common protocol, of course. Gnome
settings, KDE settings, arandr, xrandr, nvidia-settings, and so on, all
seem to work fine configuring your outputs with the same protocol today.
Yes, the protocol is meh and the implementation is a mess, but the
clients of that protocol aren't bad by any stretch of the imagination.
no tools. why do it? it's built in. in order for screen config "magic" to
work set of metadata attached to screens. you can set priority (screens
get numbers from highest to lowest priority at any given time allowing
behaviour like your "primary" screen to migrate to an external one then
migrate back when external monitor is attached etc.) sure we can start
having that metadata separate but then ALTERNATE TOOLS won't be able to
configure it thus breaking the desktop environment not providing metadata
and other settings associated with a display. this breaks functionality for
users who then complain about things not working right AND then the
compositor has to now deal with these "error cases" too because a foreign
tool will be messing with its data/setup.
Your example has a pretty straightforward baseline - the "default"
profile. Even so, we can design the protocol to make the custom metadata
options visible to the tools, and the tools can then provide the user
with options to configure that as well.
a protocol with undefined metadata is not a good protocol. it's now goes blobs
of data that are opaque except to specific implementations., this will mean
that other implementations eventually will do things like strip it out or damage
it as they don't know what it is nor do they care.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
as above. i have seen screen configuration used and abused over the years
where i just do not want to have a protocol for messing around with it for
any client. give them an inch and they'll take a mile.
Let them take a mile. _I_ want a mile. Here's an old quote that I think
UNIX was not designed to stop its users from doing stupid things, as
that would also stop them from doing clever things.
but it isn't the user - it's some game you download that you cannot alter the
code or behaviour of that then messes everything up because its creator only
ever had a single monitor and didn't account for those with 2 or 3.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
and that's perfectly fine - that is your choice. do not force your choice on
other compositors. you can implement all the protocol you want in any way
you want for your wm's tools.
Why do we have to be disjointed? We have a common set of problems and we
should strive for a common set of solutions.
because things like output configuration i do not see as needing a common
protocol. in fact it's desirable to not have one at all so it cannot be abused
or cause trouble.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
gnome does almost everything with dbus. they love dbus. a lot of gnome is
centred around dbus. they likely will choose dbus to do this. likely. i
personally wouldn't choose to use dbus.
Let's not speak for Gnome. They're copied on this thread, they'll speak
for themselves.
my point is that not everyone chooses the same solution as you. not everyone
has the same problem and needs to solve it or WANTS to solve it the same way.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
primary display? What about applications that use the entire output for
the app can simply not request to present on their "presentation" screen...
or the user would mark their primary screen (internal on laptop maybe) AS
their presentation screen - more metadata to be held by compositor.
Then we're back to the very thing you were criticising before - making
the applications implement some sort of switch between using a
"presentation" output and using some other kind of output. It would be a
lot less complicated if the application asked to go full screen and the
compositor said "hey, this app wants to be full screen, which output
would you like to put it on?"
that needs ZERO protocol extending. there already is a fullscreen request in
xdg shell. this is a compositor implementation detail. if all you want to do is
ask the user where to place the fullscreen window. if you want to open multiple
windows and have them on the most appropriate screen by default without asking
the user, then you need a little metadata. asking the app to explicitly define
the output simply means you now have N possible ways this could work depending
on each and every app. leave it to the compositor to decide along with hints
that tell the compositor the likely usage purpose of the window. a user can
always move it somewhere else via the compositor (hotkey, alt+left mouse drag
to somewhere else or some other mechanism).

but we are talking things like output control/configuration - why does a
presentation app need this control? control the actual setup of the output or
even explicitly define exactly what output (by name, id, number, etc.) to go
for? why does an app need to be able to target a specific output
programatically rather than simply give the intent/purpose of the
surface/window?
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
now ALL presentation tools behave the same - you dont have to reconfigure
each one separately and deal with the difference and lack or otherwise of
features. it's done in 1 place - compositor, and then all apps that want to
do a similar thing follow and work "as expected". far better than just
ignoring the issue. you yourself already talked about extra
tags/hints/whatever - this is one of those.
I think I'm getting at something here. Does the workflow I just
described satisfy everyone's needs for this?
Post by Carsten Haitzler (The Rasterman)
because this require clients DEFINING screen layout. wayland was
specifically designed to HIDE THIS. if the compositor displayed a screen
wrapped around a sphere in real life in a room - then it doesn't have
rectangles... how will an app deal with that? what if the compositor is
literally a VR world with surfaces wrapped around spheres and cubes - the
point of wayland's design was to hide this info from clients completely so
the compositor decides based on environment, not each and every client.
this was a basic premise/design in wayland from the get go and it was a
good one. letting apps break this abstraction breaks this design.
In practice the VAST majority of our users are going to be using one or
more rectangular displays. We shouldn't cripple what they can do for the
sake of the niche. We can support both - why do we have to hide
information about the type of outputs in use from the clients? It
doesn't make sense for an app to get fullscreened in a virtual reality
compositor, yet we still support that. Rather than shoehorning every
design to meet the least common denominator, we should be flexible.
they are not crippled. that's the point. in virtual reality fullscreen makes
sense as a "take over thew world", not take over the output to one eye.for
monitors on a desktop it makes sense to take over that monitor but not others.
so it depends on context and the compositors job is to interpret/manage/deal
with that context.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
No. Applications want to be full screen or they don't want to be. If
they want to pick a particular output, we can easily let them do so.
i don't know about you.. but fullscreen to enlightenment means you use up
ONE SCREEN. [snip]
I never said that fullscreen means multiple screens. No clue where
that's coming from.
then why does this presentation tool need to be able to configure outputs - eg
define which screen views which part of their window spanning all outputs? i
see no other purpose of having configuration control of outputs for a
presentation tool.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
what makes sense is an app hints at the purpose of its window and opens n
windows (surfaces). it can ask for fullscreen for each. the hints would
allow the compositor to choose which screen the window/surface is assigned
to.
Hinting doesn't and cannot capture all of the use cases. Just letting
the client say what it wants does.
clients explicitly saying what they want leads to broken scenarios. the game
dev who has never had > 1 screen and thus messes up users multi screen setups
because they never knew of nor cared about this situation. a HINT allows
interpretation to adapt the scenario nicely and make things work "properly".

the "i'd like to be fullscreen" hint from xdg has been a godsend - it doesn't
allow for clients to go "well i want to be at 50,80 and at 1278x968" (though
other bits of x do). apps used to do things like query root window size, create
override-redirect window , grab kbd and mouse and then display ... even though
root window may span many monitors and some parts of the rot window geom may
not be visible as no screen views that because the guy didn't know about randr
and such. worse they would play with xvidtune that only did 1 screen and thus
mess up all your screen config... because a protocol was invented that allows
EXPLICIT control and x HAD to implement explicit control. the fullscreen netwm
hint has drastically improved things as a high level hint allowing the wm to
interpret fullscreen in a way that makes sense given the scenario.

by the same token anything we do in wayland should be done at this higher level
hinting level. anything else is a recipe for disaster. it's not learning the
lessons of the past.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
Gnome calculator doesn't like being tiled: https://sr.ht/Ai5N.png
i think the problem is you are not handling min/max sizing of clients
properly. :) you need to fix sway. gnome calculator is not sizing up its
buffer on surface size. that is a message "i can't be bigger than this -
this is my biggest size. deal with is". you need to deal with it. eg - pad
it and make it sized AT the buffer size :)
This is harmful to tiling window managers in general. The window manager
arranges the windows, not the other way around. You can't have tiling
sorry. neither in x11 nor in wayland does a wm/compositor just have the freedom
to resize a window to any size it likes WITHOUT CONSEQUENCES. in x11 min/max
size hints tell the wm the range of sizes a window can be sensibly drawn/laid
out with. in wayland it's communicated by buffer size. if you choose to ignore
this then you get to deal with the consequences as in your screenshot.

i would not just blindly ignore such info. i'd either pad with black/background
and keep to the buffer size or at least scale while retaining aspect ratio (and
pad as needed but likely less).

interestingly now you complain about clients having EXPLICIT control and you
say "oh well no ... this is bad for tiling wm's" ... yet when i explain that
having output configuration control etc. etc. is harmful it's something that
SHOULD be allowed for clients... (and where the output isn't even a client
resource unlike the buffers that they render which is one).
Post by Drew DeVault
window management if you can't have the compositor tell the clients what
size to be. There's currently no metadata to tell the compositor that a
surface is strict about its geometry. Most applications handle being
given a size quite well and will rearrange/rerender itself to
compensate. Things like gnome-calcualtor are the exception, not the
rule.
yes there is - the buffer size of the next frame. your surface size is a
"request" to client for that size. the response will be a new buffer or some
given size (or maybe no new buffer at all). you THEN deal with this new size. :)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
xdg shell should be handling these already - except dmenu. dmenu is
almost a special desktop component. like a shelf/panel/bar thing.
dmenu isn't the only one, though, that may want to arrange itself in
special ways. Lemonbar and rofi also come to mind.
all of these basically are "desktop components" ala
taskbars/shelves/panels/whatever - i know that for e we don't want to
support such apps. these are built in. i don't know what gnome or kde think
but these go against their design as an integrated desktop environment. YOU
need these because your compositor has no such feature itself. the bigger
desktops don't need it. they MAY support it - may not. i know i don't want
to. :)
Users should be free to choose the tools they want. dmenu is much more
flexible and scriptable than anything any of the DEs offer in its place
that is your wm's design. that is not the design of others. they want something
integrated and don't want external tools.
Post by Drew DeVault
- you just pipe in a list of things and the user picks one. Don't be
fooled into thinking that whatever your DE does for a given feature is
the mecca of that feature. Like you were saying to make other points -
no - but i'm saying that this is not a COMMON feature among all DEs. different
ones will work differently. gnome 3's chosen design these days is to put it
into gnome shell via js extensions, not the gnome 2 way with a separate panel
process (ala dmenu). enlightenment does it internally too and extend
differently. my point is that what you want here is not universal.
Post by Drew DeVault
there are fewer contributors to each DE than you might imagine. DEs are
that is exactly what i said in response to you saying that "we have all the
resources to do all of this" when i said we don't... :/ we don't - resources
are already expended elsewhere.
Post by Drew DeVault
spread too thin to make the perfect _everything_. But some projects like
dmenu are small and singular in their focus, and maintained by one or
two people who put in a much larger amount of effort than is put in by
DE contributors on the corresponding features of that DE.
Be flexible enough for users to pick the tools they want.
a lifetime of doing wm's has taught me that this approach is not the best. you
end up with a limiting and complex protocol to then allow taskbars, pagers and
so on to be in "dmenus" of this world. this is how gnome 1.x and 2.x worked. i
added the support in e long ago. i learned that it was a limiter in adding
features as you had to conform to someone elses idea of what virtual desktops
are etc.

these panels/taskbars/shelves/whatever are best being closely integrated into
the wm.

YOU choose not to integrate. the other major DEs come already integrated with
these. this is not a universal solution everyone should support. you can come
up with your own extension and encourage people to support it in their demnu's
etc. - if another DE wants to support this then they can implement the same
extension.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
i don't know osu - but i see no reason krita needs to configure a tablet. it
can just deal with input from it. :)
input is very sensitive. having done this for years and watched how games
like to turn off key repeat then leave it off when they crash... or change
mouse accel then you find its changed everywhere and have to "fix it" etc.
etc. - i'd be loathe to do this. give them TOO much config ability anbd it
can become a security issue.
Let's change the tone of the input configuration discussion. I've come
around to your points about providing input configuration in general to
clients, let's not do that. I think the only issue we should worry about
for input at this point is fixing the pointer-constraints protocol to
use our new permissions model.
that's very reasonable. :)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
Why do those things need to be dealt with first? Sway is at a good spot
where I can start thinking about these sorts of things. There are
enough people involved to work on multiple things at once. Plus,
everyone thinks nvidia's design is bad and we're hopefully going to see
something from them that avoids vendor-specific code.
because these imho are far more important. you might be surprised at how few
people are involved.
These features have to get done at some point. Backlog your
implementation of these protocols if you can't work on it now.
that's what i'm saying. :)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
not so simple. with more of the ui of an app being moved INTO the border
(titlebar etc.) this is not a simple thing to just turn it off. you then
turn OFF necessary parts of the ui or have to push the problem out to the
app to "fallback".
You misunderstand me. I'm not suggesting that these apps be crippled.
I'm suggesting that, during the negotiation, they _object_ to having the
server draw their decorations. Then other apps that don't care can say
so.
aaah ok. so compositor adapts. then likely i would express this as a "minimize
your decorations" protocol from compositor to client, client to compositor then
responds similarly like "minimize your decorations" and compositor MAY choose
to not draw a shadow/titlebar etc. (or client responds with "ok" and then
compositor can draw all it likes around the app).
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
only having CSD solves all that complexity and is more efficient
than SSD when it comes to things like assigning hw layers or avoiding
copies of vast amounts of pixels. i was against CSD to start with too but i
see their major benefits.
I don't want to rehash this old argument here. There's two sides to this
coin. I think everyone fully understands the other position. It's not
hard to reach a compromise on this.
it's sad that we have to have this disagreement at all. :) go on. join the dark
side! :) we have cookies!
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
In Wayland you create a surface, then assign it a role. Extra details
can go in between, or go in the call that gives it a role. Right now
most applications are creating their surface and then making it a shell
surface. The compositor can negotiate based on its own internal state
over whether a given output is tiled or not, or in cases like AwesomeWM,
whether a given workspace is tiled or not. And I don't think the
decision has to be final. If the window is moved to another output or
really if any of the circumstances change, they can renegotiate and the
surface can start drawing its own decorations.
yup. but this signalling/negotiation has to exist. currently it doesnt. :)
We'll make this part of the protocols we're working on here :)
this i can agree on. :)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
you aren't going to talk me into implementing something that is important
for you and not a priority for e until such a time as i'm satisfied that
the other issues are solved. you are free to do what you want, but
standardizing things takes a looong time and a lot of experimentation,
discussion, and repeating this. we have resources on wayland and nothing
you described is a priority for them. there are far more important things
to do that are actual business requirements and so the people working need
to prioritize what is such a requirement as opposed to what is not.
resources are not infinite and free.
Like I said before, put it on your backlog. I'm doing it now, and I want
your input on it. Provide feedback now and implement later if you need
to, but if you don't then the protocols won't meet your needs.
Post by Carsten Haitzler (The Rasterman)
let me complicate it for you. let's say i'm playing a video fullscreen. you
now have to convert argb to yuv then encode when it would have been far more
efficient to get access directly to the yuv buffer before it was even
scaled to screen size... :) so you have just specified a protocol that is
by design inefficient when it could be more efficient.
What, do you expect to tell libavcodec to switch pixel formats
mid-recording? No one is recording their screen all the time. Yeah, you
might hit performance issues. So be it. It may not be ideal but it'll
likely be well within the limits of reason.
you'll appreciate what i'm getting at next time you have to do 4k ... or 8k
video and screencast/capture that. :) and have to do miracast... on a 1.3ghz
arm device :)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
yes - but when, how often and via what mechanisms pixels get there is a very
delicate thing.
And yet you still need to convert the entire screen to a frame and feed
it into an encoder, no matter what. Feed the frame to a client instead.
is the screen a single frame or multiple pieced together by scanout hw
layers? :) what is your protcol/interface to the "screen stream". if you have
it be a simple "single buffer" then you are going to soon enough run into
issues. :)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
so far we don't exactly have a lot of inter-desktop co-operation happening.
it's pretty much everyone for themselves except for a smallish core protocol.
Which is ridiculous.
Post by Carsten Haitzler (The Rasterman)
do NOT try and solve security sensitive AND performance sensitive AND design
limiting/dictating things first and definitely don't do it without everyone
on the same page.
I'm here to get everyone on the same page. Get on it.
let's work on the things we do have in common first. :)
--
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler) ***@rasterman.com
Drew DeVault
2016-03-29 04:01:00 UTC
Permalink
Post by Carsten Haitzler (The Rasterman)
my take on it is that it's premature and not needed at this point. in fact i
wouldn't implement a protocol at all. *IF* i were to allow special access, i'd
simply require to fork the process directly from compositor and provide a
socketpair fd to this process and THAT fd could have extra capabilities
attached to the wl protocol. i would do nothing else because as a compositor i
cannot be sure what i am executing. i'd hand over the choice of being able to
execute this tool to the user to say ok to and not just blindly execute
anything i like.
I don't really understand why forking from the compositor and bringing
along the fds really gives you much of a gain in terms of security. Can
you elaborate on how this changes things? I should also mention that I
don't really see the sort of security goals Wayland has in mind as
attainable until we start doing things like containerizing applications,
in which case we can elimitate entire classes of problems from this
design.
Post by Carsten Haitzler (The Rasterman)
all a compositor has to do is be able to capture a video stream to a file. you
can ADD watermarking, sepia, and other effects later on in a video editor. next
you'll tell me gimp is incapable of editing image files so we need programmatic
access to a digital cameras ccd to implement effects/watermarking etc. on
photos...
I'll remind you again that none of this supports the live streaming
use-case.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
currently possible with ffmpeg. How about instead we make a simple
wayland protocol extension that we can integrate with ffmpeg and OBS and
imagemagick and so on in a single C file.
i'm repeating myself. there are bigger fish to fry.
I'm repeating myself. Fry whatever fish you want and backlog this fish.
Post by Carsten Haitzler (The Rasterman)
eh? ummm that is what happens - unless you close the lid, then internal display
is "disconnected".
I'm snipping out a lot of the output configuration related stuff from
this response. I'm not going to argue very hard for a common output
configuration protocol. I've been trying to change gears on the output
discussion towards a discussion around whether or not the
fullscreen-shell protocol supports our needs and whether or how it needs
to be updated wrt permissions. I'm going to continue to omit large parts
of your response that I think are related to the resistance to output
configuration, let me know if there's something important I'm dropping
by doing so.
Post by Carsten Haitzler (The Rasterman)
a protocol with undefined metadata is not a good protocol. it's now goes blobs
of data that are opaque except to specific implementations., this will mean
that other implementations eventually will do things like strip it out or damage
it as they don't know what it is nor do they care.
It doesn't have to be undefined metadata. It can just be extensions. A
protocol with extensions built in is a good protocol whose designers had
foresight, kind of like the Wayland protocol we're all already making
extensions for.
Post by Carsten Haitzler (The Rasterman)
but it isn't the user - it's some game you download that you cannot alter the
code or behaviour of that then messes everything up because its creator only
ever had a single monitor and didn't account for those with 2 or 3.
But it _is_ the user. Let the user configure what they want, however
they want, and make it so that they can both do this AND deny crappy
games the right to do it as well. This applies to the entire discussion
broadly, not necessarily just to the output configuration bits (which I
retract).
Post by Carsten Haitzler (The Rasterman)
because things like output configuration i do not see as needing a common
protocol. in fact it's desirable to not have one at all so it cannot be abused
or cause trouble.
Troublemaking software is going to continue to make trouble. Further
news at 9. That doesn't really justify making trouble for users as well.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
In practice the VAST majority of our users are going to be using one or
more rectangular displays. We shouldn't cripple what they can do for the
sake of the niche. We can support both - why do we have to hide
information about the type of outputs in use from the clients? It
doesn't make sense for an app to get fullscreened in a virtual reality
compositor, yet we still support that. Rather than shoehorning every
design to meet the least common denominator, we should be flexible.
they are not crippled. that's the point. in virtual reality fullscreen makes
sense as a "take over thew world", not take over the output to one eye.for
monitors on a desktop it makes sense to take over that monitor but not others.
so it depends on context and the compositors job is to interpret/manage/deal
with that context.
I don't really understand what you're getting at here.
Post by Carsten Haitzler (The Rasterman)
sorry. neither in x11 nor in wayland does a wm/compositor just have the freedom
to resize a window to any size it likes WITHOUT CONSEQUENCES. in x11 min/max
size hints tell the wm the range of sizes a window can be sensibly drawn/laid
out with. in wayland it's communicated by buffer size. if you choose to ignore
this then you get to deal with the consequences as in your screenshot.
Here's gnome-calculator running on x with a tiling window manager:

Loading Image...

Here's the wayland screenshot again for comparison:

https://sr.ht/Ai5N.png

Most apps are fine with being told what resolution to be, and they
_need_ to be fine with this for the sake of my sanity. But I understand
that several applications have special concerns that would prevent this
from making sense, and for those it's simply a matter of saying that
they'd prefer to be floating. This is actually one of the things in the
X ecosystem that works perfectly fine and has worked perfectly fine for
a long time.
Post by Carsten Haitzler (The Rasterman)
i would not just blindly ignore such info. i'd either pad with black/background
and keep to the buffer size or at least scale while retaining aspect ratio (and
pad as needed but likely less).
Eww.
Post by Carsten Haitzler (The Rasterman)
interestingly now you complain about clients having EXPLICIT control and you
say "oh well no ... this is bad for tiling wm's" ... yet when i explain that
having output configuration control etc. etc. is harmful it's something that
SHOULD be allowed for clients... (and where the output isn't even a client
resource unlike the buffers that they render which is one).
What I really want is _users_ to have control. I don't like it that
compositors are forcing solutions on them that doesn't allow them to be
in control of how their shit works.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
Users should be free to choose the tools they want. dmenu is much more
flexible and scriptable than anything any of the DEs offer in its place
that is your wm's design. that is not the design of others.
they want something integrated...
okay
Post by Carsten Haitzler (The Rasterman)
...and don't want external tools.
Bullshit. Give them something integrated and they'll use it. However,
there's no reason why the integrated solution and the external tools
couldn't both exist. The users don't give a fuck about whether or not
the external tools exist. They are apathetic about it, they don't
actively "not want it", and their experience is in no way worsened by
the availablility of external tools. Those who do want external tools,
however, have a worsened experience if we design ourselves into a black
box that no one can extend.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
- you just pipe in a list of things and the user picks one. Don't be
fooled into thinking that whatever your DE does for a given feature is
the mecca of that feature. Like you were saying to make other points -
no - but i'm saying that this is not a COMMON feature among all DEs. different
ones will work differently. gnome 3's chosen design these days is to put it
into gnome shell via js extensions, not the gnome 2 way with a separate panel
process (ala dmenu). enlightenment does it internally too and extend
differently. my point is that what you want here is not universal.
I'm not suggesting anything radical to try and cover all of these use
cases at once. Sway has a protocol that lets a surface indicate it wants
to be docked somewhere, which allows for custom taskbars and things like
dmenu and so on to exist pretty easily, and this protocol is how swaybar
happens to be implemented. This doesn't seem very radical to me, it
doesn't enforce anything on how each of the DEs choose to implement
their this and that.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
there are fewer contributors to each DE than you might imagine. DEs are
that is exactly what i said in response to you saying that "we have all the
resources to do all of this" when i said we don't... :/ we don't - resources
are already expended elsewhere.
We've both used this same argument from each side multiple times, it's
getting kind of old. But I think these statements hold true:

There aren't necessarily enough people to work on the features I'm
proposing right now. I don't think anyone needs to implement this _right
now_. There also aren't ever enough people to give every little feature
of their DE the attention that leads to software that is as high quality
as a similar project with a single focus on that one feature.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
Be flexible enough for users to pick the tools they want.
a lifetime of doing wm's has taught me that this approach is not the best. you
end up with a limiting and complex protocol to then allow taskbars, pagers and
so on to be in "dmenus" of this world. this is how gnome 1.x and 2.x worked. i
added the support in e long ago. i learned that it was a limiter in adding
features as you had to conform to someone elses idea of what virtual desktops
are etc.
A lifetime of using and customizing and scripting WMs that are more
composable and configurable than e, gnome, kde, and most of the other
Big Ones has led me to the opposite conclusion. I'm not suggesting we do
these sorts of efforts ad nauseum. I don't think we're heading towards a
situation where we're agreeing on the implementation of virtual
desktops. I'm putting forth a small handful of important, core features
that we are all going to have to support in some way or another to even
qualify as wayland compositors and subvert X's domainance over the
desktop.
Post by Carsten Haitzler (The Rasterman)
these panels/taskbars/shelves/whatever are best being closely integrated into
the wm.
You don't provide any justification for this, you just say it like it's
gospel, and it's not. I will again remind you that not everyone wants to
buy into a desktop environment wholesale. They may want to piece it
together however they see fit and it's their god damn right to. Anything
else is against the spirit of free software.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
These features have to get done at some point. Backlog your
implementation of these protocols if you can't work on it now.
that's what i'm saying. :)
In this case, I'm not seeing how your points about what order things
need to be done in matters. Now is the right time for me to implement
this in Sway. The major problems you're trying to solve are either
non-issues or solved issues on Sway, and it makes sense to do this now.
I'd like to do it in a way that works for everyone.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
You misunderstand me. I'm not suggesting that these apps be crippled.
I'm suggesting that, during the negotiation, they _object_ to having the
server draw their decorations. Then other apps that don't care can say
so.
aaah ok. so compositor adapts. then likely i would express this as a "minimize
your decorations" protocol from compositor to client, client to compositor then
responds similarly like "minimize your decorations" and compositor MAY choose
to not draw a shadow/titlebar etc. (or client responds with "ok" and then
compositor can draw all it likes around the app).
I think Jonas is on the right track here. This sort of information could
go into xdg_*. It might not need an entire protocol to itself.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
I don't want to rehash this old argument here. There's two sides to this
coin. I think everyone fully understands the other position. It's not
hard to reach a compromise on this.
it's sad that we have to have this disagreement at all. :) go on. join the dark
side! :) we have cookies!
Never! I want my GTK apps and my Qt apps to have the same decorations,
dammit :) Too bad I don't have much hope for making my cursor theme
consistent across my entire desktop...
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
What, do you expect to tell libavcodec to switch pixel formats
mid-recording? No one is recording their screen all the time. Yeah, you
might hit performance issues. So be it. It may not be ideal but it'll
likely be well within the limits of reason.
you'll appreciate what i'm getting at next time you have to do 4k ... or 8k
video and screencast/capture that. :) and have to do miracast... on a 1.3ghz
arm device :)
I'll go back to the earlier argument of "we shouldn't cripple the
majority for the sake of the niche". Who on Earth is going to drive an
8K display on a 1.3ghz ARM device anyway :P

--
Drew DeVault
Carsten Haitzler (The Rasterman)
2016-03-29 06:10:10 UTC
Permalink
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
my take on it is that it's premature and not needed at this point. in fact i
wouldn't implement a protocol at all. *IF* i were to allow special access,
i'd simply require to fork the process directly from compositor and provide
a socketpair fd to this process and THAT fd could have extra capabilities
attached to the wl protocol. i would do nothing else because as a
compositor i cannot be sure what i am executing. i'd hand over the choice
of being able to execute this tool to the user to say ok to and not just
blindly execute anything i like.
I don't really understand why forking from the compositor and bringing
along the fds really gives you much of a gain in terms of security. Can
why?

there is no way a process can access the socket with privs (even know the
extra protocol exists) unless it is executed by the compositor. the compositor
can do whatever it deems "necessary" to ensure it executes only what is
allowed. eg - a whitelist of binary paths. i see this as a lesser chance of a
hole.
Post by Drew DeVault
you elaborate on how this changes things? I should also mention that I
don't really see the sort of security goals Wayland has in mind as
attainable until we start doing things like containerizing applications,
in which case we can elimitate entire classes of problems from this
design.
certain os's do this already - tizen does. we use smack labels. this is why i
care so much about application isolation and not having anything exposed to an
app that it doesn't absolutely need. :) so i am coming from the point of view
of "containering is solved - we need to not break that in wayland" :)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
all a compositor has to do is be able to capture a video stream to a file.
you can ADD watermarking, sepia, and other effects later on in a video
editor. next you'll tell me gimp is incapable of editing image files so we
need programmatic access to a digital cameras ccd to implement
effects/watermarking etc. on photos...
I'll remind you again that none of this supports the live streaming
use-case.
i know - but for just capturing screencasts, adding watermarks etc. - all you
need is to store a stream - the rest can be post-processed.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
currently possible with ffmpeg. How about instead we make a simple
wayland protocol extension that we can integrate with ffmpeg and OBS and
imagemagick and so on in a single C file.
i'm repeating myself. there are bigger fish to fry.
I'm repeating myself. Fry whatever fish you want and backlog this fish.
Post by Carsten Haitzler (The Rasterman)
eh? ummm that is what happens - unless you close the lid, then internal
display is "disconnected".
I'm snipping out a lot of the output configuration related stuff from
this response. I'm not going to argue very hard for a common output
configuration protocol. I've been trying to change gears on the output
discussion towards a discussion around whether or not the
fullscreen-shell protocol supports our needs and whether or how it needs
to be updated wrt permissions. I'm going to continue to omit large parts
of your response that I think are related to the resistance to output
configuration, let me know if there's something important I'm dropping
by doing so.
why do we need the fullscreen shell? that was intended for environments where
apps are only ever fullscreen from memory. xdg shell has the ability for a
window to go fullscreen (or back to normal) this should do just fine. :) sure -
let's talk about this stuff - fullscreening etc.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
a protocol with undefined metadata is not a good protocol. it's now goes
blobs of data that are opaque except to specific implementations., this
will mean that other implementations eventually will do things like strip
it out or damage it as they don't know what it is nor do they care.
It doesn't have to be undefined metadata. It can just be extensions. A
protocol with extensions built in is a good protocol whose designers had
foresight, kind of like the Wayland protocol we're all already making
extensions for.
yeah - but you are creating objects (screens) with no extended data - or
modifying them. you don't have or lose the data. :) let's talk about the actual
apps surfaces and where they go - not configuration of outputs. :)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
but it isn't the user - it's some game you download that you cannot alter
the code or behaviour of that then messes everything up because its creator
only ever had a single monitor and didn't account for those with 2 or 3.
But it _is_ the user. Let the user configure what they want, however
they want, and make it so that they can both do this AND deny crappy
games the right to do it as well. This applies to the entire discussion
broadly, not necessarily just to the output configuration bits (which I
retract).
Post by Carsten Haitzler (The Rasterman)
because things like output configuration i do not see as needing a common
protocol. in fact it's desirable to not have one at all so it cannot be
abused or cause trouble.
Troublemaking software is going to continue to make trouble. Further
news at 9. That doesn't really justify making trouble for users as well.
or just have the compositor "work" without needing scripts and users to have to
learn how to write them. :)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
In practice the VAST majority of our users are going to be using one or
more rectangular displays. We shouldn't cripple what they can do for the
sake of the niche. We can support both - why do we have to hide
information about the type of outputs in use from the clients? It
doesn't make sense for an app to get fullscreened in a virtual reality
compositor, yet we still support that. Rather than shoehorning every
design to meet the least common denominator, we should be flexible.
they are not crippled. that's the point. in virtual reality fullscreen makes
sense as a "take over thew world", not take over the output to one eye.for
monitors on a desktop it makes sense to take over that monitor but not
others. so it depends on context and the compositors job is to
interpret/manage/deal with that context.
I don't really understand what you're getting at here.
apps can still be fullscreen. nothing has been crippled. just what fullscreen
MEANS is defined by context by the compositor.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
sorry. neither in x11 nor in wayland does a wm/compositor just have the
freedom to resize a window to any size it likes WITHOUT CONSEQUENCES. in
x11 min/max size hints tell the wm the range of sizes a window can be
sensibly drawn/laid out with. in wayland it's communicated by buffer size.
if you choose to ignore this then you get to deal with the consequences as
in your screenshot.
https://fuwa.se/f/YIkvDi.png
that'd be the toolkit actually resizing regardless of its min/max hints - the
wayland back end is refusing to do this. the x11 back end is "dealing with it"
even though it doesn't have to. i can point at more software that when you go
beyond max or below min size looks like trash - it may have blank/garbage areas
of the window or fall over in other ways. in x11 you CANNOT hard-control your
window size. the wm can resize it to whatever and ignore your min/max hints.
in wayland the CLIENT controls buffer size and fills buffer with content before
compositor sees it. compositor cant force a buffer size on a client. x and
wayland work differently in the case where the wm decided to just go "screw you
- i'm doing this". you may want to NOT do that and respect the fact the client
has a min and max size and work with it. :)
Post by Drew DeVault
https://sr.ht/Ai5N.png
Most apps are fine with being told what resolution to be, and they
_need_ to be fine with this for the sake of my sanity. But I understand
that several applications have special concerns that would prevent this
but for THEIR sanity, they are not fine with it. :)
Post by Drew DeVault
from making sense, and for those it's simply a matter of saying that
they'd prefer to be floating. This is actually one of the things in the
X ecosystem that works perfectly fine and has worked perfectly fine for
a long time.
no. this has nothing to do with floating. this has to do with minimum and in
this case especially - maximum sizes. it has NOTHING to do with floating. you
are conflating sizing with floating because floating is how YOU HAPPEN to want
to deal with it. you COULD deal with it as i described - pad out the area or
scale retaining aspect ratio - allow user to configure the response. if i had a
small calculator on the left and something that can size up on the right i
would EXPECt a tiling wm to be smart and do:

+---+------------+
| |............|
|:::|............|
|:::|............|
|:::|............|
| |............|
+---+------------+

so keep the left column the max width of all clients and the right side expands
instead. on the left i pad with black/background around the "calculator" there.
that is what i'd expect if a client can't size up. the same for min size
(sizing down) - don't force apps to be smaller than their min size. deal with it
by scrolling or scaling the bitmap or however you like - but deal with it. :)

but don't confuse min and max size with floating. expecting devs to tell you
they want to float is not going to be common as most devs wont target a tiling
wm to make you happy here. YOU should choose to float - eg if window is of a
dialog type, or perhaps if it refuses to adapt to the size given etc. you need
to come up with properties/tags/modes/intents that are common across DEs to
have them be supported commonly. floating will not be common except a SPECIAL
mode for tiling wm's. try something else. :)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
i would not just blindly ignore such info. i'd either pad with
black/background and keep to the buffer size or at least scale while
retaining aspect ratio (and pad as needed but likely less).
Eww.
Post by Carsten Haitzler (The Rasterman)
interestingly now you complain about clients having EXPLICIT control and you
say "oh well no ... this is bad for tiling wm's" ... yet when i explain that
having output configuration control etc. etc. is harmful it's something that
SHOULD be allowed for clients... (and where the output isn't even a client
resource unlike the buffers that they render which is one).
What I really want is _users_ to have control. I don't like it that
compositors are forcing solutions on them that doesn't allow them to be
in control of how their shit works.
they can patch their compositors if they want. if you are forcing users to
write scripts you are already forcing them to "learn to code" in a simple way.
would it not be best to try and make things work without needing scripts/custom
code per user and have features/modes/logic that "just work" ?
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
Users should be free to choose the tools they want. dmenu is much more
flexible and scriptable than anything any of the DEs offer in its place
that is your wm's design. that is not the design of others.
they want something integrated...
okay
Post by Carsten Haitzler (The Rasterman)
...and don't want external tools.
Bullshit. Give them something integrated and they'll use it. However,
i was speaking of the other DE developers - not users. YOUR design does not
want integrated. others WANT integrated designs and DON'T want adhoc
non-integrated components in their desktop environment they are creating.
Post by Drew DeVault
there's no reason why the integrated solution and the external tools
couldn't both exist. The users don't give a fuck about whether or not
the external tools exist. They are apathetic about it, they don't
actively "not want it", and their experience is in no way worsened by
the availablility of external tools. Those who do want external tools,
however, have a worsened experience if we design ourselves into a black
box that no one can extend.
you need to calm down i think.

*I* do not want adhoc panels/taskbars/tools written by separate projects within
my DE because they cause more problems than they solve. been there. done that.
not going back. i learned my lesson on that years ago. for them to work you have
pagers and taskbars in them to be fully functional and unless you ALSO then bind
all this metadata for the pagers, virtual desktops and their content to a
protocol that is also universal, then its rather pointless. this then ties your
desktop to a specific design of how desktops are (eg NxM grids and only ONE of
those in an entire environment. when with enlightenment each screen has an
independent NxM grid PER SCREEN that can be switched separately.

so either i break all those 3rd party pagers or i compromise design and force
everyone into a horrible "1 desktop spans all screens and u have NxM virtual
desktops for all screens combined" which is far worse, so i abandoned
supporting the protocol (netwm).

for good historical reasons i know *I* don't want to repeat this design from
x11 with wayland. just to implement a pager or taskbar is a security hole as
you begin to expose other clients - no more isolation. you expose buffers of
their content. AND you limit your notions of a desktop/screen to those defined
by that protocol. i would not start walking down this path to begin with.

i'm warning you that you are simply repeating past mistakes by trying to go
this way.
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
- you just pipe in a list of things and the user picks one. Don't be
fooled into thinking that whatever your DE does for a given feature is
the mecca of that feature. Like you were saying to make other points -
no - but i'm saying that this is not a COMMON feature among all DEs.
different ones will work differently. gnome 3's chosen design these days is
to put it into gnome shell via js extensions, not the gnome 2 way with a
separate panel process (ala dmenu). enlightenment does it internally too
and extend differently. my point is that what you want here is not
universal.
I'm not suggesting anything radical to try and cover all of these use
cases at once. Sway has a protocol that lets a surface indicate it wants
to be docked somewhere, which allows for custom taskbars and things like
dmenu and so on to exist pretty easily, and this protocol is how swaybar
happens to be implemented. This doesn't seem very radical to me, it
doesn't enforce anything on how each of the DEs choose to implement
their this and that.
then keep your protocol. :) i know i have no interest in supporting it - as
above. :)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
there are fewer contributors to each DE than you might imagine. DEs are
that is exactly what i said in response to you saying that "we have all the
resources to do all of this" when i said we don't... :/ we don't - resources
are already expended elsewhere.
We've both used this same argument from each side multiple times, it's
There aren't necessarily enough people to work on the features I'm
proposing right now. I don't think anyone needs to implement this _right
now_. There also aren't ever enough people to give every little feature
of their DE the attention that leads to software that is as high quality
as a similar project with a single focus on that one feature.
that is true. :)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
Be flexible enough for users to pick the tools they want.
a lifetime of doing wm's has taught me that this approach is not the best.
you end up with a limiting and complex protocol to then allow taskbars,
pagers and so on to be in "dmenus" of this world. this is how gnome 1.x and
2.x worked. i added the support in e long ago. i learned that it was a
limiter in adding features as you had to conform to someone elses idea of
what virtual desktops are etc.
A lifetime of using and customizing and scripting WMs that are more
composable and configurable than e, gnome, kde, and most of the other
Big Ones has led me to the opposite conclusion. I'm not suggesting we do
these sorts of efforts ad nauseum. I don't think we're heading towards a
situation where we're agreeing on the implementation of virtual
desktops. I'm putting forth a small handful of important, core features
that we are all going to have to support in some way or another to even
qualify as wayland compositors and subvert X's domainance over the
desktop.
i just think that some of the things you want should stay "within your
compositor and its extension protocols". other things i see as genuinely
globally useful. :)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
these panels/taskbars/shelves/whatever are best being closely integrated
into the wm.
You don't provide any justification for this, you just say it like it's
gospel, and it's not. I will again remind you that not everyone wants to
considering i actually have implemented all of this over the years,
experienced the downsides and have come around to the conclusion that an
integrated environment works best ... i've done the miles. i explained above
how issues with pagers (external ones) create issues in x11 and thus they were
dropped. not to mention security concerns (that were not an issue in x11
because it's insecure by design - insecure meaning you can access any content
of any window at any time, or discover all your application window id's any
time in the window tree whenever you want - no isolation ... etc.).
Post by Drew DeVault
buy into a desktop environment wholesale. They may want to piece it
together however they see fit and it's their god damn right to. Anything
else is against the spirit of free software.
i disagree. i can't take linux and just use some bsd device drvier with it - oh
dear. that's against the spirit free software! i have to port it and
integrate it (as a kernel module). wayland is about making the things that HAVE
to be shared protocol just that. the things that don't absolutely have to be,
we don't. you are able to patch, modify and extend your de/wm, all you like -
most de's provide some way to do this. gnome today uses js. e uses loadable
modules. i am unsure about kde. :)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
These features have to get done at some point. Backlog your
implementation of these protocols if you can't work on it now.
that's what i'm saying. :)
In this case, I'm not seeing how your points about what order things
need to be done in matters. Now is the right time for me to implement
this in Sway. The major problems you're trying to solve are either
non-issues or solved issues on Sway, and it makes sense to do this now.
I'd like to do it in a way that works for everyone.
you need to solve clients that have a minx/max size without introducing the
need for a floating property. that is something entirely different. not solved.
what happens when you need to restart sway after some development? where do all
your terminals/editors/ide's, browsers/irc clients go? they vanish and you have
to re-run them?
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
You misunderstand me. I'm not suggesting that these apps be crippled.
I'm suggesting that, during the negotiation, they _object_ to having the
server draw their decorations. Then other apps that don't care can say
so.
aaah ok. so compositor adapts. then likely i would express this as a
"minimize your decorations" protocol from compositor to client, client to
compositor then responds similarly like "minimize your decorations" and
compositor MAY choose to not draw a shadow/titlebar etc. (or client
responds with "ok" and then compositor can draw all it likes around the
app).
I think Jonas is on the right track here. This sort of information could
go into xdg_*. It might not need an entire protocol to itself.
i'd lean on a revision of xdg :)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
I don't want to rehash this old argument here. There's two sides to this
coin. I think everyone fully understands the other position. It's not
hard to reach a compromise on this.
it's sad that we have to have this disagreement at all. :) go on. join the
dark side! :) we have cookies!
Never! I want my GTK apps and my Qt apps to have the same decorations,
dammit :) Too bad I don't have much hope for making my cursor theme
consistent across my entire desktop...
but.... COOKIES! COOOOOOOKIES! :)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
What, do you expect to tell libavcodec to switch pixel formats
mid-recording? No one is recording their screen all the time. Yeah, you
might hit performance issues. So be it. It may not be ideal but it'll
likely be well within the limits of reason.
you'll appreciate what i'm getting at next time you have to do 4k ... or 8k
video and screencast/capture that. :) and have to do miracast... on a 1.3ghz
arm device :)
I'll go back to the earlier argument of "we shouldn't cripple the
majority for the sake of the niche". Who on Earth is going to drive an
8K display on a 1.3ghz ARM device anyway :P
... you might be surprised. 4k ones are already out there. ok . not 1.3ghz -
2ghz - but no way you can capture even 4k with the highest end arms unless you
avoid conversion. you keep things in yuv space and drop your bandwidth
requirements hugely. in fact you never leave yuv space and make use of the hw
layers and the video decoder decodes directly into scanout buffers. you MAY be
able to stuff the yuv buffers back into an encoder and re-encode again ... just.
but it'd be better not to decode AND encode by take the mp4/whatever stream
directly and shuffle it down the network pipe. :)

believe it or not TODAY tablets with 4k screens ship. you can buy them. they
are required to support things like miracast (mp4/h264 stream over wifi). it's
reality today. products shipping in the 100,000's and millions. :)
--
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler) ***@rasterman.com
Drew DeVault
2016-03-29 12:11:03 UTC
Permalink
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
I don't really understand why forking from the compositor and bringing
along the fds really gives you much of a gain in terms of security. Can
why?
there is no way a process can access the socket with privs (even know the
extra protocol exists) unless it is executed by the compositor. the compositor
can do whatever it deems "necessary" to ensure it executes only what is
allowed. eg - a whitelist of binary paths. i see this as a lesser chance of a
hole.
I see what you're getting at now. We can get the pid of a wayland
client, though, and from that we can look at /proc/cmdline, from which
we can get the binary path. We can even look at /proc/exe and produce a
checksum of it, so that programs become untrusted as soon as they
change.
Post by Carsten Haitzler (The Rasterman)
i know - but for just capturing screencasts, adding watermarks etc. - all you
need is to store a stream - the rest can be post-processed.
Correct, if you record to a file, you can deal with it in post. But
there are other concerns, like what output format you'd like to use and
what encoding quality you want to use to consider factors like disk
space, cpu usage, etc. And there still is the live streaming use-case,
which we should support and which your solution does not address.
Post by Carsten Haitzler (The Rasterman)
why do we need the fullscreen shell? that was intended for environments where
apps are only ever fullscreen from memory. xdg shell has the ability for a
window to go fullscreen (or back to normal) this should do just fine. :) sure -
let's talk about this stuff - fullscreening etc.
I've been mixing up fullscreen-shell with that one thing in xdg-shell.
My bad.
Post by Carsten Haitzler (The Rasterman)
let's talk about the actual apps surfaces and where they go - not
configuration of outputs. :)
No, I mean, that's what I'm getting at. I don't want to talk about that
because it doesn't make sense outside of e. On Sway, the user is putting
their windows (fullscreen or otherwise) on whatever output they want
themselves. There aren't output roles. Outputs are just outputs and I
intend to keep it that way.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
Troublemaking software is going to continue to make trouble. Further
news at 9. That doesn't really justify making trouble for users as well.
or just have the compositor "work" without needing scripts and users to have to
learn how to write them. :)
Never gonna happen, man. There's no way you can foresee and code for
everyone's needs. I'm catching on to this point you're heading towards,
though: e doesn't intend to suit everyone's needs.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
https://sr.ht/Ai5N.png
Most apps are fine with being told what resolution to be, and they
_need_ to be fine with this for the sake of my sanity. But I understand
that several applications have special concerns that would prevent this
but for THEIR sanity, they are not fine with it. :)
Nearly all toolkits are entirely fine with being any size, at least
above some sane minimum. A GUI that cannot deal with being a
user-specified size is a poorly written GUI.
Post by Carsten Haitzler (The Rasterman)
no. this has nothing to do with floating. this has to do with minimum and in
this case especially - maximum sizes. it has NOTHING to do with floating. you
are conflating sizing with floating because floating is how YOU HAPPEN to want
to deal with it.
Fair. Floating is how I would deal with it. But maybe I'm missing
something: where does the min/max size hints come from? All I seem to
know of is the surface geometry request, which isn't a hint so much as
it's something every single app does. If I didn't ignore it, all windows
would be fucky and the tiling layout wouldn't work at all. Is there some
other hint coming from somewhere I'm not aware of?
Post by Carsten Haitzler (The Rasterman)
you COULD deal with it as i described - pad out the area or
scale retaining aspect ratio - allow user to configure the response. if i had a
small calculator on the left and something that can size up on the right i
+---+------------+
| |............|
|:::|............|
|:::|............|
|:::|............|
| |............|
+---+------------+
Eh, this might be fine for a small number of windows, and maybe even is
the right answer for Sway. I'm worried about it happening for most
windows and I don't want to encourage people to make their applications
locked into one aspect ratio and unfriendly to tiling users.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
What I really want is _users_ to have control. I don't like it that
compositors are forcing solutions on them that doesn't allow them to be
in control of how their shit works.
they can patch their compositors if they want. if you are forcing users to
write scripts you are already forcing them to "learn to code" in a simple way.
would it not be best to try and make things work without needing scripts/custom
code per user and have features/modes/logic that "just work" ?
There's a huge difference between the skillset necessary to patch a
Wayland compositor to support scriptable output configuration and to
write a bash script that uses a tool the compositor shipped for this
purpose.
Post by Carsten Haitzler (The Rasterman)
*I* do not want adhoc panels/taskbars/tools written by separate projects within
my DE because they cause more problems than they solve. been there. done that.
not going back. i learned my lesson on that years ago. for them to work you have
pagers and taskbars in them to be fully functional and unless you ALSO then bind
all this metadata for the pagers, virtual desktops and their content to a
protocol that is also universal, then its rather pointless. this then ties your
desktop to a specific design of how desktops are (eg NxM grids and only ONE of
those in an entire environment. when with enlightenment each screen has an
independent NxM grid PER SCREEN that can be switched separately.
Again, the scope of this is not increasing ad hominum. I never brought
virtual desktops and pagers into the mix. There is a small number of
things that are clearly the compositor's responsibility and that small
list is the only things I want to manipulate with a protocol. Handling
screen capture hardly has room for innovation - there are pixels on
screen, they need to be given to ffmpeg et al. This isn't locking you
into some particular user-facing design choice in your DE.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
I'm not suggesting anything radical to try and cover all of these use
cases at once. Sway has a protocol that lets a surface indicate it wants
to be docked somewhere, which allows for custom taskbars and things like
dmenu and so on to exist pretty easily, and this protocol is how swaybar
happens to be implemented. This doesn't seem very radical to me, it
doesn't enforce anything on how each of the DEs choose to implement
their this and that.
then keep your protocol. :) i know i have no interest in supporting it - as
above. :)
Well, so be it.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
We've both used this same argument from each side multiple times, it's
There aren't necessarily enough people to work on the features I'm
proposing right now. I don't think anyone needs to implement this _right
now_. There also aren't ever enough people to give every little feature
of their DE the attention that leads to software that is as high quality
as a similar project with a single focus on that one feature.
that is true. :)
Interesting that this immediately follows up the last paragraph. If you
acknowledge that your implementation of desktop feature #27 can't
possibly be as flexible/configurable/usable/good as some project that's
entirely focused on just making that one feature great, then why would
you refuse to implement the required extensibility for your users to
bring the best tools available into your environment?
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
buy into a desktop environment wholesale. They may want to piece it
together however they see fit and it's their god damn right to. Anything
else is against the spirit of free software.
i disagree. i can't take linux and just use some bsd device drvier with it - oh
dear. that's against the spirit free software! i have to port it and
integrate it (as a kernel module). wayland is about making the things that HAVE
to be shared protocol just that. the things that don't absolutely have to be,
we don't. you are able to patch, modify and extend your de/wm, all you like -
most de's provide some way to do this. gnome today uses js. e uses loadable
modules. i am unsure about kde. :)
Sure, but you can use firefox and vim and urxvt while your friend
prefers termite and emacs and chromium, and your other friend uses gedit
and gnome-terminal and surf.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
In this case, I'm not seeing how your points about what order things
need to be done in matters. Now is the right time for me to implement
this in Sway. The major problems you're trying to solve are either
non-issues or solved issues on Sway, and it makes sense to do this now.
I'd like to do it in a way that works for everyone.
you need to solve clients that have a minx/max size without introducing the
need for a floating property. that is something entirely different. not solved.
You're right, I do have to solve this. But my project and its
contributors have the bandwidth to address this and the things I'm
bringing up at the same time.
Post by Carsten Haitzler (The Rasterman)
what happens when you need to restart sway after some development? where do all
your terminals/editors/ide's, browsers/irc clients go? they vanish and you have
to re-run them?
Most of my users aren't developers working on sway all the time. Sway
has an X backend like Weston, I use that to run nested sways for
development so I'm not restarting Sway all the time. The compositor
crashing without losing all of the clients is a pipe dream imo, I'm not
going to look into it for now.
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
aaah ok. so compositor adapts. then likely i would express this as a
"minimize your decorations" protocol from compositor to client, client to
compositor then responds similarly like "minimize your decorations" and
compositor MAY choose to not draw a shadow/titlebar etc. (or client
responds with "ok" and then compositor can draw all it likes around the
app).
I think Jonas is on the right track here. This sort of information could
go into xdg_*. It might not need an entire protocol to itself.
i'd lean on a revision of xdg :)
I might lean the other way now that I've seen that KDE has developed a
protocol for this. I think that would be a better starting point since
it's proven and already in use. Thoughts?
Post by Carsten Haitzler (The Rasterman)
... you might be surprised. 4k ones are already out there. ok . not 1.3ghz -
2ghz - but no way you can capture even 4k with the highest end arms unless you
avoid conversion. you keep things in yuv space and drop your bandwidth
requirements hugely. in fact you never leave yuv space and make use of the hw
layers and the video decoder decodes directly into scanout buffers. you MAY be
able to stuff the yuv buffers back into an encoder and re-encode again ... just.
but it'd be better not to decode AND encode by take the mp4/whatever stream
directly and shuffle it down the network pipe. :)
believe it or not TODAY tablets with 4k screens ship. you can buy them. they
are required to support things like miracast (mp4/h264 stream over wifi). it's
reality today. products shipping in the 100,000's and millions. :)
Eh, alright. So they'll exist soon. I feel like both strategies can
coexist, in that case. If you want to livestream your tablet, you'll
have a performance hit and it might just be unavoidable. If you just
want to record video, use the compositor's built in thingy. I'm okay
with unavoidable performance concerns in niche situations - most people
aren't going to be livestreaming from their tablet pretty much ever.
Most people aren't even going to be screen capturing on their tablet to
be honest. It goes back to crippling the common case for the sake of the
niche case.

--
Drew DeVault
Daniel Stone
2016-03-29 12:18:11 UTC
Permalink
Hi,
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
or just have the compositor "work" without needing scripts and users to have to
learn how to write them. :)
Never gonna happen, man. There's no way you can foresee and code for
everyone's needs. I'm catching on to this point you're heading towards,
though: e doesn't intend to suit everyone's needs.
If a compositor implementation can never be sufficient to express
peoples' needs, how could an arbitrary protocol be better? Same
complexity problem.

(And, as far as the 'but what if a compositor implementation isn't
good' argument goes - don't use bad compositors.)

Cheers,
Daniel
Pekka Paalanen
2016-03-29 13:44:32 UTC
Permalink
On Tue, 29 Mar 2016 08:11:03 -0400
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
Post by Drew DeVault
I don't really understand why forking from the compositor and bringing
along the fds really gives you much of a gain in terms of security. Can
why?
there is no way a process can access the socket with privs (even know the
extra protocol exists) unless it is executed by the compositor. the compositor
can do whatever it deems "necessary" to ensure it executes only what is
allowed. eg - a whitelist of binary paths. i see this as a lesser chance of a
hole.
I see what you're getting at now. We can get the pid of a wayland
client, though, and from that we can look at /proc/cmdline, from which
we can get the binary path. We can even look at /proc/exe and produce a
checksum of it, so that programs become untrusted as soon as they
change.
That means you have to recognize all interpreters, or you suddenly just
authorized all applications running with /usr/bin/python or such.

The PID -> /proc -> executable thing works only for a limited set of things.

However, forking in the compositor is secure against that. Assuming the
compositor knows what it wants to run, it creates a connection *before*
launching the app, and the app just inherits an already authorized
connection.

The general solution is likely with containers, as you said. That thing
I agree with.


Thanks,
pq
Carsten Haitzler (The Rasterman)
2016-03-30 04:35:12 UTC
Permalink
Post by Drew DeVault
what is allowed. eg - a whitelist of binary paths. i see this as a lesser
chance of a hole.
I see what you're getting at now. We can get the pid of a wayland
client, though, and from that we can look at /proc/cmdline, from which
we can get the binary path. We can even look at /proc/exe and produce a
checksum of it, so that programs become untrusted as soon as they
change.
you can do that... but there are race conditions. a pid can be recycled.
imagine some client just before it exits sends some protocol to request doing
something "restricted". maybe you even check on connect, but let's say this
child exits and you haven't gotten the disconnect on on the fd yet because there
is still data to read in the buffer. you get the pid while the process is
still there, then it happens to exit.. NOW you check /proc/PID ... but in
the mean time the PID was recycled with a new process that is "whitelisted"
so you check this new replacement /proc/PID/exe and find it's ok and ok the
request from the old dying client... BOOM. hole.

it'd be better to use something like smack labels - but this is not used
commonly in linux. you can check the smack label on the connection and auth by
that as smack label then can be in a db of "these guys are ok if they have
smack label "x"" and there is no race here. smack labels are like containers
and also affect all sorts of other access like to files, network etc.

but the generic solution without relying on smack would be to launch yourself -
socketpair + pass fd. :) it has the lowest chance of badness. this works if the
client is a regular native binary (c/c++) or if its a script because the fd
will happily pass on even if its a wrapper shell script that then runs a binary.
Post by Drew DeVault
i know - but for just capturing screencasts, adding watermarks etc. - all
you need is to store a stream - the rest can be post-processed.
Correct, if you record to a file, you can deal with it in post. But
there are other concerns, like what output format you'd like to use and
what encoding quality you want to use to consider factors like disk
space, cpu usage, etc. And there still is the live streaming use-case,
which we should support and which your solution does not address.
given high enough quality any post process can also transcode to another
format/codec/quality level while adding watermarks etc. a compositor able to
stream out (to a file or whatever) video would of course have options for
basics like quality/bitrate etc. - the codec libraries will want this info
anyway...
Post by Drew DeVault
let's talk about the actual apps surfaces and where they go - not
configuration of outputs. :)
No, I mean, that's what I'm getting at. I don't want to talk about that
because it doesn't make sense outside of e. On Sway, the user is putting
their windows (fullscreen or otherwise) on whatever output they want
themselves. There aren't output roles. Outputs are just outputs and I
intend to keep it that way.
enlightenment ALSO puts windows "on the current screen" by default and you can
move them to another screen, desktop etc. as you like. hell it has the ability
to remember screen, desktop, geometry, and all sorts of other state and
re-apply it to the same window when it appears again. i use this often myself
to force apps to do what i want when they keep messing up.. i'm not talking
about manually moving things or the ability for a compositor/wm to override and
enforce its will.

i am talking about situations where you want things to "just work" out of the
box as they might be intended to without forcing the user to go manually say
"hey no - i want this". i'm talking about a situation like
powerpoint/impress/whatever where when i give a presentation on ONE screen i
have a smaller version of the slide, i also have the preview of the next slide,
a count-down timer for the slide talk, etc. and on the "presentation screen" i
get the actual full presentation. I should not have to "manually configure
this". impress/ppts/whatever should be able to open up 2 windows and
appropriately tag them for their purposes and the compositor then KNOWS which
screen they should go onto.

impress etc. also need to know that a presentation screen exists so it knows to
open up a special "presentation window" and a "control window" vs just a
presentation window. these windows are of course fullscreen ones - i think we
don't disagree there.

the same might go for games - imagine a nintento DS setup. game has a control
window (on the bottom screen) and a "game window" on the top. similar to
impress presentation vs control windows. imagine a laptop with 2 screens. one
in the normal place and one where your keyboard would be... similar to the DS.
maybe we can talk flight simulators which may want to span 3 monitors
(left/middle/right), due to different screens able to do different refresh
rates etc. you really likely want to have 3 windows (surfaces) with each
fullscreen on each monitor. how do we advertise to games that such a setup
exists and how would they request to lay out their left/middle/right windows
correctly.

what about when i have a phone plugged into a dock. it has 2 external hdmi "big
screens" and an internal phone screen. the internal should really behave in a
mobile-way where externals would be desktop-like. maybe an app (like
libreoffice) is not usable on a tiny screen. it should be able to say "my
window is only useful in desktop mode" or something. so when i run it - it
turns up on the appropriate screen. when the dialler app that handles phone
calls gets an incoming call.. when it opens its window you likely want it ON
the mobile display, not desktop... etc.

i am just going on to give examples of how window metadata might be used to
have things go to the right place out of the box. if your wm/compositor allows
you to manually override then sure - it can say no and place the window where
it wants. it may HAVE to at times.
Post by Drew DeVault
or just have the compositor "work" without needing scripts and users to
have to learn how to write them. :)
Never gonna happen, man. There's no way you can foresee and code for
everyone's needs. I'm catching on to this point you're heading towards,
though: e doesn't intend to suit everyone's needs.
just improve the compositor then. that's what software development is about.
Post by Drew DeVault
Post by Drew DeVault
https://sr.ht/Ai5N.png
Most apps are fine with being told what resolution to be, and they
_need_ to be fine with this for the sake of my sanity. But I understand
that several applications have special concerns that would prevent this
but for THEIR sanity, they are not fine with it. :)
Nearly all toolkits are entirely fine with being any size, at least
above some sane minimum. A GUI that cannot deal with being a
user-specified size is a poorly written GUI.
it has nothing to do with the toolkit but with the app's window content. a
toolkit may be rendering/arranging it but the app has given you information
that the content is not useful below some size or above some size. if you want
to ignore this - then fine, but don't complain of the consequences and think
the solution is a floating hint. it is not. it's your bug in not respecting
these limitations a client has given you. :) it is your choice. :)
Post by Drew DeVault
no. this has nothing to do with floating. this has to do with minimum and in
this case especially - maximum sizes. it has NOTHING to do with floating.
you are conflating sizing with floating because floating is how YOU HAPPEN
to want to deal with it.
Fair. Floating is how I would deal with it. But maybe I'm missing
something: where does the min/max size hints come from? All I seem to
know of is the surface geometry request, which isn't a hint so much as
it's something every single app does. If I didn't ignore it, all windows
would be fucky and the tiling layout wouldn't work at all. Is there some
other hint coming from somewhere I'm not aware of?
in x11 there are explicit min/max hints . not so in wayland - not that i saw
last time i looked. what is done is they may request surface geom. you may
respond by setting the surface to that geometry or some other. the app now
responds with a BUFFER rendered at NxM pixels. it may NOt match the geom you
set. this is basically the app disagreeing on your choice of geometry and
refusing to provide the geometry you asked for. this is the app giving you a
limit - you went beyond it and this buffer size is what the app can do.

it MAY be useful for apps to provide such hints though in xdg shell. it means a
compositor knows AHEAD of time what these limits are before it hits one. x11
also supported aspect ratio hints too. this was a bit tricky to get right (also
base size and size stepping - eg for terminals). some of this may be good to
bring to wayland, some not.
Post by Drew DeVault
you COULD deal with it as i described - pad out the area or
scale retaining aspect ratio - allow user to configure the response. if i
had a small calculator on the left and something that can size up on the
+---+------------+
| |............|
|:::|............|
|:::|............|
|:::|............|
| |............|
+---+------------+
Eh, this might be fine for a small number of windows, and maybe even is
the right answer for Sway. I'm worried about it happening for most
windows and I don't want to encourage people to make their applications
locked into one aspect ratio and unfriendly to tiling users.
MOST windows will have a minimum size, SOME will have a maximum size. that's
reality of things normally. often non-resizable dialog windows will have min
and max set to the same. i wouldn't worry about this as it is out of your
control- clients will decide. most will be resizable up and down to make you
happy. some will not. if you can deal nicely with "some" then your problems
will be solved.

and floating is another matter entirely. :)
Post by Drew DeVault
they can patch their compositors if they want. if you are forcing users to
write scripts you are already forcing them to "learn to code" in a simple
way. would it not be best to try and make things work without needing
scripts/custom code per user and have features/modes/logic that "just
work" ?
There's a huge difference between the skillset necessary to patch a
Wayland compositor to support scriptable output configuration and to
write a bash script that uses a tool the compositor shipped for this
purpose.
sure but 99% of users can't even manage a script. the 1% left can do scripting.
yes indeed 0.001% could patch the code. but the 99% are still out of luck
unless the compowitor itself does things "nicely" and provides nice little
"checkboxes and sliders" in a gui to set it up (even that is scary for 90% of
people). be aware when i am saying people - i mean general population, not linux
geeks/nerds.
Post by Drew DeVault
*I* do not want adhoc panels/taskbars/tools written by separate projects
within my DE because they cause more problems than they solve. been there.
done that. not going back. i learned my lesson on that years ago. for them
to work you have pagers and taskbars in them to be fully functional and
unless you ALSO then bind all this metadata for the pagers, virtual
desktops and their content to a protocol that is also universal, then its
rather pointless. this then ties your desktop to a specific design of how
desktops are (eg NxM grids and only ONE of those in an entire environment.
when with enlightenment each screen has an independent NxM grid PER SCREEN
that can be switched separately.
Again, the scope of this is not increasing ad hominum. I never brought
virtual desktops and pagers into the mix. There is a small number of
things that are clearly the compositor's responsibility and that small
list is the only things I want to manipulate with a protocol. Handling
screen capture hardly has room for innovation - there are pixels on
screen, they need to be given to ffmpeg et al. This isn't locking you
into some particular user-facing design choice in your DE.
the point of these dmenus/panels is to contain such controls - it happens that
dmenu does not do this but most instances do. the intent of these is to act as
non-integrated parts of a desktop. they function as a desktop component - eg ar
always there from login.
Post by Drew DeVault
Post by Drew DeVault
I'm not suggesting anything radical to try and cover all of these use
cases at once. Sway has a protocol that lets a surface indicate it wants
to be docked somewhere, which allows for custom taskbars and things like
dmenu and so on to exist pretty easily, and this protocol is how swaybar
happens to be implemented. This doesn't seem very radical to me, it
doesn't enforce anything on how each of the DEs choose to implement
their this and that.
then keep your protocol. :) i know i have no interest in supporting it - as
above. :)
Well, so be it.
Post by Drew DeVault
We've both used this same argument from each side multiple times, it's
There aren't necessarily enough people to work on the features I'm
proposing right now. I don't think anyone needs to implement this _right
now_. There also aren't ever enough people to give every little feature
of their DE the attention that leads to software that is as high quality
as a similar project with a single focus on that one feature.
that is true. :)
Interesting that this immediately follows up the last paragraph. If you
acknowledge that your implementation of desktop feature #27 can't
possibly be as flexible/configurable/usable/good as some project that's
entirely focused on just making that one feature great, then why would
you refuse to implement the required extensibility for your users to
bring the best tools available into your environment?
because i have implemented extensibility many times over in the past 20 years.
i've come to the conclusion that they create a poor user experience with
loosely integrated components that either look ugly, don't work like the rest of
the de or do horrible hacks that then create trouble. what does work well is
tight integration. the manpower we have we have i'd RATHER devote to making
things better out of the box and having features than just saying "bah - we
give up and hope someone else will do it". every time i have done this, it has
lead to sub-optimal or poor results. you give up solving a problem and instead
then rely on 3rd party tools that don't look right, or function well, or
integrate or then don't support things YOU want to do later on (eg like the
per-screen profiles in screen output config).

maybe YOU want to do it that way - fine. that's your choice, but most other
DE's are integrated. They work on/provide their own tools and code and logic. :)
Post by Drew DeVault
i disagree. i can't take linux and just use some bsd device drvier with it
- oh dear. that's against the spirit free software! i have to port it and
integrate it (as a kernel module). wayland is about making the things that
HAVE to be shared protocol just that. the things that don't absolutely have
to be, we don't. you are able to patch, modify and extend your de/wm, all
you like - most de's provide some way to do this. gnome today uses js. e
uses loadable modules. i am unsure about kde. :)
Sure, but you can use firefox and vim and urxvt while your friend
prefers termite and emacs and chromium, and your other friend uses gedit
and gnome-terminal and surf.
big difference - "apps" vs "desktop". of course this line is a grey area. i
consider the line at shelves/panels/filemanager/settings for desktop and
system/desktop bg/wallpaper/config tools/virtual keyboards/wm+compositor those
are on the desktop side. browser, terminals, editors are firmly in "apps" land.
it may be that your de of choice provides apps that work with the
look/feel/philosophy/toolkit of your de - but they are separate. that is where
i draw the line.
Post by Drew DeVault
what happens when you need to restart sway after some development? where do
all your terminals/editors/ide's, browsers/irc clients go? they vanish and
you have to re-run them?
Most of my users aren't developers working on sway all the time. Sway
has an X backend like Weston, I use that to run nested sways for
development so I'm not restarting Sway all the time. The compositor
crashing without losing all of the clients is a pipe dream imo, I'm not
going to look into it for now.
then you are relying on x to do development, you can never get rid of x11 -
ever then...

i don';t see it as a pipe dream. all you need is the ability to recognize a
client and its surfaces from a previous connection and have clients reconnect
and provide whatever information is necessary to restore that state (eg an id
of some sort).
Post by Drew DeVault
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
aaah ok. so compositor adapts. then likely i would express this as a
"minimize your decorations" protocol from compositor to client, client
to compositor then responds similarly like "minimize your decorations"
and compositor MAY choose to not draw a shadow/titlebar etc. (or client
responds with "ok" and then compositor can draw all it likes around the
app).
I think Jonas is on the right track here. This sort of information could
go into xdg_*. It might not need an entire protocol to itself.
i'd lean on a revision of xdg :)
I might lean the other way now that I've seen that KDE has developed a
protocol for this. I think that would be a better starting point since
it's proven and already in use. Thoughts?
if you plan on it becoming universal - plan for xdg. if you want to keep it
private or experiment locally- make it a separate protocol.
Post by Drew DeVault
... you might be surprised. 4k ones are already out there. ok . not 1.3ghz -
2ghz - but no way you can capture even 4k with the highest end arms unless
you avoid conversion. you keep things in yuv space and drop your bandwidth
requirements hugely. in fact you never leave yuv space and make use of the
hw layers and the video decoder decodes directly into scanout buffers. you
MAY be able to stuff the yuv buffers back into an encoder and re-encode
again ... just. but it'd be better not to decode AND encode by take the
mp4/whatever stream directly and shuffle it down the network pipe. :)
believe it or not TODAY tablets with 4k screens ship. you can buy them. they
are required to support things like miracast (mp4/h264 stream over wifi).
it's reality today. products shipping in the 100,000's and millions. :)
Eh, alright. So they'll exist soon. I feel like both strategies can
coexist, in that case. If you want to livestream your tablet, you'll
have a performance hit and it might just be unavoidable. If you just
want to record video, use the compositor's built in thingy. I'm okay
with unavoidable performance concerns in niche situations - most people
aren't going to be livestreaming from their tablet pretty much ever.
Most people aren't even going to be screen capturing on their tablet to
be honest. It goes back to crippling the common case for the sake of the
niche case.
it's a performance hit for EVERYONE if you do un-needed transforms (scaling,
colorspace conversion etc.). ;)
--
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler) ***@rasterman.com
Simon McVittie
2016-03-31 11:20:27 UTC
Permalink
Post by Drew DeVault
I see what you're getting at now. We can get the pid of a wayland
client, though, and from that we can look at /proc/cmdline, from which
we can get the binary path.
This line of thinking is a trap: rummaging in /proc/$pid is not suitable
for use as a security mechanism. If a client can queue up a malicious
privileged action (stuff it into the socket's send-buffer), and then
race with the compositor to exec() something that would legitimately be
allowed to take that action before the request is processed, then you lose.

See <https://bugs.freedesktop.org/show_bug.cgi?id=83499> for details of
the equivalent in D-Bus. Mainline dbus on Unix was never vulnerable to
this (because we use credentials-passing to get the uid), but there used
to be an out-of-tree LSM integration patch set (for Maemo) that was.
(That bug was about documenting the attack so that we never accidentally
introduce it.)

If you want to map processes to executable-based privilege domains in a
way that cannot be faked, you will have to use their LSM labels
(SELinux, Smack or other xattr-based labelling, or AppArmor or other
path-based labelling) which are specifically designed to do this. A
Wayland equivalent of D-Bus' GetConnectionCredentials() would probably
be useful.

S
--
Simon McVittie
Collabora Ltd. <http://www.collabora.com/>
Daniel Stone
2016-03-29 10:45:08 UTC
Permalink
Hi,
Post by Drew DeVault
You don't provide any justification for this, you just say it like it's
gospel, and it's not. I will again remind you that not everyone wants to
buy into a desktop environment wholesale. They may want to piece it
together however they see fit and it's their god damn right to. Anything
else is against the spirit of free software.
I only have a couple of things to add, since this thread is so long,
so diverse, and so shouty, that it's long past the point of
usefulness.

Firstly, https://www.redhat.com/archives/fedora-devel-list/2008-January/msg00861.html
is a cliché, but the spirit of free software is empowering people to
make the change they want to see, rather than requiring the entire
world be perfectly isolated and abstracted along inter-module
boundaries, freely mix-and-matchable.

Secondly, you talk about introducing all these concepts and protocols
as avoiding complexity. Nothing could be further from the case. That
X11 emulates this model means that it has Xinerama, XRandR,
XF86VidMode, the ICCCM, and NetWM/EWMH, as well as all the various
core protocols. You're not avoiding complexity, but simultaneously
shifting and avoiding it. You're not avoiding policy to create
mechanism; the structure and design of the mechanism is a policy in
itself.

Thirdly, it's important to take a step back. 'Wayland doesn't support
middle-button primary selections' is a feature gap compared to X11;
'Wayland doesn't have XRandR' is not. Sometimes it seems like you miss
the forest of user-visible behaviour for the trees of creating
protocol.

Fourthly, I think you misunderstand the role of what we do. If you
want to design and deploy a modular framework for Legoing your own
environment together, by all means, please do that. Give it a go, see
what falls out, see if people creating arbitrary external panels and
so find it useful, and then see if you can convince the others to
adopt it. But this isn't really the place for top-down design where we
dictate how all environments based on Wayland shall behave.

I don't really hold out hope for this thread, but would be happy to
pick up separate threads on various topics, e.g. screen
capture/streaming to external apps.

Cheers,
Daniel
Drew DeVault
2016-03-29 12:24:03 UTC
Permalink
Post by Daniel Stone
Firstly, https://www.redhat.com/archives/fedora-devel-list/2008-January/msg00861.html
is a cliché, but the spirit of free software is empowering people to
make the change they want to see, rather than requiring the entire
world be perfectly isolated and abstracted along inter-module
boundaries, freely mix-and-matchable.
I should rephrase: it's against the spirit of Unix. Simple, composable
tools, that Do One Thing And Do It Well, is the Unix way. Our desktop
environments needn't and shouldn't be much different.
Post by Daniel Stone
Secondly, you talk about introducing all these concepts and protocols
as avoiding complexity. Nothing could be further from the case. That
X11 emulates this model means that it has Xinerama, XRandR,
XF86VidMode, the ICCCM, and NetWM/EWMH, as well as all the various
core protocols. You're not avoiding complexity, but simultaneously
shifting and avoiding it. You're not avoiding policy to create
mechanism; the structure and design of the mechanism is a policy in
itself.
I disagree. I think this is just a fundamental difference of opinion.
Post by Daniel Stone
Thirdly, it's important to take a step back. 'Wayland doesn't support
middle-button primary selections' is a feature gap compared to X11;
'Wayland doesn't have XRandR' is not. Sometimes it seems like you miss
the forest of user-visible behaviour for the trees of creating
protocol.
I think you're missing what users are actually using. You'd be surprised
at how many power users are comfortable working with tools like xrandr
and scripting their environments. This is about more than just
xrandr-like support, too. There's definitely a forest of people using
screen capture for live streaming, for instance.
Post by Daniel Stone
Fourthly, I think you misunderstand the role of what we do. If you
want to design and deploy a modular framework for Legoing your own
environment together, by all means, please do that. Give it a go, see
what falls out, see if people creating arbitrary external panels and
so find it useful, and then see if you can convince the others to
adopt it. But this isn't really the place for top-down design where we
dictate how all environments based on Wayland shall behave.
I've already seen this. It's been around for a long time. I don't know
if you live in a "desktop environment bubble", but there's a LOT of this
already in practice in the lightweight WM world. Many, many users, are
using software like i3 and xmonad and herbstluftwm and openbox and so on
with composable desktop tools like dmenu and i3bar and lemonbar and so
on _today_. This isn't some radical experiment in making a composable
desktop. It's already a well proven idea, and it works great. I would
guess that the sum of people who are using a desktop like this
perhaps outnumbers the total users of, say, enlightenment. I'm just
bringing the needs of this group forward.

Some of your email is just griping about the long life of this thread,
and you're right. I think I've got most of what I wanted from this
thread, I'm going to start proposing some protocols in new threads next.

--
Drew DeVault
Daniel Stone
2016-03-29 13:22:34 UTC
Permalink
Hi,
Post by Drew DeVault
Post by Daniel Stone
Firstly, https://www.redhat.com/archives/fedora-devel-list/2008-January/msg00861.html
is a cliché, but the spirit of free software is empowering people to
make the change they want to see, rather than requiring the entire
world be perfectly isolated and abstracted along inter-module
boundaries, freely mix-and-matchable.
I should rephrase: it's against the spirit of Unix. Simple, composable
tools, that Do One Thing And Do It Well, is the Unix way. Our desktop
environments needn't and shouldn't be much different.
And yet the existence and dominant popularity of large integrated
environments (historically beginning with Emacs) suggests that the
pithy summary is either wrong, or no longer applicable. Ditto the
relative successes of Plan 9 and microkernels compared to other OSes.
Post by Drew DeVault
Post by Daniel Stone
Secondly, you talk about introducing all these concepts and protocols
as avoiding complexity. Nothing could be further from the case. That
X11 emulates this model means that it has Xinerama, XRandR,
XF86VidMode, the ICCCM, and NetWM/EWMH, as well as all the various
core protocols. You're not avoiding complexity, but simultaneously
shifting and avoiding it. You're not avoiding policy to create
mechanism; the structure and design of the mechanism is a policy in
itself.
I disagree. I think this is just a fundamental difference of opinion.
I really do not see how you can look at ICCM/EWMH and declare it to be
a victory for simplicity, and ease of implementation.
Post by Drew DeVault
Post by Daniel Stone
Thirdly, it's important to take a step back. 'Wayland doesn't support
middle-button primary selections' is a feature gap compared to X11;
'Wayland doesn't have XRandR' is not. Sometimes it seems like you miss
the forest of user-visible behaviour for the trees of creating
protocol.
I think you're missing what users are actually using. You'd be surprised
at how many power users are comfortable working with tools like xrandr
and scripting their environments. This is about more than just
xrandr-like support, too. There's definitely a forest of people using
screen capture for live streaming, for instance.
Yes, screen capture is vital to have.

Providing some of the functionality (application fullscreening,
including to potentially different sizes/modes than are currently set;
user display control) that RandR does, is also vital. Providing an
exact clone of XRandR ('let's provide one protocol that allows any
arbitrary application to do what it likes'), much less so.

I also posit that anyone suggesting that providing the full XRandR
suite to arbitrary users makes implementation more simple, has never
been on the sharp end of that implementation.
Post by Drew DeVault
Post by Daniel Stone
Fourthly, I think you misunderstand the role of what we do. If you
want to design and deploy a modular framework for Legoing your own
environment together, by all means, please do that. Give it a go, see
what falls out, see if people creating arbitrary external panels and
so find it useful, and then see if you can convince the others to
adopt it. But this isn't really the place for top-down design where we
dictate how all environments based on Wayland shall behave.
I've already seen this. It's been around for a long time. I don't know
if you live in a "desktop environment bubble", but there's a LOT of this
already in practice in the lightweight WM world. Many, many users, are
using software like i3 and xmonad and herbstluftwm and openbox and so on
with composable desktop tools like dmenu and i3bar and lemonbar and so
on _today_.
Yes I know, as a former long-term Awesome/OpenBox/etc etc etc etc etc user.
Post by Drew DeVault
This isn't some radical experiment in making a composable
desktop. It's already a well proven idea, and it works great.
Again, I don't know in what parallel universe ICCCM+EWMH are 'great', but OK.
Post by Drew DeVault
I would
guess that the sum of people who are using a desktop like this
perhaps outnumbers the total users of, say, enlightenment. I'm just
bringing the needs of this group forward.
I would suggest the total number of users of these 'power'
environments allowing full flexibility and arbitrary external control
(but still via entirely standardised protocols), is several orders of
magnitude than the combined total of Unity, GNOME and KDE, but I don't
think this thread really needs any more value judgements.

My point is that there is no solution for this existing _on Wayland_
today, something which I would've thought to be pretty inarguable,
since that's what this entire thread is ostensibly about. I know full
well that this exists on X11, and that there are users of the same,
but again, you are talking about creating the same functionality as a
generic Wayland protocol, so it's pretty obvious that it doesn't exist
today.

What I was trying to get at, before this devolved into angrily trying
to create division based on preference was - well, look at the EWMH
author list here:
https://specifications.freedesktop.org/wm-spec/wm-spec-latest.html#idm140200472428352

How many of those people are core X11 developers?

The EWMH evolved from a group of desktop developers who banded
together around common needs, and in large part standardised the
support they already had for composed environments, and also built on
the existing standard of the ICCCM. In this case, there is no relevant
ICCCM to build on, and you're attempting to reverse the EWMH process:
to build a top-down protocol and enforce it as 'this is how Wayland
works and everyone will love it', rather than building something up
which works for multiple implementations and attempting to share that
a bit more widely. For bonus points, this entire thread has already
been more pointlessly adverserial than the entire EWMH process.

Trying to do this under the general Wayland umbrella won't really fly.
xdg_shell was essentially developed as a separate project, by the
people who were very much involved in desktop development, and you'd
need to do the same for the various ideas of yours which aren't
strictly core Wayland, and build upwards from there.
Post by Drew DeVault
Some of your email is just griping about the long life of this thread,
and you're right. I think I've got most of what I wanted from this
thread, I'm going to start proposing some protocols in new threads next.
\o/

Cheers,
Daniel
Jasper St. Pierre
2016-03-30 07:06:26 UTC
Permalink
Post by Drew DeVault
Post by Daniel Stone
Thirdly, it's important to take a step back. 'Wayland doesn't support
middle-button primary selections' is a feature gap compared to X11;
'Wayland doesn't have XRandR' is not. Sometimes it seems like you miss
the forest of user-visible behaviour for the trees of creating
protocol.
I think you're missing what users are actually using. You'd be surprised
at how many power users are comfortable working with tools like xrandr
and scripting their environments.
I've removed myself from the protocol talk so far, but I have to call
this one out. XRandR might be one of the most unfortunate APIs I have
ever dealt with, on both sides of the equation.

* It deals with "outputs" by "index", implying that outputs are static
and ordered. This is not the case in today's equipment with laptop
lids and docks and tons of ports.

* There's *no* way to specify whether something is a temporary display
configuration or should be saved. I plug and unplug external monitors
on my laptop every day, but I don't want a second output to always
behave the same way. Projectors should be in mirror mode. So already
you have multiple configurations, keyed by EDIDs.

* The authors wanted to make hotplug work even when nothing was poking
XRandR, but this just meant that desktops that do store complex
configuration had to wait until XRandR auto-reconfigured before saying
"no, bad computer" and overwriting any configuration it wanted. Two
mode-sets for the price of one.

* The command-line tool made it easy for users to poke the X server
directly, bypassing the DE entirely, leading to cases where the
Settings panel showed bizarre, inconsistent results because the
intended configuration wasn't updated for the manual changes the user
made.

* In some cases, like touchscreens, you *need* input to be mapped to
screen rotation and orientation. Input mapping was half-bolted onto
XInput and XRandR as an after-thought.

* Games which wanted to change the resolution often did it through
XRandR. These rarely worked if users had a complex configuration that
used rotated outputs, etc., or even just had more than one monitor,
leaving users with broken configurations. If the game crashed, users
were stuck with a permanently small screen.

* Similarly to the above, applications which want to react to
resolution changes (e.g. a window manager which wants to resize
windows, or a desktop that wants to reorder desktop icons) is unaware
if such a change is temporary or permanent. The result is that all
your desktop icons got put in a 640x480 box after you launched a game.

* Not to mention that the only event you get out of XRandR is an
all-encompassing "quick! something changed!!" event, which doesn't
even tell if you if it was simply accepting that the configuration you
just made went through successfully, whether it was an auto-configure
from a hotplug, or whether it was some other program poking the API.

* A partial repeat of the above, XRandR was intended for a low-level
"mechanism, not policy" API, but quickly got policy bolted on
after-the-fact because users weren't running tools which actually
supplied the policy. I am very skeptical of users who try to
lego-brick their way through DEs because "it's all bloat, I don't
really need a window manager, I can just skirt along with raw X11"
(because we committed ourselves to making it half-work) and I don't
want to encourage this behavior in Wayland. Let's do it right and
mandate real policy.

(This also doesn't even touch the incredible unfortunate hacks [0] we
have had to do at Endless to support HDMI TVs that need underscan
support, which work by changing the mode-list at runtime based on a
configurable border... which is specified in the mode's XSkew field,
because we didn't have any better place to put it)

We can talk about independent protocols and APIs for each of these use
cases (with no guarantee that Wayland is the IPC mechanism at hand),
but let's not bolt on a "wl_randr" that doesn't even begin to solve
the basic problems at hand because users run xrandr today and we have
to support that use case

[0] https://github.com/endlessm/xf86-video-intel/commit/391771f1652477863ece6da90b81dddb3ecb148a
Post by Drew DeVault
--
Drew DeVault
_______________________________________________
wayland-devel mailing list
https://lists.freedesktop.org/mailman/listinfo/wayland-devel
--
Jasper
Drew DeVault
2016-03-30 23:16:17 UTC
Permalink
Simply because xrandr was/is a poorly implemented mess doesn't mean that
we are going to end up making a poorly implemented mess. We have the
benefit of hindsight. After all, xorg is a poorly implemented mess but
we still made Wayland, didn't we? (Though some could argue that we've
just ended up with a well implemented mess...)

--
Drew DeVault
Daniel Stone
2016-03-31 09:20:17 UTC
Permalink
Hi,
Post by Drew DeVault
Simply because xrandr was/is a poorly implemented mess doesn't mean that
we are going to end up making a poorly implemented mess. We have the
benefit of hindsight. After all, xorg is a poorly implemented mess but
we still made Wayland, didn't we? (Though some could argue that we've
just ended up with a well implemented mess...)
X and Wayland protocols have very different design principles guiding
them. X (often by necessity) exposes as much as possible of its
internal workings to clients, and allows total external manipulation.
That's not the case for Wayland, so what you're proposing is a
significant departure.

Cheers,
Daniel
Pekka Paalanen
2016-03-29 11:24:21 UTC
Permalink
On Tue, 29 Mar 2016 00:01:00 -0400
Post by Drew DeVault
Post by Carsten Haitzler (The Rasterman)
my take on it is that it's premature and not needed at this point. in fact i
wouldn't implement a protocol at all. *IF* i were to allow special access, i'd
simply require to fork the process directly from compositor and provide a
socketpair fd to this process and THAT fd could have extra capabilities
attached to the wl protocol. i would do nothing else because as a compositor i
cannot be sure what i am executing. i'd hand over the choice of being able to
execute this tool to the user to say ok to and not just blindly execute
anything i like.
I don't really understand why forking from the compositor and bringing
along the fds really gives you much of a gain in terms of security. Can
you elaborate on how this changes things? I should also mention that I
don't really see the sort of security goals Wayland has in mind as
attainable until we start doing things like containerizing applications,
in which case we can elimitate entire classes of problems from this
design.
I'm snipping out a lot of the output configuration related stuff from
this response. I'm not going to argue very hard for a common output
configuration protocol. I've been trying to change gears on the output
discussion towards a discussion around whether or not the
fullscreen-shell protocol supports our needs and whether or how it needs
to be updated wrt permissions.
I sense there is a misunderstanding here, that I want to correct.

The fullscreen-shell protocol is completely irrelevant here. It has
been designed to be mutually exclusive to a desktop protocol suite.

The original goal for the fullscreen-shell is to be able to use a
ready-made compositor, e.g. Weston in particular, as a hardware
abstraction layer for a single application. We of course have some demo
programs to use it so we can test it.

That single application would often be a DE compositor, perhaps a small
project which does not want to deal with all the KMS and other APIs but
concentrate on making a good DE at the expense of the slight overhead
that using a middle-man compositor brings.

Now that we have decided that libweston is a good idea, I would assume
this use case may disappear eventually.

There are also no permission issues wrt. to the fullscreen shell
protocol. The compositor exposing the fullscreen shell interface expects
only a single client ever, or works a bit like the VTs in that only a
single client can be active at a time. Ordinarily you set up the
application such, that the parent compositor is launched as part of the
app launch, and nothing else can even connect to the parent compositor.

Fullscreening windows on a desktop has absolutely nothing to do with
the fullscreen shell. Fullscreen shell is not available on compositors
configured for desktop. This is how it was designed and meant to be.


Thanks,
pq
Drew DeVault
2016-03-29 12:16:01 UTC
Permalink
This a mistake on my part. I mixed up the two protocols, I don't intend
to make any changes to fullscreen-shell. Sorry for the confusion.
Giulio Camuffo
2016-03-28 06:08:55 UTC
Permalink
Post by Drew DeVault
Greetings! I am the maintainer of the Sway Wayland compositor.
http://swaywm.org
It's almost the Year of Wayland on the Desktop(tm), and I have
reached out to each of the projects this message is addressed to (GNOME,
Kwin, and wayland-devel) to collaborate on some shared protocol
extensions for doing a handful of common tasks such as display
configuration and taking screenshots. Life will be much easier for
projects like ffmpeg and imagemagick if they don't have to implement
compositor-specific code for capturing the screen!
I want to start by establishing the requirements for these protocols.
Broadly speaking, I am looking to create protocols for the following
- Screen capture
- Output configuration
- More detailed surface roles (should it be floating, is it a modal,
does it want to draw its own decorations, etc)
- Input device configuration
I think that these are the core protocols necessary for
cross-compositor compatability and to support most existing tools for
X11 like ffmpeg. Considering the security goals of Wayland, it will also
likely be necessary to implement some kind of protocol for requesting
and granting sensitive permissions to clients.
On this, see https://lists.freedesktop.org/archives/wayland-devel/2015-November/025734.html
I have not been able to continue on that, but if you want to feel free
to grab that proposal.


Cheers,
Giulio
Post by Drew DeVault
How does this list look? What sorts of concerns do you guys have with
respect to what features each protocol needs to support? Have I missed
any major protocols that we'll have to work on? Once we have a good list
of requirements I'll start writing some XML.
--
Drew DeVault
_______________________________________________
wayland-devel mailing list
https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Drew DeVault
2016-03-28 13:04:48 UTC
Permalink
Post by Giulio Camuffo
On this, see https://lists.freedesktop.org/archives/wayland-devel/2015-November/025734.html
I have not been able to continue on that, but if you want to feel free
to grab that proposal.
I looked through this protocol and it seems like it's a good start. We
should base our work on this.

--
Drew DeVault
Martin Peres
2016-03-28 20:04:04 UTC
Permalink
Post by Drew DeVault
Post by Giulio Camuffo
On this, see https://lists.freedesktop.org/archives/wayland-devel/2015-November/025734.html
I have not been able to continue on that, but if you want to feel free
to grab that proposal.
I looked through this protocol and it seems like it's a good start. We
should base our work on this.
Being able to send accept/deny messages to the clients asynchronously
really will be needed for making good UIs.

We need to be able to revoke rights and add them on the fly.

Martin
Pekka Paalanen
2016-03-29 09:17:55 UTC
Permalink
On Mon, 28 Mar 2016 09:08:55 +0300
Post by Giulio Camuffo
Post by Drew DeVault
Greetings! I am the maintainer of the Sway Wayland compositor.
http://swaywm.org
It's almost the Year of Wayland on the Desktop(tm), and I have
reached out to each of the projects this message is addressed to (GNOME,
Kwin, and wayland-devel) to collaborate on some shared protocol
extensions for doing a handful of common tasks such as display
configuration and taking screenshots. Life will be much easier for
projects like ffmpeg and imagemagick if they don't have to implement
compositor-specific code for capturing the screen!
I want to start by establishing the requirements for these protocols.
Broadly speaking, I am looking to create protocols for the following
- Screen capture
- Output configuration
- More detailed surface roles (should it be floating, is it a modal,
does it want to draw its own decorations, etc)
- Input device configuration
I think that these are the core protocols necessary for
cross-compositor compatability and to support most existing tools for
X11 like ffmpeg. Considering the security goals of Wayland, it will also
likely be necessary to implement some kind of protocol for requesting
and granting sensitive permissions to clients.
On this, see https://lists.freedesktop.org/archives/wayland-devel/2015-November/025734.html
I have not been able to continue on that, but if you want to feel free
to grab that proposal.
Hi,

I may have had negative opinions related to some things on Giulio's
proposal, but I have changed my mind since then. I'd be happy to see it
developed further, understanding that it does not aim to solve the
question of authentication but only communicating the authorization,
for now.


Thanks,
pq
Peter Hutterer
2016-03-28 23:23:13 UTC
Permalink
Post by Drew DeVault
Greetings! I am the maintainer of the Sway Wayland compositor.
http://swaywm.org
It's almost the Year of Wayland on the Desktop(tm), and I have
reached out to each of the projects this message is addressed to (GNOME,
Kwin, and wayland-devel) to collaborate on some shared protocol
extensions for doing a handful of common tasks such as display
configuration and taking screenshots. Life will be much easier for
projects like ffmpeg and imagemagick if they don't have to implement
compositor-specific code for capturing the screen!
I want to start by establishing the requirements for these protocols.
Broadly speaking, I am looking to create protocols for the following
- Screen capture
- Output configuration
- More detailed surface roles (should it be floating, is it a modal,
does it want to draw its own decorations, etc)
- Input device configuration
a comment on the last point: input device configuration is either extremely
simple ("I want tapping enabled") or complex ("This device needs feature A
when condition B is met"). There is very little middle ground.

as a result, you either have some generic protocol that won't meet the niche
cases or you have a complex protocol that covers all the niche cases but
ends up being just a shim between the underlying implementation and the
compositor. Such a layer provides very little benefit but restricts what the
compositor can add in the future. It's not a good idea, imo.

Cheers,
Peter
Post by Drew DeVault
I think that these are the core protocols necessary for
cross-compositor compatability and to support most existing tools for
X11 like ffmpeg. Considering the security goals of Wayland, it will also
likely be necessary to implement some kind of protocol for requesting
and granting sensitive permissions to clients.
How does this list look? What sorts of concerns do you guys have with
respect to what features each protocol needs to support? Have I missed
any major protocols that we'll have to work on? Once we have a good list
of requirements I'll start writing some XML.
Martin Graesslin
2016-03-29 08:15:56 UTC
Permalink
Post by Peter Hutterer
Post by Drew DeVault
Greetings! I am the maintainer of the Sway Wayland compositor.
http://swaywm.org
It's almost the Year of Wayland on the Desktop(tm), and I have
reached out to each of the projects this message is addressed to (GNOME,
Kwin, and wayland-devel) to collaborate on some shared protocol
extensions for doing a handful of common tasks such as display
configuration and taking screenshots. Life will be much easier for
projects like ffmpeg and imagemagick if they don't have to implement
compositor-specific code for capturing the screen!
I want to start by establishing the requirements for these protocols.
Broadly speaking, I am looking to create protocols for the following
- Screen capture
- Output configuration
- More detailed surface roles (should it be floating, is it a modal,
does it want to draw its own decorations, etc)
- Input device configuration
a comment on the last point: input device configuration is either extremely
simple ("I want tapping enabled") or complex ("This device needs feature A
when condition B is met"). There is very little middle ground.
as a result, you either have some generic protocol that won't meet the niche
cases or you have a complex protocol that covers all the niche cases but
ends up being just a shim between the underlying implementation and the
compositor. Such a layer provides very little benefit but restricts what
the compositor can add in the future. It's not a good idea, imo.
I agree. I think that's something best left to the respective compositor
specific configuration modules.

Cheers
Martin
Jonas Ådahl
2016-03-29 02:30:15 UTC
Permalink
Post by Drew DeVault
Greetings! I am the maintainer of the Sway Wayland compositor.
http://swaywm.org
It's almost the Year of Wayland on the Desktop(tm), and I have
reached out to each of the projects this message is addressed to (GNOME,
Kwin, and wayland-devel) to collaborate on some shared protocol
extensions for doing a handful of common tasks such as display
configuration and taking screenshots. Life will be much easier for
projects like ffmpeg and imagemagick if they don't have to implement
compositor-specific code for capturing the screen!
I want to start by establishing the requirements for these protocols.
Broadly speaking, I am looking to create protocols for the following
I'm just going to put down my own personal thoughts on these. I mostly
agree with Carsten on all of this. In general, my opinion is that it is
completely pointless to add Wayland protocols for things that has
nothing to do with Wayland what so ever; we have other display protocol
agnostic methods for that that fits much better.

As a rule of thumb, whether a feature needs a Wayland protocol or not,
one can consider whether a client needs to reference a client side
object (such as a surface) on the server. If it needs it, we should add
a Wayland protocol; otherwise not. Another way of seeing it would be
"could this be shared between Wayland/X11/Mir/... then don't do it in
any of those".
Post by Drew DeVault
- Screen capture
Why would this ever be a Wayland protocol? If a client needs to capture
its own content it doesn't need to ask the compositor; otherwise it's
the job of the compositor. If there needs to be a complex pipeline setup
that adds sub titles, muxing, sound effects and what not, we should make
use of existing projects that intend to create inter-process video
pipelines (pinos[0] for example).

FWIW, I believe remote desktop/screen sharing support partly falls under
this category as well, with the exception that it may need input event
injection as well (which of course shouldn't be a Wayland protocol).

As a side note, for GNOME, I have been working on a org.gnome prefixed
D-Bus protocol for remote desktop that enables the actual remote desktop
things to be implemented in a separate process by providing pinos
streams, and I believe that at some point it would be good to have a
org.freedesktop.* (or equivalent) protocol doing that in a more desktop
agnostic way. Such a protocol could just as well be read-only, and
passed to something like ffmpeg (maybe can even pipe it from gst-launch
directly to ffmpeg if you so wish) in order to do screen recording.
Post by Drew DeVault
- Output configuration
This has nothing to do with Wayland as well. If there is any need for
various compositors to support third party output configuration, at
least make it display protocol agnostic (D-Bus) so that it the doesn't
have to be one implementation in each layer for each display protocol
when there is actually no point of doing so.
Post by Drew DeVault
- More detailed surface roles (should it be floating, is it a modal,
does it want to draw its own decorations, etc)
Sounds like a job for a xdg_* protocol. However, I think we need to
first settle on a bare minimum protocol, in order to be able to
stabalize anything. This bare minimum protocol needs allows for
extendability making it possible to add things like negotiating how
decorations are drawn etc. The idea is that xdg_shell v6 will allow
this.

Of course we can add xdg_shell extensions already though (i.e. stand
alone protocol extension that extends xdg_shell).
Post by Drew DeVault
- Input device configuration
Same as output configuration. There simply is no valid reason for adding
a Wayland protocol for it.
Post by Drew DeVault
I think that these are the core protocols necessary for
cross-compositor compatability and to support most existing tools for
X11 like ffmpeg. Considering the security goals of Wayland, it will also
likely be necessary to implement some kind of protocol for requesting
and granting sensitive permissions to clients.
How does this list look? What sorts of concerns do you guys have with
respect to what features each protocol needs to support? Have I missed
any major protocols that we'll have to work on? Once we have a good list
of requirements I'll start writing some XML.
I don't think we should start writing Wayland protocols for things that
has nothing to do with Wayland, only because the program where it is
going to be implemented in may already may be doing Wayland things.
There simply is no reason for it.

We should simply use the IPC system that we already have that we already
use for things like this (for example color management, inter-process
video pipelines, geolocation, notifications, music player control, audio
device discovery, accessibility, etc.).


Jonas
Post by Drew DeVault
--
Drew DeVault
_______________________________________________
desktop-devel-list mailing list
https://mail.gnome.org/mailman/listinfo/desktop-devel-list
Drew DeVault
2016-03-29 03:33:15 UTC
Permalink
Post by Jonas Ådahl
I'm just going to put down my own personal thoughts on these. I mostly
agree with Carsten on all of this. In general, my opinion is that it is
completely pointless to add Wayland protocols for things that has
nothing to do with Wayland what so ever; we have other display protocol
agnostic methods for that that fits much better.
I think these features have a lot to do with Wayland, and I still
maintain that protocol extensions make sense as a way of doing it. I
don't want to commit my users to dbus or something similar and I'd
prefer if I didn't have to make something unique to sway. It's probably
going to be protocol extensions for some of this stuff and I think it'd
be very useful for the same flexibility to be offered by other
compositors.
Post by Jonas Ådahl
As a rule of thumb, whether a feature needs a Wayland protocol or not,
one can consider whether a client needs to reference a client side
object (such as a surface) on the server. If it needs it, we should add
a Wayland protocol; otherwise not. Another way of seeing it would be
"could this be shared between Wayland/X11/Mir/... then don't do it in
any of those".
I prefer to think of it as "who has logical ownership over this resource
that they're providing". The compositor has ownership of your output and
input devices and so on, and it should be responsible for making them
available.
Post by Jonas Ådahl
Post by Drew DeVault
- Screen capture
Why would this ever be a Wayland protocol? If a client needs to capture
its own content it doesn't need to ask the compositor; otherwise it's
the job of the compositor. If there needs to be a complex pipeline setup
that adds sub titles, muxing, sound effects and what not, we should make
use of existing projects that intend to create inter-process video
pipelines (pinos[0] for example).
FWIW, I believe remote desktop/screen sharing support partly falls under
this category as well, with the exception that it may need input event
injection as well (which of course shouldn't be a Wayland protocol).
As a side note, for GNOME, I have been working on a org.gnome prefixed
D-Bus protocol for remote desktop that enables the actual remote desktop
things to be implemented in a separate process by providing pinos
streams, and I believe that at some point it would be good to have a
org.freedesktop.* (or equivalent) protocol doing that in a more desktop
agnostic way. Such a protocol could just as well be read-only, and
passed to something like ffmpeg (maybe can even pipe it from gst-launch
directly to ffmpeg if you so wish) in order to do screen recording.
I know that Gnome folks really love their DBus, but I don't think that
it makes sense to use it for this. Not all of the DEs/WMs use dbus and
it would be great if the tools didn't have to know how to talk to it,
but instead had some common way of getting pixels from the compositor.

I haven't heard of Pinos before, but brief searches online make it look
pretty useful for this purpose. I think it can be involved here.
Post by Jonas Ådahl
Post by Drew DeVault
- Output configuration
This has nothing to do with Wayland as well. If there is any need for
various compositors to support third party output configuration, at
least make it display protocol agnostic (D-Bus) so that it the doesn't
have to be one implementation in each layer for each display protocol
when there is actually no point of doing so.
I'm dropping the output configuration protocols that I initially wanted
to make, I've come around. I think we just need to rethink fullscreen
requests to work with the permission model we come up with.
Post by Jonas Ådahl
Post by Drew DeVault
- More detailed surface roles (should it be floating, is it a modal,
does it want to draw its own decorations, etc)
Sounds like a job for a xdg_* protocol. However, I think we need to
first settle on a bare minimum protocol, in order to be able to
stabalize anything. This bare minimum protocol needs allows for
extendability making it possible to add things like negotiating how
decorations are drawn etc. The idea is that xdg_shell v6 will allow
this.
Of course we can add xdg_shell extensions already though (i.e. stand
alone protocol extension that extends xdg_shell).
This sounds reasonable.
Post by Jonas Ådahl
Post by Drew DeVault
- Input device configuration
Same as output configuration. There simply is no valid reason for adding
a Wayland protocol for it.
Same as output configuraiton, I've come around and we should probably
drop this, though again with the constraint that we should tweak things
like pointer-constraints to work with the permissions model.
Post by Jonas Ådahl
I don't think we should start writing Wayland protocols for things that
has nothing to do with Wayland, only because the program where it is
going to be implemented in may already may be doing Wayland things.
There simply is no reason for it.
We should simply use the IPC system that we already have that we already
use for things like this (for example color management, inter-process
video pipelines, geolocation, notifications, music player control, audio
device discovery, accessibility, etc.).
Most of what you mentioned (geolocation, notifications, music control,
audio device discovery) have anything to do with Wayland. Why would they
have to use the same communication system? Things like how output/input
devices are handled, screen capture, and so on are very clearly Wayland
related and I think a Wayland solution for them is entirely acceptable.

--
Drew DeVault
Jonas Ådahl
2016-03-29 04:01:52 UTC
Permalink
Post by Drew DeVault
Post by Jonas Ådahl
I'm just going to put down my own personal thoughts on these. I mostly
agree with Carsten on all of this. In general, my opinion is that it is
completely pointless to add Wayland protocols for things that has
nothing to do with Wayland what so ever; we have other display protocol
agnostic methods for that that fits much better.
I think these features have a lot to do with Wayland, and I still
maintain that protocol extensions make sense as a way of doing it. I
don't want to commit my users to dbus or something similar and I'd
prefer if I didn't have to make something unique to sway. It's probably
going to be protocol extensions for some of this stuff and I think it'd
be very useful for the same flexibility to be offered by other
compositors.
Post by Jonas Ådahl
As a rule of thumb, whether a feature needs a Wayland protocol or not,
one can consider whether a client needs to reference a client side
object (such as a surface) on the server. If it needs it, we should add
a Wayland protocol; otherwise not. Another way of seeing it would be
"could this be shared between Wayland/X11/Mir/... then don't do it in
any of those".
I prefer to think of it as "who has logical ownership over this resource
that they're providing". The compositor has ownership of your output and
input devices and so on, and it should be responsible for making them
available.
I didn't say the display server shouldn't be the one exposing such an
API, I just think it is a bad idea to duplicate every display server
agnostic API for every possible display server protocol.
Post by Drew DeVault
Post by Jonas Ådahl
Post by Drew DeVault
- Screen capture
Why would this ever be a Wayland protocol? If a client needs to capture
its own content it doesn't need to ask the compositor; otherwise it's
the job of the compositor. If there needs to be a complex pipeline setup
that adds sub titles, muxing, sound effects and what not, we should make
use of existing projects that intend to create inter-process video
pipelines (pinos[0] for example).
FWIW, I believe remote desktop/screen sharing support partly falls under
this category as well, with the exception that it may need input event
injection as well (which of course shouldn't be a Wayland protocol).
As a side note, for GNOME, I have been working on a org.gnome prefixed
D-Bus protocol for remote desktop that enables the actual remote desktop
things to be implemented in a separate process by providing pinos
streams, and I believe that at some point it would be good to have a
org.freedesktop.* (or equivalent) protocol doing that in a more desktop
agnostic way. Such a protocol could just as well be read-only, and
passed to something like ffmpeg (maybe can even pipe it from gst-launch
directly to ffmpeg if you so wish) in order to do screen recording.
I know that Gnome folks really love their DBus, but I don't think that
it makes sense to use it for this. Not all of the DEs/WMs use dbus and
it would be great if the tools didn't have to know how to talk to it,
but instead had some common way of getting pixels from the compositor.
So if you have a compositor or a client that wants to support three
display server architectures, it needs to implement all those three
API's separately? Why can't we provide an API ffmpeg etc can use no
matter if the display server happens to be the X server, sway or
Unity-on-Mir?

I don't see the point of not just using D-Bus just because you aren't
using it yet. It's already there, installed on your system, it's already
used by various other parts of the stack, and it will require a lot less
effort by clients and servers if they they want to support more than
just Wayland.
Post by Drew DeVault
I haven't heard of Pinos before, but brief searches online make it look
pretty useful for this purpose. I think it can be involved here.
Pinos communicates via D-Bus, but pixels/frames are of course never
passed directly, but via shared memory handles. What a screen
cast/remote desktop API would do is more or less to start/stop a pinos
stream and optionally inject events, and let the client know what stream
it should use.
Post by Drew DeVault
Post by Jonas Ådahl
I don't think we should start writing Wayland protocols for things that
has nothing to do with Wayland, only because the program where it is
going to be implemented in may already may be doing Wayland things.
There simply is no reason for it.
We should simply use the IPC system that we already have that we already
use for things like this (for example color management, inter-process
video pipelines, geolocation, notifications, music player control, audio
device discovery, accessibility, etc.).
Most of what you mentioned (geolocation, notifications, music control,
audio device discovery) have anything to do with Wayland. Why would they
have to use the same communication system? Things like how output/input
devices are handled, screen capture, and so on are very clearly Wayland
related and I think a Wayland solution for them is entirely acceptable.
Sorry, I don't see how you make the connection between "Wayland" and
"screen capture" other than that it may be implemented in the same
process. Wayland is meant to be used by clients to be able to pass
content to and receive input from the display server. It's is not
intended to be a catch-all IPC replacing D-Bus.


Jonas
Pekka Paalanen
2016-03-29 11:09:30 UTC
Permalink
On Tue, 29 Mar 2016 12:01:52 +0800
Post by Jonas Ådahl
Post by Drew DeVault
Post by Jonas Ådahl
I'm just going to put down my own personal thoughts on these. I mostly
agree with Carsten on all of this. In general, my opinion is that it is
completely pointless to add Wayland protocols for things that has
nothing to do with Wayland what so ever; we have other display protocol
agnostic methods for that that fits much better.
I think these features have a lot to do with Wayland, and I still
maintain that protocol extensions make sense as a way of doing it. I
don't want to commit my users to dbus or something similar and I'd
prefer if I didn't have to make something unique to sway. It's probably
going to be protocol extensions for some of this stuff and I think it'd
be very useful for the same flexibility to be offered by other
compositors.
Post by Jonas Ådahl
As a rule of thumb, whether a feature needs a Wayland protocol or not,
one can consider whether a client needs to reference a client side
object (such as a surface) on the server. If it needs it, we should add
a Wayland protocol; otherwise not. Another way of seeing it would be
"could this be shared between Wayland/X11/Mir/... then don't do it in
any of those".
I prefer to think of it as "who has logical ownership over this resource
that they're providing". The compositor has ownership of your output and
input devices and so on, and it should be responsible for making them
available.
I didn't say the display server shouldn't be the one exposing such an
API, I just think it is a bad idea to duplicate every display server
agnostic API for every possible display server protocol.
Post by Drew DeVault
Post by Jonas Ådahl
Post by Drew DeVault
- Screen capture
Why would this ever be a Wayland protocol? If a client needs to capture
its own content it doesn't need to ask the compositor; otherwise it's
the job of the compositor. If there needs to be a complex pipeline setup
that adds sub titles, muxing, sound effects and what not, we should make
use of existing projects that intend to create inter-process video
pipelines (pinos[0] for example).
FWIW, I believe remote desktop/screen sharing support partly falls under
this category as well, with the exception that it may need input event
injection as well (which of course shouldn't be a Wayland protocol).
As a side note, for GNOME, I have been working on a org.gnome prefixed
D-Bus protocol for remote desktop that enables the actual remote desktop
things to be implemented in a separate process by providing pinos
streams, and I believe that at some point it would be good to have a
org.freedesktop.* (or equivalent) protocol doing that in a more desktop
agnostic way. Such a protocol could just as well be read-only, and
passed to something like ffmpeg (maybe can even pipe it from gst-launch
directly to ffmpeg if you so wish) in order to do screen recording.
I know that Gnome folks really love their DBus, but I don't think that
it makes sense to use it for this. Not all of the DEs/WMs use dbus and
it would be great if the tools didn't have to know how to talk to it,
but instead had some common way of getting pixels from the compositor.
So if you have a compositor or a client that wants to support three
display server architectures, it needs to implement all those three
API's separately? Why can't we provide an API ffmpeg etc can use no
matter if the display server happens to be the X server, sway or
Unity-on-Mir?
I don't see the point of not just using D-Bus just because you aren't
using it yet. It's already there, installed on your system, it's already
used by various other parts of the stack, and it will require a lot less
effort by clients and servers if they they want to support more than
just Wayland.
Post by Drew DeVault
I haven't heard of Pinos before, but brief searches online make it look
pretty useful for this purpose. I think it can be involved here.
Pinos communicates via D-Bus, but pixels/frames are of course never
passed directly, but via shared memory handles. What a screen
cast/remote desktop API would do is more or less to start/stop a pinos
stream and optionally inject events, and let the client know what stream
it should use.
Post by Drew DeVault
Post by Jonas Ådahl
I don't think we should start writing Wayland protocols for things that
has nothing to do with Wayland, only because the program where it is
going to be implemented in may already may be doing Wayland things.
There simply is no reason for it.
We should simply use the IPC system that we already have that we already
use for things like this (for example color management, inter-process
video pipelines, geolocation, notifications, music player control, audio
device discovery, accessibility, etc.).
Most of what you mentioned (geolocation, notifications, music control,
audio device discovery) have anything to do with Wayland. Why would they
have to use the same communication system? Things like how output/input
devices are handled, screen capture, and so on are very clearly Wayland
related and I think a Wayland solution for them is entirely acceptable.
Sorry, I don't see how you make the connection between "Wayland" and
"screen capture" other than that it may be implemented in the same
process. Wayland is meant to be used by clients to be able to pass
content to and receive input from the display server. It's is not
intended to be a catch-all IPC replacing D-Bus.
For the record, I totally agree with Jonas.

Let's not reinvent existing protocols just because you want to use
Wayland IPC, unless using Wayland IPC is actually a fundamental
requirement for the operation.

The fundamental requirement to use Wayland IPC is precisely the need to
reference Wayland protocol objects, e.g. wl_surface, or the need to
identify a Wayland client (a Wayland connection) without a trusted
third party like a container framework.

Also what I bothered to read from the exhausting thread between Drew
and Carsten, I would agree with Carsten in practically every point.


Thanks,
pq
Drew DeVault
2016-03-29 11:41:10 UTC
Permalink
Post by Jonas Ådahl
Post by Drew DeVault
I prefer to think of it as "who has logical ownership over this resource
that they're providing". The compositor has ownership of your output and
input devices and so on, and it should be responsible for making them
available.
I didn't say the display server shouldn't be the one exposing such an
API, I just think it is a bad idea to duplicate every display server
agnostic API for every possible display server protocol.
Do you foresee GNOME on Mir ever happening? We're trying to leave X
behind here. There won't be a Wayland replacement for a while. The
Wayland compositor has ownership over these resources and the Wayland
compositor is the one managing these resources - and it speaks the
Wayland protocol, which is extensible.
Post by Jonas Ådahl
Post by Drew DeVault
I know that Gnome folks really love their DBus, but I don't think that
it makes sense to use it for this. Not all of the DEs/WMs use dbus and
it would be great if the tools didn't have to know how to talk to it,
but instead had some common way of getting pixels from the compositor.
So if you have a compositor or a client that wants to support three
display server architectures, it needs to implement all those three
API's separately? Why can't we provide an API ffmpeg etc can use no
matter if the display server happens to be the X server, sway or
Unity-on-Mir?
See above
Post by Jonas Ådahl
I don't see the point of not just using D-Bus just because you aren't
using it yet. It's already there, installed on your system, it's already
used by various other parts of the stack, and it will require a lot less
effort by clients and servers if they they want to support more than
just Wayland.
Not everyone has dbus on their system and it's not among my goals to
force it on people. I'm not taking a political stance on this and I
don't want it to devolve into a flamewar - I'm just not imposing either
side of the dbus/systemd argument on my users.
Post by Jonas Ådahl
Pinos communicates via D-Bus, but pixels/frames are of course never
passed directly, but via shared memory handles. What a screen
cast/remote desktop API would do is more or less to start/stop a pinos
stream and optionally inject events, and let the client know what stream
it should use.
Hmm. Again going back to "I don't want to make the dbus decision for my
users", I would prefer to find a solution that's less dependent on it,
though I imagine taking inspiration from Pinos is quite reasonable.
Post by Jonas Ådahl
Sorry, I don't see how you make the connection between "Wayland" and
"screen capture" other than that it may be implemented in the same
process. Wayland is meant to be used by clients to be able to pass
content to and receive input from the display server. It's is not
intended to be a catch-all IPC replacing D-Bus.
DBus is not related to Wayland. DBus is not _attached_ to Wayland. DBus
and Wayland are seperate, unrelated protocols and solving Wayland
problems with DBus is silly.

--
Drew DeVault
Jonas Ådahl
2016-03-29 12:22:16 UTC
Permalink
Post by Drew DeVault
Post by Jonas Ådahl
Post by Drew DeVault
I prefer to think of it as "who has logical ownership over this resource
that they're providing". The compositor has ownership of your output and
input devices and so on, and it should be responsible for making them
available.
I didn't say the display server shouldn't be the one exposing such an
API, I just think it is a bad idea to duplicate every display server
agnostic API for every possible display server protocol.
Do you foresee GNOME on Mir ever happening? We're trying to leave X
behind here. There won't be a Wayland replacement for a while. The
Wayland compositor has ownership over these resources and the Wayland
compositor is the one managing these resources - and it speaks the
Wayland protocol, which is extensible.
GNOME's mutter already work as being a compositor for two separate
protocols: X11 and Wayland. Whenever possible, I by far prefer to
deprecate the way replacing it with a display server protocol agnostic
solution, than having duplicate implementation for every such thing.
Post by Drew DeVault
Post by Jonas Ådahl
Post by Drew DeVault
I know that Gnome folks really love their DBus, but I don't think that
it makes sense to use it for this. Not all of the DEs/WMs use dbus and
it would be great if the tools didn't have to know how to talk to it,
but instead had some common way of getting pixels from the compositor.
So if you have a compositor or a client that wants to support three
display server architectures, it needs to implement all those three
API's separately? Why can't we provide an API ffmpeg etc can use no
matter if the display server happens to be the X server, sway or
Unity-on-Mir?
See above
Most if not all clients will for the forseeable future most likely need
to support at least three protocols on Linux: X11, Wayland and Mir. I
don't see any of these going away any time soon, and I don't see any
reason to have three separate interfaces doing exactly the same thing.
Post by Drew DeVault
Post by Jonas Ådahl
I don't see the point of not just using D-Bus just because you aren't
using it yet. It's already there, installed on your system, it's already
used by various other parts of the stack, and it will require a lot less
effort by clients and servers if they they want to support more than
just Wayland.
Not everyone has dbus on their system and it's not among my goals to
force it on people. I'm not taking a political stance on this and I
don't want it to devolve into a flamewar - I'm just not imposing either
side of the dbus/systemd argument on my users.
Post by Jonas Ådahl
Pinos communicates via D-Bus, but pixels/frames are of course never
passed directly, but via shared memory handles. What a screen
cast/remote desktop API would do is more or less to start/stop a pinos
stream and optionally inject events, and let the client know what stream
it should use.
Hmm. Again going back to "I don't want to make the dbus decision for my
users", I would prefer to find a solution that's less dependent on it,
though I imagine taking inspiration from Pinos is quite reasonable.
We are not going to reimplement anything like Pinos via Wayland
protocols, so any client/compositor that want to do anything related to
stream casting (anything that doesn't just make the content end up
directly on the filesystem) will either need to reimplement their own
private solution, or depend on something like Pinos which will itself
depend on D-Bus.
Post by Drew DeVault
Post by Jonas Ådahl
Sorry, I don't see how you make the connection between "Wayland" and
"screen capture" other than that it may be implemented in the same
process. Wayland is meant to be used by clients to be able to pass
content to and receive input from the display server. It's is not
intended to be a catch-all IPC replacing D-Bus.
DBus is not related to Wayland. DBus is not _attached_ to Wayland. DBus
and Wayland are seperate, unrelated protocols and solving Wayland
problems with DBus is silly.
So is screen casting/recording/sharing. It's a feature of a compositor,
not a feature of Wayland. Screen casting in the way you describe (pass
content to some client) will most likely have its frames passed via
D-Bus, so you'd still force your user to use D-Bus anyway.


Jonas
Post by Drew DeVault
--
Drew DeVault
Pekka Paalanen
2016-03-29 13:36:52 UTC
Permalink
On Tue, 29 Mar 2016 07:41:10 -0400
Post by Drew DeVault
Not everyone has dbus on their system and it's not among my goals to
force it on people. I'm not taking a political stance on this and I
don't want it to devolve into a flamewar - I'm just not imposing either
side of the dbus/systemd argument on my users.
If you don't use what others use, then you use something different.

As you don't want to use what other people use, it goes the same way
for other people to not want to use what you use. Whatever the reasons
for either party.

Wayland upstream/community/whatever cannot force neither you nor them
to do against their will.

So your only hope is to compete with technical excellence and
popularity.
Post by Drew DeVault
Post by Jonas Ådahl
Pinos communicates via D-Bus, but pixels/frames are of course never
passed directly, but via shared memory handles. What a screen
cast/remote desktop API would do is more or less to start/stop a pinos
stream and optionally inject events, and let the client know what stream
it should use.
Hmm. Again going back to "I don't want to make the dbus decision for my
users", I would prefer to find a solution that's less dependent on it,
though I imagine taking inspiration from Pinos is quite reasonable.
Up to you, indeed, on what you force down your users' throats, but the
fact is, you will always force something on them. Your users don't have
the freedom of choice to use your compositor without Wayland either.
You chose Wayland, your users chose your software.
Post by Drew DeVault
Post by Jonas Ådahl
Sorry, I don't see how you make the connection between "Wayland" and
"screen capture" other than that it may be implemented in the same
process. Wayland is meant to be used by clients to be able to pass
content to and receive input from the display server. It's is not
intended to be a catch-all IPC replacing D-Bus.
DBus is not related to Wayland. DBus is not _attached_ to Wayland. DBus
and Wayland are seperate, unrelated protocols and solving Wayland
problems with DBus is silly.
Correct. Use each to its best effect, not all problems are nails.

If there already is a DBus based solution that just works, why would
someone write a new solution to replace that? There has to be a benefit
for replacing the old for the people using the old solution. It could
be a benefit for the end users of the old, or for the developers of the
old, but if the only benefit is for "outsiders", it gives no motivation.


Thanks,
pq
Martin Graesslin
2016-03-29 08:20:45 UTC
Permalink
Post by Drew DeVault
Greetings! I am the maintainer of the Sway Wayland compositor.
http://swaywm.org
It's almost the Year of Wayland on the Desktop(tm), and I have
reached out to each of the projects this message is addressed to (GNOME,
Kwin, and wayland-devel) to collaborate on some shared protocol
extensions for doing a handful of common tasks such as display
configuration and taking screenshots. Life will be much easier for
projects like ffmpeg and imagemagick if they don't have to implement
compositor-specific code for capturing the screen!
I want to start by establishing the requirements for these protocols.
Broadly speaking, I am looking to create protocols for the following
- Screen capture
- Output configuration
we have our kwin-kscreen specific protocol for this. You can find it at:
https://quickgit.kde.org/?
p=kwayland.git&a=blob&h=9ebe342f7939b6dec45e2ebf3ad69e772ec66543&hb=818e320bd99867ea9c831edfb68c9671ef7dfc47&f=src
%2Fclient%2Fprotocols%2Foutput-management.xml

and

https://quickgit.kde.org/?
p=kwayland.git&a=blob&h=747fc264b7e6a40a65a0a04464c2c98036a84f0f&hb=818e320bd99867ea9c831edfb68c9671ef7dfc47&f=src
%2Fclient%2Fprotocols%2Foutputdevice.xml

It's designed for our specific needs in Plasma. If it's useful for others, we
are happy to share and collaborate.

Cheers
Martin
Drew DeVault
2016-03-29 12:12:32 UTC
Permalink
Post by Martin Graesslin
Post by Drew DeVault
- Output configuration
https://quickgit.kde.org/?
p=kwayland.git&a=blob&h=9ebe342f7939b6dec45e2ebf3ad69e772ec66543&hb=818e320bd99867ea9c831edfb68c9671ef7dfc47&f=src
%2Fclient%2Fprotocols%2Foutput-management.xml
and
https://quickgit.kde.org/?
p=kwayland.git&a=blob&h=747fc264b7e6a40a65a0a04464c2c98036a84f0f&hb=818e320bd99867ea9c831edfb68c9671ef7dfc47&f=src
%2Fclient%2Fprotocols%2Foutputdevice.xml
It's designed for our specific needs in Plasma. If it's useful for others, we
are happy to share and collaborate.
It looks like something I could use in Sway. I like it. I'm going to see
how well it integrates with Sway and probably write a command line tool
to interface with it. I think that it would be useful to put this under
the permissions system, though, once that's put together.

--
Drew DeVault
Martin Graesslin
2016-03-30 06:13:17 UTC
Permalink
Post by Drew DeVault
Post by Martin Graesslin
Post by Drew DeVault
- Output configuration
https://quickgit.kde.org/?
p=kwayland.git&a=blob&h=9ebe342f7939b6dec45e2ebf3ad69e772ec66543&hb=818e32
0bd99867ea9c831edfb68c9671ef7dfc47&f=src
%2Fclient%2Fprotocols%2Foutput-management.xml
and
https://quickgit.kde.org/?
p=kwayland.git&a=blob&h=747fc264b7e6a40a65a0a04464c2c98036a84f0f&hb=818e32
0bd99867ea9c831edfb68c9671ef7dfc47&f=src
%2Fclient%2Fprotocols%2Foutputdevice.xml
It's designed for our specific needs in Plasma. If it's useful for others,
we are happy to share and collaborate.
It looks like something I could use in Sway. I like it. I'm going to see
how well it integrates with Sway and probably write a command line tool
to interface with it. I think that it would be useful to put this under
the permissions system, though, once that's put together.
We already have a command line tool for it :-)

It's kscreen-doctor which you can find in libkscreen. I've cc-ed sebas who
wrote this tool (and also the protocol).

And yes it's something for the permission system. We clearly do not want
everybody to be able to change the screen setup.

Cheers
Martin
Martin Graesslin
2016-03-29 08:25:10 UTC
Permalink
Post by Drew DeVault
Greetings! I am the maintainer of the Sway Wayland compositor.
http://swaywm.org
It's almost the Year of Wayland on the Desktop(tm), and I have
reached out to each of the projects this message is addressed to (GNOME,
Kwin, and wayland-devel) to collaborate on some shared protocol
extensions for doing a handful of common tasks such as display
configuration and taking screenshots. Life will be much easier for
projects like ffmpeg and imagemagick if they don't have to implement
compositor-specific code for capturing the screen!
I want to start by establishing the requirements for these protocols.
Broadly speaking, I am looking to create protocols for the following
- Screen capture
- Output configuration
- More detailed surface roles (should it be floating, is it a modal,
does it want to draw its own decorations, etc)
Concerning own decoration we have implemented https://quickgit.kde.org/?
p=kwayland.git&a=blob&h=8bc106c7c42a40f71dad9a884824a7a9899e7b2f&hb=818e320bd99867ea9c831edfb68c9671ef7dfc47&f=src
%2Fclient%2Fprotocols%2Fserver-decoration.xml

We would be very happy to share this one. It's already in use in Plasma 5.6
and so far we are quite satisfied with it. It's designed with convergence in
mind so that it's possible to easily switch the modes (e.g. decorated on
Desktop, not decorated on phone, no decorations for maximized windows, etc.).

I think especially for compositors like sway that can be very helpful. For Qt
we implemented support in our QPT plugin for Plasma. So if sway wants to use
it I can give you pointers on how to use it in your own QPT plugin (if you
don't have one yet how to create it) and to use it to force QtWayland to not
use the client side decorations.

Cheers
Martin
Drew DeVault
2016-03-29 12:14:25 UTC
Permalink
Post by Martin Graesslin
Post by Drew DeVault
- More detailed surface roles (should it be floating, is it a modal,
does it want to draw its own decorations, etc)
Concerning own decoration we have implemented https://quickgit.kde.org/?
p=kwayland.git&a=blob&h=8bc106c7c42a40f71dad9a884824a7a9899e7b2f&hb=818e320bd99867ea9c831edfb68c9671ef7dfc47&f=src
%2Fclient%2Fprotocols%2Fserver-decoration.xml
Excellent. The protocol looks like it'll do just fine.
Post by Martin Graesslin
I think especially for compositors like sway that can be very helpful. For Qt
we implemented support in our QPT plugin for Plasma. So if sway wants to use
it I can give you pointers on how to use it in your own QPT plugin (if you
don't have one yet how to create it) and to use it to force QtWayland to not
use the client side decorations.
I would love to see something like that. Can we work on a model that
would avoid making users install qt to install Sway? Honestly I'd like
to just set an environment variable to turn off CSD where possible, for
both Qt and GTK. I'm still trying to avoid forcing a toolkit on users.

--
Drew DeVault
Martin Graesslin
2016-03-30 06:26:04 UTC
Permalink
Post by Drew DeVault
Post by Martin Graesslin
Post by Drew DeVault
- More detailed surface roles (should it be floating, is it a modal,
does it want to draw its own decorations, etc)
Concerning own decoration we have implemented https://quickgit.kde.org/?
p=kwayland.git&a=blob&h=8bc106c7c42a40f71dad9a884824a7a9899e7b2f&hb=818e32
0bd99867ea9c831edfb68c9671ef7dfc47&f=src
%2Fclient%2Fprotocols%2Fserver-decoration.xml
Excellent. The protocol looks like it'll do just fine.
Post by Martin Graesslin
I think especially for compositors like sway that can be very helpful. For
Qt we implemented support in our QPT plugin for Plasma. So if sway wants
to use it I can give you pointers on how to use it in your own QPT plugin
(if you don't have one yet how to create it) and to use it to force
QtWayland to not use the client side decorations.
I would love to see something like that. Can we work on a model that
would avoid making users install qt to install Sway?
Ah I think there is a small misunderstanding on the QPT plugin. QPT is the Qt
Platform Theme which is a plugin loaded into Qt applications to adjust it to
the platform. So to say a standardized way for LD_PRELOAD.

So you would not have to force a toolkit on your users. It's just that when a
Qt application is used you can provide a plugin to make Qt apps behave better.
Post by Drew DeVault
Honestly I'd like
to just set an environment variable to turn off CSD where possible, for
both Qt and GTK. I'm still trying to avoid forcing a toolkit on users.
For Qt you can try:
export QT_WAYLAND_DISABLE_WINDOWDECORATION=1

it gets rid of Qt's CSD deco and replaces them by nothing. We have that set in
our QPT plugin.

And yeah, in general it would be nice to have the toolkits implement this
protocol. We didn't go for that yet as we had a release schedule mismatch
between Plasma and Qt.

Cheers
Martin
Benoit Gschwind
2016-03-31 15:37:10 UTC
Permalink
Hello Drew,

After reading the thread stream, I think there is two mixed questions in
your email that is missleading. And most reply try to address both in
one reply. I thing Daniel get the point (if I understood him well).

I read the two following questions:

[1] As almost all compositor will need the following features:
- Screen capture
- Output configuration
- More detailed surface roles (should it be floating, is it a modal,
does it want to draw its own decorations, etc)
- Input device configuration.

It would be nice if we go around a table to define some shared protocol.
For those interested I will start writing an XML spec. you are welcome
to contribute.

[2] This features are mandatory and should be in the core protocol.


If my reading is correct, I reply to [1]:

I'm in, I would like to avoid implements tools that setup screen layout,
keymaping and screen capture. Thus having a protocol to handle those
case is welcome. From my point of view of WM developer (in opposition of
DE developer)


For [2], I suggest that you starting with not-adopted protocol
specification and if they gather enough approval, you try to push it as
_optionnal_ standard protocol. By optionnal I mean the compositor
developer can choose to implement it or not. by standard I mean if a
developer want implement those feature we strongly encouraged developper
to use them instead of providing a new one.

Best regards.

--
Benoit (blocage) Gschwind
Post by Drew DeVault
Greetings! I am the maintainer of the Sway Wayland compositor.
http://swaywm.org
It's almost the Year of Wayland on the Desktop(tm), and I have
reached out to each of the projects this message is addressed to (GNOME,
Kwin, and wayland-devel) to collaborate on some shared protocol
extensions for doing a handful of common tasks such as display
configuration and taking screenshots. Life will be much easier for
projects like ffmpeg and imagemagick if they don't have to implement
compositor-specific code for capturing the screen!
I want to start by establishing the requirements for these protocols.
Broadly speaking, I am looking to create protocols for the following
- Screen capture
- Output configuration
- More detailed surface roles (should it be floating, is it a modal,
does it want to draw its own decorations, etc)
- Input device configuration
I think that these are the core protocols necessary for
cross-compositor compatability and to support most existing tools for
X11 like ffmpeg. Considering the security goals of Wayland, it will also
likely be necessary to implement some kind of protocol for requesting
and granting sensitive permissions to clients.
How does this list look? What sorts of concerns do you guys have with
respect to what features each protocol needs to support? Have I missed
any major protocols that we'll have to work on? Once we have a good list
of requirements I'll start writing some XML.
--
Drew DeVault
_______________________________________________
wayland-devel mailing list
https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Peter Hutterer
2016-04-01 03:18:58 UTC
Permalink
Post by Benoit Gschwind
Hello Drew,
After reading the thread stream, I think there is two mixed questions in
your email that is missleading. And most reply try to address both in
one reply. I thing Daniel get the point (if I understood him well).
- Screen capture
- Output configuration
- More detailed surface roles (should it be floating, is it a modal,
does it want to draw its own decorations, etc)
- Input device configuration.
It would be nice if we go around a table to define some shared protocol.
For those interested I will start writing an XML spec. you are welcome
to contribute.
I'm mostly talking about the input device configuration here but an XML is
the wrong place to start, imo. As I said above, it won't add anything much
and you still have to do the implementation everywhere. The only meaningful
thing you can do is write a library that compositors *want* to use that
reads the configuration items from some magic place and applies them to the
libinput device.

and for those that must be handled in the compostior (key remappings, for
example) you'll essentially end up writing a libcompositorinput. but now
you're quite close to internal compositor semantics so making this a generic
thing is not going to be trivial.

Cheers,
Peter
Post by Benoit Gschwind
[2] This features are mandatory and should be in the core protocol.
I'm in, I would like to avoid implements tools that setup screen layout,
keymaping and screen capture. Thus having a protocol to handle those
case is welcome. From my point of view of WM developer (in opposition of
DE developer)
For [2], I suggest that you starting with not-adopted protocol
specification and if they gather enough approval, you try to push it as
_optionnal_ standard protocol. By optionnal I mean the compositor
developer can choose to implement it or not. by standard I mean if a
developer want implement those feature we strongly encouraged developper
to use them instead of providing a new one.
Best regards.
--
Benoit (blocage) Gschwind
Post by Drew DeVault
Greetings! I am the maintainer of the Sway Wayland compositor.
http://swaywm.org
It's almost the Year of Wayland on the Desktop(tm), and I have
reached out to each of the projects this message is addressed to (GNOME,
Kwin, and wayland-devel) to collaborate on some shared protocol
extensions for doing a handful of common tasks such as display
configuration and taking screenshots. Life will be much easier for
projects like ffmpeg and imagemagick if they don't have to implement
compositor-specific code for capturing the screen!
I want to start by establishing the requirements for these protocols.
Broadly speaking, I am looking to create protocols for the following
- Screen capture
- Output configuration
- More detailed surface roles (should it be floating, is it a modal,
does it want to draw its own decorations, etc)
- Input device configuration
I think that these are the core protocols necessary for
cross-compositor compatability and to support most existing tools for
X11 like ffmpeg. Considering the security goals of Wayland, it will also
likely be necessary to implement some kind of protocol for requesting
and granting sensitive permissions to clients.
How does this list look? What sorts of concerns do you guys have with
respect to what features each protocol needs to support? Have I missed
any major protocols that we'll have to work on? Once we have a good list
of requirements I'll start writing some XML.
--
Drew DeVault
_______________________________________________
wayland-devel mailing list
https://lists.freedesktop.org/mailman/listinfo/wayland-devel
_______________________________________________
desktop-devel-list mailing list
https://mail.gnome.org/mailman/listinfo/desktop-devel-list
Ladislav Igrec
2016-03-31 23:05:15 UTC
Permalink
hello, i have a proposal for this
first time using a mailing list so i hope i do it right-ish


protocol for screenshots:

client -> server: "can i get a screenshot ?"
server -> client: "sure, here it is" / "no"
server would send the screenshot via memfd or write() or something.
something like [type/format][length][data] (bit more futureproof then that, though)

for hotkeys:
client -> server: "can i get all alt+F5 events ?"
server -> client: "sure, i'l send them to you" / "no"

for streaming, i don't know how exactly does it work under wayland.
if a compositor can close the stream, then there is no problem

i suggest the transport layer to be UDS for 3 reasons
1. if the file is not in the predetermined place (example /run/wayland-ext/$COMPOSITOR_PID) then the compositor obviously doesn't support the extensions
(while at it, this doesn't have to be wayland exclusive)
2. the compositor can check the PID of the client
3. the compositor can send fds


rationales:
the list of supported features does not have to be gotten as the compositor saying "no" means "no", for whatever reason.
ofc the "what do you support" should be part of the protocol anyway
(the "no" can also be "NO_PRIVILEDGE" or "IDK_WHAT_YOU_ARE_TALKING_ABOUT")

it being UDS the compositor can get the PID, UID and GID of the client.
the compositor can then verify the client by looking at its /proc/$PID/exe.

this minimizes the whole priviledges thing in the protocol and lets the compositor writers choose how they will implement it.
examples:
a compositor that always does/allows what it can.
a compositor that asks the user then (optionally) remembers the answer
a compositor that asks some priviledges daemon "can this process do this?"

i would suggest that the compositor makes a list in human readable format in a file that only the "wayland_compositor" GID can read or write to.
something like:
$EXE_FROM_PROC $UID $GID PERMISSION1 PERMISSION2 etc..



Ladislav Igrec

"hope-ing he is making sense"
Loading...