Discussion:
[systemd-devel] systemd (user) and (sd-pam) (user) processes in login shell
Niksa Jurinovic
2015-12-07 22:10:36 UTC
Permalink
Hello,

I am new to systemd init system as well as to fresh installed Fedora 23
Server, and I would like to put a question related to 'systemd (user)'
and '(sd-pam) (user)' processes invoked under each and every one user's
login shell. The first process is '/usr/lib/systemd/systemd --user' with
PPID=1 and the second is its child process '(sd-pam)'.

What do these processes exactly do and why does my Oracle 12c
database instance (started by 'oracle' user) always crash (silently
shutdown by itself) WITHOUT these processes (or if they are killed)?
When it happens, the database instance is down, and the oracle alert.log
shows semaphore memory corruption:

ORA-27300: OS system dependent operation:semctl failed with status: 22
ORA-27301: OS failure message: Invalid argument
ORA-27302: failure occurred at: sskgpwrm1
ORA-27157: OS post/wait facility removed
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1
ksmsdes: Error destroying SGA
Instance termination got error 27120 from SGA destruction.

If these processes are active for 'oracle' user, Oracle instance never
crashes.

Thank you very much.
Mantas Mikulėnas
2015-12-07 23:36:01 UTC
Permalink
(Hmm, wonder if Inbox's "Undo send" works or if I ended up spamming the
list...)
Post by Niksa Jurinovic
Hello,
I am new to systemd init system as well as to fresh installed Fedora 23
Server, and I would like to put a question related to 'systemd (user)'
and '(sd-pam) (user)' processes invoked under each and every one user's
login shell. The first process is '/usr/lib/systemd/systemd --user' with
PPID=1 and the second is its child process '(sd-pam)'.
What do these processes exactly do and why does my Oracle 12c
database instance (started by 'oracle' user) always crash (silently
shutdown by itself) WITHOUT these processes (or if they are killed)?
When it happens, the database instance is down, and the oracle alert.log
It might be clearer if you described how exactly the daemon is started and
which cgroup it runs under (according to systemd-cgls). Perhaps you're
starting it directly from the shell, and not via systemctl as intended?

The "systemd --user" process is meant for interactive users (as in, not
system accounts) – it acts as the user's personal service manager. I don't
think lack of that process is the cause here, maybe an effect instead –
killing it is part of logind's cleanup when a user logs out.

(There is one --user instance for every user, shared across multiple login
sessions, so it is run under a separate "PAM session" of its own; sd-pam is
just a helper process for that.)

What uid does "oracle" have – is it within the system account range
(usually 1–999) or user account (1000–)? I wonder if it's the latter, which
would mean systemd-logind would clean up various things like IPC on
logout... (see logind.conf)

In fact I'm pretty sure that's the case according to the "Identifier
removed" error.

User accounts should be created with "useradd -r" if they're meant for
daemons, to make them have a system UID and let systemd distinguish from
personal accounts.
Post by Niksa Jurinovic
ORA-27300: OS system dependent operation:semctl failed with status: 22
ORA-27301: OS failure message: Invalid argument
ORA-27302: failure occurred at: sskgpwrm1
ORA-27157: OS post/wait facility removed
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1
ksmsdes: Error destroying SGA
Instance termination got error 27120 from SGA destruction.
If these processes are active for 'oracle' user, Oracle instance never
crashes.
Pretty sure these processes being active is a result, not cause.
--
Mantas Mikulėnas
Kai Krakow
2015-12-22 00:36:16 UTC
Permalink
Am Tue, 8 Dec 2015 01:36:01 +0200
What uid does "oracle" have – is it within the system account range
(usually 1–999) or user account (1000–)? I wonder if it's the latter,
which would mean systemd-logind would clean up various things like
IPC on logout... (see logind.conf)
Is this hard-coded in systemd (uid 0..999 and 1000+) or is it read from
login.defs?

Because I cannot find anything related to it in logind.conf which leads
me to the assumption your reference was about RemoveIPC and friends
only...
--
Regards,
Kai

Replies to list-only preferred.
Mike Gilbert
2015-12-22 02:43:24 UTC
Permalink
Post by Kai Krakow
Am Tue, 8 Dec 2015 01:36:01 +0200
What uid does "oracle" have – is it within the system account range
(usually 1–999) or user account (1000–)? I wonder if it's the latter,
which would mean systemd-logind would clean up various things like
IPC on logout... (see logind.conf)
Is this hard-coded in systemd (uid 0..999 and 1000+) or is it read from
login.defs?
Because I cannot find anything related to it in logind.conf which leads
me to the assumption your reference was about RemoveIPC and friends
only...
I rather doubt the numeric value of the oracle UID has anything to do
with the problem you are having.

With systemd, you really cannot start daemons from an interactive
shell. Rather, you need to define a service unit, and call "systemctl
start" to start long-running daemons.
Kai Krakow
2015-12-22 02:54:43 UTC
Permalink
Am Mon, 21 Dec 2015 21:43:24 -0500
Post by Mike Gilbert
Post by Kai Krakow
Am Tue, 8 Dec 2015 01:36:01 +0200
What uid does "oracle" have – is it within the system account range
(usually 1–999) or user account (1000–)? I wonder if it's the
latter, which would mean systemd-logind would clean up various
things like IPC on logout... (see logind.conf)
Is this hard-coded in systemd (uid 0..999 and 1000+) or is it read
from login.defs?
Because I cannot find anything related to it in logind.conf which
leads me to the assumption your reference was about RemoveIPC and
friends only...
I rather doubt the numeric value of the oracle UID has anything to do
with the problem you are having.
With systemd, you really cannot start daemons from an interactive
shell. Rather, you need to define a service unit, and call "systemctl
start" to start long-running daemons.
I think we are talking different here. My question is a spin-off of the
OP.

Mantas actually made the connection between user and system uid range
to systemd behavior. I just wondered, if this is:

[_] an assumption based on guessing (don't put a cross here)
[_] hard-coded which personally I'd find surprising
[_] configurable and I didn't find the knob

But putting one and one together, your answer means (to the OP):

Don't start daemons directly from a shell and exit. Systemd will blast
them away. Defined behavior.

Yes, it won't work.
--
Regards,
Kai

Replies to list-only preferred.
Mike Gilbert
2015-12-22 16:17:00 UTC
Permalink
Post by Kai Krakow
Am Mon, 21 Dec 2015 21:43:24 -0500
Post by Mike Gilbert
Post by Kai Krakow
Am Tue, 8 Dec 2015 01:36:01 +0200
What uid does "oracle" have – is it within the system account range
(usually 1–999) or user account (1000–)? I wonder if it's the
latter, which would mean systemd-logind would clean up various
things like IPC on logout... (see logind.conf)
Is this hard-coded in systemd (uid 0..999 and 1000+) or is it read
from login.defs?
Because I cannot find anything related to it in logind.conf which
leads me to the assumption your reference was about RemoveIPC and
friends only...
I rather doubt the numeric value of the oracle UID has anything to do
with the problem you are having.
With systemd, you really cannot start daemons from an interactive
shell. Rather, you need to define a service unit, and call "systemctl
start" to start long-running daemons.
I think we are talking different here. My question is a spin-off of the
OP.
Sorry for the mis-reply.
Michael Biebl
2015-12-22 02:56:12 UTC
Permalink
Post by Mike Gilbert
With systemd, you really cannot start daemons from an interactive
shell.
Well, there is systemd-run ...
--
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?
Mantas Mikulėnas
2015-12-22 11:00:12 UTC
Permalink
Post by Mike Gilbert
Post by Kai Krakow
Am Tue, 8 Dec 2015 01:36:01 +0200
Post by Mantas Mikulėnas
What uid does "oracle" have – is it within the system account range
(usually 1–999) or user account (1000–)? I wonder if it's the latter,
which would mean systemd-logind would clean up various things like
IPC on logout... (see logind.conf)
Is this hard-coded in systemd (uid 0..999 and 1000+) or is it read from
login.defs?
Because I cannot find anything related to it in logind.conf which leads
me to the assumption your reference was about RemoveIPC and friends
only...
I rather doubt the numeric value of the oracle UID has anything to do
with the problem you are having.
It does, as Oracle uses SysV IPC and logind's behavior depends on UID.
--
Mantas Mikulėnas <***@gmail.com>
Mantas Mikulėnas
2015-12-22 10:59:51 UTC
Permalink
Post by Kai Krakow
Am Tue, 8 Dec 2015 01:36:01 +0200
Post by Mantas Mikulėnas
What uid does "oracle" have – is it within the system account range
(usually 1–999) or user account (1000–)? I wonder if it's the latter,
which would mean systemd-logind would clean up various things like
IPC on logout... (see logind.conf)
Is this hard-coded in systemd (uid 0..999 and 1000+) or is it read from
login.defs?
Because I cannot find anything related to it in logind.conf which leads
me to the assumption your reference was about RemoveIPC and friends
only...
It's set at compile (configure) time – either obtained from the compile
host's login.defs or set with --with-system-uid-max=UID.
--
Mantas Mikulėnas <***@gmail.com>
Lennart Poettering
2015-12-23 00:45:31 UTC
Permalink
Post by Kai Krakow
Am Tue, 8 Dec 2015 01:36:01 +0200
What uid does "oracle" have – is it within the system account range
(usually 1–999) or user account (1000–)? I wonder if it's the latter,
which would mean systemd-logind would clean up various things like
IPC on logout... (see logind.conf)
Is this hard-coded in systemd (uid 0..999 and 1000+) or is it read from
login.defs?
We do not read login.defs. It's a compile-time setting (configure
--with-system-uid-max=). The distros choose the right cutoff, not the
admins.

Lennart
--
Lennart Poettering, Red Hat
Reindl Harald
2015-12-23 01:48:19 UTC
Permalink
Post by Lennart Poettering
Post by Kai Krakow
Am Tue, 8 Dec 2015 01:36:01 +0200
Post by Mantas Mikulėnas
What uid does "oracle" have – is it within the system account range
(usually 1–999) or user account (1000–)? I wonder if it's the latter,
which would mean systemd-logind would clean up various things like
IPC on logout... (see logind.conf)
Is this hard-coded in systemd (uid 0..999 and 1000+) or is it read from
login.defs?
We do not read login.defs
which is a mistake
Post by Lennart Poettering
It's a compile-time setting (configure
--with-system-uid-max=). The distros choose the right cutoff, not the
admins
there are setups much older than systemd existed and fedora (as example)
changed from 500 to 100 - hence a compile time setting is wrong by
design when there was a config file over many years

Mantas Mikulėnas
2015-12-08 05:59:07 UTC
Permalink
Post by Mantas Mikulėnas
It might be clearer if you described how exactly the daemon is started
and
Post by Mantas Mikulėnas
which cgroup it runs under (according to systemd-cgls). Perhaps you're
starting it directly from the shell, and not via systemctl as intended?
# su - oracle -l -c '${ORACLE_HOME}/bin/dbstart'
If Oracle is started in this way, the processes 'systemd (oracle)' and
'(sd-pam) (oracle)' DO NOT appear. And that's the problem. Seems that
oracle daemon cannot live without these processes and it dies (shutdown by
itself) very soon (after 5-15 minutes working). The lack of these processes
is the cause of the crash here.
2. By logging in directly as 'oracle' user to console (tty). In this case
the processes 'systemd (oracle)' and '(sd-pam) (oracle)' appear immediately
after logging to console. The database and listener is then started
executing 'dbstart' from console. This way Oracle never crashes, except if
I deliberately kill the two processes as root during the session and Oracle
crashes immediately.
I see. This doesn't kill Oracle by itself, however, it still can cause
various other problems. You really should launch daemons through a systemd
.service, Oracle is no exception.
The "systemd --user" process is meant for interactive users (as in, not
Post by Mantas Mikulėnas
system accounts) – it acts as the user's personal service manager. I
don't
Post by Mantas Mikulėnas
think lack of that process is the cause here, maybe an effect instead –
killing it is part of logind's cleanup when a user logs out.
No, the lack of these processes is the cause of the crash, as I already
said above. So far as these processes are running, no fear of Oracle's
crash.
Correlation does not imply causation. These processes do nothing relevant
by themselves; their presence only indicates that a systemd-logind _user
session_ exists, which is the cause.
What uid does "oracle" have – is it within the system account range
Post by Mantas Mikulėnas
(usually 1–999) or user account (1000–)? I wonder if it's the latter,
which
Post by Mantas Mikulėnas
would mean systemd-logind would clean up various things like IPC on
logout... (see logind.conf)
uid=54321(oracle) gid=54321(oinstall)
groups=54321(oinstall),54322(dba),54323(oper),54324(backupdba),54325(dgdba),54326(kmdba),54327(asmdba)
Ok, so the UID is the problem. (These look suspiciously like made-up
numbers, but I'm guessing they are centrally-managed accounts, maybe NIS or
LDAP.)

So, since "oracle" has an UID ≥ 1000, and since you probably cannot change
that, you should instead *disable RemoveIPC= in /etc/systemd/logind.conf*
to disable the automatic IPC cleanup.
--
Mantas Mikulėnas <***@gmail.com>
Loading...