Discussion:
Two problems with result of num_processors (NPROC_CURRENT_OVERRIDABLE)
Oliver Heimlich
2016-11-28 14:34:24 UTC
Permalink
Follow-Up: Savannah bugs #48534 #48535.

Hi,
I have two problems with the function num_processors in libgnu's nproc.c.


FIRST PROBLEM

num_processors (NPROC_CURRENT_OVERRIDABLE) returns the value of
OMP_NUM_THREADS. However, if a thread limit is defined, it should return
the minimum of OMP_NUM_THREADS and OMP_THREAD_LIMIT.


OMP_THREAD_LIMIT=1 OMP_NUM_THREADS=2 ./run-my-program
- What I get: 2
- What I expect: 1
- Reason: The effective number of threads cannot exceed the thread limit.


SECOND PROBLEM

num_processors (NPROC_CURRENT_OVERRIDABLE) parses the environment
variable OMP_NUM_THREADS for an integer. However, according to OpenMP
documentation it may be a list of integers. In that case, the function
ignores the value of OMP_NUM_THREADS. See
https://gcc.gnu.org/onlinedocs/libgomp/OMP_005fNUM_005fTHREADS.html


OMP_NUM_THREADS=2,2,1 ./run-my-program
- What I get: 4
- What I expect: 2 or 1 depending on the current OpenMP nesting level


Best
Oliver
Paul Eggert
2016-11-29 02:34:00 UTC
Permalink
Thanks, that all sounds reasonable. Can you suggest a patch? Preferably, in "git
format-patch" format, with a ChangeLog entry. Thanks.
Pádraig Brady
2017-02-21 04:59:26 UTC
Permalink
Post by Oliver Heimlich
Follow-Up: Savannah bugs #48534 #48535.
Hi,
I have two problems with the function num_processors in libgnu's nproc.c.
FIRST PROBLEM
num_processors (NPROC_CURRENT_OVERRIDABLE) returns the value of
OMP_NUM_THREADS. However, if a thread limit is defined, it should return
the minimum of OMP_NUM_THREADS and OMP_THREAD_LIMIT.
OMP_THREAD_LIMIT=1 OMP_NUM_THREADS=2 ./run-my-program
- What I get: 2
- What I expect: 1
- Reason: The effective number of threads cannot exceed the thread limit.
SECOND PROBLEM
num_processors (NPROC_CURRENT_OVERRIDABLE) parses the environment
variable OMP_NUM_THREADS for an integer. However, according to OpenMP
documentation it may be a list of integers. In that case, the function
ignores the value of OMP_NUM_THREADS. See
https://gcc.gnu.org/onlinedocs/libgomp/OMP_005fNUM_005fTHREADS.html
OMP_NUM_THREADS=2,2,1 ./run-my-program
- What I get: 4
- What I expect: 2 or 1 depending on the current OpenMP nesting level
There is no way to determine the current nesting level,
without actually using OpenMP, so I added support
for calling omp_get_num_threads() if _OPENMP is defined.
I'm not 100% sure it's valid to do that without depending on
the openmp module (which I think it's best not for nproc to do),
so I'm open to removing that portion.

Also I added support for deferring to the first nesting level
in OMP_NUM_THREADS, and for OMP_THREADS_LIMIT like you suggest.

Tesing coreutils' nproc (without _OPENMP) with this change gives:

$ src/nproc
4
$ OMP_THREADS_LIMIT=1 src/nproc
4
$ OMP_NUM_THREADS=2,2,1 src/nproc
2
$ OMP_NUM_THREADS=2bad src/nproc
4
$ OMP_THREADS_LIMIT=1 OMP_NUM_THREADS=2,2,1 src/nproc
1
$ OMP_THREADS_LIMIT=0 OMP_NUM_THREADS=2,2,1 src/nproc
2
$ OMP_THREADS_LIMIT=1bad OMP_NUM_THREADS=2,2,1 src/nproc
2
$ OMP_THREADS_LIMIT=1b OMP_NUM_THREADS=12,2,1 src/nproc
12
$ OMP_THREADS_LIMIT=1 OMP_NUM_THREADS=12 src/nproc
1
$ OMP_THREADS_LIMIT=111 OMP_NUM_THREADS=12 src/nproc
12
$ OMP_NUM_THREADS=12 nproc
12

cheers,
Pádraig
Pádraig Brady
2017-02-26 15:01:59 UTC
Permalink
Post by Pádraig Brady
Post by Oliver Heimlich
Follow-Up: Savannah bugs #48534 #48535.
Hi,
I have two problems with the function num_processors in libgnu's nproc.c.
FIRST PROBLEM
num_processors (NPROC_CURRENT_OVERRIDABLE) returns the value of
OMP_NUM_THREADS. However, if a thread limit is defined, it should return
the minimum of OMP_NUM_THREADS and OMP_THREAD_LIMIT.
OMP_THREAD_LIMIT=1 OMP_NUM_THREADS=2 ./run-my-program
- What I get: 2
- What I expect: 1
- Reason: The effective number of threads cannot exceed the thread limit.
SECOND PROBLEM
num_processors (NPROC_CURRENT_OVERRIDABLE) parses the environment
variable OMP_NUM_THREADS for an integer. However, according to OpenMP
documentation it may be a list of integers. In that case, the function
ignores the value of OMP_NUM_THREADS. See
https://gcc.gnu.org/onlinedocs/libgomp/OMP_005fNUM_005fTHREADS.html
OMP_NUM_THREADS=2,2,1 ./run-my-program
- What I get: 4
- What I expect: 2 or 1 depending on the current OpenMP nesting level
There is no way to determine the current nesting level,
without actually using OpenMP, so I added support
for calling omp_get_num_threads() if _OPENMP is defined.
I'm not 100% sure it's valid to do that without depending on
the openmp module (which I think it's best not for nproc to do),
so I'm open to removing that portion.
Also I added support for deferring to the first nesting level
in OMP_NUM_THREADS, and for OMP_THREADS_LIMIT like you suggest.
I pushed that and the attached follow up.

The result was tested in coreutils with this matrix:
avail=$(nproc)

OMP_THREAD_LIMIT OMP_NUM_THREADS RESULT
----------------------------------------------------
- - $avail
1 - 1
1 0 1
- 0 $avail
- 2,2,1 2
- 2,ignored 2
- 2bad $avail
- -2 $avail
1 2,2,1 1
0 2,2,1 2
1bad 2,2,1 2
1bad $(($avail+1)),2,1 $(($avail+1))
1 $(($avail+1)) 1
$(($avail+2)) $(($avail+1)) $(($avail+1))
$(($avail+1)) $(($avail+2)) $(($avail+1))
- $(($avail+1)) $(($avail+1))

cheers,
Pádraig
Bruno Haible
2017-02-26 17:05:04 UTC
Permalink
Hi Pádraig,

The function 'num_processors' is now a bit large, and in particular contains
the multiplied complexity of
- the number of systems which each require a different approach,
- the query parameter which in one MUST consider omp_env_limit and in the
other case MAY but NEED NOT consider omp_env_limit.

To make this code more maintainable, I would propose to split it into
- a function which contains only the complexity of the various systems,
- a function which contains only the complexity of OMP handling.

Like this:


2017-02-26 Bruno Haible <***@clisp.org>

nproc: Refactor large function.
* lib/nproc.c (num_processors_ignoring_omp): New function, extracted
from num_processors.
(num_processors): In this function, only deal with OMP.

diff --git a/lib/nproc.c b/lib/nproc.c
index db3ca9b..7880e62 100644
--- a/lib/nproc.c
+++ b/lib/nproc.c
@@ -199,65 +199,12 @@ num_processors_via_affinity_mask (void)
return 0;
}

-/* Parse OMP environment variables without dependence on OMP.
- Return 0 for invalid values. */
-static unsigned long int
-parse_omp_threads (char const* threads)
-{
- unsigned long int ret = 0;
-
- if (threads == NULL)
- return ret;
-
- /* The OpenMP spec says that the value assigned to the environment variables
- "may have leading and trailing white space". */
- while (*threads != '\0' && c_isspace (*threads))
- threads++;

- /* Convert it from positive decimal to 'unsigned long'. */
- if (c_isdigit (*threads))
- {
- char *endptr = NULL;
- unsigned long int value = strtoul (threads, &endptr, 10);
-
- if (endptr != NULL)
- {
- while (*endptr != '\0' && c_isspace (*endptr))
- endptr++;
- if (*endptr == '\0')
- return value;
- /* Also accept the first value in a nesting level,
- since we can't determine the nesting level from env vars. */
- else if (*endptr == ',')
- return value;
- }
- }
-
- return ret;
-}
-
-unsigned long int
-num_processors (enum nproc_query query)
+/* Return the total number of processors. Here QUERY must be one of
+ NPROC_ALL, NPROC_CURRENT. The result is guaranteed to be at least 1. */
+static unsigned long int
+num_processors_ignoring_omp (enum nproc_query query)
{
- unsigned long int omp_env_limit = ULONG_MAX;
-
- if (query == NPROC_CURRENT_OVERRIDABLE)
- {
- unsigned long int omp_env_threads;
- /* Honor the OpenMP environment variables, recognized also by all
- programs that are based on OpenMP. */
- omp_env_threads = parse_omp_threads (getenv ("OMP_NUM_THREADS"));
- omp_env_limit = parse_omp_threads (getenv ("OMP_THREAD_LIMIT"));
- if (! omp_env_limit)
- omp_env_limit = ULONG_MAX;
-
- if (omp_env_threads)
- return MIN (omp_env_threads, omp_env_limit);
-
- query = NPROC_CURRENT;
- }
- /* Here query is one of NPROC_ALL, NPROC_CURRENT. */
-
/* On systems with a modern affinity mask system call, we have
sysconf (_SC_NPROCESSORS_CONF)
= sysconf (_SC_NPROCESSORS_ONLN)
@@ -279,7 +226,7 @@ num_processors (enum nproc_query query)
unsigned long nprocs = num_processors_via_affinity_mask ();

if (nprocs > 0)
- return MIN (nprocs, omp_env_limit);
+ return nprocs;
}

#if defined _SC_NPROCESSORS_ONLN
@@ -287,7 +234,7 @@ num_processors (enum nproc_query query)
Cygwin, Haiku. */
long int nprocs = sysconf (_SC_NPROCESSORS_ONLN);
if (nprocs > 0)
- return MIN (nprocs, omp_env_limit);
+ return nprocs;
}
#endif
}
@@ -330,7 +277,7 @@ num_processors (enum nproc_query query)
if (query == NPROC_CURRENT)
{
if (psd.psd_proc_cnt > 0)
- return MIN (psd.psd_proc_cnt, omp_env_limit);
+ return psd.psd_proc_cnt;
}
else
{
@@ -351,7 +298,7 @@ num_processors (enum nproc_query query)
? MP_NAPROCS
: MP_NPROCS);
if (nprocs > 0)
- return MIN (nprocs, omp_env_limit);
+ return nprocs;
}
#endif

@@ -367,7 +314,7 @@ num_processors (enum nproc_query query)
if (sysctl (mib, ARRAY_SIZE (mib), &nprocs, &len, NULL, 0) == 0
&& len == sizeof (nprocs)
&& 0 < nprocs)
- return MIN (nprocs, omp_env_limit);
+ return nprocs;
}
#endif

@@ -376,9 +323,73 @@ num_processors (enum nproc_query query)
SYSTEM_INFO system_info;
GetSystemInfo (&system_info);
if (0 < system_info.dwNumberOfProcessors)
- return MIN (system_info.dwNumberOfProcessors, omp_env_limit);
+ return system_info.dwNumberOfProcessors;
}
#endif

return 1;
}
+
+/* Parse OMP environment variables without dependence on OMP.
+ Return 0 for invalid values. */
+static unsigned long int
+parse_omp_threads (char const* threads)
+{
+ unsigned long int ret = 0;
+
+ if (threads == NULL)
+ return ret;
+
+ /* The OpenMP spec says that the value assigned to the environment variables
+ "may have leading and trailing white space". */
+ while (*threads != '\0' && c_isspace (*threads))
+ threads++;
+
+ /* Convert it from positive decimal to 'unsigned long'. */
+ if (c_isdigit (*threads))
+ {
+ char *endptr = NULL;
+ unsigned long int value = strtoul (threads, &endptr, 10);
+
+ if (endptr != NULL)
+ {
+ while (*endptr != '\0' && c_isspace (*endptr))
+ endptr++;
+ if (*endptr == '\0')
+ return value;
+ /* Also accept the first value in a nesting level,
+ since we can't determine the nesting level from env vars. */
+ else if (*endptr == ',')
+ return value;
+ }
+ }
+
+ return ret;
+}
+
+unsigned long int
+num_processors (enum nproc_query query)
+{
+ unsigned long int omp_env_limit = ULONG_MAX;
+
+ if (query == NPROC_CURRENT_OVERRIDABLE)
+ {
+ unsigned long int omp_env_threads;
+ /* Honor the OpenMP environment variables, recognized also by all
+ programs that are based on OpenMP. */
+ omp_env_threads = parse_omp_threads (getenv ("OMP_NUM_THREADS"));
+ omp_env_limit = parse_omp_threads (getenv ("OMP_THREAD_LIMIT"));
+ if (! omp_env_limit)
+ omp_env_limit = ULONG_MAX;
+
+ if (omp_env_threads)
+ return MIN (omp_env_threads, omp_env_limit);
+
+ query = NPROC_CURRENT;
+ }
+ /* Here query is one of NPROC_ALL, NPROC_CURRENT. */
+ {
+ unsigned long nprocs = num_processors_ignoring_omp (query);
+ return MIN (nprocs, omp_env_limit);
+ }
+}
Pádraig Brady
2017-02-26 17:42:17 UTC
Permalink
Post by Bruno Haible
Hi Pádraig,
The function 'num_processors' is now a bit large, and in particular contains
the multiplied complexity of
- the number of systems which each require a different approach,
- the query parameter which in one MUST consider omp_env_limit and in the
other case MAY but NEED NOT consider omp_env_limit.
To make this code more maintainable, I would propose to split it into
- a function which contains only the complexity of the various systems,
- a function which contains only the complexity of OMP handling.
nproc: Refactor large function.
* lib/nproc.c (num_processors_ignoring_omp): New function, extracted
from num_processors.
(num_processors): In this function, only deal with OMP.
diff --git a/lib/nproc.c b/lib/nproc.c
index db3ca9b..7880e62 100644
--- a/lib/nproc.c
+++ b/lib/nproc.c
@@ -199,65 +199,12 @@ num_processors_via_affinity_mask (void)
return 0;
}
-/* Parse OMP environment variables without dependence on OMP.
- Return 0 for invalid values. */
-static unsigned long int
-parse_omp_threads (char const* threads)
-{
- unsigned long int ret = 0;
-
- if (threads == NULL)
- return ret;
-
- /* The OpenMP spec says that the value assigned to the environment variables
- "may have leading and trailing white space". */
- while (*threads != '\0' && c_isspace (*threads))
- threads++;
- /* Convert it from positive decimal to 'unsigned long'. */
- if (c_isdigit (*threads))
- {
- char *endptr = NULL;
- unsigned long int value = strtoul (threads, &endptr, 10);
-
- if (endptr != NULL)
- {
- while (*endptr != '\0' && c_isspace (*endptr))
- endptr++;
- if (*endptr == '\0')
- return value;
- /* Also accept the first value in a nesting level,
- since we can't determine the nesting level from env vars. */
- else if (*endptr == ',')
- return value;
- }
- }
-
- return ret;
-}
-
-unsigned long int
-num_processors (enum nproc_query query)
+/* Return the total number of processors. Here QUERY must be one of
+ NPROC_ALL, NPROC_CURRENT. The result is guaranteed to be at least 1. */
+static unsigned long int
+num_processors_ignoring_omp (enum nproc_query query)
{
- unsigned long int omp_env_limit = ULONG_MAX;
-
- if (query == NPROC_CURRENT_OVERRIDABLE)
- {
- unsigned long int omp_env_threads;
- /* Honor the OpenMP environment variables, recognized also by all
- programs that are based on OpenMP. */
- omp_env_threads = parse_omp_threads (getenv ("OMP_NUM_THREADS"));
- omp_env_limit = parse_omp_threads (getenv ("OMP_THREAD_LIMIT"));
- if (! omp_env_limit)
- omp_env_limit = ULONG_MAX;
-
- if (omp_env_threads)
- return MIN (omp_env_threads, omp_env_limit);
-
- query = NPROC_CURRENT;
- }
- /* Here query is one of NPROC_ALL, NPROC_CURRENT. */
-
/* On systems with a modern affinity mask system call, we have
sysconf (_SC_NPROCESSORS_CONF)
= sysconf (_SC_NPROCESSORS_ONLN)
@@ -279,7 +226,7 @@ num_processors (enum nproc_query query)
unsigned long nprocs = num_processors_via_affinity_mask ();
if (nprocs > 0)
- return MIN (nprocs, omp_env_limit);
+ return nprocs;
}
#if defined _SC_NPROCESSORS_ONLN
@@ -287,7 +234,7 @@ num_processors (enum nproc_query query)
Cygwin, Haiku. */
long int nprocs = sysconf (_SC_NPROCESSORS_ONLN);
if (nprocs > 0)
- return MIN (nprocs, omp_env_limit);
+ return nprocs;
}
#endif
}
@@ -330,7 +277,7 @@ num_processors (enum nproc_query query)
if (query == NPROC_CURRENT)
{
if (psd.psd_proc_cnt > 0)
- return MIN (psd.psd_proc_cnt, omp_env_limit);
+ return psd.psd_proc_cnt;
}
else
{
@@ -351,7 +298,7 @@ num_processors (enum nproc_query query)
? MP_NAPROCS
: MP_NPROCS);
if (nprocs > 0)
- return MIN (nprocs, omp_env_limit);
+ return nprocs;
}
#endif
@@ -367,7 +314,7 @@ num_processors (enum nproc_query query)
if (sysctl (mib, ARRAY_SIZE (mib), &nprocs, &len, NULL, 0) == 0
&& len == sizeof (nprocs)
&& 0 < nprocs)
- return MIN (nprocs, omp_env_limit);
+ return nprocs;
}
#endif
@@ -376,9 +323,73 @@ num_processors (enum nproc_query query)
SYSTEM_INFO system_info;
GetSystemInfo (&system_info);
if (0 < system_info.dwNumberOfProcessors)
- return MIN (system_info.dwNumberOfProcessors, omp_env_limit);
+ return system_info.dwNumberOfProcessors;
}
#endif
return 1;
}
+
+/* Parse OMP environment variables without dependence on OMP.
+ Return 0 for invalid values. */
+static unsigned long int
+parse_omp_threads (char const* threads)
+{
+ unsigned long int ret = 0;
+
+ if (threads == NULL)
+ return ret;
+
+ /* The OpenMP spec says that the value assigned to the environment variables
+ "may have leading and trailing white space". */
+ while (*threads != '\0' && c_isspace (*threads))
+ threads++;
+
+ /* Convert it from positive decimal to 'unsigned long'. */
+ if (c_isdigit (*threads))
+ {
+ char *endptr = NULL;
+ unsigned long int value = strtoul (threads, &endptr, 10);
+
+ if (endptr != NULL)
+ {
+ while (*endptr != '\0' && c_isspace (*endptr))
+ endptr++;
+ if (*endptr == '\0')
+ return value;
+ /* Also accept the first value in a nesting level,
+ since we can't determine the nesting level from env vars. */
+ else if (*endptr == ',')
+ return value;
+ }
+ }
+
+ return ret;
+}
+
+unsigned long int
+num_processors (enum nproc_query query)
+{
+ unsigned long int omp_env_limit = ULONG_MAX;
+
+ if (query == NPROC_CURRENT_OVERRIDABLE)
+ {
+ unsigned long int omp_env_threads;
+ /* Honor the OpenMP environment variables, recognized also by all
+ programs that are based on OpenMP. */
+ omp_env_threads = parse_omp_threads (getenv ("OMP_NUM_THREADS"));
+ omp_env_limit = parse_omp_threads (getenv ("OMP_THREAD_LIMIT"));
+ if (! omp_env_limit)
+ omp_env_limit = ULONG_MAX;
+
+ if (omp_env_threads)
+ return MIN (omp_env_threads, omp_env_limit);
+
+ query = NPROC_CURRENT;
+ }
+ /* Here query is one of NPROC_ALL, NPROC_CURRENT. */
+ {
+ unsigned long nprocs = num_processors_ignoring_omp (query);
+ return MIN (nprocs, omp_env_limit);
+ }
+}
Yes much better with that cleanup.
I presume you'll apply.

thanks!
Pádraig
Bruno Haible
2017-02-26 21:33:51 UTC
Permalink
Post by Pádraig Brady
Yes much better with that cleanup.
I presume you'll apply.
OK, pushed.

Loading...