Similar to process descriptors, jail desriptors are allow jail
administration using the file descriptor interface instead of JIDs.
They come from and can be used by jail_set(2) and jail_get(2),
and there are two new system calls, jail_attach_jd(2) and
jail_remove_jd(2).
Reviewed by: bz, brooks
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D43696
On most other platforms observed, including OpenBSD, NetBSD, and Linux,
these system calls have long since been converted to only touching the
supplementary groups of the process. This poses both portability and
security concerns in porting software to and from FreeBSD, as this
subtle difference is a landmine waiting to happen. Bugs have been
discovered even in FreeBSD-local sources, since this behavior is
somewhat unintuitive (see, e.g., fix 48fd05999b for chroot(8)).
Now that the egid is tracked outside of cr_groups in our ucred, convert
the syscalls to deal with only supplementary groups. Some remaining
stragglers in base that had baked in assumptions about these syscalls
are fixed in the process to avoid heartburn in conversion.
For relnotes: application developers should audit their use of both
setgroups(2) and getgroups(2) for signs that they had assumed the
previous FreeBSD behavior of using the first element for the egid. Any
calls to setgroups() to clear groups that used a single array of the
now or soon-to-be egid can be converted to setgroups(0, NULL) calls to
clear the supplementary groups entirely on all FreeBSD versions.
Co-authored-by: olce (but bugs are likely mine)
Relnotes: yes (see last paragraph)
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D51648
Include the two new syscalls in the symbol map.
Reviewed by: kib
MFC after: 3 months
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D50315
This new system call allows to set all necessary credentials of
a process in one go: Effective, real and saved UIDs, effective, real and
saved GIDs, supplementary groups and the MAC label. Its advantage over
standard credential-setting system calls (such as setuid(), seteuid(),
etc.) is that it enables MAC modules, such as MAC/do, to restrict the
set of credentials some process may gain in a fine-grained manner.
Traditionally, credential changes rely on setuid binaries that call
multiple credential system calls and in a specific order (setuid() must
be last, so as to remain root for all other credential-setting calls,
which would otherwise fail with insufficient privileges). This
piecewise approach causes the process to transiently hold credentials
that are neither the original nor the final ones. For the kernel to
enforce that only certain transitions of credentials are allowed, either
these possibly non-compliant transient states have to disappear (by
setting all relevant attributes in one go), or the kernel must delay
setting or checking the new credentials. Delaying setting credentials
could be done, e.g., by having some mode where the standard system calls
contribute to building new credentials but without committing them. It
could be started and ended by a special system call. Delaying checking
could mean that, e.g., the kernel only verifies the credentials
transition at the next non-credential-setting system call (we just
mention this possibility for completeness, but are certainly not
endorsing it).
We chose the simpler approach of a new system call, as we don't expect
the set of credentials one can set to change often. It has the
advantages that the traditional system calls' code doesn't have to be
changed and that we can establish a special MAC protocol for it, by
having some cleanup function called just before returning (this is
a requirement for MAC/do), without disturbing the existing ones.
The mac_cred_check_setcred() hook is passed the flags received by
setcred() (including the version) and both the old and new kernel's
'struct ucred' instead of 'struct setcred' as this should simplify
evolving existing hooks as the 'struct setcred' structure evolves. The
mac_cred_setcred_enter() and mac_cred_setcred_exit() hooks are always
called by pairs around potential calls to mac_cred_check_setcred().
They allow MAC modules to allocate/free data they may need in their
mac_cred_check_setcred() hook, as the latter is called under the current
process' lock, rendering sleepable allocations impossible. MAC/do is
going to leverage these in a subsequent commit. A scheme where
mac_cred_check_setcred() could return ERESTART was considered but is
incompatible with proper composition of MAC modules.
While here, add missing includes and declarations for standalone
inclusion of <sys/ucred.h> both from kernel and userspace (for the
latter, it has been working thanks to <bsm/audit.h> already including
<sys/types.h>).
Reviewed by: brooks
Approved by: markj (mentor)
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D47618
This is similar to chroot(2), but takes a file descriptor instead
of path. Same syscall exists in NetBSD and Solaris. It is part of a larger
patch to make absolute pathnames usable in Capsicum mode, but should
be useful in other contexts too.
Reviewed By: brooks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D41564
Officially since C11 (and in reality FreeBSD since 3.0 with commit
1b46cb523d) errno has been defined to be a macro. Rename the symbol
to __libsys_errno and move it to FBSDprivate_1.0 and confine it entierly
to libsys for use by libthr. Add a FBSD_1.0 compat symbol for existing
binaries that were incorrectly linked to the errno symbol during
libc.so.7's lifetime.
This deliberately breaks linking software that directly links to errno.
Such software is broken and will fail in surprising ways if it becomes
threaded (e.g., if it triggers loading of a pam or nss module that
uses threads.)
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D46780
I put the symbols in the wrong file (should have been
lib/libc/sys/Symbol.map), added a duplicate pdfork entry due to a botch
rebase, and there seems to be a issue with gcc13/binutils not exposing
the symbols so revert the whole thing while I debug.
This reverts commit ee632fb9eb.
List them in the symbol map rather than using the __sym_default to
expose them. This will allow later improvements in the stub
implementations in libc.so.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D44113
When moving the implementation, I failed to move the symbol entry.
Reviewed by: kib
Fixes: 84dd0c080b libc: libc/gen/sched_getcpu_gen.c -> libsys/
Differential Revision: https://reviews.freebsd.org/D44112
These provide standard APIs, but are implemented using another system
call (e.g., pipe implemented in terms of pipe2) or are interposed by the
threading library to support cancelation.
After discussion with kib (see D44111), I've concluded that it is
better to keep most public interfaces in libc with as little
as possible in libsys.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D44241
Long ago (e129c18a83ef) __sys_sigwait was wrapped to prevent sigwait()
from returning with EINTR. Through a series of changes this wrapper
become __libc_sigwait which was internal to libc and used solely in the
interposing table. To support a move of sigwait back to libc, move this
wrapper into libsys and rename it with an __libsys_ prefix.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D44238
It's a thin wrapper on cap_getmode() implemented in libc, not a system
call so the symbol should have been exposed by libc/gen/Symbol.map
alongside the implementation.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D44110
Before, the 'errno' itself was defined in libc and was referenced by
libsys, causing undesired dependency.
Reviewed by: brooks, imp
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D43985
Continue to filter the public interface (elf_aux_info()), but entierly
relocate the private interfaces (_elf_aux_info(),
__init_elf_aux_vector(), and __elf_aux_vector) to libsys.
This ensures that rtld updates the correct (only) copy of
__elf_aux_vector. After 968a18975a
updates were confused and __getosreldate was failing, causing
the system to fall back to compat compat12 syscalls in some cases.
Return to explicitly linking libc to libsys and link libthr with libc
and libsys (in that order).
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D43910
We now export all _ and __sys_ prefixed syscalls stubs from libc and
libsys so that libsys can replace them.
Reviewed by: kib, emaste, imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/908
This is part of the interface to the kernel and some syscall wrappers
depend on it so move it there.
Reviewed by: kib, emaste, imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/908
Remove core system call implementations and documentation to lib/libsys
and lib/libsys/<arch> from lib/libc/sys and lib/libc/<arch>/<sys>.
Update paths to allow libc to find them in their new home.
Reviewed by: kib, emaste, imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/908