WikiTwist

Docker Capabilities – Namespaced vs Global

Quick take: Linux capabilities are checked by the host kernel. With Docker’s default namespaces, many actions are contained to the container’s view of the system. Some capabilities, however, still impact the whole host and should be treated as near-root. This guide keeps your container perms tight and your host safe.

Understanding namespaces & capabilities (and why it matters)

Docker isolates processes using Linux namespaces (PID, NET, MNT, IPC, UTS, USER…), while capabilities split the all-powerful root into fine-grained privileges (e.g., CAP_NET_ADMIN, CAP_SYS_ADMIN). Namespaces define where an action applies; capabilities define what the process may do. Together, they allow “least privilege” containers instead of full root.

Important: some capabilities remain effectively global in common Docker setups (for example, host time isn’t namespaced by default, so CAP_SYS_TIME can affect the host). Treat these like sharp tools.

Security baseline: Drop everything and add back only what you need (--cap-drop ALL then selective --cap-add), and avoid --privileged entirely unless you truly need host-level access.

Bonus hardening: Consider User Namespace Remapping so “root” inside the container maps to an unprivileged UID on the host.

Mostly namespaced / safer (with Docker’s default isolation)

CapabilityWhat it allowsScope (typical)RiskTypical uses
CAP_NET_ADMINChange IPs, routes, firewall rules, links.Network namespace🟧 MediumVPN clients, routing, tc/iptables tweaks inside container netns.
CAP_NET_RAWRaw sockets (ping, packet capture).Network namespace🟧 MediumPing, tcpdump, IDS/packet tools.
CAP_NET_BIND_SERVICEBind to ports < 1024.Network namespace🟩 LowListening on :80/:443 as non-root.
CAP_NET_BROADCASTSend broadcast/multicast.Network namespace🟧 MediumService discovery, custom L2/L3 testing.
CAP_CHOWNChange file owners.Mount namespace🟧 MediumInstallers, packaging steps.
CAP_DAC_OVERRIDEBypass file R/W/X perms.Mount namespace🟧 MediumBackup/restore tools, init scripts.
CAP_DAC_READ_SEARCHBypass read/search perms.Mount namespace🟧 MediumIndexing, scanning.
CAP_FOWNERBypass file ownership checks.Mount namespace🟧 MediumFixing perms across app files.
CAP_FSETIDPreserve/set setuid/setgid bits.Mount namespace🟧 MediumPackaging special binaries.
CAP_MKNODCreate special device files.Mount namespace🟧 MediumInit systems, device simulations.
CAP_LINUX_IMMUTABLESet immutable/append-only flags.Mount namespace🟧 MediumHardening files (careful with bind mounts!).
CAP_SETUIDSet process UID arbitrarily.PID namespace🟧 MediumDaemons dropping/raising privileges.
CAP_SETGIDSet process GID arbitrarily.PID namespace🟧 MediumSame as above (groups).
CAP_SETPCAPGrant/remove process capabilities.PID namespace🟧 MediumInit wrappers, launchers.
CAP_SETFCAPSet file xattrs for capabilities.Mount namespace🟧 MediumPackaging binaries with caps.
CAP_KILLSignal any process (in same PID ns).PID namespace🟧 MediumSupervisors, debuggers.
CAP_SYS_NICEChange priorities/scheduling.PID namespace🟧 MediumLow-latency apps.
CAP_SYS_RESOURCEOverride resource limits (ulimits).PID namespace🟧 MediumDBs, HPC, large RLIMITs.
CAP_SYS_TTY_CONFIGConfigure TTYs.PID/UTS namespaces🟧 MediumInteractive daemons, serial tools.
CAP_IPC_LOCKLock memory into RAM.IPC/PID namespaces🟧 MediumCrypto, low-latency apps.
CAP_IPC_OWNERBypass IPC ownership checks.IPC namespace🟧 MediumLegacy IPC mgmt.
CAP_SYS_CHROOTUse chroot.Mount/PID namespaces🟧 MediumBuild systems, init tools.
CAP_SYS_PTRACETrace/debug processes (same PID ns).PID namespace🟧 MediumDebuggers, profilers.
CAP_AUDIT_READRead audit logs (where available).Namespaced/filtered🟧 MediumSecurity tooling (limited in containers).
CAP_CHECKPOINT_RESTORECRIU checkpoint/restore operations.PID/NET/MNT interplay🟧 MediumLive-migrate processes, fast restarts.

Dangerous / global-effect (host-level impact; avoid unless you truly need them)

CapabilityWhat it allowsWhy dangerousRiskTypical uses
CAP_SYS_ADMINMounts, many ioctls, namespace ops; huge surface.Swiss-army knife; many actions leak outside namespaces, often near-root.🟥 HighFUSE, loop mounts, advanced storage/network — prefer alternatives.
CAP_SYS_MODULELoad/unload kernel modules.Direct kernel modification on the host.🟥 HighKernel dev (not for typical containers).
CAP_SYS_BOOTReboot the machine.Affects entire host.🟥 HighAlmost never in containers.
CAP_SYS_TIMESet system clock/timers.Host time (often) isn’t namespaced in Docker; changes the host clock.🟥 HighTime-sync daemons (not recommended in containers).
CAP_SYS_RAWIODirect hardware I/O port access.Bypasses driver abstractions.🟥 HighSpecialized HW tooling.
CAP_SYSLOGRead kernel logs (dmesg), control klog.Leaky visibility into host kernel state.🟥 HighKernel troubleshooting (host tools preferred).
CAP_AUDIT_CONTROLConfigure audit subsystem.Global security policy changes.🟥 HighHost security mgmt, not containers.
CAP_AUDIT_WRITEWrite to audit logs.Global audit channel abuse possible.🟥 HighRare in containers.
CAP_MAC_ADMINConfigure MAC (e.g., SELinux/Smack).Alters host security policy.🟥 HighSecurity frameworks (host-level).
CAP_MAC_OVERRIDEBypass MAC restrictions.Defeats host MAC confinement.🟥 HighAlmost never safe.
CAP_PERFMONAdvanced perf events tracing.Can observe host kernel/other tasks.🟥 HighLow-level perf analysis (host tools better).
CAP_BPFLoad/manage (e)BPF programs/maps.Hooks into host kernel paths; potential escapes/DoS.🟥 HighNet observability/firewalls — prefer dedicated agents.
CAP_BLOCK_SUSPENDBlock system suspend.Impacts host power mgmt.🟥 HighPower daemons (host-level).
CAP_WAKE_ALARMSchedule RTC wakeups.Host power/timing side effects.🟥 HighEmbedded/power mgmt; avoid in containers.

Practical tips (production-ready)

Docker Compose example (safe pattern)

services:
  app:
    image: your/image:stable
    # 1) Drop everything, add back only what you need
    cap_drop:
      - ALL
    cap_add:
      - NET_BIND_SERVICE
      - NET_ADMIN   # only if you really need it
    # 2) Keep default bridged networking to preserve NET namespace
    # network_mode: bridge
    # 3) Optional: run as non-root user inside the container
    user: "1000:1000"
    # 4) Optional: use a restrictive seccomp profile
    security_opt:
      - seccomp:default

By the way, check out our Docker post about syslog-ng and our other docker posts.

Exit mobile version