Sometimes a task will be heavily disk, network or CPU dependent, and will starve other tasks of those resources. It’s important that such tasks won’t steal resources from other tasks.
Ideally, some resource intensive tasks would run only when the resources they depend on are idle. That, or resource intensive tasks would have their resource consumption throttled to favor other tasks.
In this article, we will review CPU and I/O scheduling at the process-level on both Linux and Darwin.
Process scheduling priority can be easily tuned with
Linux processes given a lower niceness value on the interval
[-20, 19] via the
nice shell interface have priority over those with a larger niceness value. Internally, niceness has an interval of
[0, 139], of which only the last 39 priorities are available to users.
Here’s how we’d run a process with a lower scheduling precedence:
nice -n 19 our_expensive_cmd
On macOS, priorities given to
nice on the command-line, or via the C process management interface, exist on the interval
Darwin, the open source basis for macOS, has additional process priority concepts.
For example, a process with priority
PRIO_DARWIN_BG has these properties:
When a thread or process is in a background state the scheduling priority is set to the lowest value, disk IO is throttled (with behavior similar to using setiopolicy_np(3) to set a throttleable policy), and network IO is throttled for any sockets opened after going into background state. Any previously opened sockets are not affected.
We’ll use Darwin’s process priority concepts in the next section.
On Linux, we can set a task’s CPU affinity with
If you run a heterogenous multiprocessing (HMP) setup, like those found in ARM big.LITTLE designs, this means you can pin background tasks to the little processor cores, or a CPU intensive task you’d like to finish quickly to the big processor cores.
If you want to take advantage of a processor’s cache or a multicore processor’s shared cache, this can eliminate overhead associated with switching processes between many processors or cache misses/coherency.
XNU, the kernel from which Darwin is based, exposes a thread affinity API, but there isn’t a convenient
taskset-like wrapper around it.
Network and Disk I/O Scheduling
The Linux kernel has a variety of input/ouput (I/O) queues. I/O is prioritized by each process’s given I/O scheduling class. The classes are idle, best effort and real time.
A process with the I/O scheduling class of idle will be given disk access when the resource is idle.
Best effort classes have priorities on the interval
[0, 7], with lower numbers given a higher priority. Best effort scheduling is done via round-robin.
Real time classes are given disk priority, have the same number of priority levels as best effort and can only be used by root. Real time scheduled processes can starve other processes of resources. If you don’t know if you should use real time I/O scheduling, that most likely means that you don’t need it.
We can set I/O priority on Linux via
# run a command with `idle` disk priority
ionice -c idle our_expensive_cmd
# run a command with the lowest `best effort` disk priority
ionice -c best-effort -n 7 our_expensive_cmd
Linux does not have a convenient interface to set network I/O priority. There are many options one can implement to that end, perhaps using kernel network QoS policies or
iptables. I’m sure something involving network namespaces could be hacked together and would be pretty cool. However, someone built a userspace solution with a nice command-line interface, so why not use that?
Since Linux is well-documented, I will not go into its different tiers of I/O queues, but I will expound on the macOS equivalent when we get to them, as I was unable to find much documentation in macOS-land.
Unlike Linux, XNU-derived systems have the concept of network I/O priority classes for processes.
XNU-derived macOS does not have the same scheduling structure as Linux, but does have the ability to prioritize disk I/O.
If we look at the
setiopolicy_np() manpage, we’ll see that macOS has a handful of I/O priority classes:
IOPOL_IMPORTANTprocesses are given utmost priority.
IOPOL_STANDARDprocesses may be delayed in favor of
IOPOL_UTILITYprocesses are intended to be short-running and are throttled to favor the two priority classes above.
IOPOL_THROTTLEprocesss are intended to be long-running and are throttled to favor the above classes.
IOPOL_PASSIVEprocesses are intended for server-type processes for which I/O is the result of requests from client applications. The intent is that client application I/O will not be slowed down by server I/O initiated by the client. Other priority classes will ignore this class, giving the above classes priority over
Darwin gives us a nice command-line interface around
Let’s take a look at how we’d run a process with the lowest CPU and I/O priority in Linux:
ionice -c idle nice -n 19 our_expensive_cmd
Here’s how we’d do something very similar to the above on macOS:
taskpolicy -b our_expensive_cmd
-b flag runs
our_expensive_cmd with the process priority
PRIO_DARWIN_BG, which was touched upon in the previous section.
taskpolicy also allows us to set one of the five disk policies that were listed above. For example, if we wanted to run a process with the policy
IOPOL_THROTTLE and scope
IOPOL_SCOPE_PROCESS, we could run:
taskpolicy -d throttle our_expensive_cmd
If we ran
taskpolicy with the
-g flag, it would do the same as above, but with the scope
Digging around the XNU kernel source, we will find that
IOPOL_SCOPE_DARWIN_BG is given the
TASK_POLICY_DARWIN_BG_IOPOL I/O policy flavor, which is used to decide the background process I/O tier.
We’ll also see that the default I/O policy for
IOPOL_UTILITY is given tier
THROTTLE_LEVEL_TIER2, which is one tier above
Interestingly enough, mobile XNU-derived systems like iOS have a slightly different I/O scheduling scheme than non-mobile systems, based on what I can glean from preprocessor directives.
Here’s a breakdown of tiers:
THROTTLE_LEVEL_TIER0, the default scheduling tier for
THROTTLE_LEVEL_TIER1, default for
THROTTLE_LEVEL_TIER2, default for
THROTTLE_LEVEL_TIER3, default for
If we try to give a process the scope of
IOPOL_SCOPE_DARWIN_BG, but give it any policy but
IOPOL_THROTTLE, we’ll get an error:
alex@mbp:~$ taskpolicy -g passive bash
taskpolicy: setiopolicy_np(...IOPOL_SCOPE_DARWIN_BG...): Invalid argument
At some point I’ll do a write up on XNU QoS latency and throughput tiers, but that’s for another day.
We took a look at what CPU and I/O prioritization options are available on Linux. We’ve contrasted them with that which are available on macOS.
While Linux has utilities like
taskset, it lacks a convenient process-level network throttling option. macOS, on the otherhand, has network and disk throttling built-in and available behind the
taskpolicy command-line utility.
Setting processor affinity is as easy as issuing a shell command on Linux, but on macOS such functionality must implemented at the application level, per application.