Background

Sometimes a task will be heavily disk, network or CPU dependent, and will starve other tasks of those resources. It’s important that such tasks won’t steal resources from other tasks.

Ideally, some resource intensive tasks would run only when the resources they depend on are idle. That, or resource intensive tasks would have their resource consumption throttled to favor other tasks.

In this article, we will review CPU and I/O scheduling at the process-level on both Linux and Darwin.

Process Scheduling

Process scheduling priority can be easily tuned with nice.

Linux processes given a lower niceness value on the interval [-20, 19] via the nice shell interface have priority over those with a larger niceness value. Internally, niceness has an interval of [0, 139], of which only the last 39 priorities are available to users.

Here’s how we’d run a process with a lower scheduling precedence:

nice -n 19 our_expensive_cmd

On macOS, priorities given to nice on the command-line, or via the C process management interface, exist on the interval [-20, 20].

Darwin, the open source basis for macOS, has additional process priority concepts.

For example, a process with priority PRIO_DARWIN_BG has these properties:

When a thread or process is in a background state the scheduling priority is set to the lowest value, disk IO is throttled (with behavior similar to using setiopolicy_np(3) to set a throttleable policy), and network IO is throttled for any sockets opened after going into background state. Any previously opened sockets are not affected.

We’ll use Darwin’s process priority concepts in the next section.

Processor Affinity

On Linux, we can set a task’s CPU affinity with taskset.

If you run a heterogenous multiprocessing (HMP) setup, like those found in ARM big.LITTLE designs, this means you can pin background tasks to the little processor cores, or a CPU intensive task you’d like to finish quickly to the big processor cores.

If you want to take advantage of a processor’s cache or a multicore processor’s shared cache, this can eliminate overhead associated with switching processes between many processors or cache misses/coherency.

XNU, the kernel from which Darwin is based, exposes a thread affinity API, but there isn’t a convenient taskset-like wrapper around it.

Network and Disk I/O Scheduling

The Linux kernel has a variety of input/ouput (I/O) queues. I/O is prioritized by each process’s given I/O scheduling class. The classes are idle, best effort and real time.

A process with the I/O scheduling class of idle will be given disk access when the resource is idle.

Best effort classes have priorities on the interval [0, 7], with lower numbers given a higher priority. Best effort scheduling is done via round-robin.

Real time classes are given disk priority, have the same number of priority levels as best effort and can only be used by root. Real time scheduled processes can starve other processes of resources. If you don’t know if you should use real time I/O scheduling, that most likely means that you don’t need it.

We can set I/O priority on Linux via ionice.

# run a command with `idle` disk priority
ionice -c idle our_expensive_cmd

# run a command with the lowest `best effort` disk priority
ionice -c best-effort -n 7 our_expensive_cmd

Linux does not have a convenient interface to set network I/O priority. There are many options one can implement to that end, perhaps using kernel network QoS policies or iptables. I’m sure something involving network namespaces could be hacked together and would be pretty cool. However, someone built a userspace solution with a nice command-line interface, so why not use that?

Since Linux is well-documented, I will not go into its different tiers of I/O queues, but I will expound on the macOS equivalent when we get to them, as I was unable to find much documentation in macOS-land.

Unlike Linux, XNU-derived systems have the concept of network I/O priority classes for processes.

XNU-derived macOS does not have the same scheduling structure as Linux, but does have the ability to prioritize disk I/O. If we look at the setiopolicy_np() manpage, we’ll see that macOS has a handful of I/O priority classes:

IOPOL_IMPORTANT processes are given utmost priority.
IOPOL_STANDARD processes may be delayed in favor of IOPOL_IMPORTANT.
IOPOL_UTILITY processes are intended to be short-running and are throttled to favor the two priority classes above.
IOPOL_THROTTLE processs are intended to be long-running and are throttled to favor the above classes.
IOPOL_PASSIVE processes are intended for server-type processes for which I/O is the result of requests from client applications. The intent is that client application I/O will not be slowed down by server I/O initiated by the client. Other priority classes will ignore this class, giving the above classes priority over IOPOL_PASSIVE.

Darwin gives us a nice command-line interface around setiopolicy_np() and setpriority(), called taskpolicy.

You can browse its source here, given that it’s part of the Darwin open-source project. Here’s a copy of the relevant taskpolicy manpage, since Apple took their copy down.

Let’s take a look at how we’d run a process with the lowest CPU and I/O priority in Linux:

ionice -c idle nice -n 19 our_expensive_cmd

Here’s how we’d do something very similar to the above on macOS:

taskpolicy -b our_expensive_cmd

The -b flag runs our_expensive_cmd with the process priority PRIO_DARWIN_BG, which was touched upon in the previous section.

taskpolicy also allows us to set one of the five disk policies that were listed above. For example, if we wanted to run a process with the policy IOPOL_THROTTLE and scope IOPOL_SCOPE_PROCESS, we could run:

taskpolicy -d throttle our_expensive_cmd

If we ran taskpolicy with the -g flag, it would do the same as above, but with the scope IOPOL_SCOPE_DARWIN_BG.

Digging around the XNU kernel source, we will find that IOPOL_SCOPE_DARWIN_BG is given the TASK_POLICY_DARWIN_BG_IOPOL I/O policy flavor, which is used to decide the background process I/O tier.

We’ll also see that the default I/O policy for IOPOL_SCOPE_DARWIN_BG is IOPOL_UTILITY. IOPOL_UTILITY is given tier THROTTLE_LEVEL_TIER2, which is one tier above THROTTLE_LEVEL_TIER3.

Interestingly enough, mobile XNU-derived systems like iOS have a slightly different I/O scheduling scheme than non-mobile systems, based on what I can glean from preprocessor directives.

Here’s a breakdown of tiers:

THROTTLE_LEVEL_TIER0, the default scheduling tier for IOPOL_NORMAL, IOPOL_DEFAULT and IOPOL_PASSIVE.
THROTTLE_LEVEL_TIER1, default for IOPOL_STANDARD
THROTTLE_LEVEL_TIER2, default for IOPOL_UTILITY
THROTTLE_LEVEL_TIER3, default for IOPOL_THROTTLE

If we try to give a process the scope of IOPOL_SCOPE_DARWIN_BG, but give it any policy but IOPOL_UTILITY or IOPOL_THROTTLE, we’ll get an error:

alex@mbp:~$ taskpolicy -g passive bash
taskpolicy: setiopolicy_np(...IOPOL_SCOPE_DARWIN_BG...): Invalid argument

At some point I’ll do a write up on XNU QoS latency and throughput tiers, but that’s for another day.

Conclusion

We took a look at what CPU and I/O prioritization options are available on Linux. We’ve contrasted them with that which are available on macOS.

While Linux has utilities like nice, ionice and taskset, it lacks a convenient process-level network throttling option. macOS, on the otherhand, has network and disk throttling built-in and available behind the taskpolicy command-line utility.

Setting processor affinity is as easy as issuing a shell command on Linux, but on macOS such functionality must implemented at the application level, per application.

Process Scheduling on Linux and macOS

Background

Process Scheduling

Processor Affinity

Network and Disk I/O Scheduling

Conclusion

Comments