Learning Question

If many programs are running, how can they all make progress on a limited number of CPU cores?

A CPU core can execute only one instruction stream at a time.

When more runnable work exists than available cores, the operating system scheduler decides which process or thread runs next.

The first mental model is:

Scheduling is the OS mechanism that shares CPU time among runnable execution contexts.

Running Versus Runnable

A process or thread can be runnable without currently running.

Running means it is actually executing on a CPU core at that moment.

Runnable means it is ready to run if the scheduler gives it CPU time.

Blocked means it cannot continue until some event occurs, such as I/O completion, a timer, a lock release, or child-process termination.

This distinction prevents a common confusion:

A program can be alive and ready but not currently executing.

Time Slices

Operating systems commonly give runnable work limited turns on the CPU.

A time slice is a period during which one execution context runs before the scheduler may switch to another.

The exact policy varies by OS and scheduler configuration.

The important idea is that long-running CPU-bound work should not permanently prevent other runnable work from running.

Timer interrupts help the kernel regain control and make scheduling decisions.

Context Switching

A context switch saves enough state from one execution context and restores enough state for another.

That state may include:

  • instruction pointer
  • stack pointer
  • CPU registers
  • scheduling metadata
  • address-space context when switching between processes

After a context switch, the CPU continues as if the selected process or thread had been paused and resumed.

From inside a program, this pause is usually invisible except through timing and concurrency effects.

Blocking Gives Up the CPU

When a thread waits for I/O, a lock, a timer, or another event, it may block.

Blocked work is not runnable.

The scheduler can run something else.

This is why a program waiting for keyboard input does not need to spin on the CPU continuously.

The kernel records what the thread is waiting for and wakes it when the event occurs.

Scheduling is therefore connected to waiting:

The scheduler chooses among runnable work. Blocking changes what counts as runnable.

CPU-Bound and I/O-Bound Work

CPU-bound work spends most of its time executing instructions.

I/O-bound work often waits for external events such as disk, network, terminal input, or another process.

These programs behave differently under scheduling.

A CPU-bound program may use as much CPU time as the scheduler allows.

An I/O-bound program may run briefly, block, wake later, and run again.

Understanding this distinction helps explain why CPU usage, latency, and throughput are not the same thing.

Priority and Fairness

Schedulers use policies.

They may consider priority, fairness, interactivity, CPU affinity, deadlines, or other factors.

This collection does not need the full details of a particular scheduler.

The useful programmer-facing point is:

Being runnable does not mean running immediately, and scheduling policy affects when progress happens.

Core Mental Model

The operating system makes CPU sharing possible by switching between runnable execution contexts and by removing blocked work from the runnable set until it can continue.

When reasoning about a slow or stuck program, ask:

Is it running on the CPU, runnable but waiting for CPU time, or blocked waiting for an event?

Final Summary

Scheduling lets multiple processes and threads share limited CPU cores.

The OS tracks running, runnable, and blocked states, uses context switching to resume execution contexts, and coordinates CPU time with I/O waits, timers, locks, and other runtime events.