Threads and the Thread Model

Open a web browser and watch what it does. One tab is rendering a page — laying out text, painting images — while another tab is quietly downloading a big file in the background. A spinner keeps spinning, a video keeps playing, and the whole time the window stays responsive to your clicks. All of that is happening inside a single program, seemingly at the same time. How?

The answer is threads. A process is a running program with its own private slice of memory. But a process does not have to do just one thing at a time. Inside it we can have many threads of execution — several independent "lines of work" all running within the same program, each following its own path through the code. This one idea is the whole of this page: a thread is a unit of execution within a process.

What exactly is a thread?

Think of a process as a well-stocked workshop: it has the tools (the program code), the shared supply of raw material (the global data and the heap), and the paperwork (open files and network connections). A thread is a single worker in that workshop. Add more workers and you get more done at once — but they are all in the same room, sharing the same tools and the same materials.

Each worker still needs a few things of their own: a personal notepad to track "where am I in my task" and "what am I holding right now". For a thread, that private kit is tiny but essential:

a program counter — which instruction this thread is about to execute;
a set of registers — the CPU values it is working with this instant;
a stack — its own trail of function calls and local variables.

Everything else — the code, the globals, the heap, the open files — belongs to the process and is shared by every thread in it. That sharing is what makes threads powerful, and, as we'll see, what makes them dangerous.

Shared vs. private: the heart of the model

The single most important thing to understand about threads is the split between what they share and what each keeps private. Get this table into your bones:

Shared by ALL threads in the process | Private to EACH thread ------------------------------------------------------------------ Program code (the text segment) | Program counter (PC) Global and static variables | CPU registers The heap (dynamically allocated memory) | The stack (locals + call chain) Open files and file descriptors | Thread state (running / ready / …) Network sockets | Thread-local storage Process ID and address space | Thread ID

Read the left column again: threads live in one shared address space. If thread A writes to a global variable or an object on the heap, thread B sees the change immediately, because they are literally looking at the same bytes of memory. There is no copying, no message passing, no barrier between them. That is the point — and the peril.

Notice how the diagram is drawn: one big process box holds the shared segments (code, globals, heap) that every thread reaches into, and then each thread carries only its own little stack and program counter alongside. Reveal the threads one at a time and watch the shared region stay put while private stacks pile up around it.

Threads vs. processes

We already have processes — why invent threads at all? Because giving every line of work its own process is expensive. Processes are deliberately isolated: each gets its own private address space, and the operating system works hard to keep them apart. Threads throw that isolation away on purpose, and win three things:

Concurrency & parallelism. Several threads can make progress together; on a multi-core CPU they can run genuinely at the same time, one per core.
Responsiveness. One thread can block on a slow download while another keeps the user interface alive. The browser tab stays clickable because a different thread is handling it.
Cheap sharing & cheap switching. Threads share memory for free (no copying), and switching between two threads of the same process is far cheaper than switching between two processes — as we'll now count out.

Two processes have separate address spaces; to communicate they must use explicit channels (pipes, sockets, shared-memory regions the OS sets up).
Two threads of the same process share one address space; they communicate simply by reading and writing the same variables.
Creating a thread and switching between threads is much lighter than doing the same with processes — which is why threads are sometimes called "lightweight processes".

Worked comparison: switching cost

When the CPU stops running one line of work and starts another, it performs a context switch: it saves the state of the old one and loads the state of the new one. The cost of that switch is the crux of why threads are "lightweight". Let's tally up what has to happen in each case.

Switching between two processes means changing which address space is active:

save the old process's registers and program counter;
switch the memory map — load a whole new set of page tables;
flush the TLB (the CPU's cache of address translations), because the old entries now point to the wrong process's memory;
the CPU caches are cold for the new process, so early accesses miss and stall.

Switching between two threads of the same process is much less work:

save the old thread's registers and program counter, load the new thread's;
that's essentially it — the address space is unchanged, so the page tables stay, the TLB stays valid, and the caches are still warm with the shared code and data.

The expensive steps — swapping page tables, flushing the TLB, cold caches — are exactly the ones a thread switch skips, because both threads already share the same memory map. That is the whole reason a thread switch can be an order of magnitude cheaper than a process switch, and why creating a thread is far quicker than fork-ing a whole new process.

Threads are cheap, not free. Each one still needs its own stack — often a megabyte reserved by default — plus kernel bookkeeping and a slot in the scheduler. Spawn tens of thousands and you burn memory and drown the scheduler in switching overhead. This is exactly the pressure that led to thread pools (reuse a fixed set of workers) and, later, lightweight async models and green/virtual threads that pack thousands of logical tasks onto a handful of real OS threads.

User-level vs. kernel-level threads

Who actually knows about a thread and schedules it? There are two answers, and real systems blend them.

Kernel-level threads are created and scheduled by the operating system itself. The kernel sees every thread, so it can run different threads of one process on different cores at the same time, and if one thread blocks (say, waiting on disk) the others keep running. The catch: every create, destroy, and switch is a trip into the kernel, which costs a little.
User-level threads are managed by a library inside the process; the kernel sees only the single process and knows nothing of the threads within. Switching is blisteringly fast (no kernel crossing), but with a naïve implementation the whole process can be scheduled on just one core — and if one thread makes a blocking system call, the kernel blocks the entire process, freezing all the others.

Because each has a weakness the other fixes, systems often use a hybrid (many-to-many) model: many user-level threads are multiplexed onto a smaller pool of kernel threads, aiming for fast switching and true parallelism. Modern OSes lean on kernel threads for the real parallelism and let language runtimes layer user-level scheduling on top.

The danger hiding in the sharing

Shared memory is the gift and the curse. Because two threads can touch the same variable at the same instant, their updates can interleave in ways you never intended. Picture two threads both running balance = balance + 100: each reads the old balance, adds 100, and writes back — but if they read before either has written, one of the two deposits is simply lost. This is a race condition, and it is the reason a whole toolkit — locks, mutexes, semaphores, atomics — exists to synchronize access to shared data. That toolkit is a topic all its own; for now, just carry away the warning.

The classic beginner mistake is to imagine that threads have separate memory, like processes do. They do not. Two processes are walled off from each other; two threads of one process share the same heap and the same globals. That shared address space is the entire point of threads — and it is precisely why they are dangerous.

So when you write multithreaded code, assume that any global or heap object can be read and written by another thread at any moment. The private stack is the only memory a thread truly has to itself. If two threads touch the same shared data and at least one of them writes, you have a potential race — and you must add synchronization to make it safe.