Open a web browser and watch what it does. One tab is rendering a page — laying out text, painting images — while another tab is quietly downloading a big file in the background. A spinner keeps spinning, a video keeps playing, and the whole time the window stays responsive to your clicks. All of that is happening inside a single program, seemingly at the same time. How?
The answer is threads. A
Think of a process as a well-stocked workshop: it has the tools (the program code), the shared supply of raw material (the global data and the heap), and the paperwork (open files and network connections). A thread is a single worker in that workshop. Add more workers and you get more done at once — but they are all in the same room, sharing the same tools and the same materials.
Each worker still needs a few things of their own: a personal notepad to track "where am I in my task" and "what am I holding right now". For a thread, that private kit is tiny but essential:
Everything else — the code, the globals, the heap, the open files — belongs to the process and is shared by every thread in it. That sharing is what makes threads powerful, and, as we'll see, what makes them dangerous.
The single most important thing to understand about threads is the split between what they share and what each keeps private. Get this table into your bones:
Read the left column again: threads live in one shared address space. If thread A writes to a global variable or an object on the heap, thread B sees the change immediately, because they are literally looking at the same bytes of memory. There is no copying, no message passing, no barrier between them. That is the point — and the peril.
Notice how the diagram is drawn: one big process box holds the shared segments (code, globals, heap) that every thread reaches into, and then each thread carries only its own little stack and program counter alongside. Reveal the threads one at a time and watch the shared region stay put while private stacks pile up around it.
We already have processes — why invent threads at all? Because giving every line of work its own process is expensive. Processes are deliberately isolated: each gets its own private address space, and the operating system works hard to keep them apart. Threads throw that isolation away on purpose, and win three things:
When the CPU stops running one line of work and starts another, it performs a context switch: it saves the state of the old one and loads the state of the new one. The cost of that switch is the crux of why threads are "lightweight". Let's tally up what has to happen in each case.
Switching between two processes means changing which address space is active:
Switching between two threads of the same process is much less work:
The expensive steps — swapping page tables, flushing the TLB, cold caches — are exactly the ones a
thread switch skips, because both threads already share the same memory map. That is the
whole reason a thread switch can be an order of magnitude cheaper than a process switch, and why
creating a thread is far quicker than fork-ing a whole new process.
Threads are cheap, not free. Each one still needs its own stack — often a megabyte reserved by default — plus kernel bookkeeping and a slot in the scheduler. Spawn tens of thousands and you burn memory and drown the scheduler in switching overhead. This is exactly the pressure that led to thread pools (reuse a fixed set of workers) and, later, lightweight async models and green/virtual threads that pack thousands of logical tasks onto a handful of real OS threads.
Who actually knows about a thread and schedules it? There are two answers, and real systems blend them.
Because each has a weakness the other fixes, systems often use a hybrid (many-to-many) model: many user-level threads are multiplexed onto a smaller pool of kernel threads, aiming for fast switching and true parallelism. Modern OSes lean on kernel threads for the real parallelism and let language runtimes layer user-level scheduling on top.
Shared memory is the gift and the curse. Because two threads can touch the same variable at the same
instant, their updates can interleave in ways you never intended. Picture two
threads both running balance = balance + 100: each reads the old balance, adds 100, and
writes back — but if they read before either has written, one of the two deposits is simply
lost. This is a race condition, and it is the reason a whole toolkit —
locks, mutexes, semaphores, atomics — exists to synchronize access
to shared data. That toolkit is a topic all its own; for now, just carry away the warning.
The classic beginner mistake is to imagine that threads have separate memory, like processes do. They do not. Two processes are walled off from each other; two threads of one process share the same heap and the same globals. That shared address space is the entire point of threads — and it is precisely why they are dangerous.
So when you write multithreaded code, assume that any global or heap object can be read and written by another thread at any moment. The private stack is the only memory a thread truly has to itself. If two threads touch the same shared data and at least one of them writes, you have a potential race — and you must add synchronization to make it safe.