Week 2 - Processes and Process Management

Process

Process

A process is an instance of an executing program. It has some state in memory such as:

  • The code that is being ran,
  • Initial data it started with,
  • The heap associated with the application, and
  • The stack associated with the application. This all lives in the processes address space.
Link to original

Aside

Heap (OS)

Heap

The heap of a processes is dynamic memory which is allocated at run time. It will be used to store variables which my vary dramatically in size depending on what the application is run on - for example reading data into memory. This memory will stay allocated until it is explicit de-allocated. Therefore the heap can come with considerable overheads and require techniques like garbage collection or custom allocations to handle.

Link to original

Stack (OS)

Stack (OS)

The stack of an application is a FIFO queue of stack frames - these contains a functions parameters local variables and return address. These get added when a function is called and removed once a function completes. The stack acts as the control flow for a process determining where to return to once a function as completed. The stack has a fixed size when a process starts and if it goes beyond that size can cause a stack overflow.

Link to original

Address space (OS)

Address space (OS)

For a process the address space is the virtual memory associated to the executing program. This is used to make memory management within the program simple and abstracts handing the physical memory to the OS.

Link to original

Metaphor

A process is line an order of toys:

  • State of execution:
    • Completed,
    • waiting, or
    • In progress.
  • Parts and temporary holding area
    • Pieces used to make the toy, or
    • Containers to put the pieces.
  • May require special hardware:
    • Sewing machine or
    • glue gun.

This is analogy to an OS where a process has:

  • State of execution:
    • Program counter, or
    • Stack
  • Parts and temporary holding area
    • data, or
    • registered state in memory.
  • May require special hardware:
    • I/O, or
    • access to sound output.

Process execution state

For the OS to stop and start running processes it must keep track of what it is doing. For this it uses:

All this information is stored in the PCB:

Process control block (PCB)

Process control block (PCB)

A Process control block is a data structure that holds the state for a process. This includes but is not limited to:

  • Process Identification (PID),
    • Of both the process and its parent if that exists.
  • Process state
  • Program counter
  • CPU register
  • Memory management information,
  • Scheduling information,
  • Accounting information,
    • SPU usage, elapsed time, user/system time.
  • I/O status
  • Process privileges, and
  • Process metadata.
Link to original

This block is fully instantiated when a process starts however it is frequently updated as the process is executing. It is the job of the OS to keep this up to date and correct - it will need this when it starts and stops processes.

Switching process

When running a given process that CPU has the PCB loaded into the CPU registers. If the CPU were to suspend that process it would have to write that PCB to memory and load the new processes PCB into the CPU registers. This is called a Context switch (CPU).

Context switch (CPU)

Context switch

A context switch is when the CPU goes from running one process to running a different one. This involves writing the old processes PCB into memory and fetching the second processes PCB from memory and loading it into the CPU registers.

Link to original

Context switching is costly for two reasons:

  • Direct costs: This comes from physically having to write the PCB from the CPU registers into memory and vice versa.
  • Indirect costs: The CPU has multiple layers of caches. When switching from one process to another you have to switch the data present in all these caches normally making data access temporarily very costly.

Process life cycle

During a processes time it goes through multiple different stages.

  • New: Once the user issues a process that they want to start a PCB is made and it is admitted to the CPU.
  • Ready: This is a process that has something to do but is not being ran on the CPU yet.
  • Waiting: If the process has to wait on some event from the network or I/O then it will be moved into a waiting stating for that to finish.
  • Running: It will have been context switched onto and the PCB loaded into the CPU register.
  • Terminated: Once a process has exited or error-ed it moves state to terminated to be cleaned up.

Creation

When you start the computer the OS starts a number of processes that have privileged access. These in tern create the application that you run on your computer. There are two system calls to create a new process:

  • Fork: This creates an exact copy of the current process, including the program counter.
  • Exec: This replaces a processes PCB with that of a new program. The normal flow for a process to start another one is to call fork followed by exec.

CPU Scheduler

This is a process that determines which one of the ready processes will be dispatched next to the CPU and how long it should run for. This is done via 3 operations:

  • Preempt: Interrupt and save the current context.
  • Schedule: Run the scheduler to choose the next process.
  • Dispatch: dispatch a process and switch to its context. An efficient OS wants to spend as much time running processes the user wants to run and the least time possible running the above 3 operations.

There are two important decisions that you must take when deciding on the scheduler.

  • How long should processes run for?
  • What metrics to choose the next process to run?

I/O scheduling

When a process is stopped by an IO operation this is then handled by the device driver associated with that IO task. The process will enter the waiting state until the device driver interrupts the CPU to let it know the operation has been completed and the process can move back to the ready state. Though there are other ways this waiting state can end - for example a time out.

Inter-process communication

As modern applications get more complex they are being structured to be multiple processes communicating with on another. However, the OS is on purposely structured to isolate different applications from one another. Therefore they need to communicate to each other using IPC

Inter-process communication (IPC)

Inter-process communication (IPC)

Inter-process communication is the method or API in which different processes can communicate with one another. There are four main methods to communicate messages between two processes.

  1. Message-passing IPC: This is via the OS which offers an API to pass messages between processes the OS puts them on a message bus that is sent to the other process. This has the advantage that it is managed by the OS and is safe - though it has the disadvantage of needing the OS which incurs a lot of overhead.
  2. Shared memory IPC: This lets two processes share some physical memory which is mapped into both their virtual memory space. This means the OS is out of the way but the two processes must know how to use that shared memory with one another - sometimes having to re-implement code in the OS.
  3. Higher level semantics: Such as shared files or Remote Procedure Calls (RPC).
  4. Synchronization: Methods in which two processes can synchronize so not to adversely effect one an others operation. Examples are mutexes or semaphores.
Link to original