04 - Computer Hardware ReviewOutlineWhy Review Hardware?Hardware ModelProcessorAnnouncementsReading: MOS 1.3Hardware ModelThe OS is fundamentally tied to hardware.- Must manage hardware resources for user- Must provide system calls for interactionWe saw from OS history that the entire purposeof an OS is to deal directly with hardware soprogrammers don't have to.Simplified model of a PC:A computer is composed of a CPU, memory, andI/O devices, all connected by a system bus.Modern computers are more complicated, but this isa decent enough model for us.The Central Processing Unit (CPU) performs allcomputations.CPU context (state) is held in registers.- General registers for data- Program counter for next instruction address- Stack pointer for top stack address- Status registers for execution informationA register is a small, very fast unit of memoryused to store data during active computation.CPU may need to switch between executingmultiple different programs.This is called a context switch.During a context switch, all the importantregisters need to be saved and restored fromkernel memory so programs execute correctly.Pipelined CPUs can process more than oneinstruction in stages for better performance.Superscalar CPUs improve performance further viaout-of-order execution.Multithreading/hyperthreading processorsincorporate hardware to support highly performantcontext switching.Multicore processors have multiple independentCPU cores in a single chip, allowing multiprocessing.(see MOS Figure 1-8)There are multiple different types of memory in acomputer used for different purposes.They are often arranged in a hierarchy organizedby size, performance, and cost.A cache temporarily holds data fetched fromslower memory in faster memory.Questions:Main memory is the addressable memory used byprocesses for just about everything.It is usually volatile (loses data on power loss).Details out of scope.Secondary memory is Non-volatile (persistent).There's lots of types:Hard Disk Drives are electromechanical, magneticstorage devices.I/O devices typically have a hardware controllerwith internal registers for configuration.Depending on the architecture, the processor canaccess these registers in one of 2 ways:I/O operations take a long time.There are 3 ways to check when they're done:Interrupts are important enough that they bearelaboration here.Interrupt hardware model (Fig 1-11a):Interrupt software model (Fig 1-11b):Things to note:It has been a long time since there has only beena single system bus in a computer.Now there are all sorts of buses everywhere, withmany protocols.> Parallel Bus> Shared (Bus)- Each word sent all at once- Faster- Larger footprint- Devices all connected to the same wires- Requires a bridge controller to manage- Each word sent one bit at a time- Slower- Smaller footprint- Devices connected to CPU separately- Requires switcher to manage connections> Serial Bus> Point-to-Point (P2P)Bus architectures:Many topologies. Here are just 2:Examples:PCI (legacy) - parallel, sharedPCIe - serial, P2PUSB - serial, starSATA - serial, P2P- Why?- A single bus limits data throughput- Kernel is always in memory- Kernel code is only executed on-demand via interrupt or system call- Most CPU time is spent in user-mode programsThe program calls into a device driver whichissues a command to the device controller.CPU hardware automatically checks for apending interrupt if interrupts are enabled.CPU hardware jumps directly to the kernel-mode interrupt handler function based on theprovided device id.The handler saves CPU context, handles theinterrupt, then restores the CPU context andreturns to the interrupted process.When device controller finishes, it signals theinterrupt controller using dedicated bus lines.When interrupt controller is ready, it sets apin on the CPU to signal a pending interrupt.The interrupt sets the device id of thepending interrupt so the CPU can process it.Most modern architectures now opt for MMIOPort-mapped I/O (port address space)Busy-waiting / PollingKeep checking the device status register.Have the device send a signal to the processor, which theprocessor manages with a context switch.Special hardware allows devices to directly write to addressablememory, triggering an interrupt when the transfer is complete.InterruptsDirect Memory Access (DMA)Memory-mapped I/O (shared address space)- Very cheap, massive storage- Long-term data reliability- Rather slow- Quite fast- No moving parts- Data degradationFloating Gate Transistorsstore bits via quantumtunneling.Solid State (Flash) Drives are non-volatileelectrical storage devices.- ROM (Read-only memory)- EEPROM (Electrically Erasable PROM)- Hard Disk- Flash/SSDSoftware Interface:get()returns hit if data present, miss otherwise- When to put new items in cache?- Which cache slot to put the new item in?- Which item to evict when cache full?- Where to put evicted items in memory?stores data in cacheRelated: cache coherencyset()- Hardware cache (e.g. L1, L2, L3) - managed by hardware, out of scope- Software cache (e.g. page cache, file cache)- Fetches instructions from memory, executes instructions, and updates memory- Each CPU has an Instruction Set Architecture (ISA) that defines the machine code format. ISAs are not portable.ProcessorMemoryI/O DevicesBusesMake sure to join the Discord!PA 1 is due Friday.Start today if you haven't started yet!1)2)3)4)5)MemoryVideoControllerKeyboardControllerUSBControllerHDDControllerMMUCPUBusI/O DevicesOn-boardBased on MOS Figure 1-6A simplified single-stage RISC-V processor datapathP&H Figure 4.17Patterson, D. A., & Hennessy, J. L. (2018). Computer Organizationand Design RISC-V Edition: The Hardware/Software Interface.Morgan Kaufmann. ISBN: 978-0128122754FetchUnitFetchUnitDecodeUnitDecodeUnitHoldingBufferExecuteUnitExecuteUnitExecuteUnitFetchUnitDecodeUnitExecuteUnitSimplified 3-stage pipeline from based on MOS Figure 1-7aSuperscalar CPU from MOS Figure 1-7bMemoryCachesMain Memory (RAM)Hard Disk Drive (HDD)Solid State Drive (SSD/Flash)I/O DevicesInterruptsBusesAdditional ReferencesSecondary MemoryRegistersLevelSizeThroughputLatencyL1 (Cache)L2 (Cache)L3 (Cache)RAM/Main(Primary)SSD(Secondary)HDD(Secondary)~16 kB~256 GB/s~250 ps~1 ns~3 ns~10 ns~100 ns~200 us~4 ms~64 GB/s~18 GB/s~5 GB/s~60 GB/s~2 GB/s~500 MB/s~32 kB~1 MB~16-32 MB~4-64 GB~256 GB - 2 TB~1-18 TBlowercost+largersizehigherperfValues based on wikipedia:Memory HierarchyDRAM CellSRAM CellDRAM used for Main MemorySRAM used for cacheHDD ComponentsHDD DiagramHDD MechanismSSD ControllerHost InterfaceEmbeddedProcessorBufferManagerFlashControllerFlashControllerFlashControllerFlashMemoryFlashMemoryFlashMemoryFlashMemoryFlashMemoryFlashMemoryRAM BufferFlash CellSSD Architecture1)1)2)2)3)MOS Figure 1-111)1)2)3)2)3)4)Intel layout 2007MOS Figure 1-12Parallel vs. Serial InterfacePCI vs PCIe TopologiesSome assets from Vecteezy.comMMU = Memory Management Unittemporal localityif you used an address, you are likely touse it again soon.if you used an address, you are likelyaddresses close byspacial locality