Statistics 506, Fall 2016

Basic computer architecture for data-oriented computing


The goal of this page is to define some basic terms that are important when discussing computing systems.

Key hardware and software components

A server is a single computer, almost always connected to a network.

A network is a collection of computers that can communicate with each other.

A cluster is a collection of computers, usually colocated and connected by a high speed data transfer network.

When speaking of a cluster of computers working together, a server is often called a node. With most clusters, the nodes are highly homogeneous in terms of their hardware and operating system specifications. This facilitates ease of maintenance and interoperability.

An operating system is the software that manages the hardware, and facilitates running application software. Examples of operating systems include Unix, Linux, Windows, and MacOS.

The CPU (central processing unit) is a hardware component that executes the instructions (logical, arithmetic, etc.) of computer programs.

A core is an independent processing unit in a multi-core CPU. An ordinary core can process only one instruction at a time. A modern core can process more than 10^11 instructions per second (IPS). See here for a table of IPS values.

Processes

A computer program is a collection of logical instructions that can be executed on a CPU (or more specifically on one core of a CPU). An instance of a computer program while being executed is called a process.

Multitasking refers to the ability of all modern cores and operating systems to execute multiple processes at the same time. Since a single core can only execute one instruction at a time, standard multitasking is carried out by rapidly switching between processes on a single core. On a multicore machine, processes can be assigned to separate cores and run in parallel.

Processes running on a given machine can only communicate with each other using specific, restricted interfaces. This is called interprocess communication.

A single process can spawn multiple threads, which are essentially independent processes that are managed jointly by the operating system and have more options for communication than standard independent processes.

Storage

There are several forms of data storage used on modern computers. Some of the most important forms of storage are summarized next. In general, the most important attributes of a storage system are the capacity and the transfer speed.

  • Primary storage is volatile memory that requires a continuous power supply to maintain its information content. There are three main types of primary storage:

    • Registers are very small memory units built into the core that are used to hold the values on which the processor is currently operating.

    • Processor cache is very fast memory that is not very large in capacity. The processor cache is typically organized into four levels, with level 1 (or L1) cache being the fastest. In a multi-core CPU, each core has its own L1 cache but the other cache levels are sometimes shared between the cores. A recent Intel CPU is the Haswell processor, see here for its cache sizes.

    • Main memory is the relatively slow primary memory that is physically separate from the CPU. A modern server typically has 24-96 Gb of primary memory, but 1 terabyte or more is possible. See here for memory specifications of the U-M Flux nodes.

  • Secondary storage, or auxiliary storage is the non-volatile component of a computer’s memory, meaning that the information is retained even when the computer has no power. There are two main forms of secondary storage.

    • The standard form of secondary storage is a hard disk drive (HDD), which currently have capacity up to 10 terabytes (10^12 bytes), although more commonly they store 1-4 Tb. HDD transfer rates are typically around 100Mb/second or slower. An issue with transfer performance on an HDD is fragmentation, meaning that files are split into parts and stored at different locations on the disk. On an HDD, fragmented files are much slower to read than contiguous files.

    • An alternative form of secondary storage is a solid state disk (also known as flash memory). SSD’s up to 2TB are available but are very expensive, 512 gigabyte capacity SSD units (half of a Tb) are more common. An SSD can typically transfer data at 0.5-1 Gb/second. SSDs are not affected by fragmentation.

  • In addition to data that are stored in primary and secondary storage, it is often required to access data stored on remote servers via the network, for example retrieving data from a remote database or using a network file system (NFS). In general, transfer rates for remotely stored data will be much slower than rates for primary or secondary storage on the local machine. Current ethernet transfer rates are physically capable of reaching around 100 Gbit/second, but the actual rate depends on the network capacity and traffic between the two machines, and is generally much slower than the theoretical limit.

Software/hardware interaction

A lot of modern software provides an “abstraction layer” that insulates most users from the hardware architecture. For example, compilers and run-times attempt to optimize which data are stored in the various levels of cache on a given processor. Some languages and libraries are able to take advantage of multiple cores without much action on the part of the user.

Numeric data

Numeric data is one of several types of data that is commonly processed intensively on computers (text and memory addresses are two examples of non-numeric data that may also be intensively processed). Since numeric data is so prominent, hardware and OS support have evolved to facilitate very high speed processing of numeric data.

Numbers on a computer are broadly categorized as two types: integer type and floating point type.

  • Integer data are usually stored directly in memory using binary base 2 form. For signed types, one bit is reserved to represent the sign (positive or negative), with the reamining bits holding the magnitude of the value. For unsigned types, all bits capture the magnitude of the value. The range of an unsigned value is from 0 to 2^m-1, where m is the number of bits. For example, a 1-byte unsigned value can hold integers from 0 to 255, a 2-byte unsigned value can hold integers from 0 to 65535. Modern computers commonly work with 1, 2, 4, and 8 byte signed and unsigned integer values, so there are usually 8 distinct integer types. For multi-byte values, there are two systems for ordering the bytes: little endian or LSB puts the least significant byte at the lowest memory address; big endian or MSB puts the most significant byte at the lowest memory address. Most modern hardware is little endian.

  • Floating point data are stored in an exponential form (−1)^s × c × b^q. For a given value, s (the sign), c (the significand), b (the base), and q (the exponent) are stored as bits or integers. On modern systems, the base may only be equal to 2 or 10. Only finitely many distinct values can be represented in this form, and the relative distance between two consecutive representable values is roughly constant (i.e. if x and y are consecutive representable values, (y-x)/x is roughly constant). In addition, special bit patterns are used to represent positive and negative infinity, and NaN (not a number).

Future prospects

There is currently a lot of research going on to define the cutting edge and next generation of computing hardware and systems. Recent advances include distributed processing tools like Hadoop, and networked databases to manage data storage and retrieval. However, especially in the area of concurrent or parallel computing, the current generation of software and programming tools do not always do a great job of making it easy for developers and researchers to optimize performance on modern computing systems.

One hardware tool that is widely available, but that is not widely utilized is graphical processing units, or GPU’s. These are processing units with a very large number of cores (e.g. hundreds) that are simpler than ordinary cores, but that can be utilized to do certain types of calculations on a massively parallel scale.

Resources

Access times for various data transfer operations.