4.0 Input-Output Organization

 

The Cache Bus: Higher-level architectures, such as those used by the Pentium Pro and Pentium II, employ a dedicated bus for accessing the system cache. This is sometimes called a backside bus. Conventional processors using fifth-generation motherboards and chipsets have the cache connected to the standard memory bus.

The Memory Bus: This is a second-level system bus that connects the memory subsystem to the chipset and the processor. In some systems the processor and memory buses are basically the same thing.

The Local I/O Bus: This is a high-speed input/output bus used for connecting performance-critical peripherals to the memory, chipset, and processor. For example, video cards, disk storage devices, high-speed networks interfaces generally use a bus of this sort. The two most common local I/O buses are the VESA Local Bus (VLB) and the Peripheral Component Interconnect Bus (PCI).

The Standard I/O Bus: Connecting to the above three buses is the standard I/O bus, used for slower peripherals (mice, modems, regular sound cards, low-speed networking) and also for compatibility with older devices. On almost all modern PCs this is the Industry Standard Architecture (ISA) bus.

ISA (Industry Standard Architecture) is a standard bus architecture that is associated with the IBM AT motherboard. It allows 16 bits at a time to flow between the motherboard circuitry and an expansion slot card and its associated device(s).

MCA (Micro Channel Architecture) was developed by IBM for its line of PS/2 desktop computers, Micro Channel Architecture is an interface between a computer (or multiple computers) and its expansion cards and their associated devices. MCA was a distinct break from previous bus architectures such as ISA. The pin connections in MCA are smaller than other bus interfaces. For this and other reasons, MCA does not support other bus architectures. Although MCA offers a number of improvements over other bus architectures, its proprietary, nonstandard aspects did not encourage other manufacturers to adopt it. It has influenced other bus designs and it is still in use in PS/2s and in some minicomputer systems.

EISA (Extended Industry Standard Architecture) is a standard bus architecture that extends the ISA standard to a 32-bit interface. It was developed in part as an open alternative to the proprietary Micro Channel Architecture (MCA) that IBM introduced in its PS/2 computers. EISA data transfer can reach a peak of 33 megabytes per second.

VESA (Video Electronics Standards Association Local Bus) Local Bus is a standard interface between your computer and its expansion slots that provides faster data flow between the devices controlled by the expansion cards and your computer's microprocessor. A "local bus" is a physical path on which data flows at almost the speed of the microprocessor, increasing total system performance. VESA Local Bus is particularly effective in systems with advanced video cards and supports 32-bit data flow at 50 MHz. A VESA Local Bus is implemented by adding a supplemental slot and card that aligns with and augments an ISA expansion card.

PCI (Peripheral Component Interconnect) is an interconnection system between a microprocessor and attached devices in which expansion slots are spaced closely for high speed operation. Using PCI, a computer can support both new PCI cards while continuing to support ISA expansion cards, currently the most common kind of expansion card. Designed by Intel, the original PCI was similar to the VESA Local Bus. However, PCI 2.0 is no longer a local bus and is designed to be independent of microprocessor design. PCI is designed to be synchronized with the clock speed of the microprocessor, in the range of 20 to 33 MHz. PCI is now installed on most new desktop computers, not only those based on Intel's Pentium processor but also those based on the PowerPC.

PCI transmits 32 bits at a time in a 124-pin connection (the extra pins are for power supply and grounding) and 64 bits in a 188-pin connection in an expanded implementation. PCI uses all active paths to transmit both address and data signals, sending the address on one clock cycle and data on the next. Burst data can be sent starting with an address on the first cycle and a sequence of data transmissions on a certain number of successive cycles.

 

Hard Disk Interfaces

ST-506/412 is an old standard interface for connecting hard disk drives to PCs. ST-506 is sometimes referred to as MFM, which is the most prevalent encoding scheme used on ST-506 disk drives. ST-506 also supports the RLL encoding format. MFM (modified frequency modulation, an encoding scheme used by PC floppy disk drives and older hard drives. RLL (run length limited) produces faster data access speeds and can increase a disk's storage capacity by up to 50 percent. RLL is used on most newer hard drives.

MFM (Enhanced Small Device Interface) is an interface standard developed by a consortium of the leading personal computer manufacturers for connecting disk drives to PCs. ESDI is two to three times faster than the older ST-506 standard. To use an ESDI drive, your computer must have an ESDI controller.

Introduced in the early 80s, ESDI is already obsolete.

IDE (Integrated Drive Electronics) is a standard electronic interface used between a computer motherboard's data paths or bus and the computer's disk storage devices. The IDE interface is based on the IBM PC ISA 16-bit bus standard, but it is also used in computers that use other bus standards. IDE gets its name because the disk drive controller is built into the logic board in the disk drive. IDE was adopted as a standard by ANSI in November, 1990. The ANSI name for IDE is Advanced Technology Attachment (ATA). The IDE (ATA) standard is one of several related standards maintained by the T10 Committee.

SCSI (Small Computer System Interface) is a set of evolving ANSI standard electronic interfaces that allow personal computers to communicate with peripheral hardware such as disk drives, tape drives, CD-ROM drives, printers, and scanners faster and more flexibly than previous interfaces. Developed at Apple Computer and still used in the Macintosh, the present set of SCSIs are parallel interfaces. SCSI ports are built into most personal computers today and supported by all major operating systems. In addition to faster data rates, SCSI is more flexible than earlier parallel data transfer interfaces. The latest SCSI standard, Ultra-2 SCSI for a 16-bit bus can transfer data at up to 80 megabytes per second (MBPs). SCSI allows up to 7 or 15 devices (depending on the bus width) to be connected to a single SCSI port in daisy-chain fashion. This allows one circuit board or card to accommodate all the peripherals, rather than having a separate card for each device, making it an ideal interface for use with portable and notebook computers. A single host adapter, in the form of a PC Card, can serve as a SCSI interface for a "laptop," freeing up the parallel and serial ports for use with an external modem and printer while allowing other devices to be used in addition. Although not all devices support all levels of SCSI, the evolving SCSI standards are generally backwards-compatible. That is, if you attach an older device to a newer computer with support for a later standard, the older device will work at the older and slower data rate. The original SCSI, now known as SCSI-1, evolved into SCSI-2, known as "plain SCSI." as it became widely supported. SCSI-3 consists of a set of primary commands and additional specialized command sets to meet the needs of specific device types. The collection of SCSI-3 command sets is used not only for the SCSI-3 parallel interface but for additional parallel and serial protocols, including Fiber Channel, Serial Bus Protocol (used with the IEEE 1394 Firewire physical protocol), and the Serial Storage Protocol (SSP). The latest SCSI standard is Ultra-2 which uses a 40 MHz clock rate to get maximum data transfer rates up to 80 MBPs. It provides a longer possible cabling distance (up to 12 meters) by using Low Voltage Differential (LVD) signaling. Earlier forms of SCSIs use a single wire that ends in a terminator with a ground. Ultra-2 SCSI sends the signal over two wires with the data represented as the difference in voltage between the two wires. This allows support for longer cables. A low voltage differential reduces power requirements and manufacturing costs.

 

Ports

Serial means one event at a time. It is usually contrasted with parallel, meaning more than one event happening at a time. In data transmission, the techniques of time division and space division are used, where time separates the transmission of individual bits of information sent serially and space (on multiple lines or paths) can be used to have multiple bits sent in parallel. In the context of computer hardware and data transmission, serial connection, operation, and media usually indicate a simpler, slower operation and parallel indicates a faster operation. This indication doesn't always hold since a serial medium (for example, fiber optic cable) can be much faster than a slower medium that carries multiple signals in parallel.

On your PC, the printer is usually attached through a parallel interface and cable so that it will print faster. Your keyboard and mouse are one-way devices that only require a serial interface and line. Inside your computer, much of its circuitry supports bits being moved around in parallel. The computer modem uses one of your PC's serial connections or COM ports. Serial communication between your PC and the modem and other serial devices adheres to the RS-232C standard. Conventional computers and their programs operate in a serial manner, with the computer reading a program and performing its instructions one after the other. However, some of today's computers have multiple processors and can perform instructions in parallel.

In the context of computer hardware and data transmission, serial connection, operation, and media usually indicate a simpler, slower operation. Parallel connection and operation indicates faster operation. A conventional phone connection is generally thought of as a serial line since its usual transmission protocol is serial. Conventional computers and their programs operate in a serial manner, with the computer reading a program and performing its instructions one after the other. However, some of today's computers have multiple processors that divide up the instructions and perform them in parallel.

USB (Universal Serial Bus) is a "plug-and-play" interface between a computer and add-on devices (such as audio players, joysticks, keyboards, telephones, scanners, and printers). With USB, a new device can be added to your computer without having to add an adapter card or even having to turn the computer off. The USB peripheral bus standard was developed by Compaq, IBM, DEC, Intel, Microsoft, NEC, and Northern Telecom and the technology is available without charge for all computer and device vendors.

USB supports a data speed of 12 megabits per second. This speed will accommodate a wide range of devices, including MPEG-2 video devices, data gloves, and digitizers. It is anticipated that USB will easily accommodate plug-in telephones that use ISDN and digital PBXs. Since October, 1996, the Windows operating systems have been equipped with USB drivers or special software designed to work with specific I/O device types. USB is integrated into Windows 98. As of mid-1998, most new computers and peripheral devices were equipped with USB.

FireWire is Apple Computer's version of a new standard, IEEE 1394 High Performance Serial Bus, for connecting devices to your personal computer. FireWire provides a single plug-and-socket connection on which up to 63 devices can be attached with data transfer speeds up to 400 Mbps (megabits per second). The standard describes a serial bus or pathway between one or more peripheral devices and your computer's microprocessor. In the next few years, you can expect to see many peripheral devices coming equipped to meet this new standard. FireWire and other IEEE 1394 implementations provide:

In time, IEEE 1394 implementations are expected to replace and consolidate today's serial and parallel interfaces, including Centronic parallel, RS232-C, and SCSI. The first products to be introduced with FireWire include digital cameras, digital video disks (DVD), digital video tapes, digital camcorders, and music systems. Because IEEE 1394 is a peer-to-peer interface, one camcorder can dub to another without being plugged into a computer. With a computer equipped with the socket and bus capability, any device (for example, a video camera) can be plugged in while the computer is running.

There are two levels of interface in IEEE 1394, one for the backplane bus within the computer and another for the point-to-point interface between device and computer on the serial cable. A simple bridge connects the two environments. The backplane bus supports 12.5, 25, or 50 megabits per second data transfer. The cable interface supports 100, 200, or 400 megabits per second. Each of these interfaces can handle any of the possible data rates and change from one to another as needed.

The serial bus functions as though devices were in slots within the computer sharing a common memory space. A 64-bit device address allows a great deal of flexibility in configuring devices in chains and trees from a single socket. IEEE 1394 provides two types of data transfer: asynchronous and isochronous. Asynchronous is for traditional load-and-store applications where data transfer can be initiated and an application interrupted as a given length of data arrives in a buffer. Isochronous data transfer ensures that data flows at a pre-set rate so that an application can handle it in a timed way. For multimedia applications, this kind of data transfer reduces the need for buffering and helps ensure a continuous presentation for the viewer. The 1394 standard requires that a device be within 4.5 meters of the bus socket. Up to 16 devices can be connected in a single chain, each with the 4.5 meter maximum (before signal attenuation begins to occur) so theoretically you could have a device as far away as 72 meters from the computer.

 

The New Processor Bus

Intel Pentium 4 Bus Interface Unit

The first new feature seen by code or data as it enters Pentium 4 is the new system bus. The well-known 'FSB' of Pentium 3 is clocked at 133 MHz and able to transfer 64-bit of data per clock, offering a data bandwidth of 8 byte * 133 million/s = 1,066 MB/s. Pentium 4's system bus is only clocked at 100 MHz and also 64-bit wide, but it is 'quad-pumped', using the same principle as AGP4x. Thus it can transfer 8 byte * 100 million/s * 4 = 3,200 MB/s. This is obviously a tremendous improvement that even leaves AMD's recently 'upgraded ' EV6-bus quite far behind. The bus of the most recent Athlon's is clocked at 133 MHz, 64-bit wide and 'double-pumped', offering 8 byte * 133 million/s * 2 = 2,133 MB/s.

The new bus of Pentium 4 enables it to exchange data with the rest of the system faster than any other x86-processor, thus removing one important bottleneck that Pentium 3 was suffering from. However, the fastest processor bus doesn't help much unless the system's main memory can deliver data at an according pace. Intel's new 850 chipset for Pentium 4, which currently represents the only chipset for this new CPU, is using two Rambus channels and therefore the expensive and unpopular RDRAM. However, these two RDRAM channels are able to deliver the same data bandwidth as Pentium 4's new bus (3,200 MB/s), making them a perfect match at least on paper. This constellation enables Pentium 4-systems to have the highest data transfer rates between processor, system and main memory, which is a clear benefit. At the same time system cost is impacted by the high price of RDRAM plus the fact that a Pentium 4-system always requires two or even four RDRAM-RIMMs of the same size and spec. One, three or mixed RIMMs are not an option.

 

4.2 INTERRUPT

 

An interrupt is a signal from a device attached to a computer or from a program within the computer that causes the main program that operates the computer (the operating system) to stop and figure out what to do next. Almost all personal (or larger) computers today are interrupt-driven - that is, they start down the list of computer instructions in one program (perhaps an application such as a word processor) and keep running the instructions until either (A) they can't go any further or (B) an interrupt signal is sensed. After the interrupt signal is sensed, the computer either resumes running the program it was running or begins running another program.

Basically, a single computer can perform only one computer instruction at a time. But, because it can be interrupted, it can take turns in which programs or sets of instructions that it performs. This is known as multitasking . It allows the user to do a number of different things at the same time. The computer simply takes turns managing the programs that the user effectively starts. Of course, the computer operates at speeds that make it seem as though all of the user's tasks are being performed at the same time. (The computer's operating system is good at using little pauses in operations and user think time to work on other programs.)

An operating system usually has some code that is called an interrupt handler. The interrupt handler prioritizes the interrupts and saves them in a queue if more than one is waiting to be handled. The operating system has another little program, sometimes called a scheduler, that figures out which program to give control to next.

In general, there are hardware interrupts and software interrupts. A hardware interrupt occurs, for example, when an I/O operation is completed such as reading some data into the computer from a tape drive. A software interrupt occurs when an application program terminates or requests certain services from the operating system. In a personal computer, a hardware interrupt request (IRQ) has a value associated with it that associates it with a particular device.

An I/O bus is similar to bus between the CPU, mainboard control logic, and memory. Both types of bus structure have address wires, data wires, and a similar set of housekeeping wires. Both bus structures must determine if an operation refers to memory or an I/O address. Both must distinguish between 8-bit, 16-bit, and 32-bit operations. Both must be able to introduce "Wait States" to slow down the CPU when a device needs more time to complete an operation.

The most important difference between the CPU-memory local bus and the I/O bus is the presence of Interrupt Request (IRQ) wires. The I/O bus has 15 separate IRQ wires. The CPU has only one interrupt pin. The chip set on the mainboard has to provide a translation between the two.

Without interrupts, the CPU must start an operation to a device and then spin in a loop asking, "Is it done yet? Is it done yet? Is it done yet?" After a few hundred thousand tests, the device will signal that the operation is complete. Interrupts allow the CPU (particularly on a more advanced operating system like Windows 95, OS/2, or NT) to do some other work until the operation is complete.

When a device generates an interrupt, the CPU hardware stops running an ordinary program and jumps to an interrupt handling routine in the Device Driver. The interrupt may signal that:

Any device on the I/O bus can request an interrupt by placing a signal on one of the 15 IRQ wires. If more than one IRQ signal is received at the same time, the chip set on the mainboard has to select the one with highest priority to process first. The CPU is interrupted (by sending a signal on its one wire) and the chip set then transfer the identity of the IRQ level to be processed.

Each IRQ wire goes to every slot in the I/O bus. An adapter card is configured, physically with switches or logically with a utility) to use a specific IRQ value. The I/O bus on the first PC assumed that each device would have its own IRQ line, so the circuit to drive an interrupt request was made very simple. Unfortunately, when two adapter cards are incorrectly configured with the same IRQ value, and if both try to generate IRQ's at the same time, the result is to produce a short in the I/O bus. Usually there is no damage, but the effect can be to burn out either card or to trash the mainboard.

Later bus architectures (MCA, EISA, PCI) use safer circuitry that allows two devices to share the same interrupt. When an interrupt is shared, the system responds to an interrupt by calling the device driver for each device associated with that IRQ. The drivers poll their respective adapter cards to determine if there is any pending activity which requires a response. There is a slight loss of efficiency when interrupts are shared, but not enough to cause worry.

 

The CPU uses the OUT instruction to send data or commands to an I/O device, and it uses the IN instruction to read data or status from the device. These instructions cause the address of the I/O device to be placed on the bus, and they flag one of the housekeeping wires in the bus to indicate that this is an I/O address and not a memory address.

Each device is configured to respond to a range of addresses (the "ports"). Generally, a device will respond to a range of eight addresses. For example, the COM1 port responds to addresses 03F8 to 03FF. When the device sees an I/O address on the bus that matches a value in its range, it responds.

A device uses each address to process a different type of command or generate a different type of status. COM1, for example, uses the first address of 03F8 to handle all the data. The remaining four addresses are used to configure the line (speed, parity), to control the phone (hang-up, begin), to check modem status, and perform other housekeeping.

Unfortunately, it is still possible to plug a card built in 1984 into a modern PC. The I/O bus cannot know how fast a device is able to operate, so it handles the worst case and slows everything down to match the speeds used ten years ago. The chip set on the main board generates a long stream of Wait State signals, forcing the CPU to wait for 250 nanoseconds. The device itself can respond to request more Wait States to give itself even longer to respond. All this time, the CPU is stopped dead waiting for the IN or OUT instruction to end.

What is the Hardware Interrupt and DMA structure of the IBM PC?

 

The processor is a highly-tuned machine that is designed to do one thing at a time. However, we use our computers in a way that requires the processor to at least appear to do many things at once. If you've ever used a multitasking operating system like Windows 95, you've done this; you may have been editing a document while downloading information on your modem and listening to a CD simultaneously. The processor is able to do this by sharing its time among the various programs it is running and the different devices that need its attention. It only appears that the processor is doing many things at once because of the high speed that it is able to switch between tasks. Most of the different parts of the PC need to send information to and from the processor, and they expect to be able to get the processor's attention when they need to do this. The processor has to balance the information transfers it gets from various parts of the machine and make sure they are handled in an organized fashion. There are two basic ways that the processor could do this:

In addition to the well-known hardware interrupts that we discuss in this section, there are also software interrupts. These are used by various software programs in response to different events that occur as the operating system and applications run. In essence, these represent the processor interrupting itself. This is part of how the processor is able to do many things at once. The other thing that software interrupts do is allow one program to access another one (usually an application or DOS accessing to the BIOS) without having to know where it resides in memory.

Device interrupts are fed to the processor using a special piece of hardware called an interrupt controller. The standard for this device is the Intel 8259 interrupt controller, and has been since early PCs. As with most of these dedicated controllers, in modern motherboards the 8259 is incorporated into a larger chip as part of the chipset. The interrupt controller has 8 input lines that take requests from one of 8 different devices. The controller then passes the request on to the processor, telling it which device issued the request (which interrupt number triggered the request, from 0 to 7). The original PC and XT had one of these controllers, and hence supported interrupts 0 to 7 only. Starting with the IBM AT, a second interrupt controller was added to the system to expand it; this was part of the expansion of the ISA system bus from 8 to 16 bits. In order to ensure compatibility the designers of the AT didn't want to change the single interrupt line going to the processor. So what they did instead was to cascade the two interrupt controllers together. The first interrupt controller still has 8 inputs and a single output going to the processor. The second one has the same design, but it takes 8 new inputs and its output feeds into input line 2 of the first controller. If any of the inputs on the second controller become active, the output from that controller triggers interrupt #2 on the first controller, which then signals the processor. Devices designed to use IRQ2 as a primary setting are rare in today's systems, since IRQ2 has been out of use for over 10 years. In most cases IRQ2 is just considered "unusable", while IRQ9 is a regular, usable interrupt line. However, some modems for example still offer the use of IRQ2 as a way to get around the fact that COM3 and COM4 share interrupts with COM1 and COM2 by default. You may need to do this if you have a lot of devices contending for the low-numbered IRQs.

The devices that use interrupts trigger them by signaling over lines provided on the ISA system bus. Most of the interrupts are provided to the system bus for use by devices; however, some of them are only used internally by the system, and therefore they are not given wires on the system bus. These are interrupts 0, 1, 2, 8 and 13, and are never available to expansion cards. As explained in this section on the ISA bus, the original bus was only 8 bits wide and had a single connector for expansion cards. The bus was expanded to 16 bits and a second connector slot added next to the first one; you can see this if you look at your motherboard, since all modern PCs use 16-bit slots. The addition of this extra connector coincided with the addition of the second interrupt controller, and the lines for these extra IRQs were placed on this second slot. This means that in order to access any of these IRQs--10, 11, 12, 14 and 15--the card must have both connectors. While almost no motherboards today have 8-bit-only bus slots, there are still many expansion cards that only use one ISA connector. The most common example is an internal modem. These cards can only use IRQs 3, 4, 5, 6 and 7 (and 6 is almost always not available since it is used by the floppy disk controller). They can also use IRQ 9 indirectly if they have the ability to use IRQ2, since 9 is wired to where 2 used to be.

The PC processes device interrupts according to their priority level. This is a function of which interrupt line they use to enter the interrupt controller. For this reason, the priority levels are directly tied to the interrupt number:

All of the regular interrupts that we normally use and refer to by number are called maskable interrupts. The processor is able to mask, or temporarily ignore, any interrupt if it needs to, in order to finish something else that it is doing. In addition, however, the PC has a non-maskable interrupt (NMI) that can be used for serious conditions that demand the processor's immediate attention. The NMI cannot be ignored by the system unless it is shut off specifically. When an NMI signal is received, the processor immediately drops whatever it was doing and attends to it. As you can imagine, this could cause havoc if used improperly. In fact, the NMI signal is normally used only for critical problem situations, such as serious hardware errors. The most common use of NMI is to signal a parity error from the memory subsystem. This error must be dealt with immediately to prevent possible data corruption.

In general, interrupts are single-device resources. Because of the way the system bus is designed, it is not feasible for more than one device to use an interrupt at one time, because this can confuse the processor and cause it to respond to the wrong device at the wrong time. If you attempt to use two devices with the same IRQ, an IRQ conflict will result. This is one of the types of resource conflicts. It is possible to share an IRQ among more than one device, but only under limited conditions. In essence, if you have two devices that you seldom use, and that you never use simultaneously, you may be able to have them share an IRQ. However, this is not the preferred method since it is much more prone to problems than just giving each device its own interrupt line. One of the most common problems regarding shared IRQs is the use of the third and fourth serial (COM) ports, COM3 and COM4. By default, COM3 uses the same interrupt as COM1 (IRQ4), and COM4 uses the same interrupt as COM2 (IRQ3). If you have a mouse on COM1 and set up your modem as COM3--a very common setup--guess what happens the first time you try to go online. You can share COM ports on the same interrupt, but you have to be very careful not to use both devices at once; in general this arrangement is not preferred. Many modems will let you change the IRQ they use to IRQ5 or IRQ2, for example, to avoid this problem. Other common areas where interrupt conflicts occur are IRQ5, IRQ7 and IRQ12.

 Direct memory access (DMA) channels are system pathways used by many devices to transfer information directly to and from memory. DMA channels are not nearly as "famous" as IRQs as system resources go. This is mostly for a good reason: there are fewer of them and they are used by many fewer devices, and hence they usually cause fewer problems with system setup. However, conflicts on DMA channels can cause very strange system problems and can be very difficult to diagnose. DMAs are used most commonly today by floppy disk drives, tape drives and sound cards.

As you know, the processor is the "brain" of the machine, and in many ways it can also be likened to the conductor of an orchestra. In early machines the processor really did almost everything. In addition to running programs it was also responsible for transferring data to and from peripherals. Unfortunately, having the processor perform these transfers is very inefficient, because it then is unable to do anything else. The invention of DMA enabled the devices to cut out the "middle man", allowing the processor to do other work and the peripherals to transfer data themselves, leading to increased performance. Special channels were created, along with circuitry to control them, that allowed the transfer of information without the processor controlling every aspect of the transfer. This circuitry is normally part of the system chipset on the motherboard. Note that DMA channels are only on the ISA bus (and EISA and VLB, since they are derivatives of it). PCI devices do not use standard DMA channels at all.

Standard DMA is sometimes called "third party" DMA. This refers to the fact that the system DMA controller is actually doing the transfer (the first two parties are the sender and receiver of the transfer). There is also a type of DMA called "first party" DMA. In this situation, the peripheral doing the transfer actually takes control of the system bus to perform the transfer. This is also called bus mastering.

Bus mastering provides much better performance than regular DMA because modern devices have much smarter and faster DMA circuitry built into them than exists in the old standard ISA DMA controller. Newer DMA modes are now available, such as Ultra DMA (mode 3 or DMA-33) that provide for very high transfer rates.

While the use of DMA provided a significant improvement over processor-controlled data transfers, it too eventually reached a point where its performance became a limiting factor. DMA on the ISA bus has been stuck at the same performance level for over 10 years. For old 10 MB XT hard disks, DMA was a top performer. For a modern 8 GB hard disk, transferring multiple megabytes per second, DMA is insufficient. On newer machines, disks are controlled using either programmed I/O (PIO) or first-party DMA (bus mastering) on the PCI bus, and not using the standard ISA DMA that is used for devices like sound cards. This type of DMA does not rely on the slow ISA DMA controllers, and allows these high-performance devices the bandwidth they need. In fact, many of the devices that used to use DMA on the ISA bus use bus mastering over the PCI bus for faster performance. This includes newer high-end SCSI cards, and even network and video cards.

 

 

Video Adapters

Today, most systems are sold with a display adapter that connects to a PCI or VESA "local bus", supports some Windows accelerator, and provides SVGA resolutions. The "local bus" means that the CPU can send data to the card at high speed. The "accelerator" means that the display adapter can draw lines and boxes and can move windows and scroll text itself. Resolution and number of colors are determined by the amount of video memory, and refresh rate is determined by the quality of the components. All these items need to be explained in detail.

Display adapters are characterized by


. Resolution refers to the number of dots on the screen. It is expressed as a pair of numbers that give the number of dots on a line (horizontal) and the number of lines (vertical). Four resolutions are in common use today

A computer display is essentially a high resolution TV set. It generates colors by combining amounts of Red, Green, and Blue (an "RGB" connection). In current use, these colors are controlled by three wires in the display cable. Each has a variable amount of voltage represented by a number from 0 to 255. This produces a theoretical 16 million possible colors. Complete control of color ("Truecolor") may be needed for displaying photographs, but ordinary applications get along with far fewer.


The Color Depth (number of colors) is determined by the number of bits assigned to hold color value.

The display adapter stores a value (4 to 24 bits) in memory for every dot on the screen. The amount of storage needed is determined by multiplying the number of dots (resolution) by the memory required for each dot.

The original VGA display had a resolution of 640x480 and supported 4 bit color. This required only 256K of memory.

An SVGA adapter with 512K can generally support resolution up to 800x600 and 8 bit (1 byte per dot) color.

An SVGA adapter with 1 megabyte of video memory can support 1024x768 resolution at 8 bits, or 800x600 resolution at 16 bits. In some systems, it can also display 1280x1024 in 4 bit color.

Additional memory is required for greater resolution or more color depth. However, not all systems support more video memory.


The Refresh Rate determines the speed that the display uses to paint the dots on the screen. The original VGA displays ran at 60Hz, but some people complained that this produced a flicker. International standards now require a rate of 70Hz. A "multisynch" monitor can adapter to refresh rates in a range, typically 60-75Hz. A utility program is typically provided on diskette to set the refresh rate on the display adapter for various resolutions.

However, a multisynch monitor generally needs adjustment when first connected to a new adapter card or run at a new resolution or rate. IBM avoids this problem by precisely matching adapters and displays within its product families. This means, however, that an older IBM display may not work on a non-IBM system. This became more trouble than it was worth, so today IBM generally produces only multisynch monitors like all the other vendors.

Combinations of high refresh rate, high resolution, and maximum color depth may overtax the chip that converts numbers in the display adapter memory into voltage levels in the wires of the display cable. This chip is called a RAMDAC. Faster RAMDAC chips are available, but they are expensive. Any attempt to run the RAMDAC too fast (to use a higher resolution or refresh rate than the chip supports) can damage the display card. Consult the manual provided with the display adapter to determine the allowable combinations.

A display will struggle to keep up with the signal sent by the card. If the signal is beyond the capability of the display, it can be damaged. To avoid this problem, operating systems will often require the user to identify the make and model of the display before allowing higher resolutions and refresh rates to be selected. If the model is not one listed in the table of supported displays, the manual that accompanied the display may provide a list of supported modes.


An accelerator chip on the video card can draw lines and boxes, fill in background color, scroll text, and manage the mouse pointer. These functions significantly improve the performance of Windows and OS/2. Before accelerators, a video adapter simply mapped the display memory to an area of the PC memory. The PC program would calculate the location of the line, and then would change the color dot by dot (byte by byte) in this area of memory.

With an accelerator, the CPU only has to send the video adapter a command to draw a line (and the starting point, ending point, width, and color of the line). The CPU is not required to calculate the bits in the line, and the amount of data that has to flow from the CPU through the I/O bus to the adapter card is greatly reduced.

An application program sends a sequence of requests to Window. Each request creates a window, button, box, menu, or writes some text. Some commands can be simply passed on to the accelerator chip for exection. A display adapter requires a Windows Driver routine. The Driver knows which commands the chip can handle, and which have to be turned into bits (or lines) on the CPU. Every video card has a Windows 3.1 driver. However, it is a good idea to make sure that the card also has a Windows 95, Windows NT, OS/2, and maybe a Linux XWindows driver.

The first generation of accelerator cards ran on the ISA bus, which limited data transfer to 16 bits in realtively slow cycle times. Then VESA Local Bus cards came along that could tranfer data in 32 bit chunks at a faster clock rate. Most Pentium machines today have a PCI video interface that transfers data at 64 bits and a very fast clock speed. Video performance is so good across the board that it is better to select a middle of the road adapter that every operating system supports rather than a very fast adapter with limited software.

In current use, the most popular accelerator cards are based on the "S3" family of chips, which combine good performance and low cost. Low end 486 machines with a VESA Local bus may have the S3 805 chip. Pentium machines would have the S3 864 chip. There are many other popular chip sets, but check the list of supported cards for each operating system to make sure that drivers are available.


An accelerator card reduces the amount of data that must be transferred between the PC and the display adapter. Nevertheless, display performance may be limited by the transfer rate. Originally, display cards plugged into the standard ISA bus. Since that bus is limited to transfer two bytes of data at a time, and is clocked at only 8Mh, this became a bottleneck. Today, most ISA bus machines provide one or two slots of Local Bus sockets. A Local Bus video card can transfer four bytes of data at a time, and it can operate at the 25 or 33Mh clock rate of the CPU and memory instead of the slower clock rate of the I/O bus.

For Pentium and PowerPC machines, an even faster alternative is provided by the PCI bus. The PCI interface allows 8 bytes of data to be transferred in one operation.

In many cases the desktop PC models come with a built-in Local Bus video adapter based on one of the S3 chips. This should be perfectly adequate for typical serious use under Windows or OS/2. Computers that are used as a disk or database server are often locked in a closet or run unattended. It may be tempting to save money by using a low cost video adapter on a server. However, if there is any monitor displaying performance data or status on the screen, then the choice of display adapter can effect performance. When any program writes to the screen (even when the display is powered off) the CPU must wait until the data or commands have been transferred to the display adapter. By using a low speed adapter (such as an old VGA), the CPU is forced to wait longer. The apparent CPU utilization goes up, and the performance of the server may degrade. The CPU is being used up by the wait states required to move a large block of data two bytes at a time across the old ISA bus. Therefore, while it may make sense to save money by using an inexpensive display screen on a server, it is still a good idea to get a high speed adapter card.

Over the last year there have been no new developments in video technology. Most vendors are using better RAMDAC chips and now support higher refresh rates. However, video speed is so high that improvements would not be noticed, and the display screen technology has not changed enough to make large screens any more affordable.

 

The hard disk has one or more metal platters coated top and bottom with a magnetic material similar to the coating on a VCR magnetic tape. In the VCR the tape moves by a fixed recording and sensing device (the "head"). With a disk, the head is connected to an "arm" which is moved in and out along a radius of the disk circle. To read or write information, the computer or disk controller must figure out where the data is on the disk. The arm is then moved the correct distance, then it waits until the location on the disk where the data is located rotates around to the point where it passes under the head and can be transferred. The surface of the disk is preformatted into units that hold 512 byte of data (the "sectors").

In the first generation of PC's, the electronics to move the arm, position, and to control the recording or sensing was placed in a separate controller card. Advances in chip technology allow this function to be done by logic on the disk which can be more easily tuned at the factory to the special features of each type of device. Today there are two technologies, IDE and SCSI.

IDE (Standard on Desktop PCs)

IDE (Integrated Disk Electronics) is the least expensive current disk technology. IDE support is usually built into the mainboard, though it is also possible to get an interface card for the ISA bus for around $30. An IDE disk is connected to the mainboard or interface card through a flat "ribbon" cable. Rather than invent a new interface, the signals in the IDE cable simply duplicate the activity on the ISA bus itself.

After Nov, 1994 vendors started to ship systems with Enhanced IDE (EIDE). Classic IDE supported two hard disks of 528 megabyte or less. EIDE allows four devices, including a mixture of disks, tapes, and CD-ROM, and the hard disks can be larger.

An IDE interface cable has two plugs and can be attached to two devices. The first device acts as the master, and the second device acts as a slave. This interface is busy if either device is processing a request, so activity on one device blocks access to the other. It will generally be necessary when adding a new disk to a system to set a switch or connector on the disk to indicate if it is to function as master or slave.

When they designed the EIDE standard, they needed compatibility with all the existing IDE devices. So they didn't change the rules on the cable. An EIDE interface chip can support four devices, but it has two interface cables each connecting two devices. The EIDE chip looks and acts like two IDE chips. An old IDE disk can be connected to a new EIDE connector.

However, a new large EIDE disk cannot always be connected to an old PC. The original IBM programming interface limited the disk space to 528 megabytes (not a big problem when hard disks had 10 or 20 megs). Today there are 1 gig disks advertised for little more than $200. However, an old IDE disk interface chip may not support data beyond the first 528 megs. You can buy a new interface card for $40, but even then the BIOS on old systems will not support I/O to partitions that extend beyond 528 megs. You may need to load a new operating system (Windows 95, OS/2, or Windows NT) and the partitions containing the operating system files may have to reside completely within the first 528 megs of the disk.

Computers built in the last year should come with Extended IDE (EIDE). The extensions overcome limits in the original IDE design:

Since EIDE simulated two separate IDE interface chips, there is an optimization that many customers do not fully appreciate. Newer operating systems (OS/2, Windows NT, and even Windows 95 to some extent) permit more than one I/O request to be running at a time. When a program wants to read something from a disk, the request is given to the disk interface and another program is allowed to run while the first program waits for data. However, the IDE interface allows only one of the two disks connected to the same cable to be active at a time, and any request to use the second disk will be blocked while data is being read from the first disk. An EIDE interface duplicates this IDE restriction, but since the EIDE chip looks like two IDE devices, a request can be made through the second interface while the first interface is busy.

If you run plain old DOS and Windows 3.x, it doesn't matter. Those systems will wait for any operation to complete before running any other program. However, if you are running a new system, and if you purchase a second IDE hard disk, then there is a performance advantage to putting the second drive on the second interface cable (managed by the second simulated IDE "device") rather than connecting it to the same flat disk interface to which the first disk is connected. On separate cables, the two disks can be active at the same time.

However, if you have two hard disks and an EIDE CD-ROM, then it is best to put the two disks on the same cable and isolate the CD-ROM on the second cable. A CD-ROM is much slower than a hard disk, and it will be busy longer. If it is on the same cable with a hard disk, it will block access to that disk when any request is made. Unless it is used very infrequently, the best performance will probably be provided by isolating the slow CD-ROM on its own cable.

SCSI (For Servers and Power Users)

SCSI provides a standard interface for all types of computers. The IDE disk and the ISA bus are peculiar to IBM-compatible Intel-compatible PC machines. SCSI, however, is used by Macintosh computers, RISC workstations, minicomputers, and even some mainframes. SCSI has always supported a mixture of disks, tapes, and CD-ROM drives. While EIDE disks may go up to one gigabyte, SCSI disks are available with 4 to 9 gigabytes of storage.

SCSI is a bus. In the SCSI architecture, the PC (or more precisely, the SCSI adapter card in the PC) is just one device on the bus. Each device is a "peer" of the other devices. In theory, a tape drive could send commands to the PC. In practice, the tape drive isn't smart enough and the PC doesn't respond to commands anyway.

In the Classic SCSI bus, there are 25 signals, each represented by a pair of wires (50 wires all together). Nine of the wires hold the eight bits plus parity of a byte of data. The other wires carry control functions. Classic SCSI can transfer data up to 5 megabytes per second. The Fast SCSI option of the SCSI-2 standard allows 10 megabytes per second on the same cable. To run faster, a Fast Wide SCSI interface is defined, but it requires more than the usual 50 wire cable.

An IDE disk must be mounted inside the computer. There is no provision for the IDE ribbon cable to run to external devices. SCSI devices can also be internal. They are connected to each other and to the adapter card using a flat ribbon cable with 50 wires (OK) or a round bundled cable with 25 twisted pairs of wires (Better).

However, SCSI devices can also be external to the computer. They can be mounted in individual boxes, or can be mounted together in larger tower enclosures. The adapter card is connected to external SCSI devices with a round cable containing 25 twisted pairs of wires. Sample SCSI plugs in common use:

EIDE or SCSI?

EIDE comes standard with any modern computer. The interface is built into the mainboard and requires no slots. SCSI requires an adapter card that may cost an additional $200.

IDE disks are also cheaper. A one gig EIDE disk is advertised for $220, while smaller SCSI disks start at $100 more. In most cases, the EIDE and SCSI disks are physically the same, operate at the same speed, and differ only in their electronics.

SCSI may be able to transfer data somewhat faster on the cable, but with a disk the limiting factor is the speed at which it rotates, not the electronic transfer speed.

EIDE is not a choice for devices that don't start out as PC's. Mac systems, RISC workstations, and minicomputers use SCSI. SCSI also provides for the external connection of devices ("Zip" removable high capacity drives, writable CD-ROMs units, backup tape units).

SCSI is worth the extra cost in a Server. EIDE supports two separate I/O operations to two disks (on the two different interface cables). SCSI allows all of the disk devices to be active simultaneously. Of course, only one device can be transferring data on the SCSI cable at any given time. However, a disk spends most of its time moving the arm to the right location and waiting for the data to rotate around to the point where it can be read or written. A SCSI controller can have all of its disks moving into position while one disk is actively transferring data.

It goes without saying that a good disk interface on a modern server will connect to the PCI bus. The ISA bus is unreasonably slow (by modern terms), the Microchannel is expensive, and EISA is now obsolete. However, there are both IDE and SCSI adapters that interface to the PCI bus.

An EIDE adapter will always be dumb and cheap. A SCSI adapter can be smart enough to Busmaster. As a Busmaster, the SCSI card can transfer data to or from buffers in memory directly. This frees up the CPU to do other things. To get the full benefit, the computer must be running an operating system (Windows NT, OS/2, Netware, or Unix) that can take advantage of the full capability of the card.

Currently, the most popular high performance PCI bus SCSI controller card is made by Adaptec. It goes by a variety of names (Adaptec 2940, AIC7870). This card is on the leading edge of technology, and there are updated drivers for it for Windows 95 (ftp them from Microsoft on the Internet), Windows NT (in Service Pak 2 for Version 3.51), OS/2, and Linux. Those who prefer a more sedate life should probably wait a few months for things to stabilize.

 

The immense success of notebooks gave the development of flat panel displays an unsurpassed boost. Obviously, this also had an effect on normal flat panel displays. Flat panel displays (often referred to as TFTs or LC displays) have been a hot topic of discussion above all in Europe and Japan. This presence in the media is rather surprising, as the sales volume of devices for 1998 was very far from that of CRTs. But so what? - when you take into consideration that the market leader for CRTs sold more than that in just one week . On the other hand, there is an enormous demand for TFTs and one that can currently not be met. That means that the situation concerning TFTs is tense, such as has become rare in the PC market and something that would normally be resolved quickly. Well, things are different for the TFT market and several explanations for the sparse supply include the low availability of glass, the restricted production volumes of the vendors and the unwillingness of the manufacturers to lose money by investing in what they see as a high risk business. The fact remains that the majority of TFTs are used in a business environment, especially where desktop space really is critical, or where noise levels, heat dissipation and health factors play an important role.

 

Modern display technologies are currently classified as either cathode ray tube monitors (CRTs) or flat panel displays. Tube devices are large and take up a lot of space, flat panel displays - i.e. devices without a tube - as the name states, are flat and space-saving. The flat panel display category itself encompasses a number of very different technologies such as LCDs (Liquid Crystal Displays), plasma displays, LEDs (Light Emitting Diode) and various other devices. Within these technologies, one can distinguish between flat panel displays that emit light and those that use back light that passes through them.

We will discuss those flat panel displays that - from the current point of view - seem to be the most purposeful; so-called TFT-LCDs. These devices belong to the group of displays that use back light passing through them. STN and DSTN (passive matrix LCDs) are also used, but nowadays only in very low-priced notebooks.

Figure 1: Overview of the different flat panel display technologies. Active matrix LCD's have prevailed on the market.

TFT stands for 'Thin Film Transistor' and describes the control elements that actively control the individual pixels. For this reason, one speaks of so-called 'active matrix TFTs'. How are images produced? The basic principle is quite simple: a panel with many pixels is used whereby each pixel can emit any color. To this purpose, a back light is used which is normally comprised of a number of flourescent tubes. In order to light a single pixel, all that needs to be done is for a small 'door' or 'shutter' to open to let the light pass through. The technology that makes this possible is of course more complicated and involved than the simple explanation above. LCD (Liquid Crystal Display) stands for monitors that are based on liquid crystals. Liquid crystals can change their molecular structure and therefore allow varying levels of light to pass through them (or they can block the light). Two polarizer filters, color filters and two alignment layers determine exactly how much light is allowed to pass and which colors are created. The layers are positioned between the two glass panels. A specific voltage is applied to the alignment layer, creating an electric field - which then aligns the liquid crystals. Each dot on the screen (pixel) therefore requires three components, one for red, green and blue - just as for the tubes within cathode ray tube devices.

The most common devices are Twisted Nematic TFTs . The following sections explain the way in which such TFTs work. A number of different technologies obviously exist.

Figure 2a: How a Standard TFT (Twisted Nematic) Display works

When no voltage is applied, the molecule structures are in their natural state and twisted by 90 degrees. The light emitted by the back light can then pass through the structure.

Figure 2b: How a Standard TFT (twisted nematic) works

If a voltage is applied, i.e. an electric field is created, the liquid crystals are twisted so that they are vertically aligned. The polarized light is then absorbed by the second polarizer. Light can therefore not leave the TFT display at this location.

 

Architecture of a TFT Pixel

The color filters for red, green and blue are integrated on to the glass substrate next to each other. Each pixel (dot) is comprised of three of these color cells or sub-pixel elements. This means that with a resolution of 1280 x 1024 pixels, exactly 3840 x 1024 transistors and pixel elements exist. The dot or pixel pitch for a 15.1 inch TFT (1024 x 768 pixels) is about 0.0188 inch (or 0.30 mm) and for an 18.1 inch TFT (1280 x 1024 pixels) it's about 0.011 inch (or 0.28 mm).

Figure 4: Pixels of a TFT. The left upper corner of a cell incorporates a Thin Film Transistor. Color filters allow the cells to change their RGB basic colors.

The pixels are decisive and the smaller their spacing, the higher the maximum possible resolution. However, TFTs are also subject to physical limitations due to the maximum display area. With a diagonal of 15 inch (or about 38 cm) and a dot pitch of 0.0117 inch (0.297 mm), it makes little sense to have a resolution of 1280 x 1024. Part 4 of this report covers the relationship between dot pitch and diagonal dimensions in more detail.

Pixels are in a fixed location and therefore define the resolution of a TFT without any geometrical problems. In other words: the maximum number of pixels corresponds to the maximum resolution. But what about lower resolutions? What happens if you have to switch to a lower resolution as is often necessary for games, video playback and other applications? In this case it is important that the electronics scale the 'smaller' image up to the size of the maximum size of the display panel. If the circuitry can't handle this task efficiently, the result will be distorted and not exactly ergonomic. From a technical point of view, this is not as easy to handle as for CRTs.

Why? In the case of CRTs, the electron beam can be adapted to the new resolution by simply changing the deflecting voltages. Besides, it basically doesn't matter if the beam happens to hit a point between two pixels occasionally. This is quite a different matter in the case of TFTs: due to the active control of every individual pixel, complex scaling electronics are required to recalculate the data for smaller resolutions. With whole number scaling factors (e.g. a factor of 2 when scaling 800 x 600 up to 1600 x 1200) it's fairly simple: the height and width of each pixels are doubled. The displayed image is correctly shown. Things become harder when scaling from 800 x 600 to 1024 x 768. The scaling factor is then 1.28, i.e. not a whole number (integer). It's no longer possible to uniquely assign data to a single pixel in every case. The electronics therefore have to decide whether to activate one pixel or two. Mathematical rounding-off errors then lead to unpleasant effects when displaying text (see figure below). State-of-the-art electronic components can reduce this effect using a trick (see Advanced Scaling) in order to reduce the optical impression: if data can't be uniquely assigned to a pixel, then the pixel's display intensity is reduced.

Figure 5: Scaling using the character "m". Scaling factors with fractional numbers often cause visual distortion.