Thursday, December 4, 2014

Memory Devices:Introduction, Memory Organization ,Memory Hierarchy, System Level Memory Organization ,Memory Device Organization, Memory Device Types ,Read-Only Memory , Random Access Memory (RAM) ,Special Memory Structures, Interfacing Memory Devices ,Accessing DRAMs , Refreshing the DRAM and Error Detection and Correction

Memory Devices

Introduction

Memory is an essential part of any computation system. It is used to store both the computation instructions and the data. Logically, memory can be viewed as a collection of sequential locations, each with a unique address as its label and capable of storing information. Accessing memory is accomplished by supplying the address of the desired data to the device.

Memory devices can be categorized according to their functionality and fall into two major categories, read-only-memory (ROM), and write-and-read memory or random-access memory (RAM). There is also another subcategory of ROM: mostly read but sometimes write memory or flash ROM memory. Within the RAM category there are two types of memory devices differentiated by storage characteristics, static RAM (SRAM) and dynamic RAM (DRAM) respectively. DRAM devices need to be refreshed periodically to prevent the corruption of their contents due to charge leakage. SRAM devices, on the other hand, do not need to be refreshed.

Both SRAM and DRAM are volatile memory devices, which means that their contents are lost if the power supply is removed from these devices. Nonvolatile memory, the opposite of volatile memory, retains its contents even when the supply power is turned off. All current ROM devices, including mostly read sometimes write devices are nonvolatile memories. Except for a very few special memories, these devices are all interfaced in a similar way. When an address is presented to a memory device, and sometimes after a control signal is strobed, the information stored at the specified address is retrieved after a certain delay. This process is called a memory read. This delay, defined as the time taken from address valid to data ready, is called memory read access time. Similarly, data can be stored into the memory device by performing a memory write. When writing, data and an address are presented to the memory device with the activation of a write control signal. There are also other control signals used to interface. For example, most of the memory devices in packaged chip format has a chip select (or chip enable) pin. Only when this pin is asserted, the particular memory device gets active. Once an address is supplied to the chip, internal address decoding logic is used to pinpoint the particular content for output. Because of the nature of the circuit structure used in implementing the decoding logic, a memory device usually needs to recover before a subsequent read or write can be performed. Therefore, the time between subsequent address issues is called cycle time. Cycle time is usually twice as long as the access time. There are other timing requirements for memory devices. These timing parameters play a very important role in interfacing the

memory devices with computation processors. In many situations, a memory device’s timing parameters affect the performance of the computation system greatly.

Some special memory structures do not follow the general accessing scheme of using an address. Two of the most frequently used are content addressable memory (CAM), and first-in-first-out (FIFO) memory. Another type of memory device, which accepts multiple addresses and produces several results at different ports, is called multiport memory. There is also a type of memory that can be written in parallel, but is read serially. It is referred to as video RAM or VDRAM since they are used primarily in graphic display applications. We will discuss these in more detail later.

2. Memory Organization

There are several aspects to memory organization. We will take a top down approach in discussing them.

Memory Hierarchy

The speed of memory devices has been lagging behind the speed of processors. As processors become faster and more capable, larger memory spaces are required to keep up with the every increasing software complexity written for these machines. Figure 8.1(a) illustrates the well-known Moore’s law, depicting the exponential growth in central processing unit (CPU) and memory capacity. Although CPUs’ speed continues to grow with the advancement of technology and design technique (in particular pipelining), due to the nature of increasing memory size, more time is needed to decode wider and wider addresses and to sense the information stored in the ever-shrinking storage element. The speed gap between CPU and memory devices continues to grow wider. Figure 8.1(b) illustrates this phenomenon.

image

imageThe strategy used to remedy this problem is called memory hierarchy. Memory hierarchy works be- cause of the locality property of memory references due to the sequentially fetched program instructions and the conjugation of related data. In a hierarchical memory system there are many levels of memory hierarchies. A small amount of very fast memory is usually allocated and brought right next to the central processing unit to help match up the speed of the CPU and memory. As the distance becomes greater between the CPU and memory, the performance requirement for the memory is relaxed. At the same time, the size of the memory grows larger to accommodate the overall memory size requirement. Some of the memory hierarchies are registers, cache, main memory, and disk. Figure 8.2 illustrates the general memory hierarchy employed in a traditional system. When a memory reference is made, the processor accesses the memory at the top of the hierarchy. If the desired data is in the higher hierarchy, a “bit” is encountered and information is obtained quickly. Otherwise a miss is encountered. The requested infor- mation must be brought up from a lower level in the hierarchy. Usually memory space is divided into blocks so that it can be transferred between levels in groups. At the cache level a chunk is called a cache block or a cache line. At the main memory level a chunk is referred to as a memory page. A miss in the cache is called a cache miss and a miss in the main memory is called a page fault. When a miss occurs, the whole block of memory containing the requested missing information is brought in from the lower hierarchy as mentioned before. If the current memory hierarchy level is full when a miss occurs, some existing blocks or pages must be removed and sometimes written back to a lower level to allow the new one(s) to be brought in. There are several different replacement algorithms. One of the most commonly used is the least recently used (LRU) replacement algorithm.

In modern computing systems, there may be several sublevels of cache within the hierarchy of cache. The general principle of memory hierarchy is that the farther away from the CPU it is, the larger its size, slower its speed, and the cheaper its price per memory unit becomes. Because the memory space addressable by the CPU is normally larger than necessary for a particular software program at a given time, disks are used to provide an economical supplement to main memory. This technique is called virtual memory. Besides disks there are tapes, optical drives, and other backup devices, which we normally call backup storage. They are used mainly to store information that is no longer in use, to protect against main memory and disk failures, or to transfer data between machines.

System Level Memory Organization

On the system level, we must organize the memory to accomplish different tasks and to satisfy the need of the program. In a computation system, addresses are supplied by the CPU to access the data or instruction. With a given address width a fixed number of memory locations may be accessed. This is referred to as the memory address space. Some processing systems have the ability to access another separate space called input/output (I/O) address space. Others use part of the memory space for I/O purposes.

This style of performing I/O functions is called memory–mapped I/O. The memory address space de- fines the maximum size of the directly addressable memory that a computation system can access using memory type instructions. For example, a processor with address width of 16-b can access up to 64 K different locations (memory entries), whereas a 32-b address width can access up to 4 Gig different locations. However, sometimes we can use indirection to increase the address space. The method used by 80X86 processors provides an excellent example of how this is done. The address used by the user or programmer in specifying a data item stored in the memory system is called a logical address. The address space accessed by the logical address is named logical address space. However, this logical ad- dress may not necessarily be used directly to index the physical memory. We called the memory space accessed by physical address the physical address space. When the logical space is larger than the phys- ical space, then memory hierarchy is required to accommodate the difference of space sizes and store them in a lower hierarchy. In most of the current computing systems, a hard-disk is used as this lower hierarchy memory. This is termed virtual memory system. This mapping of logical address to physical address could be either linear or nonlinear. The actual address calculation to accomplish the mapping process is done by the CPU and the memory management unit (MMU). Thus far, we have not specified the exact size of a memory entry. A commonly used memory entry size is one byte. For historical rea- sons, memory is organized in bytes. A byte is usually the smallest unit of information transferred with each memory access. Wider memory entries are becoming more popular as the CPU continues to grow in speed and complexity. There are many modern systems that have a data width wider than a byte. A common size is a double word (32 b). in current desktop. As a result, memory in bytes is organized in sections of multibytes. However, due to need for backward compatibility, these wide datapath sys- tems are also organized to be byte addressable. The maximum width of the memory transfer is usually called memory word length, and the size of the memory in bytes is called memory capacity. Since there are different memory device sizes, the memory system can be populated with different sized memory devices.

Memory Device Organization

Physically, within a memory device, cells are arranged in a two-dimensional array with each of the cell capable of storing 1 b of information. This matrix of cells is accessed by specifying the desired row and column addresses. The individual row enable line is generated using an address decoder while the column

image

imageis selected through a multiplexer. There is usually a sense amplifier between the column bit line and the multiplexer input to detect the content of the memory cell it is accessing. Figure 8.3 illustrates this general memory cell array described with r bit of row address and c bit of column address. With the total number of r + c address bits, this memory structure contains 2r +c number of bits. As the size of memory array increases, the row enable lines, as well as the column bit lines, become longer. To reduce the capacitive load of a long row enable line, the row decoders, sense amplifiers, and column multiplexers are often placed in the middle of divided matrices of cells, as illustrated in Fig. 8.4. By designing the multiplexer differently we are able to construct memory with different output width, for example, ×1, ×8, ×16, etc. In fact, memory designers go to great efforts to design the column multiplexers so that most of the fabrication masks may be shared for memory devices that have the same capacity but with different configurations.

3 .Memory Device Types

As mentioned before, according to the functionality and characteristics of memory, we may divide memory devices into two major categories: ROM and RAM. We will describe these different type of devices in the following sections.

Read-Only Memory

In many systems, it is desirable to have the system level software (for example, basic input/output system [BIOS]) stored ina read-only format, because thesetypes of programs are seldom changed.Many embedded systems also use read-only memory to store their software routines because these programs also are never changed during their lifetime, in general. Information stored in the read-only memory is permanent. It is retained even if the power supply is turned off. The memory can be read out reliably by a simple current sensing circuit without worrying about destroying the stored data. Figure 8.5 shows the general structure of a read-only memory (ROM). The effective switch position at the intersection of the word-line/bit-line determines the stored value. This switch could be implemented using different technologies resulting in different types of ROMs.

Masked Read-Only Memory (ROM)

The most basic type of this read-only-memory is called masked ROM, or simply ROM. It is pro- grammed at manufacturing time using fabrication processing masks. ROM can be produced using different

technologies, bipolar, complementary metal oxide semiconductor (CMOS), n-channel metal oxide semiconductor (nMOS), p-channel metal oxide semi- conductor (pMOS), etc. Once they are programmed there are no means to change their contents. More- over, the programming process is performed at the factory.

Programmable Read-Only Memory (PROM)

Some read-only memory is one-time programmable, but it is programmable by the user at the user’s own site. This is called programmable read-only memory (PROM). It is also often referred to as write once memory (WOM). PROMs are based mostly on bipolar technology, since this technology sup- ports it very well. Each of the single transistors in a cell has a fuse connected to its emitter. This transistor and fuse make up the memory cell. When a fuse is blown, no connection can be established when the cell is selected using the row line. Thereby a zero is stored. Otherwise, with the fuse intact, a logic one is represented. The programming is done through a programmer called a PROM programmer or PROM burner. Figure 8.6 illustrates the structure of a bipolar PROM cell and its cross section when fabricated.

image

Erasable Programmable Read-Only Memory (EPROM)

It is sometimes inconvenient to program the ROM only once. Thus, the erasable PROM is designed. This type of erasable PROM is called EPROM. The programming of a cell is achieved by avalanche injection of high-energy electrons from the substrate through the oxide. This is accomplished by applying a high drain voltage, causing the electrons to gain enough energy to jump over the 3.2-eV barrier between the substrate and silicon dioxide thus collecting charge at the floating gate. Once the applied voltage is removed, this charge is trapped on the floating gate. Erasing is done using an ultraviolet (UV) light eraser. Incoming

image

image

UV light increases the energy of electrons trapped on the floating gate. Once the energy is increased above the 3.2-eV barrier, the electrons leave the floating gate and move toward the substrate and the selected gate. Therefore, these EPROM chips all have windows on their packages where erasing UV light can reach inside the packages to erase the content of cells. The erase time is usually in minutes. The presence of a charge on the floating gate will cause the MOS transistor to have a high threshold voltage. Thus, even with a positive select gate voltage applied at the second level of polysilicon the MOS remains to be turned off. The absence of a charge on the floating gate causes the MOS to have a lower threshold voltage. When the gate is selected the transistor will turn on and give the opposite data bit. Figure 8.7 illustrates the cross section of a EPROM cell with floating gate. EPROM technologies that migrate toward smaller geometries make floating-gate discharge (erase) via UV light exposure increasingly difficult. One problem is that the width of metal bit lines cannot reduce proportionally with advancing process technologies. EPROM metal width requirements limit bit-lines spacing, thus reducing the amount of high-energy photons that reach charged cells. Therefore, EPROM products built on submicron technologies will face longer and longer UV exposure time.

Electrical Erasable Read-Only Memory (EEPROM)

Reprogrammability is a very desirable property. However, it is very inconvenient to use a separate light- source eraser for altering the contents of the memory. Furthermore, even a few minutes of erase time is intolerable. For this reason, an erasable PROM was designed called electrical erasable PROM (EEPROM). EEPROM permits new applications where erasing is done without removing the device from the system it resides in. There are a few basic technologies used in the processing of EEPROMs or electrical repro- grammable ROMs. All of them uses the Fowler-Nordheim tunneling effect to some extent. In this tunneling effect, cold electrons jump through the energy barrier at a silicon-silicon dioxide interface and into the oxide conduction band. This can only happen when the oxide thickness is of 100 A˚ or less, depending on the technology. This tunneling effect is reversible, allowing the reprogrammable ROMs to be used over and over again. One of the first electrical erasable PROMs is the electrical alterable ROM (EAROM) which is based on metal-nitrite-oxide silicon (MNOS) technology. The other is EEPROM, which is based on

silicon floating gate technology used in fabricating EPROMs. The floating gate type of EEPROM is favored because of its reliability and density. The major difference between EPROM and EEPROM is in the way they discharge the charge stored in the floating gate. EEPROM must discharge floating gates electrically as opposed to using an UV light source in an EPROM device where electrons absorb photons from the UV radiation and gain enough energy to jump the silicon/silicon-dioxide energy barrier in the reverse direction as they return to the substrate. The solution for the EEPROM is to pass low-energy electrons through the thin oxide through high field (107 V/cm2). This is known as the Fowler-Nordheim tunneling, where electrons can pass a short distance through the forbidden gap of the insulator and enter the conduction bank when the field applied is high enough. There are three common types of flash EEPROM cells. One uses the erase gate (three levels of polysilicon), the second and third use source and drain, respectively, to

image

FIGURE 8.8 (a) Triple poly EEPROM cell layout and structure, (b) flotox EEPROM cell structure (source program- ming), (c) EEPROM with drain programming, (d) another source programming EEPROM.

erase. Figures 8.8(a)–8.8(d) illustrate the cross sections of different EEPROMs. To realize a small EEPROM memory cell, the NAND structure was proposed in 1987. In this structure, cells are arranged in series. By using different patterns, an individual cell can be detected whether it is programmed or not. From the user’s point of view, EEPROMs differs from RAM only in their write time and number of writes allowed before failure occurs. Early EEPROMs were hard to use because they have no latches for data and address to hold values during the long write operations. They also require a higher programming voltage, other than the operating voltage. Newer EEPROMs use charge pumps to generate the high programming voltage on the chip so the user does not need to provide a separate programming voltage.

Flash-EEPROM

This type of erasable PROM lacks the circuitry to erase individual locations. When you erase them, they are erased completely. By doing so, many transistors may be saved, and larger memory capacities are possible. Note that sometimes you do not need to erase before writing. You can also write to an erased, but unwritten location, which results in an average write time comparable to an EEPROM. Another important thing to know is that writing zeros into a location charges each of the flash EEPROM’s memory cells to the same electrical potential so that subsequent erasure drains an equal amount of free charge (electrons) from each cell. Failure to equalize the charge in each cell prior to erasure can result in the overerasure of some cells by dislodging bound electrons in the floating gate and driving them out. When a floating gate is depleted in this way, the corresponding transistor can never be turned off again, thus destroying the flash EEPROM.

imageFIGURE 8.9 Different SRAM cells: (a) six-transistor SRAM cell with depletion transistor load, (b) four-transistor SRAM cell with polyresistor load, (c) CMOS six-transistor SRAM cell, (d) five-transistor SRAM cell.

Random Access Memory (RAM)

RAM stands for random-access memory. It is really read-write memory because ROM is also random access in the sense that given a random address the corresponding entry is read. RAM can be categorized by content duration. A static RAM’s contents is always retained, as long as power is applied. A DRAM, on the other hand, needs to be refreshed every few milliseconds. Most RAMs by themselves are volatile, which means that without the power supply their content will be lost. All of the ROMs mentioned in the previous section are nonvolatile. RAM can be made nonvolatile by using a backup battery.

Static Random Access Memory (SRAM)

Figure 8.9 shows various SRAM memory cells (6T, 5T, 4T). The six transistor (6T) SRAM cell is commonly used SRAM. The crossed coupled inverters in a SRAM cell retain the information indefinitely, as long as the power supply is on since one of the pull-up transistors supplies current to compensate for the leakage current. During a read, bit and bitbar line are precharged while the word enable line is held low. Depending on the content of the cell, one of the lines is discharged slightly causing the precharged voltage to drop when the word enable line is strobed. This difference in voltage between the bit and bitbar lines is sensed by the sense amplifier, which produces the read result. During a write process, one of the bit/bitbar lines is discharged, and by strobing the word enable line the desired data is forced into the cell before the word line goes away.

Figure 8.10 gives the circuit ofa complete SRAM circuit design with only one column and one row shown. One of the key design parameters of a SRAM cell is to determine the size of transistors used in the memory cell. We first need to determine the criteria used in sizing transistors in a CMOS 6-transistor/cell SRAM. There are three transistor sizes to choose in a 6-transistor CMOS SRAM cell due to symmetry. They are the pMOS pull-up size, the nMOS pull-down size, and the nMOS access transistor (also called the pass- transistor gate) size. There are two identical copies of each transistor, giving a total of six. Since the sole purpose of the pMOS pull-up is to supply enough current in overcoming junction current leakage, we should decide this size first. This is also the reason why some SRAMs completely remove the two pMOS transistors and replace them with two 10-Gpolysilicon resistors giving the 4T cell shown in Fig. 8.9. Since one of the goals is to make your cell layout as small as possible, pMOS pull-up is chosen to be minimum

imageboth in its length and width. Only when there is room available (i.e., if it does not increase the cell layout area), the length of pMOS pull-up is increased. By increasing the length of pMOS pull-up transistors, the capacitance on the crossed-coupled inverters output nodes is increased. This helps in protecting against certain soft errors. It also makes the cell slightly easier to write.

The next step is to choose the nMOS access transistor size. This is a rather complicated process. To begin we need to determine the length of this transistor. It is difficult to choose because, on one hand, we want it also to be minimum in order to reduce the cell layout size. However, on the other hand, a column of n SRAM bits (rows) has to have n of access transistors connected to the bit or bitbar line. If each of the cells leaks just a small amount of current, the leakage is multiplied by n. Thus, the bit or bitbar line, which one might think should be sitting at the bit-line pull-up (or precharge) voltage, is actually pulled down by this leakage. Thus, the bit or bitbar line high level is lower than the intended voltage. When this happens, the voltage difference between the bit and bitbar lines, which is seen by the sense amplifier

imageduring a read, is smaller than expected, perhaps catastrophically so. Thus, if the transistors used are not particularly leaky or n is small, a minimum sized nMOS is sufficient. Otherwise, a larger sized transistor should be used. Beside considering leakage, there are three other factors that may affect the transistor sizes of the two nMOSs. They are: (1) cell stability, (2) speed, and (3) layout area. The first factor, cell stability, is a DC phenomenon. It is a measure of the cell’s ability to retain its data when reading and to change its data when writing. A read is done by creating a voltage difference between the bit and bitbar lines (which are normally precharged) for the sense amplifier to differentiate. A write is done by pulling one of the bit or bitbar lines down completely. Thus, one must design the size to satisfy the cell stability while achieving the maximum read and write speed and maintaining the minimum layout area.

Much work has been done in writing computer-aided design (CAD) tools that automatically size transistors for arrays of SRAM cells, and then do the polygon layout. Generally, these are known as SRAM macrogenerators or RAM compilers. These generated SRAM blocks are used as drop ins in many application specific intergrated circuits (ASICs). Standard SRAM chips are also available in many different organizations. Common ones are arranged in 4 b, bytes, and double words (32 b) in width.

There is also a special type of SRAM cell used in computers to implement registers. These are called multiple port memories. In general, the contents can be read by many different requests at the same time. Figure 8.11 shows a dual-read port single-write port SRAM cell. When laying out SRAM cells, adjacent cells usually are mirrored to allow sharing of supply or ground lines. Figure 8.12 illustrates the layout of four adjacent SRAM cells using a generic digital process design rules. This block of four cells can be repeated in a two-dimensional array format to form the memory core.

Direct Random Access Memory (DRAM)

The main disadvantage of SRAM is in its size since it takes six transistors (or at least four transistors and two resistors) to construct a single memory cell. Thus, the DRAM is used to improve the capacity. There are different DRAM cell designs. There is the four-transistor DRAM cell, three-transistor DRAM cell, and the one-transistor DRAM cell. Figures 8.13 shows the corresponding circuits for these cells. Data writing is accomplished in a three-transistor cell by keeping the RD line low (see Fig. 8.13(b)) while strobing the WR line with the desired data to be written is kept on the bus. If a 1 is desired to be stored, the gate of T2 is charged turning on T2. This charge remains on the gate of T2 for a while before the leakage current discharges it to a point where it cannot be used to turn on T2. When the charge is still there, a read can be performed by precharging the bus and strobing the RD line. If a 1 is stored, then both T2 and T3 are on during a read, causing the charge on the bus to be discharged. The lowering of voltage can be picked up by the sense amplifier. If a zero is stored, then there is no direct path from the bus to ground, thus the charge on the bus remains. To further reduce the area of a memory cell, a single transistor cell is often used. Figure 8.13(c) shows the single transistor cell with a capacitor. Usually, two columns of cells

image

image

are mirror images of each other to reduce the layout area. The sense amplifier is shared, as shown in Fig. 8.14. In this one-transistor DRAM cell, there is a capacitor used to store the charge, which determines the content of the memory. The amount of the charge in the capacitor also determines the overall performance of the memory. A continuing goal is to downscale the physical area of this capacitor to achieve higher and higher density. Usually, as one reduces the area of the capacitor, the capacitance also decreases. One approach is to increase the surface area of the storage electrode without in- creasing the layout area by employing stacked capacitor structures, such as finned or cylindrical structures. Certain techniques can be used to utilize a cylindrical capacitor structure with hemispherical grains. Figure 8.15 illustrates the cross section of a one-transistor DRAM cell with the cylindrical capacitor structure. Since the capacitor is charged by a source follower of the pass transistor, these capacitors can be charged maximally to a threshold voltage drop from the supply voltage. This reduces the total charge stored and affects performance, noise margin, and density. Frequently, to avoid this problem,

image

the word lines are driven above the supply voltage when the data are written. Figure 8.16 shows typical layout of one-transistor DRAM cells. The writing is done by putting either a 0 or 1 (the desired data to store) on the read/writing line. Then the row select line is strobed. A zero or one is stored in the capacitor as charge. A read is performed with precharging

imagethe read/write line then strobing the row select. If a zero is stored due to charge sharing, the voltage on the read/write line decreases. Otherwise, the voltage remains. A sense amplifier is placed at the end to pick up if there is a voltage change or not. DRAM differs from SRAM in another aspect. As the density of DRAM increases, the amount of charge stored in a cell reduces. It becomes more subject to noise. One type of noise is caused by radiation called alpha particles. These particles are helium nuclei that are present in the environment naturally or emitted from the package that houses the DRAM die. If an alpha particle hits a storage cell, it may change the state of the memory. Since alpha particles can be reduced, but not eliminated, some DRAMs institute error detection and correction techniques to increase their reliability. Another difference between DRAMs and SRAMs is in the number of address pins needed for a given size RAM. SRAM chips require all address bits to be given at the same time. DRAMs, however, utilize time-multiplex address lines. Only half of the address bits are given at a given time. They are divided by rows and columns. An extra control signal is thus required. This is the reason why DRAM chips have two address strobe signals: row address strobe (RAS) and column address strobe (CAS).

Special Memory Structures

The trend in memory devices is toward larger, faster and better-performance products. There is a complementary trend toward the development of special purpose memory devices. Several types of special purpose RAMs are offered for particular applications such as content addressable memory for cache memory, line buffers (FIFOs) for office automation machines, frame buffers for TV and broadcast equipment, and graphics buffers for computers.

image

Content Addressable Memory (CAM)
A special type of memory called content addressable memory (CAM) or associative memory is used in many applications such as cache memory and associative processor. A CAM stores a data item consisting of a tag and a value. Instead of giving an address, a data pattern is given to the tag section of the CAM. This data pattern is matched with the content of the tag section. If an item in the tag section of the CAM matches the supplied data pattern, the CAM outputs the value associated with the matched tag. Figure 8.17 illustrates the basic structure of a CAM. CAM cells must be both readable and writable just like the RAM cell. Figure 8.18 shows a circuit diagram for a basic CAM cell with a match output signal. This output signal may be used as an input for some logic to determine the matching process.

imageFirst-In--First-Out/Queue (FIFO/Queue)

A FIFO/queue is used to hold data while waiting. It serves as the buffering region for two systems, which may have different rates of consuming and producing data. FIFO can be implemented using shift registers or RAMs with pointers.

Video RAM: Frame Buffer

There is a rapid growth in computer graphic applications. The technology that is most successful is termed raster scanning. In a raster scanning display system, an image is constructed with series of horizontal lines. Each of these lines are connected pixels of the picture image. Each pixel is represented with bits controlling the intensity. Usually there are three planes corresponding to each of the primary colors: red, green, and blue. These three planes of bit maps are called frame buffer or image memory. Frame buffer architecture affects the performance of a raster scanning graphic system greatly. Since these frame buffers needs to be read out serially to display the image line by line, a special type of DRAM memory called video memory is used. Usually these memory are dual ported with a parallel random access port for writing and a serial port for reading.

4. Interfacing Memory Devices

Besides capacity and type of devices, other characteristics of memory devices include its speed and the method of access. We mentioned in the Introduction memory access time. It is defined as the time between the address available to the divide and the data available at the pin for access. Sometimes the access time is measured from a particular control signal. For example, the time between the read control line ready to data ready is called the read command access time. The memory cycle time is the minimum time between two consecutive accesses. The memory write command time is measured from the write control ready to data stored in the memory. The memory latency time is the interval between the CPU issuing an address to data available for processing. The memory bandwidth is the maximum amount of memory capacity being transferred in a given time. Access is done with address, read/write control lines, and data lines. SRAM and ROMs are accessed similarly during read. Figure 8.19 shows the timing diagram of two SRAM read cycles. In both methods read cycle time is defined as the time period between consecutive read addresses. In the first method, SRAM acts as an asynchronous circuit. Given an address, the output of the SRAM changes and become valid after a certain delay, which is the read access time. The second method uses two control signals, chip select and output enable, to initiate the read process. The main difference is in the data output valid time. With the second method data output is only valid after the output enable signal is asserted, which allows several devices to be connected to the data bus. Writing SRAM and electrically

image

reprogrammable ROM is slightly different. Since there are many different programmable ROMs and their writing processes depend on the technology used, we will not discuss the writing of ROMs here. Figure 8.20 shows the timing diagram of writing typical SRAM chips. Figure 8.20(a) shows the write cycle using the write enable signal as the control signal, whereas Fig. 8.20(b) shows the write cycle using chip enable signals. Accessing DRAM is very different from SRAM and ROMs. We will discuss the different access modes of DRAMs in the following section.

Accessing DRAMs

DRAM is very different from SRAM in that its row and column address are time multiplexed. This is done to reduce the pins of the chip package. Because of time multiplexing there are two address strobe lines for DRAM address, RAS and CAS. There are many ways to access the DRAM. We list the five most common ways.

Normal Read/Write

When reading, a row address is given first, followed by the row address strobe signal RAS. RAS is used to latch the row address on chip. After RAS, a column address is given followed by the column address strobe CAS. After certain delay (read access time) valid data appear on the data lines. Memory write is done similarly to memory read with only the read/write control signal reversed. There are three cycles available to write a DRAM. They are early write, read-modify-write, and late write cycles. Figure 8.21 shows only the early write cycle of a DRAM chip. Other write cycles can be found in most of the DRAM databooks.

Fast Page Mode (FPM) or Page Mode

In page mode (or fast page mode), a read is done by lowering the RAS when the row address is ready. Then, repeatedly give the column address and CAS whenever a new one is ready without cycling the RAS line. In this way a whole row of the two-dimensional array (matrix) can be accessed with only one RAS and the same row address. Figure 8.22 illustrates the read timing cycle of a page mode DRAM chip.

imageStatic Column

Static column is almost the same as page mode except the CAS signal is not cycled when a new column address is given—thus the static column name.

Extended Date Output (EDO) Mode

In page mode, CAS must stay low until valid data reach the output. Once the CAS assertion is removed, data are disabled and the output pins goes to open circuit. With EDO DRAM, an extra latch following the sense amplifier allows the CAS line to return to high much sooner, permitting the memory to start precharging earlier to prepare for the next access. Moreover, data are not disabled after CAS goes high. With burst EDO DRAM, not only does the CAS line return to high, it can also be toggled to step though the sequence in burst counter mode, providing even faster data transfer between memory and the host. Figure 8.23 shows a read cycle of an EDO page mode DRAM chip. EDO mode is also called hyper page mode (HPM) by some manufactures.

Nibble Mode

In nibble mode after oneCAS with a given column three more accesses are performed automatically without giving another column address (the address is assumed to be incremented from the given address).

image

image

imageRefreshing the DRAM

Row Address Strobe- (RAS-) Only Refresh

This type of refresh is done row by row. As a row is selected by providing the row address and strobing RAS, all memory cells in the row are refreshed in parallel. It will take as many cycles as the number of rows in the memory to refresh the entire device. For example, an 1M × 1 DRAM, which is built with 1024 rows and columns will take 1024 cycles to refresh the device. To reduce the number of refresh cycles, memory arrays are sometimes arranged to have less rows and more columns. The address, however, is nevertheless multiplexed as two evenly divided words (in the case of 1M × 1 DRAM the address word width is 10 b each for rows and columns). The higher order bits of address lines are used internally as column address lines and they are ignored during the refresh cycle. No CAS signal is necessary to perform the RAS-only refresh. Since the DRAM output buffer is enabled only when CAS is asserted, the data bus is not affected during the RAS-only refresh cycles.

Hidden Refresh

During a normal read cycle, RAS and CAS are strobed after the respective row and column addresses are supplied. Instead of restoring the CAS signal to high after the read, several RASs may be asserted with the corresponding refresh row address. This refresh style is called the hidden refresh cycles. Again, since the CAS is strobed and not restored, the output data are not affected by the refresh cycles. The number of refresh cycles performed is limited by the maximum time that CAS signal may be held asserted.

Column Address Strobe (CAS) Before RAS Refresh (Self-Refresh)
To simplify and speed up the refresh process, an on-chip refresh counter may be used to generate the refresh address to the array. In such a case, a separate control pin is needed to signal to the DRAM to initiate the refresh cycles. However, since in normal operation RAS is always asserted before CAS for read and write, the opposite condition can be used to signal the start of a refresh cycle. Thus, in modern self-refresh DRAMs, if the control signal CAS is asserted before the RAS, it signals the start of refresh cycles. We call this CAS-before-RAS refresh, and it is the most commonly used refresh mode in 1-Mb DRAMs. One discrepancy needs to be noted. In this refresh cycle the WE pin is a don’t care for the 1-Mb chips. However, the 4 Mb specifies the CAS Before RAS refresh mode with WE pin held at high voltage. A CAS-before-RAS cycle with WE low puts the 4 Meg into the JEDEC-specified test mode (WCBR). In contrast, the 1 Meg test mode is entered by applying a high to the test pin.

All of the mentioned three refresh cycles can be implemented on the device in two ways. One method utilizes a distributed method, the second uses a wait and burst method. Devices using the first method refresh the row at a regular rate utilizing the CBR refresh counter to turn on rows one at a time. In this type of system, when it is not being refreshed, the DRAM can be accessed, and the access can begin as soon as the self-refresh is done. The first CBR pulse should occur within the time of the external refresh rate prior to active use of the DRAM to ensure maximum data integrity and must be executed within three external refresh rate periods. Since CBR refresh is commonly implemented as the standard refresh, this ability to access the DRAM right after exciting the self-refresh is a desirable advantage over the second method. The second method is to use an internal burst refresh scheme. Instead of turning on rows at regular interval, a sensing circuit is used to detect the voltage of the storage cells to see if they need to be refreshed. The refresh is done with a series of refresh cycles one after another until all rows are completed. During the refresh other access to the DRAM is not allowed.

5. Error Detection and Correction

Most DRAMs require a read parity bit for two reasons. First, alpha particle strikes disturb cells by ionizing radiation, resulting in lost data. Second, when reading DRAM, the cell’s storage mechanism capacitively shares its charge with the bit-line through an enable (select) transistor. This creates a small voltage differential to be sensed during read access. This small voltage difference can be influenced by other close by bit-line voltages and other noises. To have even more reliable memory, error correction code may be used. One of the error correction methods is called the Hamming code, which is capable of correcting any 1-b error.

Defining Terms

Dynamic random access memory (DRAM): This memory is dynamic because it needs to be refreshed periodically. It is random access because it can be read and written.

Memory access time: The time between a valid address supplied to a memory device and data becoming ready at output of the device.

Memory cycle time: The time between subsequent address issues to a memory device.

Memory hierarchy: Organize memory in levels to make the speed of memory comparable to the processor.

Memory read: The process of retrieving information from memory.

Memory write: The process of storing information into memory.

ROM: Acronym for read-only memory.

Static random-access memory (SRAM): This memory is static because it needs not to be refreshed. It is

random access because it can be read and written.

References

Alexandridis, N. 1993. Design of Microprocessor-Based Systems. Prentice-Hall, Englewood Cliffs, NJ. Chang, S.S.L. 1980. Multiple-read single-write memory and its applications. IEEE Trans. Comp. C-29(8). Chou, N.J. et. al. 1972. Effects of insulator thickness fluctuations on MNOS charge storage characteristics.

IEEE Trans. Elec. Dev. ED-19:198.

Denning, P.J. 1968. The working set model for program behavior. CACM 11(5).

Flannigan, S. and Chappell, B. 1986. J. Solid St. Cir.

Fukuma, M. et al. 1993. Memory LSI reliability. Proc. IEEE 81(5). May.

Hamming, R.W. 1950. Error detecting and error correcting codes. Bell Syst. J. 29 (April).

Katz, R.H. et al. 1989. Disk system architectures for high performance computing. Proc. IEEE 77(12).

Lundstrom, K.I. and Svensson, C.M. 1972. Properties of MNOS structures. IEEE Trans. Elec. Dev.

ED-19:826.

Masuoka, F. et al. 1984. A new flash EEPROM cell using triple poly-silicon technology. IEEE Tech. Dig.

IEDM: 464–467.

Micro. Micro DRAM Databook.

Mukherjee, S. et al. 1985. A single transistor EEPROM cell and its implementation in a 512 K CMOS

EEPROM. IEEE Tech. Dig. IEDM:616–619.

NEC. n.d. NEC Memory Product Databook.

Pohm, A.V. and Agrawal, O.P. 1983. High-Speed Memory Systems. Reston Pub., Reston, VA.

Prince, B. and Gunnar Due-Gundersen, G. 1983. Semiconductor Memories. Wiley, New York.

Ross, E.C. and Wallmark, J.T. 1969. Theory of the switching behavior of MIS memory transistors. RCA

Rev. 30:366.

Samachisa, G. et al. 1987. A 128 K flash EEPROM using double poly-silicon technology. IEEE International

Solid State Circuits Conference, 76–77.

Sayers. et al. 1991. Principles of Microprocessors. CRC Press, Boca Raton, FL.

Scheibe, A. and Schulte, H. 1977. Technology of a new n-channel one-transistor EAROM cell called SIMOS.

IEEE Trans. Elec. Dev. ED-24(5).

Seiichi Aritome, et al. 1993. Reliability issues of flash memory cells. Proc. IEEE 81(5).

Shoji, M. 1989. CMOS Digital Circuit Technology. Prentice-Hall, Englewood Cliffs, NJ.

Slater, M. 1989. Design of Microprocessor-based Systems.

Further Information

More information on basic issues concerning memory organization and memory hierarchy can be found in Pohm and Agrawal (1983). Prince and Due-Gunderson (1983) provides a good background on the different memory devices. Newer memory technology can be found in memory device databooks such as Mukherjee et al. (1985) and the NEC databook. IEEE Journal on Solid-State Circuits publishes an annual special issue on the Internation Solid-State Circuits Conference. This conference reports the current state- of-the-art development on most memory devices such as DRAM, SRAM, EEPROM, and flash ROM. Issues related to memory technology can be found in the IEEE Transactions on Electron Devices. Both journals have an annual index, which is published at the end of each year (December issue).