The memory mix challenge
The mix of memory products, DRAM, SRAM, and Flash combined with a mix of test methodologies, I/O compression, FPC, RPC, and BIST presents a complex test profile. To meet the issues facing memory manufacturers today, testers at wafer sort require a broad range of capabilities, writes Sam Wong, senior applications engineer at Agilent Technologies.
The manufacturing test floor hums with activity - a range of memory devices is being tested on a variety of testers. There may be different combinations of DRAM, SRAM, Flash, embedded and stacked memories all requiring test, on a single manufacturing floor. More than ever, memory manufacturers are facing product mix issues between various technologies.
In the past, a test floor was populated by a single type of memory device, such as DRAM or Flash, and a tester was chosen with this device in mind. But now, the memory market is in transition. No longer are manufacturers dedicated to one memory type, and no longer do they have the luxury of using focused test platforms that are tailored to a specific target device. To remain competitive, they have no alternative but to diversify their product mix.
Memory manufacturers face another challenge - variety within a single product type. As device densities and pin counts have increased, manufacturers have created alternate test methodologies that enable a device to be tested with a reduced set of pins at different steps of the test process. However, since these methodologies are new and must be considered in the design of the product, they may not be uniformly applied throughout the product family. Newer devices may be designed with the test methodologies while older ones may not. Additionally, different test methodologies may be used at different times during the devices lifecycle. The result is a mix of high and low pin-count devices, both of which need to be tested cost-effectively, thereby adding to the complexity of the product mix challenge. Wafer sort poses particular problems since the test needs of memory products are more diverse than at final test.
Consequently, on the test floor of today, and in the future, the question is how to accommodate at wafer sort, not only different memory types, but also, different test methodologies. The answer is not multiple test platforms, one dedicated for each device type, because that introduces too much complexity in many of the manufacturing processes: operator training, maintenance and support, test development, and product support. Also, a tester dedicated to a specific product may have a short life if that product itself has a short lifetime. Instead, a flexible system that can provide high throughput and parallelism, while dynamically changing to accommodate different pin and parallelism requirements, is one way of meeting the memory mix test challenges.
Market
The semiconductor memory market has experienced great fluctuation over the past several years. The PC market, once the primary driver for memory devices, is being augmented by communications, consumer and computing applications which are expanding from stationary to wireless, mobile use. Flash memory, well suited for mobile devices, is experiencing an explosive period of growth (Figure 1).
Recognising the growth of the non-volatile memory market, DRAM manufacturers are crossing over into Flash memory. In the past, the different memory manufacturers were clearly defined by memory type: Intel was the leader in Flash and Samsung the leader in DRAM. Now Samsung, as well as Micron, Hynix and Infineon, all DRAM suppliers, are expanding into Flash memory, or rapidly expanding their existing Flash memory lines. Pure NOR Flash suppliers, such as STMicro and Spansion (an AMD – Fujitsu joint venture), are adding NAND Flash to their product mix. These crossover manufacturers, as well as sub-contract manufacturers, need a test system that can accommodate a varied mix.
Requirements
At wafer sort, DRAM devices exhibit typical pin counts of less than 60 pins, and are tested at frequencies up to 200MHz. For DRAM, a stimulus is applied to the memory cell, the memory cell is read and compared to expected data and fail flags set accordingly. DRAM test requires an automated pattern generator (APG) rich with features to enable the writing of complex patterns typically found in DRAM test. Because DRAM is a commodity product that is vulnerable to market fluctuations, minimising cost of test is critical. One way DRAM manufacturers try to lower the cost of test is through high parallelism. Another way is to reduce the number of pins required for test by using compression techniques.
SRAM devices target different applications than DRAM and performance requirements may also differ. Despite these differences, many of the same test techniques used for DRAM are also applicable to SRAM.
Flash testing differs from DRAM testing in significant ways. For Flash test, the exact value of the stimulus to apply to the memory cell is not exactly known, and must be determined through an iterative process. It is uncertain how long it will take to complete a test step once its begun. Thus the test time for each cell is not exactly the same, unlike it is for DRAM. Flash, particularly NOR Flash, requires an APG that provides independent sequencing on a per device under test (DUT) basis. Like DRAM, Flash manufacturers try to reduce the cost of test through high parallelism and different test methodologies at wafer sort.
Emerging methodologies at wafer sort
As the cost of test continues to rise, different test methodologies that reduce the number of pins required to test a device are being implemented. These methodologies share the benefits of smaller pin counts, reduced tester resources, and higher throughput but with an impact on device design.
Some DRAM manufacturers use I/O compression techniques to reduce I/O pins from a typical 8 or 16 channels down to 4. The advantage of the compression scheme is that fewer tester channels are needed. The disadvantage is that a longer, more complicated test program is required.
Flash devices, both NOR and NAND, use full pin count (FPC) and reduced pin count (RPC) test methodologies. With an FPC strategy, all device pins (generally less than 52) are accessible, providing maximum flexibility for test generation. Bitmapping and many redundancy schemes are available, with redundancy analysis being done by the automated test equipment (ATE). The full pin count test methodology provides the most extensive, flexible coverage since all user mode functions and data sheet parameters can be tested. However, this methodology has the most intensive capital cost. This test strategy is required for known good die (KGD) and final test requirements.
RPC test methodologies are very similar to FPC, but applied to fewer pins. The DUT interface is reduced to about 16 pins or less, depending on the bus width of the multiplexed address and data pins. Some built-in self-test (BIST) capability may be needed to handle algorithmic operations such as automatic address and data generation. RPC results in higher throughput, but some tests, such as at-speed tests cannot be covered 100%.
BIST test methodologies are more ‘minimised than the previous techniques and can be applied to DRAM, SRAM or Flash. BIST is more appropriate once a part is mature, since access to the part can be very limited. The DUT access is through a serial interface and requires 8 or fewer pins providing complete control by the BIST controller, including all program/erase algorithms. The BIST controller executes all trimming and all redundancy analysis with a simple column or block redundancy procedure. The ATE is only required to perform DC tests, provide a voltage/current reference, sequence through BIST functions and process the pass/fail information. The obvious advantage of BIST is reduced need for high-performance ATE, but this comes at the cost of die size penalty.
At this point, there is no one ‘right test methodology. Since some of these techniques are emerging, time is required to develop and deploy each methodology. Even once the methodologies are mature, they will correspond to different times in a devices lifecycle. Full pin count methodologies will be used on first silicon and for characterisation, whereas reduced schemes, such as I/O compression, RPC or BIST will be used once the part is mature. Because of the unique test requirements at different points along the device lifecycle, there will always be a mixture of test methodologies on the test floor.
The variety of different test methodologies, I/O compression, FPC, RPC and BIST combined with the mixture of memory devices is creating a complex test profile. In this environment, an ideal test system should be able to switch from one memory technology to another, while maintaining high throughput, high parallelism and the ability to adjust to different pin counts and parallelisms.
High throughput
In a memory mix environment, it is important to have an ATE architecture that can maximise throughput for all the memory types. Two typical architectures used to test memory devices are shared-resource and tester-per-site.
With shared-resource architectures, it is necessary for each DUT to wait for serial resources when needed. Contrast that to a tester-per-site architecture in which each device under test receives its own dedicated set of test resources: its own APG, buffer memory, error catch RAM, vector memory and power supplies. No critical resources are shared. This distinction is important for serial tests that require unique data for each DUT, such as electrical repair and parametric tests, as well as Flash programming.
The architecture type has a direct influence on test time. With the shared-resource architecture, the serial tests on one site have to be completed before it can move on to the next site since there is only one set of resources available for the whole system. As a result, the non-operating DUTs remain idle until the resources are available for them to continue. In this case, the total serial test time is the sum of all DUTs individual test times.
With a tester-per-site architecture, each DUT can be tested independently in parallel since there is one complete set of test resources behind each site. The DUTs do not have to wait for resources and the total test time is only as long as the slowest DUT (Figure 2). The tester-per-site architecture results in a much higher throughput for devices that are asynchronous in nature such as Flash memory.
DRAM and SRAM are typically tested with shared-resource ATE architectures. This stems from the fact that the devices have a predictable, deterministic behaviour and sequencer control can be the same for all DUTs. Sharing the controller, buffer memory, and APG resources reduces tester cost.
On the other hand, testing NOR Flash requires more than just the functionality of DRAM and SRAM test. In particular, NOR Flash is difficult to test without allowing for independent sequencing within the APG, on a per-DUT basis. Without this per-DUT sequencing - requiring an independent APG per DUT - test times increase by 15-25% per each additional DUT tested (with the same resources) due to serialisation of shared tester resources. This additional test time required for each additional DUT is known as test time overhead (TTO). For this reason, NOR flash has historically been tested on an ATE tester-per-site architecture. This architecture also provides a controller per DUT so that the test programs, and not just the pattern-generated code, can run independently. Independent computational power per DUT allows the test flow and test patterns to be modified dynamically per DUT as pass/fail information and array behaviour, such as program or erase time, are accumulated. Program development is also greatly simplified since no attention needs to be paid to managing multiple DUTs.
NAND Flash requires the same basic requirements as NOR Flash, DRAM, and SRAM. However, in the case of NAND Flash, the choice between tester-per-site and shared-resource architectures is not so clear. The decision depends on the test methodology, test modes and other business factors. Issues such as product mix, NAND/NOR/DRAM, NAND/DRAM or NAND/NOR, often influence the decision.
Consequently, since the tester-per-site architecture results in a much higher throughput for asynchronous devices, such as NOR Flash and certain NAND Flash memory, it is necessary to have this architecture when testing a mix of memory products that includes Flash. The most flexible tester-per-site architectures offer greater than 128 sites per system with at least 32 I/O channels per site. This translates into the ability to test more than 256 DUTs with more than 4000 I/O pins per test system.
Parallelism
Different memory types require high parallelism which in turn, is being driven by expanded die size and large-area wafer probing. At wafer sort, dual test heads with high pin counts achieve high parallelism for DRAM, up to x128. For Flash memories, a tester-per-site architecture paired with high pin counts is necessary to provide similar parallelism rates. However, as tester pin counts increase, significant connection-count problems arise between the ATE pin electronics and the probe cards or load boards. This is an industry-wide problem that manufacturers are facing.
Part of the problem is that so many pins are needed in the ATE interface. Even though a user may only access I/O channels, drive channels, and utility channels, the tester interface needs more than twice as many pins since each signal pin is often accompanied by at least 1-2 grounds to maintain the required 50ohm impedance. For example, a highly parallel system of over 128 sites provides more than 4000 channels to the user. However, the DUT interface for that same system is composed of more than 11,000 pins.
With the traditional pogo pin interface, the amount of force required to seat an 11,000 pin interface to a probe card would be around 700 pounds. This high seating force would result in planarity and consistency problems, casting doubt on the reliability of the pogo pin solution. As the tester interface exceeds 11,000 pins, it is imperative that a reliable DUT interface be provided.
A DUT interface that replaces pogo pins with mass-terminated, high bandwidth connectors is more reliable and more repeatable (Figure 3). This connection scheme can handle the 11,000-pin interface with a much lower probe card seating force, 60 pounds instead of 700. Because the connectors have a much smaller pitch than pogo pins, this scheme can support more pins while maintaining similar or increased probe card keep out area. Also, this interface, with its machined components and mechanical self-alignment, provides superior reliability and repeatability.
Thus, as the volumes of memory devices continuously increase, higher parallelism is absolutely necessary to reduce the cost of test. Higher parallelism translates into a higher number of tester pins that will stress the traditional pogo pin interface to the limits of its capability. To be able to accommodate high parallelism, an ideal tester should have a different type of interface, such as an interface that uses mass-terminated, high bandwidth connectors that can provide the reliability, repeatability and performance that higher parallelism requires.
Flexible configuration
The combination of memory products, DRAM, SRAM and Flash, and their associated test methodologies, I/O compression, FPC, RPC, or BIST, results in a variety of different parallelisms and pin counts. Wafer sort parallelisms can range anywhere from x128 to x36 and pin counts from 60 to 4. In this situation, configurability of tester resources is very important. Having all I/O pins, versus dedicated drive-only and I/O pins, helps accommodate a wider variety of device types. In addition, when using a tester-per-site architecture, the ability to adjust the number of pins associated with each APG helps optimise parallelism without wasting tester channels.
Agilent Technologies has developed a flexible APG that can dynamically adjust to different parallelism needs. With an APG connected to each DUT, the test system can handle 144 DUTs having 32 pins each, 72 DUTs having 64 pins each, or 36 DUTS having 128 pins each. When the APG resources are shared among two or more DUTS, parallelism can be increased by a factor of two or more.
Since memory devices present such a varied combination of pin counts and parallelisms, it is imperative that a tester have the ability to adapt to this variation. When NOR Flash and certain NAND Flash memory are part of the product mix, a flexible configuration, designed into a tester-per-site architecture, that enables the APG to adjust to different DUT pin requirements is essential for optimising parallelism.
Fig.1: Memory units shipped. Source Semico Research (January 2004) |
Fig.2: Throughput comparison |
Fig.3: Mass-terminated high bandwidth connectors. Courtesy of Agilent Technology (June 2004) |