Achieving Rapid Yield Improvement
Richard Kittler, Michael McIntyre, Christopher A. Bode, Thomas J. Sonderman, Steve Reeves and Steve Zika, Advanced Micro Devices, Austin, Texas -- Semiconductor International, 7/1/2004
|
The demands on semiconductor manufacturers have never been greater than they are today. Companies are continually challenged to meet market demand by delivering increasingly sophisticated products in high volume, on time and more cost-effectively than ever before. Customers' bottom-line needs often require manufacturers to adopt entirely new approaches to yield management that not only deliver higher yields but also achieve maturity levels more quickly than was attainable even a few years ago.
This new yield management, in turn, depends on new approaches to engineering analysis that are both broader and deeper than before to help semiconductor companies make well-informed decisions to more rapidly achieve mature yields. Automated manufacturing technology puts the fruits of that analysis to work by enabling continuous yield learning that not only communicates changes to the manufacturing line, but also enables a real-time determination of whether those changes are having the desired effect. This makes it possible to fix root causes universally from that point forward.
Far from being only drawing board concepts, these new solutions are in use now by forward-thinking manufacturers. This article describes the industry-leading engineering analysis and Automated Precision Manufacturing (APM) solution deployed at AMD, and the results the solution is achieving for AMD and its customers.
Components of rapid yield learningRapid yield learning is the practice of finding and fixing yield-limiting mechanisms in the manufacturing process to realize the yield potential of the product and process for a given toolset. This activity is driven by the need to achieve certain volumes and drive down the cost per part to maximize profitability. Although the yield learning curves for a process are usually sketched as smooth, they are in fact a series of step functions overlaid with noise and yield excursions (Fig. 1 ).
The step functions upward represent intentional process improvements, while the downswings in the curve represent excursions caused by unintended tool or process failures. The overall rate of ascent is determined by the rate at which excursions can be responded to and fixed, as well as the efficiency of offline process improvement efforts. Each of these functions requires expert engineers with the support of specialized forms of data and analyses to achieve rapid progress.
There are two types of yield learning activity: One occurs when a new product is introduced on an existing process; the other occurs when both the product and the process are new, such as the transition to a new technology node. When only the product is new, ramping its yield involves finding and fixing design sensitivities, since the process is usually fixed by the momentum of other products already in production. When both the product and process are new, the situation is complicated by the need to measure and understand the combined effects of design sensitivities to process, process targeting, and tool-related process capability issues. By breaking yield learning down into its different components, both the expertise and associated software systems can be tailored to efficiently support each requirement of the yield learning process.
We describe how each of the capabilities needed to support rapid yield learning has evolved in parallel with manufacturing capabilities, from stability to controllability and, eventually, predictability.
Establishing stabilityDuring the period from the late 1980s through the mid-1990s, IC manufacturing was converting from factory monitoring via paper-based trend charting to more rigorous adoption and use of statistical process control (SPC). The SPC charts began with paper, but moved rapidly to electronic forms and linkage with the online work-in-process (WIP) tracking system to provide alerts and, optionally, toggle tool-down states when severe excursions were detected. SPC rule violations triggered manual intervention by technicians and engineers, who followed a troubleshooting process described in an out-of-control action plan to determine the cause of the excursion and implement corrective action. Great attention was paid to ensuring that the charts associated with critical process steps were maintained in control and that process capability indices met specific objectives.
In parallel with the development of these online systems, initial versions of offline yield management applications were developed to improve the rate of response to yield excursions and find systematic sources of yield loss in the manufacturing stage. These systems introduced such capabilities as automated commonality analyses, rogue tool analyses and wafer positional analyses (WPA). Data to support the analyses were provided through a family of databases fed by extracts of inline metrology data from the WIP tracking system and datalogs captured from e-test and wafer sort. An integrated data access menu provided transparent access to the data across databases and multiple plant locations. Batch routines ran continuously to comb through tool history, wafer positional data, chamber-to-chamber effects, etc., to look for systematic sources of variation in yield.
Even defect excursions, when systematic in nature,1 could be successfully traced to root cause through such methods. Finding the root cause of each such source of variability provided an opportunity to improve the process and resulted in additional inline metrology to ensure that future excursions were caught as soon as possible. The batch systems also created and maintained shared data sets of critical measurement and tool history information on volume products so ad hoc analyses could be run with minimal overhead and reference the same data used by the batch monitors. This was especially beneficial when certain data-cleansing tasks were required before analysis could proceed.
Such software systems evolved rapidly to find tool and process-related problems, but were less effective at isolating design- or test-related issues. To trace device issues at final test to either fab or design, a more comprehensive database was needed — one that would add not only coverage of final test data, but also a means to track and use the mapping between fab and final test lot numbers (a.k.a. lot genealogy). The lack of programmatic access to lot traceability data had stifled final test correlation studies on all but the most critical of problems. The effort to add support for these additional data types and unify existing database designs spawned AMD's multiyear SAPPHiRE database project.
Evolving to controllabilityIn the late 1990s, market demand for device performance and quicker time to market drove the need for more rapid ramp up for new products and tighter control of the process. Among the most important new capabilities in achieving tighter process control was the move from a passive SPC system to active process tuning in the form of run-to-run control. Through advanced process control (APC), we were able to correct for tool drift, more rapidly rematch tools after maintenance events, and compensate for deviations from target at prior steps. With tighter distributions we could move the process target closer to the spec limits and deliver a higher yield to desired performance bins. Models were developed and used to relate process targets at multiple steps to final device performance and periodically adjust target set points.
At critical process steps, the use of APC made it less likely that excursions would occur on the controlled parameters. In addition, it was realized that better control over the health of the tools themselves could assist in catching problems before they manifested in process excursions. These tool health monitors made use of equipment performance tracking and fault detection and classification (FDC) software systems. The FDC systems monitored trace data from onboard tool sensors in an effort to trap and flag deviations from normal processing behavior. Today, both APC and FDC are managed under the umbrella of a coherent strategy for factory automation called automated precision manufacturing (APM).2
Yield learning tools were advanced during this time through full implementation of the SAPPHiRE database at manufacturing sites worldwide. Using a single comprehensive scheme replicated across multiple physical servers and unified by a single end-user view, yield and product engineers could now transparently integrate data from front to back in the process. Lot history and genealogy tables enabled joining of data across the lot-numbering schemes used at different manufacturing facilities. Integrated tool event history tables also enabled study of the relationships between tool events and device performance on the wafers being processed.
In addition to SAPPHiRE, database views provided access to lot- and wafer-level defect data from the defect management systems as if it were part of the same integrated environment. Highly tuned analysis engines were now able to mine a larger set of data in response to excursions and root out systematic yield limiters for faster yield ramps. For example, excursions in speed distributions at final test could now be traced to tool or chamber performance problems in the fab through commonality analysis or other methods only now possible with the complete lot histories and genealogy across manufacturing sites that SAPPHiRE provided. SAPPHiRE has become the system of reference for all data associated with yield correlation studies, and continues to be enhanced with new forms of data driven by advancements in yield ramp and excursion control methodologies.
Among the more recent additions to data and analysis capabilities to support rapid yield learning are defect and bitmap integration, web-based trend charting, lot analysis summaries, multi-lot wafer positional analyses, unit-level traceability, local data marts, and integration of reticle and design data. A brief summary of each of these follows.
Defect and bitmap integrationAMD uses enhanced third-party defect management software (DMS) to consolidate defect metrology data by fab for both in-fab monitors and offline correlation studies. Intelligent defect sampling strategies and correlation to bitmap and wafer sort failures enable a form of "virtual strip back" that pinpoints the defects responsible for failures and accelerates finding the root cause of defect excursions that impact yield of both memory and logic parts. In addition, a form of bitmap-like analysis is used to determine the defect sensitivities of scan chains on logic parts (Fig. 2 ).
Web-based trend charting
Trend charts are now routinely posted to the internal web for critical parameters at e-test, wafer sort and final test. Some of these charts enable drill-down to data on specific lots and wafers within those lots. Having charts posted on the internal web has provided the broadest access possible to the "front line" data used in problem identification. It has also provided access to this information at all levels in the manufacturing area.
Lot analysis summariesA system called Lot Dossier gathers critical data on each lot as it progresses through fab and test; it analyzes the data, and posts these results to a web page. This web page becomes the reference site for the lot containing exception reports, inline metrology data summaries, wafer positional analyses, wafer maps, etc.
Multi-lot wafer positional analysesAlthough it has been more than 10 years since wafer positional analysis was introduced as Wafer Sleuth by Hewlett-Packard3 and then disseminated through SEMATECH, new methods for using the data continue to be developed. Among these is how to take advantage of data across multiple lots. Some of the methods AMD has developed include extracting signals associated with a library of patterns for each lot and correlating the strength of those signals to tool history. Such an approach has enabled a transition from chasing one-off lot phenomena to detection of multi-lot systematic issues that can be traced to a root cause, whether in response to an excursion or as part of the background yield improvement effort. Another important aspect of use has to do with the frequency of randomization. Today's complex process flows can involve up to 60 randomizations to more quickly identify the range of process steps associated with yield signals due to order of processing and chamber effects.
Unit-level traceabilityLot-level traceability through SAPPHiRE facilitated fab-to-final-test correlations, but the noise levels on these studies were high because of the variability of units within a lot. Recent processor product designs allow unique die identifiers to be programmed at wafer sort based on a combination of lot number, wafer scribe and X-Y probe location. These identifiers are then reread at all subsequent test operations in the back end. This enables correlation of each unit's final test results to wafer sort and electrical test. The improved signal-to-noise ratio now allows detailed models to be developed that relate final device performance to wafer sort, and hence e-test and inline fab metrology data.
Such models not only help explain excursions, but also allow forward prediction of binning and fine-tuning of wafer sort performance targets based on desired final test bin distributions. With such predictive models, wafers and lots can now be prioritized for assembly and test based on the expected bin distributions from the models. This has led to a reduced cost of delivering the right mix of product and performance to AMD's customers. Additional benefits of unit traceability are realized with the ability to unambiguously trace customer returns to an assembly or fab lot and thereby realize further learning and minimize additional jeopardy.
Local data martsThe advent of SAPPHiRE provided a large and very complete repository of the data needed to support rapid yield learning. However, it became clear that individual engineering groups often needed to create and maintain subsets of the main repository to support routine analysis tasks, especially at the unit level. This became more acute when unit traceability was implemented and large amounts of data were needed quickly to build and test new predictive models of device performance. Since the volume of such data was too great to be efficiently stored in the proprietary data formats used by various analysis tools, controlled data subsets, called data marts, were created. They were designed to facilitate efficient storage and retrieval of large volumes of unit-level data with local flexibility to add model parameters and create roll-ups by lot, wafer, wafer zone, etc.
Reticle and design attribute dataReticle and design attribute data is being loaded for correlation to manufacturing critical dimensions and defect sensitivities in an effort to better tune designs to a given process and fab tool set. It is anticipated that this area of focus will be increasingly important as optical lithography limits are pushed further.
The total packageThe methods we have described for minimizing yield excursions and maximizing ramp rate through systematic detection and elimination of yield loss mechanisms have produced dramatic results. An example of this is the yield ramp on AMD's Opteron 64-bit processor. Although this part was the first volume product on a 130 nm process technology in AMD's Dresden fab, its yield ramp was ~66% faster than the previous generation (Fig. 3 ).
![]() |
| 3. AMD Opteron processor yield trend vs. wafer count was steeper than that for prior processor introductions, despite its being a new product on a new 130 nm process. |
Moving toward predictability
Unit traceability has presaged the move into an era of predictability. As models are improved that relate wafer sort parametrics with final device performance, this will eventually lead to trusted models for similar relationships between fab metrology data, e-test, wafer sort and final test. When this occurs, the automation systems controlling the fab, especially the APC and integrated planning (i.e., dispatching) systems under the umbrella of AMD's APM program, will be able to route individual wafers based on tool capabilities and availabilities to maximize each wafer's revenue at the lowest cost.
Among the more important capabilities needed to advance rapid yield learning as we go forward will be the following:
- Improved models for predicting device performance and yield based on design attributes. These models will enable fine-tuning of designs, resolution enhancement techniques and processes to minimize systematic device failures due to marginality of the design-process combination.
- Improved models that relate FDC faults and other excursions of tool health monitors to yield and downstream device characteristics, enabling better predictive models for tool maintenance.
- Improved add-on sensor technology (e.g., motes) to extend factory FDC and software systems to process greater volumes of data.
- Improved systems for summarizing data about what measurements are being made, what they mean, and how they are collected and stored (a.k.a. metadata).
- Expansion of unit traceability to all products, with methods for selective storage and use to support model building.
- Better use of special signature analysis, including pre-defined zonal analyses of wafer maps and methods of extracting and using pattern information from traces and maps.
- Comprehensive use of both intra- and inter-die universal coordinates for overlay and analysis of disparate data types.
- Wafer traceability within fab, including wafer history, tool and chamber sequences within tools.
- Ability to store and analyze sub-die-level data to correlate process results to performance within a die (e.g., across-die linewidth variation impact on performance).
- Improved sampling methodologies that both maintain statistical representation across all facets of manufacturing and provide targeted sampling for greater detail, when called for by subtle trends in normal manufacturing.
Some of these challenges are not new,4 but all are important. Although none is specific to 300 mm manufacturing, the cost of such factories and the wafers within them will require many of these new methods to achieve profitability through fast yield ramps, low wafer cost and predictable device performance.
ConclusionThe yield management solutions in place today are already delivering unprecedented yield levels and achieving maturity faster than ever. The solutions being planned promise to deliver even higher quality at lower costs. Faster time-to-problem resolution and broader access to the needed data by an expanded engineering community are also part of the future in achieving rapid yield improvements.
| Author Information |
| Richard Kittler is a Fellow in AMD 's Technology Development Group. He has a Ph.D. in solid-state physics from UC Berkeley. |
| Michael McIntyre is a program manager in the Yield Management Systems Group within APM. He has a B.S. in chemical engineering from Worcester Polytechnic Institute, and an M.B.A. from the University of Texas at Dallas. |
| Christopher Bode is a member of the Technology Integration Group within APM. He has a B.S. in chemical engineering from the University of Illinois at Urbana-Champaign, and an M.S.E. and Ph.D. in chemical engineering from the University of Texas. |
| Thomas Sonderman is the director of APM. He has a B.S. in chemical engineering from the University of Missouri and M.S. in electrical engineering from National Technological University. |
| Steve Reeves is a senior member of technical staff in the Product Development Engineering Group of AMD's Microprocessor Division. He has a B.S. and M.S. in statistics from the University of Missouri. |
| Steve Zika is manager of product technology implementation for the Computational Product Group. He has a B.S. and M.S. in materials science engineering from Stanford University. |
| References |
|
|




