Integrated Metrology and Wafer-Level Control
Kevin Lensing and Broc Stirton, AMD Automated Precision Manufacturing, Sunnyvale, Calif. -- Semiconductor International, 6/1/2006
|
Since the 0.25 µm days of semiconductor technology, there has been a tremendous amount of industry speculation about how, when, where and why to deploy integrated metrology (IM) into very large-scale integration (VLSI) manufacturing. Entire tracks of major conferences are consistently devoted to developments on the IM front, but (with a few exceptions) the industry nonetheless seems stuck in a cycle of research, technology development, feasibility studies, pilot projects, and forward-looking business analyses. Chipmakers and metrology vendors have produced scores of publications pronouncing the age of IM to be at hand, and trade publications have devoted significant editorial space to extolling the virtues of IM as a key enabler for various technology nodes. Notably absent, however, are significant public statements from chipmakers that IM is being deployed in lieu of standalone metrology for any large block of unit processes at their latest multibillion-dollar fabrication facility. With all the industry attention, engineering effort, and apparent momentum surrounding this technology, why does large-scale adoption remain elusive?
Industry conventional wisdom has tied the emergence of IM to the necessity for wafer-level process control. It is undeniable that shrinking process windows have already necessitated the development of control systems that account for variation below the lot level. To enable the wafer-level data resolution necessary for more granular control, the assumption has been made that standalone sampling plans of 1-4 wafers per lot must be replaced by systems that can measure every wafer without accruing a large cycle time penalty. Enter IM. The consensus connection between IM and wafer-level advanced process control (APC) was summarized nicely by Alexander Braun in July 2005 : "True APC requires integrated metrology, particularly with the introduction of new materials and architectures, and increasingly smaller process windows, especially beyond the 65 nm node."1
Our manufacturing strategy is called automated precision manufacturing (APM). Two of the integrated pillars of APM are APC and advanced measurement technology. The alignment of these two groups under the APM umbrella has allowed us to continually evaluate the relationship between emerging metrology and APC technologies. In this article, we will leverage the unique perspective of APM to add some clarity to the IM and wafer-level control (WLC) discussion. We will start with a simple explanation of how analysis of variance should drive control strategies. We will then explore the state of IM, both as a control solution and as an enabler for factory efficiency, yield enhancement and cost savings. Finally, we will detail our approach to the WLC problem and, in doing so, provide an interesting alternative to the prevailing conventional wisdom.
Variance and wafer control
In the interest of clarity, it will help to define a few relevant terms. WLC refers to the use of individual process settings for each wafer in a lot or batch, defined at the beginning of the run. In other words, no new information obtained during the run is used to update the process settings; all process parameters are downloaded to the process tool at the start of the run (Fig. 1 ). This is similar to the operating scenario for lot-level control, except that wafers can have individual settings. Wafer-to-wafer (WtW) control simply adds the capability for process setting updates within the run. In this case, the process settings for each wafer may be defined or updated at any time before the wafer is processed. The WtW automation scenario can include the feedback of new information during the run or an IM feed-forward scenario. New information obtained during the run can come from IM data from preceding wafers in the same lot or standalone metrology data from other lots. Such will be the conventions in this article.
In typical analysis of variance in semiconductor processing, it is helpful to break down the components of process variation into several pieces: lot-to-lot, wafer-to-wafer, within-wafer and within-flash (or reticle field). As independent, linearly related quantities, these individual variance components sum together to give the total variance:2
σ2tot=σ2ltl + σ2wtw + σ2wiw + σ2wif + σ2meas
The techniques of WLC and WtW control are concerned primarily with addressing the WtW component. In general, to actively control any form of variation, two requirements must be met. First, the output variable of interest must be observed. This typically consists of wafer-state metrology. Second, a control knob must exist. To achieve WLC, this must include per-wafer process setting capability. The additional variance (σmeas) caused by the measurement system is also explicitly included here. This term is traditionally addressed by improving the precision of the metrology tool and filtering raw data for measurement error.
Each of the above components consists of systematic and random (noise) contributions. It has been shown that active feedback control of random variation is futile,3 so the focus is feed-forward control of random variation and feed-forward/feedback control of systematic variation. There is a danger when using feedback of interpreting an upstream systematic variation as a drift. Take an IM-equipped etcher with feedback control. If four consecutive wafers are measured at final etched CD (FICD) and found to show a trend, this may induce a control move to recenter the FICDs. However, if the trend is caused by differences in the bottom antireflective coating (BARC) thickness caused by deposition chamber biases, a systematic bias as a trend is interpreted. This will actually induce more variation by making a control move based on this erroneous interpretation of the data. Hence, proper engineering diligence to identify sources of systematic variation is a prerequisite for any WLC implementation. While the ratio of systematic to random variation is very process-dependent, it has been our experience that the most potential for improvement lies in uncovering and compensating for the systematic effects. They must be "uncovered" because they are not always conspicuous. The overlaid contributions of subsequent process steps superimpose to obscure the individual effects and give the appearance of random variation. In Figure 2 , for example, what looks like a random trend in the cumulative error by slot is actually the combination of systematic contributions from five polish arms, three bake plates and four etch chambers. The importance of good engineering data analysis in discovering these systematic effects cannot be understated.
![]() |
| 2. A seemingly random distribution of error by slot is actually the cumulative effect of three systematic sub-unit contributions. |
The systematic portion of the process variation can consist of such effects as chamber bias or tool drift. By drift, we refer to any wafer-sequence effect in the output parameter caused by changes in the processing conditions in the process tool. If the drift is fast enough that its intralot effect is significant or has an idle-time dependency (e.g., a first-wafer effect), then WLC may be required to address it. For slower drifts, APC can compensate for most of the effect with simple lot-to-lot adjustments.
IM drivers
The industry has focused largely on IM as a solution to complicated control problems. Referring to the definition supplied above, IM is clearly the best metrology vehicle to deliver WtW control. In a feed-forward WtW control scenario, each wafer is sent to the IM unit and measured prior to processing. The recipe is then modified for that wafer to account for incoming variation, and the procedure is repeated for all (or some subset) of the wafers in the lot being processed. When used in this way, assuming equally accurate metrology data, the IM unit is adding no additional control value over WLC using standalone metrology. The big advantage for IM over WLC for the feed-forward case is the improved efficiency of measuring wafers on an integrated unit while they are queuing for process. No cycle time or standalone capacity is wasted on a separate processing step for pre-metrology.
For feedback WtW control, on the other hand, IM is uniquely qualified. To receive feedback in time to correct for trends within a lot, timing and logistics make it impossible to move wafers to a standalone metrology unit, measure them, and deliver results to a control system in time to make an adjustment for the same lot. But even for IM, this is not easy. First of all, the IM unit must be fast enough to measure wafers and produce results in time for an adjustment. For complex measurements taking place on high-volume equipment (like a lithography cell), it is often impossible for the IM unit to keep up with the process tool while measuring every wafer and a plurality of sites. One solution is to institute some kind of IM sampling plan (e.g., measure every second wafer), but this limits the effectiveness of the WtW control system. Separating an actionable trend from process noise often requires several data points from adjacent wafers, and when wafers are skipped, the controller is likely to choose the conservative option and make no change. If additional measurements add clarity to the trend, the controller will act, but how many wafers in the current lot remain unprocessed to benefit from the so-called wafer-level adjustment? Metrology suppliers are acutely aware of this throughput problem, and significant engineering effort is being devoted to keeping up with the process tools. But there is a price to be paid, either literally in terms of IM pricing or figuratively in lost sensitivity, attached to the higher IM throughput. If the IM unit is also being used for feed-forward WtW control at the same process step, the capacity constraint on the metrology unit is even worse. When these IM logistical scenarios force a series of data compromises, the WtW control benefit quickly gets lost in the noise.
Even if wafer control is not implemented, IM should nonetheless provide benefit for lot-level APC systems. Having the metrology on the process tool means zero feedback delay; updated states are ready for the next lot being processed. This eliminates the lag between state change and metrology detection, ensuring that there are no intervening lots in jeopardy of being processed with the prechange settings while waiting for a post-change lot to be sampled. Average delay using standalone metrology can be anywhere from one to five lots, depending on the processing sequence. For a common APC application using exponentially weighted moving average (EWMA) state estimation and a medium gain, total variance could be cut in half by reducing the metrology delay from three to zero lots.
Control requirements are not the only drivers for IM. The volume of data produced by IM can also be used to improve line and sort yields. Even in the absence of APC, if an IM unit can produce data on every wafer that is monitored in real-time by statistical process control (SPC) or fault detection systems, excursions may be detected and the equipment halted with a minimum amount of scrap jeopardy. Similarly, the IM data could add clarity to root-cause analysis of subtle equipment or process-driven yield signatures. Both of these cases are theoretically legitimate, but it is difficult to calculate an effective return on investment (ROI) based on yield improvement because the net effect is a function of the particular line and sort yield performance for a given factory. In their sales pitches to chipmakers, IM vendors often trot out numbers like "2% scrap reduction" or "1% sort yield improvement," but these numbers are guesstimates at best, and they are largely ignored by fab customers. In the case where management is aware of a specific scrap problem around an equipment set, or if parametric wafer-level yield loss is particularly vexing, IM may provide some help. But keep in mind that the capital investment for IM is only one part of the cost of the total yield solution. Inline fault and SPC systems must be in place to receive and act at the wafer level to limit scrap. Yield analysis tools must exist to mine the extensive wafer-level parametric data and find actionable signatures. Without these systems and analysis tools, IM results become part of the ever-growing graveyard of factory data.
Apart from a process-window constraint that requires WtW control, maybe the most compelling arguments for IM are related to factory efficiency. The demanding process requirements that are driving controller development are also expanding the overall inline metrology burden. Nothing is more likely to cause a fab manager to boil over than a production bottleneck caused by a metrology tool, but as each technology node adds new requirements to an often static metrology capacity, a loss of efficiency is inevitable. With pervasive fabwide APC, it is no longer an option to simply keep the line moving by allowing a large block of product to skip metrology. No metrology means no APC, which halts the process tools. So while measurements burden the factory with metrology cycle time, sparse or delayed data burdens the factory with APC-induced interruptions.
To the extent that IM implementation actually provides relief for the standalone capacity, factory inefficiencies related to metrology and APC can be minimized. Once IM is fully implemented and qualified for a single process, sampling at the associated standalone metrology step can be dramatically reduced or even eliminated. The result will be improved cycle time for the product and additional available standalone capacity for other needs. The on-board metrology will also eliminate the two largest sources of APC-induced factory disruptions — controller jeopardy and pilot wafers. We already discussed above how eliminating metrology lag improves control, but it also reduces factory disruptions by eliminating jeopardy constraints. A jeopardy constraint occurs when the number of lots processed since the last controller update reaches a user-defined limit. Jeopardy inhibits the process tool until a lot is measured, but if every lot is measured using IM while it is being processed, jeopardy is always zero.
APC initialization occurs whenever the state of the controlled variable is unknown. Common examples are after tool maintenance, when a new product is introduced or when data has expired. During initialization, a small number of wafers are often split from the lead lot, processed, and measured to get an accurate state estimate before committing all 25 wafers. While the pilot lot is being processed and measured, the remaining wafers (and often the process tool) are inhibited. If the measurements were gathered during the process flow, the pilot wafer process would no longer be necessary. The APC controller could estimate the state from IM measurements on the lead wafers and make the adjustment for the rest of the lot with minimal delay.
Finally, there is the question of capitalized cost — probably the most controversial of all potential IM drivers. Vendors have claimed that IM is roughly half the price of standalone metrology.1 But how are they measuring? Are they simply comparing one tool to one tool, like a single-etch IM scatterometer vs. a single CD-SEM? Or are they comparing the overall "cost to control" a set of equipment, since a single CD-SEM will service multiple etch platforms? How about a cost-of-ownership (CoO) comparison based on the relationship between the capital cost and throughput? What method seems fair? The tool-to-tool comparison seems the least reasonable. A fab is not at all likely to choose between a single IM and a single standalone tool to fill a control need. The "cost to control" methodology calls for IM and standalone costs to be normalized according to the number of individual process tools that are controlled by a single system, but this methodology does not account for the additional wafer metrology and associated factory value in terms of process control and yield that comes with measuring every wafer in every lot when process tools contain IM. So that leads us to the per-wafer CoO approach. This puts everything on a cost-per-wafer basis, but in doing so, discounts the flexibility of the standalone unit in terms of lot, wafer and site sampling. It also falsely assumes that the relationship between sampling and value is infinitely proportional, and we know that this is not true. The factory receives diminishing returns from additional wafer measurements. There is a saturation point after which little additional information is gained, which is the entire basis for metrology sampling plans.
No matter what method is used to compare the cost of IM to standalone, the underlying assumption remains that IM will substitute, not add to, the standalone burden. Given the current risk level associated with most IM deployments, this is not a valid assumption. Right now, the majority of IM cases have not been proven manufacturable enough to justify buying IM in lieu of standalone metrology. Chipmakers are bringing IM into the fab for evaluation after they have acquired enough standalone capacity to satisfy their control needs. The success of an IM project may impact buying patterns for future factories or technology nodes, but it does nothing for present production. In a cruel twist of fate, the IM project often decreases the overall production metrology capacity by requiring standalone measurements to validate the IM data. Until the risk level associated with IM makes it a replacement instead of an additional cost, there is no such thing as a fair comparison.
The current state of IM
It has been accurately shown in several publications that CMP is the area where IM has been most widely adopted.1,4,5 Why is this so? First of all, the metrology solution is a much lower technical risk than IM for other process modules. For CMP, the same optical film thickness metrology unit that is common for standalone metrology can be adapted and integrated with the polisher. The IM unit may be a slight variation in terms of the optical design or metrology supplier, but the engineering principles are similar to those that are already well known for controlling polish processes. So while there may be complications with the metrology integration, the inherent risk in the measurement technology itself is minimal. The relative simplicity of the metrology technology makes the solution more manufacturable and cost-effective. Next, there is a clear case that has allowed the OEM suppliers to develop a "drop-in" solution that combines the polisher, IM unit, and APC controller in one capital package. The standard implementation allows for feed-forward WtW APC using the on-board film thickness unit, thereby correcting for incoming film thickness variation by modifying the polish time for each wafer. There is no need for a comprehensive APC infrastructure to implement this WtW solution. If no APC controllers exist immediately up or downstream, the CMP WtW control becomes a clearinghouse for correcting aggregate variation in the unit process.
The patterning modules present a different IM story. With pattern-limited yield loss dominating current and future technology nodes, the promise of rich, high-volume data streams flowing from every lithography cell and etcher is quite appealing. With scatterometry technology gaining traction in key areas such as a standalone solution, the table seems set for an optical CD-IM revolution. Indeed, these business factors have led to a tremendous amount of IM progress in litho and etch over the past five years, but there is much more to be done before CD-IM reaches mainstream production implementation.
To illustrate the difficulty of implementing IM in litho and etch, one needs only to compare the situation to CMP. The IM technology scatterometry is just now emerging from infancy. Chipmakers do not have the luxury of dropping a well understood and long-implemented standalone technology into the IM space. They are instead convoluting a difficult integration problem with a difficult technology problem. It would be better if scatterometry could continue to mature as a standalone technology for the next 10 years before migrating into process tool configurations, but the need statement for CD-IM will not wait. Complicating matters is the fact that broadband scatterometry can be deployed using several different hardware configurations. The various hardware types (Fig. 3 ) entail a series of compromises between cost, throughput, size and sensitivity.6 The optimal choice of hardware will likely be different for standalone, integrated litho and integrated etch cases. Is it worth it to pick the "best" hardware for each application and accumulate a complex web of scatterometry tools, likely from multiple vendors, to develop and maintain? Or is there a single supplier and/or hardware compromise that can meet every requirement? This is just one of many difficult questions that chipmakers must answer.
![]() |
| 3. Various hardware types entail a series of compromises between cost, throughput, size and sensitivity. |
Like the optical film thickness metrology used in CMP, scatterometry is a model-based technique, but the modeling problem is far more complex. Scatterometry extends the modeled space from one to two dimensions, and in addition to the traditional film thickness solutions, scatterometry also solves for the geometric shape parameters describing the diffraction grating. As a result, each new application requires a significant development effort to build a model, and model sensitivity to the parameters of interest is not guaranteed. The quality of the metrology result is generally a function of the diffraction grating layout and composition of the film stack. For standalone applications, it makes sense to deploy scatterometry, try out a myriad of process layers, and pick the ones that present the best results for the highest-impact processes.
For IM, the usefulness of the metrology is tied fundamentally to the set of processes running on the host tool. If it is found that scatterometry is not feasible for a new layer or technology slated for that equipment, the IM unit is stranded. This is a real possibility as new materials being used at 65 nm and beyond strain the ability of scatterometry to produce robust metrology. Especially troubling is litho IM, where advanced antireflective coatings (ARCs) seek to eliminate the very reflectance that scatterometry relies on for sensitivity to CD and resist profile variation.6 Effectively managing the development and deployment of scatterometry amid all this uncertainty is no small task.
The potential upside of scatterometry has led most chipmakers to pursue the technology despite the high engineering overhead and modeling uncertainty. When scatterometry proves a good fit for an application, the stream of fast, rich and precise CD and profile metrology is unmatched. Scatterometry is alone in its ability to provide sidewall information for vertical or reentrant resist profiles at throughputs suitable for high-volume manufacturing (Fig. 4 ), making it a key enabler for lithography pattern control at the 65 nm node and beyond.
WLC at AMD
Given that patterning IM is not a good cost/risk proposition, we developed an alternate strategy for WLC. As previously noted, the first requirement in controlling variation is an observation strategy. The collection of adequate inline wafer-state data is a cornerstone of good process control. The trade-off between collecting sufficient data (whether for SPC or APC needs) and minimizing metrology requirements is a familiar manufacturing concern. Our goal has been to maximize the amount of useful information available to the APC applications. While the quantity of data is certainly part of this, the decision of what to measure (i.e., which lots, wafers and sites) was recognized to be of critical importance. In other words, data is useful as long as it furthers the manufacturing objectives. Complicating this is the fact that what is useful can change over time. For example, if we have not recently measured a wafer from chamber A of a particular etcher, the data we will obtain from sampling such a wafer is much more valuable than it would be if we had sampled chamber A only minutes ago. By leveraging advanced sampling and scheduling capabilities, we are able to ensure that available metrology capacity is used to capture data with time-sensitive value. Static sampling rates are replaced with a dynamic system capable of balancing multiple engineer-defined rules. The system works by minimizing the penalty for unsatisfied rules with constraints on metrology capacity.7 Such optimization ensures that metrology tools are always being used to provide the maximum amount of information about the process to our control systems, and allows multiple sampling goals to be met. In this sense, we collect information instead of data. Over time, all systematic contributions to wafer variation can be observed and characterized, with zero net increase in the volume of metrology data generated.
Once the variation has been observed, the job of the control algorithm is to monitor the systematic equipment trends, decouple all sources of variation, and correct for the biases with feed-forward WLC. So instead of using IM to measure every wafer for WtW feed-forward control, we use standalone metrology and dynamic sampling to measure just enough to characterize systematic sources of variance and correct them using WLC.
Summary
For industry leaders in semiconductor manufacturing, the age of wafer-level APC is not just coming, it is here. Widespread lot-level APC brought wafer-level variation to the forefront early on, and WLC development began at the 130 nm technology node. There are multiple insertion points of WLC at 90 nm; for 65 nm production, it will be pervasive. While the industry pushes for immature IM solutions to provide the data necessary for wafer control, we came to the conclusion that a parallel strategy would be required. A close examination of the variance showed that processes rarely exhibit within-lot instability, and most of the wafer-level variation is actually a combination of systematic contributions from upstream equipment. It is not incoming noise that is eating up narrow process windows; it is biases between individual chambers on a cluster tool, bake plates and coat cups on a litho cell, and individual zones within a furnace. With smarter wafer sampling, it is not necessary to measure every wafer to implement feed-forward WLC — one only needs to sample enough to characterize the sources of variation. This has proven to be a good strategy. With the combination of standalone metrology and dynamic sampling, we are able to produce WLC that approaches the performance of WtW feed-forward control with IM. This strategy has enabled us to take a more measured approach to IM evaluation and deployment. We have no IM in production for 90 nm, and very little for the 65 nm node.
All of this does not mean that we are not interested in IM. As mentioned above, there are other good reasons for IM not related to wafer-level APC. IM is especially appealing to align with lean manufacturing principles, where product flows, with minimal uncertainty related to metrology sampling. For future technology nodes, it is entirely possible that feedback WtW control will prove necessary, thereby requiring more IM. But the technology has to be ready, and right now there are too many questions surrounding IM, especially in the patterning modules, to justify integrated in place of standalone metrology. As long as IM is an additional instead of a replacement cost, our system of WLC using standalone metrology affords us the luxury of waiting.
| Author Information |
| Kevin Lensing is a member of the technical staff in the APM development group at AMD . He received a B.S. in chemistry from the University of Dallas and an M.S. in chemical engineering from the University of Texas at Austin. |
| Broc Stirton is a member of the technical staff in AMD's APM development group. He earned a B.S. in physics from the University of Dallas. |
| References |
|



