Estimating the Impact of Defects on Yield from In-Line Defect Measurement Data
Stuart Riley, Fabcentric., Sunnyvale, Calif. -- Semiconductor International, 12/1/1999
Defect measurement methodologies typically involve defect detection and categorization of defect types through microscope review. Defect counts are used to track excursions and set priorities for defect reduction efforts, but defect counts alone are not sufficient for excursion control and defect prioritization because they do not adequately explain how specific defect types affect the yield of the product. Controlling excursions and setting defect priorities is more effective when the impact of defects on yield is understood. To bridge the gap between defect count data and yield impact estimation, one needs to adopt a methodology that is credible, provides consistent results and requires few assumptions in order to apply it. This article reviews the key steps that are necessary in applying in-line defect measurement data to estimate the impact of defects on yield. These steps include defect detection, defect review sampling, defect classification, and the choice of the applied yield model for yield estimation.
Yield Prediction and Defect-Limited Yields
Yield-loss can be caused by mechanisms other than those detectable by defect inspection equipment. Variables affecting inspection tool performance such as focus, contrast (the ability to separate the defect image from the background image), and light level all place inherent limits on what can be detected. Other yield-limiting problems such as missing implants or oxide pinholes are usually invisible to microscope-based inspection techniques. Improper placement of the inspection steps, such as after a deposition where the defects are hidden by a film, can also limit the effectiveness of the inspection to detect yield-limiting defects. It is left to electrical test to detect these problems. No matter how careful one is in establishing the proper sampling, classification and statistical treatment of the defect inspection data, the estimation of yield impact based on such data may not correspond to the final yield of the product. It is unlikely that an "accurate yield prediction" can be achieved based on in-line inspection data. Instead, it is more likely the inspection data will only reflect a subset of the overall causes of yield loss for any given lot.
"Defect-Limited Yield" defines the upper limit of yield achievable due to specific defect types or groups of types. If all defects other than the type or group under consideration are removed from the wafer, the yield will be no better than the limited yield for that type or group. Removal of these defects may not result in a corresponding improvement in yield due to multiple-failed die, or defects other than the ones under consideration, such as undetected yield-loss mechanisms. The concepts of defect-limited yield and multiple-failed die can be described as a simple Venn diagram as shown in Figure 1. The large outer circle represents all die on the wafer (or groups of wafers). The middle circle represents all die failing due to detected and undetected defects. The circles to the left and right represent all die containing type A and type B defects, respectively. Of course, these defects have to be detected to be included in these circles. The areas where the type A and type B defect circles overlap the yield loss circle corresponds to failing die containing type A and type B defects.
The overlapping type A and type B circles correspond to die containing both defect types. If we remove all of the type A defects as shown in Figure 2, the corresponding yield recovery may not be realized due to undetected defects continuing to depress the yield on those same die. This phenomenon, called "Multiple Failed Die", limits our ability to fully describe the effects of yield loss and is a fundamental limitation of any yield-estimation methodology based on in-line defect detection. Therefore, any yield estimate based solely on in-line defect inspection data may not be able to accurately predict the final yield for a lot.
Kill ratio estimations
Kill ratios define the probability a defect will cause a fail on the die circuit. Obviously, this is important to know as the yield impact (and thus the priorities) of defects cannot be determined without it. In this section, we will review two popular approaches to determining kill ratios, then describe a method that estimates kill ratios of defects based on their apparent effect on the die circuit as defined through classification.
Kill ratios based on defect size
The "Critical Area" is a measure of the sensitivity of a device circuit to random defects, and is related to the ground rules of the device (or different areas within the device) and the size of defects falling on the device 1, 2, 3. For a given device ground rule, the yield due to random defects is lowered as defect sizes grow larger. Likewise, the yield is lowered for a given defect size if the ground rules of the device become smaller and denser.
This relationship of defect size to yield is driving many defect management groups to adopt methodologies based on assigning kill ratios according to the defect size data from inspection equipment. This methodology is based on the assumption that defect sizes based on inspection equipment measurements are accurate or at least consistent run to run. However, the ability for an inspection tool to properly define a defect’s size depends on how much of the defect is detected. If a defect is partially detected by the inspection tool, it will be sized smaller than it actually is. Partial detection of defects is tightly coupled to tool sensitivity settings. Unless one is certain the inspection tool is optimally adjusted to consistently and accurately size defects, it is not realistic to assume the defect size measurements are always correct.
The relationship between a defect’s size and its effect on the product changes depending on which structures the defect falls. Even if the inspection equipment can accurately size defects, one cannot realistically assume defects of a particular size will have the same effect on different areas of a device, nor will they have the same effect on different product types with different ground rules. It is not realistic to expect a "one-size-fits-all" methodology such as this to work for logic devices with widely varying ground rule regions, in manufacturing lines producing many kinds of products.
| Fig. 2 |
Kill ratios based on test correlations
There is a general belief that kill ratios can only be determined by performing correlations of in-line defect data to test data. This belief is based on the assumption that the accuracy of electrical test data somehow makes the correlations to in-line defect data for kill ratio extraction accurate. Even though the electrical test itself is an accurate measurement, this methodology still relies on the inaccurate measurement and interpretation of in-line defect inspection to determine the kill ratios.
Consider the diagram in figure 1. At first examination, it appears that the kill ratios for the type A and B defects can be determined by dividing the overlapping areas of the defect-type circles and the yield loss circle by the overall areas for each type. This approach is equivalent to performing a test correlation of in-line defect data. Upon closer examination, we see there really is no definitive way to determine if type A and B defects are killers just because they happen to exist on the failing die. We only know that type A and B defects are on failing die, but there is no certainty of a causal relationship. There could be other mechanisms, totally unrelated to the type A or B defects, killing these die. The only thing we know with certainty is that type A and B defects are non-killers if they fall on die that pass electrical test.
Applying the correlation methodology to determine kill ratios of specific defect types (required for setting defect priorities) is limited by how well the defects are grouped in their classification codes. The accuracy of the kill ratios for each defect category can be no better than the accuracy of detection and classification.
The correlation methodology requires time to collect the wafer data after test to conduct the correlations. In-line defect measurement is most effective when it is placed near the sources of potentially harmful defects, enabling decisions to be made real time, decreasing the chances of exposing more product to yield-limiting defects. By the time the electrical test correlations are performed on a set of wafers, long removed from processing, the defect problems in the line may have changed. Throughout this period, wafers currently in process can be exposed to potentially harmful defects that are not found on the correlation wafers. It is not realistic to assume the priorities of the defect types in the line will remain fixed over the time period needed for this analysis. As stated previously, kill ratios will change depending on the ground rules of particular devices. Kill ratios must be determined immediately with the data at hand and must be applicable to multiple products.
Determining kill ratios through defect classification
Limited yields for detected defects need to be estimated on a real-time basis, across multiple product types. In-line monitoring methodologies must provide real-time estimations of kill ratios. One way to meet these requirements is by noting the effect of a defect on the product during classification, then assign an estimated probability that the defect will cause a fail based on the effect.
Critical area is a measurement of the degree to which a defect will cause fails on a die, and is related to the circuit density. This implies the critical area is related to classification groupings. To see this, let Ac be the critical area of the die, and Di the average defect density per die for the ith defect type. The term, AcDi defines the average number of faults per die for the ith defect type 4. This term can be expressed in terms of kill ratio and average defect count per die for the ith defect type 5,
This term is used in the yield models in the next section. Kill ratios can be different for each defect type. Therefore, the kill ratio is defined by how the defect is categorized. This has a significant implication for in-line inspection; if the defect classifications are grouped by how they affect product (where possible) some of the critical area (kill ratio) information regarding a particular defect group can be captured. In addition, the kill ratio will change for a given defect depending where the defect falls on the device.
To illustrate how defect classification can be used to determine the kill-ratios, consider the two arrays, A and B, in Figures 3 and 4. The two arrays are exactly the same in length and width. The numbers and distributions of circles and squares (treated as two visually distinct defect groups) are the same in each array. Therefore, the defect density of each array is the same. If defect density were the only way to determine defect excursions or priorities, the arrays would be equivalent. However, the line-space pitch of array B is smaller than array A, so we should expect the effect of the defects on the patterns (kill ratios) for each array to be different. In other words, we expect the critical area of array B to be greater than the critical area of array A.
The defect groups, circles and squares, are equivalent to groups based on visual distinctiveness and/or possible source. The groups are sub-categorized by their apparent effect on the pattern of the arrays as seen in Table 1:
- A "short" is a defect touching two or more lines across a gap.
- An "extension" is a defect partially extending out into a gap from a line.
- An "on-the-line" is a defect totally on a line (not extended into the gap).
- A "between-the-line" is a defect totally in the gap (not touching any line).
Each defect sub-category has an assigned kill ratio based on an estimate of the defect’s relative probability of causing a fail. The effect of some defects on the product is obvious, as in the case of the shorts. Engineering judgment is required for the defects that are not so obvious. The engineering judgment is based on the knowledge of the particular device being measured and collective agreement between the pertinent engineering groups on the probability specific defect types will cause a fail. Collective agreement is a good way to avoid disagreements among the different engineering groups when the data is summarized and reported.
After classification, the count of each defect type (based on its apparent effect on the product) is multiplied by its associated kill ratio to determine the "corrected counts" listed in the "correction for kill ratio" column in Table 1.
For each defect group, the corrected counts are added together, then divided by the total number of uncorrected counts, to determine the overall kill ratio for the group,
where kgroup is the kill ratio for a particular defect group consisting of n defect types, ki is the kill ratio for a specific defect type, and Classi is the number of defects classified for a specific defect type. Groups can be defined in any way that makes sense for a particular analysis. Defect types can be grouped according to source, level, product type, and so on. When defect types are regrouped, the associated kill ratio for each group changes according to equation .
two tables
| Fig. 4 |
The kill ratios for the defect groups in each array are shown in Table 2. If the average of the kill ratios of defects in array B is divided by the average of array A, we find that array B is about 1.75x more sensitive to the same defects affecting array A. This is consistent with our expectations that the critical area of array B is greater than the critical area of array A. This technique is similar to techniques used to extract the critical area of a die circuit layout 00.
This classification methodology has 2 major benefits, because it allows for:
1. On-the-fly, real-time calculation of kill ratios for quick estimations of excursions and defect priorities based on defect-limited yield. The kill ratios for each defect group change as the relative number of shorts, extensions, etc. changes within the groups. Recalculation of the group kill ratios is possible using equation .
2. Dynamic calculation of kill ratios according to ground-rule and product differences in multi-product manufacturing lines. One classification scheme works across product types and levels.
The accuracy of the kill ratios for each defect category can be no better than the accuracy classification. However, if the consistency of classification is maintained, excursions and priorities can be resolved from the background data, because they are set according to relative differences in the data.
Automatic Defect Classification (ADC) is gaining in popularity at many semiconductor sites. ADC may not be able to distinguish the effect of the defects according to this methodology. ADC may only be able to distinguish the broader categories (such as the circles and squares in the previous example). Consequently, the kill ratios for ADC groups may not be determined on a real-time basis, nor will they be entirely applicable across multiple product types. There are ways to utilize this classification methodology to determine kill ratios with an ADC system, but that is the subject of a future article.
|
Table 1 | ||||||
| Classification Groupings | Array A | Array B | ||||
|
Type |
Affect |
Estimated Kill Ratio |
Count |
Correction for Kill
Ratio |
Count |
Correction for Kill
Ratio |
| Circles | Shorts |
1 |
0 |
0 |
11 |
11 |
| Extensions |
0.5 |
18 |
9 |
9 |
4.5 | |
| On Line |
0 |
0 |
0 |
0 |
0 | |
| Between Lines |
0 |
2 |
0 |
0 |
0 | |
| Total Circular Group |
-- |
20 |
9 |
20 |
15.5 | |
| Squares | Shorts |
1 |
0 |
0 |
8 |
8 |
| Extensions |
0.5 |
16 |
8 |
12 |
6 | |
| On Line |
0 |
1 |
0 |
0 |
0 | |
| Between Lines |
0 |
3 |
0 |
0 |
0 | |
| Total Square Group |
-- |
20 |
8 |
20 |
14 | |
|
Table 2 | ||||||
| Array A | Array B | |||||
| Groups | Total Class | Total Killers |
Kill Ratio
Ka |
Total Class |
Total Killers |
Kill Ratio
Ka |
| Circles | 20 | 9 |
0.45 |
20 |
15.5 |
0.78 |
| Squares | 20 | 8 |
0.40 |
20 |
14 |
0.70 |
Yield estimation calculation
Now that we have discussed the effect defect detection and classification have in estimating the defect-limited yield, we can consider methods of calculating the yield. A suitable method of calculation should:
1. Use data from direct measurement of in-line data, such as defect count, defective die, and so on.
2. Not be dependent on to too many assumptions. When assumptions are used, they should be realistic and credible.
We can start by calculating the yield of a wafer, given that all the detected defects will cause failures in the die they fall in,
where the #GoodDie is the number of die without detected defects and the #DieScanned is the number of total die scanned by the inspection tool. Of course, not all detected defects will cause failures, so a correction needs to be applied to the number of die with defects (#BadDie) to determine the probability of recovering good die from the group of bad die. This correction should be a function of:
1. Number of defects, per defect type, on each bad die, di. The more defects there are in a die, the lower the yield will tend to be.
2. Design layout sensitivities to defects as defined by the probability of failure, or kill ratio for each defect type, ki. As the design rules get tighter for a given set of defects, the yield will tend to be lower. The correction should be scalable to accommodate differences in device critical areas.
Equation takes the form (for the ith defect type):
where P(ki, di) is applied to the # BadDie to determine the probable number of die that can be recovered, or yield, as good die from the "Bad Die" group. Let’s consider the following two well-known yield models as possible candidates for P(ki, di): the generalized negative binomial and Poisson distribution models.
Negative binomial model
The general form of the negative binomial model is:
where Ac is the critical area of the die, and Di is the average defect density per die for the ith defect type. Substituting equation for ACDi, the model can be expressed in terms of measurable data from in-line inspections. The model then takes the form,
where the average number of defects per defect type is determined by,
where #Defs is the total number of defects detected,
#DefDie is the number of die with defects, Pcti is the
percent of the ith defect type (determined through
classification). The term,
,
is used to adjust the model to accommodate different distributions of defects.
If the ith defect type is tightly clustered,
needs to be small. If the ith type is randomly
distributed,
is large and the negative binomial model approximates a Poisson model. The
term
for the ith defect type is not easy to determine. Therefore,
one is forced to assume a value for
to use in the model – an assumption that can introduce errors in the estimation.
Poisson distribution model
The Poisson distribution model is simpler to apply than the negative binomial model, but it only works well for randomly distributed defects. With the introduction of declustering algorithms in defect management software, we can separate clustered defects from randomly distributed defects on the wafermap. Using only the randomly distributed defects, we can use Poisson statistics as a substitute for P(ki, di) in equation . The Poisson model has the form,
where the average number of defects for the ith type is
Here, the treatment of the average number of defects per die for the Poisson model deviates from the negative binomial model. The #RandDefs is the total number of random defects, and the #RandDie is the number of die with defects as determined through a declustering process. We will assume all defective die contain some random defects, so the #GoodDie and the #BadDie in equation will not be treated differently.
The application of this model assumes the declustering process works well and is consistent. If the declustering program identifies too many random defects as clusters, the average number of defects per die would be too low, resulting in a higher estimate of yield. On the other hand, if the declustering program fails to identify clusters, the average number of defects per die would be too high, resulting in a lower estimate of yield. This is a limitation of this particular model.
Combining defects
The models can be expressed in terms of combined groups of defects to determine the overall limited-yield for the groups. Expressed in terms of defect groups, the Poisson distribution model would look like,
where the average number of defects for the group would be,
The kill ratio for the group is determined by using equation .
Defect review sampling
The intended outcome of defect measurements is a defect-limited yield estimate. Yield is a measure of the number of good die, and defect-limited yield is a measure of the number of die not killed for specific defects. Therefore, it stands to reason that the defect review sampling must ensure:
1. As many defective die (die with defects) as possible are selected for defect sampling.
2. Each defective die has an equal probability of selection from the pool of defective die.
3. Each defect has an equal probability of selection from the pool of defects within each defective die.
4. A maximum number of selected defects per die should be set to minimize the effects of over sampling from defect clusters.
This "Die-Based Sampling" technique introduced as part of a defect-limited yield methodology at IBM in the early 1990s 6.
We can see how the yield estimate is affected by defect sampling techniques by using a simple simulation. We can create random distributions of "dots" on a grid map (for simplicity we’ll call this a distribution of defects on a wafer map), apply different sampling techniques, then compare the sampled outcome with the parent population. Because the intended result is a yield estimate, we compare the defect-limited yield of each defect distribution in the parent population to the estimated defect-limited yield of the sampled population. This technique was used as early as 1988 utilizing a fault-simulation program developed by C. Stapper at IBM 7, 8, 9.
|
Table 3 | |||||
| Defect Type | Defect Count | % of Type |
Defect Density
(area/die=1) |
% of Die With Defect (100
die) |
DLY (% of die without
defect) |
| A (random) |
100 |
20% |
1.0 |
67% |
33% |
| B (clustered) |
300 |
60% |
3.0 |
9% |
91% |
| C (random) |
50 |
10% |
0.5 |
40% |
60% |
| D (clustered) |
50 |
10% |
0. |
4% |
96% |
| Total |
500 |
100% |
5.0 |
82% |
18% |
Figure 5 shows the original distribution of defects on a simulated map of 100 die. There are 4 defect types - 2 random and 2 clustered. Table 3 shows the characterization of the original defect distribution. There are 133 random defects with the clustered defects removed. The average number of defects per die is 1.62.
Figure 6 shows the result of randomly selecting 100 defects across the wafer. The defects have been preferentially sampled in the clusters. This skews the yield estimations as seen in Table 4. (In this analysis, all defect types are assumed to be killers.) All of the numbers in parentheses refer to the numbers in Table 3. The normalized defect density for the ith defect type (Di-Norm) is determined by
,
where #Defs is the number of defects detected, AreaInsp is the total area inspected, di is the number of defects for the ith defect type, and #Class is the number of defects classified.
|
Table 4 | ||||||
| Defect Type | Selected Defect Count | % of Type |
Normalized Defect Density
(area/die=1) |
% of Die With Defect |
Neg Bin DLY (a=2) |
Poisson DLY |
| A (random) |
23 |
23% (20%) |
1.15 (1.0) |
81% (33%) |
47% (33%) |
74% (33%) |
| B (clustered) |
59 |
59% (60%) |
2.95 (3.0) |
52% (91%) |
29% (91%) |
50% (91%) |
| C (random) |
7 |
7% (10%) |
0.35 (0.5) |
94% (60%) |
74% (60%) |
91% (60%) |
| D (clustered) |
11 |
11% (10%) |
0.5 (0.) |
91% (96%) |
64% (96%) |
87% (95%) |
| Total |
100 |
100% |
5 |
18% |
6% |
29% |
If we sample by randomly selecting defective die, then randomly select m 4 defects per die as in Figure 7, we can minimize the effects of the clusters and get a better estimation of yield as seen in Table 5.
By applying this simulation technique, we can see that while the estimated yields are not entirely accurate (inherent limitations to defect measurement will limit accuracy, but consistency can be achieved), the sampling method shown in Figure 7 and summarized in Table 5 results in the correct relative ranking of the defects. Readers are encouraged to perform sampling simulations like this to ensure that any sampling methodology they choose to utilize is consistent with the intended outcome.
Summary
The defect-limited yield of defects must be determined near the
source of the defects, so that defect priorities can be set and potential yield
problems can be corrected immediately. The limitation of the defect detection
equipment and methods used to sample and classify defects places limits on the
ability to "accurately predict" the final yield of product. By sampling defects
in a manner that would cover many defective die, we can get an estimate of the
number of die affected by specific defects. By applying a classification
methodology that includes apparent effect on the product, we can apply estimated
kill ratios to the defect codes. This information may then be used to calculate
the limited yield for defects using a practical, methodical model. The limited
yields of the defects can then be used to estimate excursion conditions and
defect priorities.
|
Table 5 | ||||||
| Defect Type | Selected Defect Count | % of Type |
Normalized Defect Density
(area/die=1) |
% of Die With Defect |
Neg Bin DLY (a=2) |
Poisson DLY |
| A (random) |
52 |
52% (20%) |
2.6 (1.0) |
57% (33%) |
30% (33%) |
53% (33%) |
| B (clustered) |
12 |
12% (60%) |
0.6 (3.0) |
90% (91%) |
62% (91%) |
86% (91%) |
| C (random) |
31 |
31% (10%) |
1.55 (0.5) |
75% (60%) |
39% (60%) |
68% (60%) |
| D (clustered) |
5 |
5% (10%) |
0.25 (0.5) |
96% (96%) |
80% (96%) |
94% (96%) |
| Total |
100 |
100% |
5 |
18% |
5% |
29% |
References:
1. Linda S. Milor, "Yield Modeling Based on In-Line Scanner Defect Sizing and a Circuit’s Critical Area", IEEE Transactions on Semiconductor Manufacturing, Vol. 12, No. 1, pp. 26-35, February 1999.
2. C. Stapper, "Modeling of Integrated Circuit Defect Sensitivities", IBM Journal of Research and Development, Vol. 27, No. 6, pp. 95-102, May 1983.
3. C. Stapper, "Modeling of Defects in Integrated Circuit Photolithographic Patterns", IBM Journal of Research and Development, Vol. 28, No. 4, pp. 461-475, July 1984.
4. C. Stapper, F. Armstrong, and K. Saji, "Integrated Circuit Yield Statistics", Proceedings of the IEEE, Vol. 71, No. 4, pp. 453-468, April 1983.
5. ibid
6. S. L. Riley, "Optical Inspection of Wafers Using Large-Area Defect Detection and Sampling", The IEEE International Workshop on Defect and Fault Tolerance in VLSI Systems", pp. 12-21, November 1992.
7. C. Stapper, "Statistics Associated With Spatial Fault Simulation Used for Evaluating Integrated Circuit Yield Enhancement", IEEE Transactions on Computer-Aided Design, Vol. 10, No. 3, pp. 399-406, March 1991.
8. C. Stapper, "Simulation of Spatial Fault Distributions for Integrated Circuit Yield Estimations", IEEE Transactions on Computer-Aided Design, Vol. 8, No. 12, pp. 1314- 1318, December 1989.
9. C. Stapper, "Fault-Simulation Programs for Integrated Circuit Yield Estimations", IBM Journal of Research and Development, Vol. 33, No. 6, pp. 647-652, December 1989.