A Simplified Approach to Die-Based Yield Analysis
There is a production-worthy method for determining the impact of yield shifts. It applies to multiple product and die sizes.
Stuart Riley, Texas Instruments Inc., Dallas, Texas -- Semiconductor International, 8/1/2007
Many fabs do not use a die-based approach to monitoring yield impact, considering it too complicated to accommodate multiple die layouts and mixed defect distributions. Instead, fab facilities use a defect density metric to monitor shifts of defects over time. But defect density is difficult to translate to defect-limited yield, because it does not provide information about the number of defective die or the number of defects in the die.
Alternatively, a die-based approach is the best way to monitor the yield impact of observable defects. In other words, engineers want to track defect-limited yield, which is modulated by the number of defective die, the number of defects within the die and the probability that a defect will result in yield loss (kill ratio). However, this strategy has been avoided by many fabs because it has been considered too complicated to apply in situations where devices with multiple die layouts on the wafer are processed in the line. We will show how a die-based approach can be applied to multiple devices on wafers with mixed distributions.
Yield impact of defects
If we assumed all defects were killers (kill ratio=1), all defective die would fail. The defect-limited yield would simply be the percent of good die (die without defects):
But we know that all defects do not have the same probability of causing a failure. We must apply kill ratios to defects to get a realistic estimate of yield loss. Once we apply the kill ratios, we must consider the number of defects per die to determine the probability a die will fail for a given number of defects in the die. For example, if we assume defects have a kill ratio >0 and <1, a die with one defect will have a better chance of yielding than a die with 100 defects.
Die with defects may be “recovered” if there is some possibility that they may yield for a given number of defects. A probability density function can be applied to the defective die to find the number of recovered die caused by non-failing defects:1,2
The term “d” is the average number of defects per defective die:
The term “ki” is the kill ratio for a specific defect type or size bin. The overall kill ratio, or the kill ratio of a specific defect type, groups of types or size bins, can be found by weighted averaging:1,2
where “N” is the total number of defects classified. This term could also apply to the total population, if the fab is using size bins. The term “ni” is the number of defects classified for a specific defect type, and “ki” is the kill ratio for that type. Note that this term also includes the fraction of defects per type. If size bins are used, substitute size bins for the types.
As an example of how to find the overall kill ratio, consider three defect types:
- Defect type “A”: ki=0.5 and ni=40
- Defect type “B”: ki=1 and ni=10
- Defect type “C”: ki=0 and ni=50
The overall kill ratio for the combined group of defects is:
Applying the probability density function to the percent of good die, the yield impact for a specific defect type, groups of types or size bins can be expressed as:
A suitable probability density function should use readily available inspection data. It should also use data that takes the number of defects per die and kill ratios into account. The Poisson distribution function works best for this purpose:
where the term “kid” is the average number of killer defects per die.3
Although the Poisson distribution function is simple to use, it does not work well with mixed distributions.3 If a wafer has a mix of random and clustered defects, the defects in the clustered die tend to make the average number of defects per die too large. This will make the yield estimation lower than it should be. For this reason, the Poisson distribution is not widely used in yield modeling. But, alternative probability density functions are too complicated to use for our purposes. For example, while the Negative Binomal model handles mixed distributions well:3
it requires that the user know the cluster coefficient (α) to estimate the defect-limited yields. We cannot easily derive this number from inspection data.
Alternatively, if we isolate die with clusters (die with significantly more defects compared with the population of defective die) from the other defective die, we can treat the clustered die as die with many random defects. The Poisson distribution function can be applied to each set of die groups independently, allowing us to apply it to mixed-distributions. We can identify clustered die by using a technique called die-based clustering, which uses the number of defects per die to find outlier dice.
Die-based clustering
Typical defect clustering algorithms attempt to group defects belonging to regions of higher density without regard to die borders. This complicates the die-based yield impact approach because clustered defects can cross die boundaries. So, we identify defect clusters as die that contain significantly more defects — or outliers — when compared with the overall population of defective die.
The outlier die are defined as defective die with defect counts exceeding M × median, having at least N defects and at least Y defects more than the median.
We then adjust these numbers to optimize the clustering. For example, if we assume M= 3,andN=Y=10, we get the defect clustering shown in Figure 1 . Once the outlier die are properly identified, we can separate the random and clustered die to apply the Poisson distribution function to each group separately. Now all we need to do is apply the proper expression that can join the distributions to determine the impact of defects on yield.
Yield impact estimations of mixed distributions
Using die-based clustering, we can partition die into multiple cluster groups using as few groups as possible. Here we recommend just two groups: random and clustered. If we partition the die into too many groups, it could needlessly complicate the estimations. Keeping this simple may result in slightly incorrect estimations, but these are the compromises that must be made when dealing with this type of data. That is why this approach should be treated as a way to determine shifts in yield impact caused by defects, and not be used for yield predictions.
With the die now partitioned into clustered and random groups, the yield impact for mixed distributions can be expressed as:
where the number of good (recovered) random die is:
and the number of good clustered die is:
which splits the terms in the exponent into clustered and random parts. If you treat the defect types and their associated kill ratios differently between the random and clustered parts, those differences can be handled by Equation 10. Each clustered die can contain some level of random defects, so a correction for the random part of the clustered die can be applied. But if we assume the defect types are the same for random and clustered die, this expression reduces to:
and Kr=Kc.
To find the yield impact of defects as if there were no clusters on the wafers, we can use the following expression for random defects:
where the average number of random defects per die is:
This time, we use the total defective die because we assume the clustered die have a random part. By comparing the yield impact for random distributions with the overall yield estimate, the difference provides an estimate of the impact of clusters on the wafer.
The limited-yield approach (Fig. 2 ) shows the difference between maps with different numbers of defects per die. All wafers have the same die layout, using the normalization method described later. The defect-limited yields for several wafers are plotted for a range of kill ratios, ranging between Kr=0.1 (upper bound) to Kr=1.0 (lower bound). In practice, most wafers will have kill ratios between these ranges, so their estimated defect-limited yields will fall between the upper and lower bounds, as plotted on the chart.
Wafers 1 and 2 have almost the same limited yield with Kr=1 (percent good die), but a much different limited yield with Kr=0.1, because Wafer 2 has significantly more defects per die than Wafer 1. With this approach, we can see the differences in the defect-limited yields between wafers with different numbers of defects per die, even if the percentage of good die are virtually the same.
Wafer 3 has a moderate level of defects. If the mix of defect types is the same between Wafers 1 and 3, the relative estimated yield difference between the two wafers is maintained. The yield estimation for Wafer 3 will be higher than the estimation for Wafer 1 (Y1<Y3). If the mix of defect types is such that the overall kill ratio for Wafer 3 is higher than that of Wafer 1, the yield estimation for Wafer 3 could be lower than the estimation for Wafer 1 (Y1>Y3).
It is this combination of defective die, the average number of defects in the die, and the associated kill ratios per defect type that permits better separation of potential yield-limiting defect issues, as compared with the defect-density approach.
Tie-in with inspection
An easy way to determine the cumulative effect of defects on yield is to multiply the yields per inspection point. For M inspection points, we have:3
If specific wafers skip inspection steps, you can interpolate the estimated yield for that step using recent inspection data for other wafers at that inspection point around that same timeframe.
An alternative, but more complicated method for determining the cumulative effect of defects is to create composites of the wafers using all the inspection data to that point. The overall kill ratio up to that point would be determined using the combination of layers and defect types. As stated before, the added layer of complication may not be necessary, because this approach is an estimation of shifts in yield over time, and is not meant to be used for yield prediction.
This approach can be applied to any set of defect data. If defect size bins are used instead of classifications, the expressions will still work. For size bins, just partition the defect data by “i” size bins, then substitute for the terms in Equation 4. If classifications are used, we will need to apply review sampling. How the number of defects are sampled will affect the outcome of our estimations. This approach works best if the defect review sampling is die-based.
Die-based review sampling
This die-based approach works best using a Pareto of sampled defect types. Wafer-based random sampling of defects, without regard to die, can bias the Pareto to the clusters, which can lead to incorrect estimations about which defects affect the most die.
With die-based sampling, each defective die is randomly selected, and the defects within the selected die are randomly selected.4 Each defective die, as well as the defects within the die, has an equal chance of being selected. But if sampling by defect size is important, you can randomly select a die and then a defect by size. In either case, random selection of the die is essential.
This sampling method can be adjusted to ensure a sufficient number of clustered die are sampled by checking to make sure a minimum number of clustered die have been selected (Fig. 3 ). The maximum number of defects per wafer and the maximum number of defects per die for the initial round of defect sampling can be adjusted. The minimum number of clustered die can be set for a secondary check to ensure a minimum number of clustered die have been sampled.
| 3. Die-based sampling with some oversampling to ensure a minimum number of clustered die are sampled. |
Die normalization for multiple devices
Most fabs run many devices that have different numbers of die on the wafers (die layout), making the die-based approach less desirable than the defect-density approach. To accommodate the differences in die layout, each device would have to be tracked separately. If we normalize the die layout (Fig. 4 ), we can simplify the analysis so that the defect data can be tracked together for many devices. We can simply reconfigure die sizes to normalize the die layout over multiple devices. The normalization would be applied after the data has been collected in the fab. This permits analysis by an individual device or in combination with other devices.
Conclusions
Because a die is the unit product for semiconductor fabs, defect management strategies should include a die-based approach to monitoring the defect-limited yield impact of observable defects. Many of the technical issues involved with this approach can be overcome by defining clustered die or using die normalization. Poisson statistics can be used to estimate the limited yield if clustered die are treated as localized regions with significantly more random defects, compared with the rest of the wafer.
Die-based yield analysis allows the fab to track the yield impact of defects using a combination of defective die, the average number of defects in the die, and the associated kill ratios per defect type. It also permits the fab to determine the cumulative impact of defects as wafers move through the line. Yield-related issues can be tracked and prioritized with actions directed more effectively than is possible with the typical defect-density approach.
| Author Information |
| Stuart Riley is a member of the Group Technical Staff at Texas Instruments in Dallas. He has been in the semiconductor industry since 1981, working in defect management since 1987. Over the years, he has written several papers on defect-limited yield estimation methodologies. |
| References |
|