Yield Analysis Based on Fault Probability and Kill Ratio
David Hu and Milo Koretsky, Oregon State University, Corvallis, Ore. Manu Rehani and David Abercrombie, LSI Logic, Gresham, Ore. -- Semiconductor International, 11/1/2001
|
To accurately predict fab yield, values for certain critical parameters must be established. Typically these are divided into systematic yield (YS) and random yield (YR). These systematic defects result from process variables not meeting specifications, such as photomask misalignment, under-etching, and design faults, such as a design not meeting minimum spacing rules. Because the main focus of the present project is establishing the components of the YR, no attempt has been made to analyze YS any more than to establish its value for a particular technology or device.
Fault probabilityRandom defects are caused by random events, such as particle deposition, that occur during the fabrication process and cause the formation of features on the die not intended in the design layout. If random defects occur in a critical region, or critical area, of the die, they will cause opens or shorts — or some other type of fatal defect — and cause the chip to fail.1 Each type of random defect has a probability of causing a fault associated with it. This fault probability (FP) is simply the ratio of the critical area associated with that particular defect type to the total area of the chip. FP is a function of parameters associated with both the defect itself and the layout of the die: the size and type of the defect, and the circuit's geometry.1
One way to predict yield is to establish the values of the FPs associated with each defect type. We examine and compare two ways of establishing FPs — by analyzing existing bin and defect data and inferring the FPs through models of the defect distributions; and by simulating the defect distribution using Monte Carlo techniques and determining the FPs from the number of failed circuits on the layout.
Modeling defect limited yieldIn actual manufacturing, there is usually more than one defect type. Assuming defect types occur independently, the overall yield is:
(1)
where YDLi, i=1,2,3,... are the defect-limited yields for defect types i.2 For example, if no other defects are present other than defect type 1, and the systematic yield is 1, then Y=YDL1; i.e., the resulting yield would be equal to the defect-limited yield for defect type 1, and is limited by the yield loss due to this defect.
To calculate YDLi, we must assume a distribution model for the defect. All yield equations result from different assumptions of the distribution of the defect density. Using a simple binomial probability model, the probability of finding n fatal defects in a unit area (e.g., a chip area) of a region of constant defect density D is given by:
(2)
where
(3)
Because the FP may be defined as the portion of defects that are fatal, λ represents the average number of fatal defects per chip.3 Equation 2 is the simplest distribution to assume, known as the Poisson distribution. Defining the YR for a particular type of defect, YDLi, as the probability of zero fatal defects, n=0, Equation 2 gives,
(4)
As the die area is a known given, the determination of the FP and the defect density, D, is the primary task in estimating the YR.
In the literature, the term kill ratio (KR) is sometimes used interchangeably with the term fault probability, although in this paper it is given a different definition from FP. Like FPs, KRs can be used to estimate the defect-limited yield. We estimated both factors.
Calculations were based on three months of defect and bin data. In this sample set, more than 100,000 random defects were detected by optical inspection tools placed after certain process steps, such as etching or deposition, on the fabrication line. Classification of the defects is based on the process step right after which the defect was detected, and its size, which is binned. The size bins are categorized from Sz1 to Sz10, with Sz1 representing all defect sizes from 0 to <1 µm, Sz2 from 1 to <2 µm, etc., up to Sz10, representing those defect sizes 9 µm and greater. After classifying all the defects based on the above method, the FPs and KRs were calculated.
FPs from bin, defect dataWe used two methods to estimate the FPs from the bin and defect data. One was based on isolating the die with only one defect, and counting the total number of die with a particular defect and the number of failed die for the same particular defect.4 For example, if N die with only one defect type, StepASizeX, are counted, and NF of them have failing bin numbers, the FP would be:
FPStepASizeX= NF/N (5)
KRs were calculated with the same bin data and defect data as those used to calculate the FPs. A KR can be defined as the ratio of the increased chance, or probability, of a die failing due to a particular defect type A present on it to the probability that the die will not fail if that particular defect A is not present.5
(6)
where R represents the event of a die being rejected, or failing, G is the event that the die is good, or passes electrical testing, A is the event that defect type A is present on the die, and AC is the event that defect type A is not present on the die. If one were to define the FPA in the same terms used to define KRA, we would have:
(7)
where R is the event of a die being rejected, or failing, and Aonlyone is the event that a die has only one defect, of type A. In other words, FPA can be defined as the probability of a die failing when only one defect, of type A, is present. It is straightforward to show that FP and KR are not, in fact, the same, and estimate different probabilities. For example, if the failure rate is zero when defect type A is not present, P{R/Ac}=0, and P{G/Ac}=1, Equation 6 becomes,
(8)
Comparing Equations 7 and 8, it is clear that the KR, in fact, will overestimate the FP, because the probability of failure for a die with at least one defect must be greater than that for a die with just one defect. However, if the defect density of defect type A is low, most of the die that have any defects on them would have only one defect type A, especially if the distribution of defects on the wafer is completely random, i.e., if it has a binomial or Poisson distribution. In that case, Equation 8 would reduce to Equation 7. Therefore, if the defect density is not too high, the KR for a particular defect type may offer a good approximation to the FP.
In addition to serving as upper limits to the FPs, estimating KRs are of value because they can be used in estimating defect-limited yield, from which the FP may be inferred. Based on Equation 6, one can show that KR can be estimated by the following:5
(9)
where TGA is the total number of good die with defect type A, TA is the total number of die with defect type A, TG is the total number of good die inspected, and T is the total number of die inspected for defect type A. It can be shown from basic probability theory that the limited yield for defect type A can be computed as follows:5
(10)
where P(A) is the probability that a die will have at least one defect of type A on it. From this expression, it can be further shown that:5
(11)
If we know the spatial probability distribution function of defects on a wafer, pA, we can estimate TA as,
(12)
where pA (0) is the probability a die will not have any defects of type A on it. Since the negative binomial equation best describes the actual distribution of random defects on a wafer in the fab setting,3 we should use it to calculate pA (0). Substituting n=0 into the negative binomial equation,
(13)
where a is the cluster factor that determines how clustered the defects are, we have,
(14)
where
is the average number of defects of type A per die. Substituting Equation 14 into Equation 12 and rearranging, we have,
(15)
Equation 15 is a nonlinear equation for a that can easily be solved by numerical methods, such as a bisection search or the Newton-Raphson method. Thus, we can solve for a once we know T, TA and for each defect of type A. With the aid of the negative binomial equation, we calculate the defect-limited yield as
(16)
Because the derivation of Equation 16 is based on the assumption that the defects are distributed completely randomly, it would be incorrect to equate Equations 11 and 16 when the defects follow a clustered distribution. However, under certain conditions, this is a good approximation, even if the distribution is clustered. To see this, let us examine how the equations for the limited yield under a completely random distribution, i.e., Equation 4, and a distribution that may be clustered, i.e., Equation 16, behave under these situations. Figure 1 compares the defect-limited yields (LYs) predicted by both Equations 4 and 16 under varying DDs and cluster factors (a). We see that the agreement between these two equations lessens as the cluster factor decreases, i.e., as the distribution becomes more clustered, or as the DD increases. However, at a low FP, we see that at a higher DD and low cluster factor, the agreement is still quite good. Looking at Figure 1, for example, we see that for the LY for a clustered distribution with a=0.1 and a DD of more than one defect per die, the agreement with the LY for a completely random distribution with the same DD is better than 99% when the FP is 0.02. On the other hand, at high FPs, this approximation quickly breaks down as the DD is increased, as seen in Figure 2, where we observe two sets of LYs, one corresponding to FP set at 0.02 — the same set used in Figure 1 — and one to FP set at 0.33. The LYs set for FP=0.02 are superimposed on the top line in this figure, while the LYs set for FP=0.33 are clearly separated at the lower as, dramatically illustrating the dependence of the approximation of Equations 4 and 16 on the FP value.
![]()
1. Comparison of line yield vs. defects per die for a completely random distribution and five different cluster distribution factors (a), at a fault probability (FP) set at 0.02. At low FP, LY is well approximated, almost regardless of cluster factor.
With the above illustration, we see that, at lower FPs, DD s per defect type below one defect per die, and as above 0.1 — conditions typically seen in the fab — the LYs predicted by Equations 4 and 16 are comparable. Thus, under the conditions just given, the LY predicted by both Equations 4 and 16 is equivalent to that predicted by Equation 11:
(17a)
or, rearranging,
(17b)
Equation 17 provides another means of estimating FP from the bin and defect data.
![]()
2. Comparison of two sets of line yield vs. defects per die, one corresponding to FP set at 0.02 and one to FP set at 0.33. Each set has a completely random defect distribution and five distributions set at different cluster factors (a). The LYs set for FP= 0.02 are superimposed on the top line in this figure.
Finally, FP values were estimated based on computer simulation runs on the die layout for each layer. To facilitate comparison with our FP values estimated from the bin and defect data, the FPs from the simulation had to be averaged over the same size bins. To obtain average FPs over a defect size range from the simulation results, we can start with the general expression for the expectation of a function of a random variable:6
(18)
where f(x) is the probability density function of the random variable x. If we know FP as a function of size, FP(x), where x is size, then the average FP over all possible defect sizes would be, per Equation 18:
(19)
where h(x) is the probability density function of the defect sizes, also known simply as the defect size distribution. The exact functional form of h(x) can be determined from defect monitors, where xo has been found to be much smaller than the minimum dimension of the device in most cases.7 However, it has been found that assuming a linear increase in h(x) up to a certain size, xo, and a 1/x3 decrease above this size, is an adequate assumption for most defect size distributions.7 Once xo is chosen, h(x) is determined by recognizing that the probability density function must satisfy the following relationship:
(20)
Assuming that8
(21a)
and
(21b)
we have, by substitution of Equations 21 into Equation 20,
(22)
Since h(x) must be continuous,
(23)
Solving Equations 22 and 23, we have,
(24a)
and
(24b)
Once we determine FP(x) and h(x) we can estimate the average FP over all defect sizes for any given defect type or layer by numerically integrating Equation 19. To determine the average simulated FP over a specific defect size range, xmin to xmax, where xmin is less than xo, we need only replace xmin for 0 and xmax for ∞ in Equations 19 through 22, and then solve for Equations 22 and 23 to get,
(25a)
and
(25b)
For example, substituting xo=0.5, xmin=0, and xmax=1 into Equation 25a and Equation 25b, substituting the results into Equations 21a and 21b to obtain h(x), and then numerically integrating,
(26)
we can obtain the average FP over the defect size range of 0 to 1 µm.
For xmin greater than xo, we need only use Equation 21b together with Equation 20 (after replacing xmin for 0 and xmax for ¥, and solve for b, to get
(27)
Substituting xmin=1, and xmax=2 into Equation 27, then using Equation 21b and Equation 26, we can obtain the average FP over the defect size range of 1 to 2 µm, for example.
Comparing the FPs and KRs estimated from our bin and defect data with the FPs estimated from simulation additionally requires determining the process steps at which defects responsible for causing the faults being simulated may occur.
Once these process steps are determined, the inspection step at which these defects may be detected is established. For example, the bin and defect map KRs and FPs associated with the TN1T layer are estimated based on the defects detected right after Ti-nitride 1 deposition. The simulation "contact" is based on the layout showing the location of contacts. Because the TN1T step is the process step occurring right after the contacts are etched and before tungsten is deposited into the contacts, it seems a reasonable assumption that any defects detected here are those that can cause faults in the contact. That is, the defects detected at TN1T are assumed to represent all possible defects that could potentially cause a contact to be blocked or short-circuited, so that the FPs calculated based on the defects detected at this step should be equivalent to the FPs estimated by simulation on the contact layout. Similarly, the simulation "via" is based on the layout showing the via between one metal and another. The defects detected at TN2T, for example, are assumed to represent all possible defects that could potentially cause the via between metal 1 and metal 2 to be blocked or shorted.
ResultsWe examined a sample of our simulation results averaged for specific size bins for the BP74 device along with the FPs and their confidence intervals from the bin and defect map data for the TiN layers (Tables 1 and 2). We estimated FPs using Equation 5, which is the estimate of the probability of success in the binomial distribution. The tables also give the assumed fault mechanisms.
From Table 2 we see that the agreement between KRs and the simulated FPs is better than that between the estimated FPs and the simulated FPs shown in Table 1, though it is harder to tell for the larger bin sizes (Sz 5 and 6), because of the large confidence intervals caused by the small sample sizes at the larger-size bins. In light of this problem, we chose not to use size bins in analyzing subsequent data (see Part 2 of this article for more details).
The results from Table 1 show that, for the larger-sized bins, the FPs between the simulated and defect map data agree, but only because the confidence intervals were so broad, making the comparison meaningless. But at lower-sized bins, where the confidence intervals are much narrower, there is a large discrepancy. For example, at TN1TSz01, the FP estimated from bin data is between ~0.2 and 0.3, while the simulation FP at ContactSz01 is <0.01. In general, as can be readily seen from Table 1, the FPs estimated from the bin and defect map data using Equation 5 are uniformly greater than the FPs estimated by computer simulation on die layout.
This is probably due to hidden defects that can contribute to increasing the estimated FP over the true FP. In other words, the defects detected at any of the inspection steps may not represent all the defects actually present on the die at the particular inspection step.
There are three potential reasons that a defect may not be detected on any given wafer:
- They are covered by a previous deposition (smaller defects go more easily undetected than larger ones).
- They occurred on a layer that was never inspected for defects.
- They are so similar to the surrounding layout in texture and topography that the inspection tool cannot distinguish them from the background.
In particular, the first reason seems most responsible for contributing to the presence of defects that are missed. The second reason can be understood very simply — by realizing that not all the possible process steps at which defects may occur are examined for the wafers whose bin data we are using to make our FP estimations. Thus, although a particular die may show only one defect present on the final defect map, it may in fact contain defects from other steps that were not examined.
In Part 2 of this article, we will show that there are only two reliable means of estimating fault probabilities — by means of simulation of probable defect mechanisms on the layout for that layer, and by the use of the defect-limited yield equation. We will also show that the FPs estimated from the defect-limited yield equation are meaningful and can be readily defined. We arrive at a way of predicting yield for a given layer, even if there is more than one defect mechanism or defect type on that layer.
| Author Information |
| David T. Hu worked as an intern at LSI Logic from July 2000 to February 2001. His project involved predicting chip yield from in-line bin and defect data. He is currently an M.S. candidate in the chemical engineering department at Oregon State University, and has a B.S. in chemistry from the University of Portland. |
| Phone: 1-541-752-8334 |
| E-mail: huth@che.orst.edu |
| Milo D. Koretsky is an associate professor of chemical engineering at Oregon State University. His research interests are in thin-film materials processing. His teaching interests include integration of microelectronics unit operations into the core chemical engineering curriculum. He also serves as the chemical engineering advisor to the MECOP internship program. He received B.S. and M.S. degrees from the University of California at San Diego, and a Ph.D. from the University of California at Berkeley, all in chemical engineering. |
| Phone: 1-541-737-4591 |
| E-mail: koretsm@engr.orst.edu |
| Manu Rehani is the section manager of yield data systems at LSI Logic and is responsible for implementing and sustaining the yield analysis and reporting infrastructure, as well as for future growth. He has been with the company since 1997, holding various defect detection, analysis and production management positions. He has an M.S. in chemical engineering from Oregon State University. |
| David Abercrombie is the yield engineering manager at LSI Logic. In addition to driving in-line defect and end-of-line wafer sort yield improvement, his team also researches and develops software systems for data warehousing, automation and engineering productivity. He has a BSEE from Clemson University and an MSEE from North Carolina State University. |