News Article

Quick to find fault

Monday 1st August 2005

Real-time fault detection and classification is one of the holy grails of semiconductor manufacturing. John Scanlan, Kevin O'Leary, Marcus Carbery, Francisco Martinez and Paul Scullin of Straatum report.

Rapid detection and identification of process and tool faults is essential for maintaining productivity and product yields in semiconductor fabs. Historically, fabs have relied on either end-of-line quality checks or regular process quality checks using ex-situ metrology and/or short-loop test structures to detect faults. Automated fault detection and classification (FDC) attempts to catch and prevent excursions by monitoring the process tool in real time.

To control any process, high-quality data is required. For example, plasma etch processes depend on chamber pressure, plasma power, wall and substrate temperatures, gas phase and surface chemistry, and chamber geometry, with many other second-order contributions.

To demonstrate the viability of an FDC scheme, lets look at an RF process-state sensing system [1] with plasma etching as our application platform.

The process-state sensor measures RF harmonic components in the RF power delivery line of the plasma chamber. A data stream containing full temporal resolution and full multidimensional sensor resolution, in this case the Fourier spectrum (figure 2) with multiple (n) harmonics, contains detailed information on the processstate.

However, such a large bandwidth data stream means some data compression is required. Temporal compression is achieved by data averaging over a single process step. For FDC process control schemes, the absolute accuracy of the measurement is not critical; what is important is the resolution to process-state changes and the repeatability of the measurement under nonchanging conditions.

Statistical methods
Semiconductor manufacturing plants historically have used statistical process control (SPC) for tool monitoring. For real-time FDC, we monitor toolstate variables and/or additional process-state variables from in situ sensors.

The first problem encountered is data overload. Even with temporal data compression, there is still a large data stream of potential information. With the addition of a process-state sensor, the operator now has 20-30 variables to monitor per process, per chamber. This quickly becomes unmanageable and prohibitive to real-time FDC.

The second problem is that these approaches assume the underlying process is stationary - that is, the mean and variance of the data do not vary with time, and the observations are identically, independently, and normally distributed [2].

A possible answer to the data overload problem is the use of multivariate statistical techniques and data treatment methods [3]. For example, from the tool-state and in situ process-state sensor data set, one can construct a multivariate Hotelling T2 statistic, following a principal components analysis (PCA) transformation of the data set.

The T2 statistic represents a multivariate extension of the student-t statistic, and calculates the distance of any given process data point from statistically normal operation, defined by a set of the sensor data collected over normal operating conditions.

Normal process is a cluster of data points in ndimensional sensor space and T2 is the distance from the centre of this data cluster weighted relative to the normal sensor variance. The advantage of this method is the capture of all tool-state and/or process-state sensor data into a single statistic that includes a model of individual parameter means, variance, and the covariance of the entire parameter set. Fault detection is based on the observation of data points with high T2 scores, representing deviation from the model.

As with univariate statistical approaches, multivariate methods also assume a normal distribution. However, certain pragmatic issues make these statistical techniques difficult to implement, such as the need for a very large baseline sample and a requirement to build toolby- tool models. Both of these issues are incompatible with FDC system requirements, particularly in terms of scalability.

Knowledge-based methods
Another approach to real-time FDC uses knowledge-based techniques (figure 3). Unlike model-based methods, which rely on passive data collection to build a model of normal and abnormal behaviour, knowledge-based methods employ direct system teaching.

One approach is to construct a multidimensional sensor-space fingerprint of the process and compare it in real time to a set of known fault fingerprints. A "fingerprint" is a set of sensor data that defines a particular process-state - thus a fault fingerprint means a set of sensor data defining a fault process-state. A fault is detected only if a match is determined. This is the first step in reducing false positives.

The question of sensitivity to real fault conditions then becomes a function of the underlying sensor data and the users resolution requirements. In general, the underlying sensor should be sensitive to very small changes in process conditions, allowing users to set alarm or warning limits based on the process window. Additionally, sensor data must have sufficient dimensions to permit a multitude of different fingerprints to be defined for the respective multitude of fault conditions.

An illustration of this knowledge-based FDC scheme is shown in figure 3. For this illustration, only two sensor dimensions are used. As each wafer is processed, the FDC algorithm compares the current process-state fingerprint to the set of known fault fingerprints. A similar fingerprint also represents the baseline process condition. While a model-based approach compares the present state to the baseline state to determine any differences, the knowledge-based approach also checks to see if any differences match a known fault state. It is far easier to view and compare fingerprints than to view all multi-dimensional and multivariate data for each process-state or fingerprint.

Thereafter, if the sensor outputs changes and they match the changes expected from the set of learned response curves, then the fault root cause is immediately classifiable. In this method, a fault fingerprint is classified before a fault is encountered, ensuring a robust method of detecting such faults.

When a new fault appears, the multitude of tool sensors will report a state change. On first occurrence, no matching fingerprints will exist in the fault library; and the fault cannot be classified. Fingerprints of new faults can be added when the fault is confirmed independently, for example, by metrology. If this fault reappears in the future, it can be instantly classified. This method allows for continuous learning and expansion of the fault library.

Integration of knowledge and model methods
Although a model-based approach is sensitive to any change not captured in the model of normal tool behaviour, it will not capture many process-states (unless the model is infinitely large) that are not fault conditions. These may be flagged as false positives. Meanwhile, the knowledge-based approach compares the current state not to the normal model but to a set of known abnormal states. Unless the fault-state is pre-known, real faults not previously captured may be missed.

This illustrates the fundamental challenge with real-time FDC: based at least on in situ sensors, it may be impossible to discriminate between fault states and normal states. If this is the case, a compromise may be the preferred solution. One key need, then, is the ability to compare states without having to compare the full temporal and multivariate data sets.

Figure 4 shows an illustration of a method aimed at meeting these requirements. For the purposes of illustration, the in situ sensor data is shown only as a three-dimensional data set. A multivariate model, shown as a volume in the 3-D sensor space, captures all baseline fingerprint data. Deviation from the model appears as a large T2. This is the model-based approach and represents the magnitude of the possible fault relative to the baseline model.

The knowledge-based approach is added by constructing the projection of this sensor-space fingerprint onto each of the known fault fingerprint vectors. The classifier then returns a figure of merit for the fingerprint match, for example a correlation coefficient in the multidimensional sensor space. Projection onto all the fault directions permits construction of a Pareto chart of fingerprint matches, allowing determination of fault root cause. Any process state fingerprint data can thus be compared to the baseline data and to the known fault signatures.

Application Example
To illustrate an integrated model- and knowledgebased method, we acquired sensor data from an RF impedance sensor mounted between the RF match and electrode. There are two teaching steps required prior to implementing a system for realtime control using the RF sensor to fingerprint the chamber.

The first step is the collection of sensor data for the nominal process to establish a baseline. This involves fingerprinting a number of wafers processed under no-fault conditions to identify the distribution of a healthy process (figure 5).

The second step is fault fingerprint training. This can be initiated by simulating fault conditions such as process set-point drift or hardware failures. The effort involved in this training has to be weighed against the possible upside. For example, if hardware faults, such as those that may be encountered following a preventive maintenance event, are desired components of the fault library, then this learning can be time consuming.

After training is completed, the FDC system can be implemented in real-time control. Any excursion outside the variance captured in the baseline model appears as a high T2. Deviations in T2 indicate that something statistically unusual has occurred. This is a necessary but not sufficient condition to confirm that a fault has occurred.

There are a number of large T2 values that are known not to correspond to any real tool fault (figure 6).

It is also possible to include patterns of behaviour other than variations in process inputs, such as human error in performing preventative maintenance (for example, installing the wrong consumable part in an etch chamber).

Conclusion
Successful real-time FDC schemes have several key requirements, including appropriate in situ process-state sensors, effective data reduction techniques, and robust control algorithms. The overall package also needs to be cost effective and provide operational benefits. The mutually complementary pros and cons of knowledge- and model-based methods suggest that a combined approach provides users with a powerful tool to make real-time FDC feasible in volume manufacturing environments.

Fig. 1. Fault detectionand classificationconfiguration

Fig. 2. Fouriercomponents of the RFwaveform showfundamental plus fourharmonics of RF currentand voltage and phase

Fig. 3. FDC based on pattern recognition illustrated for a twodimensional in situ sensor data stream. A fault library contains a set of predetermined sensor space fault patterns. A sensor space pattern on each production wafer is determined and compared to the fault library patterns

References
[1] F Martinez et al, SPIE Advanced Process Control and Automation, March, 2005.
[2] DC Montgomery, Introduction to Statistical Quality Control, 2nd Ed., John Wiley & Sons, 1991.
[3] NB Gallagher et al, IFAC ADCHEM ‘97, pp. 7883, Banff, Canada, 1997.

Quick to find fault

More news articles

Corporate Partners

Navigation

Our magazines

Our conferences

Our awards

About us

Data Protection Preferences

You chose the industry type of "Other"