Role Of Process-induced Wafer Geometry Changes In Advanced Semiconductor Manufacturing
How can high-resolution wafer geometry measurements, combined with mechanics models, be used to quantify distortions and stresses introduced by processing and assess their impact on subsequent process steps? Kevin T. Turner1 and Jaydeep Sinha2 1Department of Mechanical Engineering and Applied Mechanics, University of Pennsylvania 2Surfscan-ADE Division, KLA-Tencor Corporation discuss.
Wafer geometry is a broad term that describes measurements of the shape, flatness, and roughness of a wafer. Quantities such as bow, warp, site flatness, nanotopography, and roughness (Figure 1) are all measurements of wafer geometry and play a role in the performance of semiconductor manufacturing processes at different points in the process flow.
Figure 1: Range of wafer geometry features commonly encountered in semiconductor processing. Different types of wafer geometry features are classified based on spatial wavelength (horizontal axis). Typical heights of these features are indicated on the vertical axis.
These wafer geometry features cover a range of spatial wavelengths, from micrometers to hundreds of millimeters, and can have amplitudes of Angstroms (e.g., roughness) to hundreds of micrometers (e.g., wafer shape). Traditionally, most SEMI standards related to wafer geometry have been focused on the starting or bare wafer geometry. However, significant changes in wafer geometry may occur during processing of the wafer and these changes are often significantly larger than the geometry variations of the bare wafer. These process-induced wafer geometry (PIWG) changes can be monitored throughout the process flow (Figure 2) using metrology tools that measure the out-of-plane distortion (OPD) of the wafer. Measurements of PIWG changes can be used to identify problematic processes, such as film deposition processes that introduce excessive residual stress or thermal processes that result in the introduction of slip lines or plastic deformation. Furthermore, recent work  has demonstrated that PIWG measurements can also be used in a feed-forward manner to set corrections in subsequent fabrication steps, such as lithographic patterning.
Figure 2: Examples of changes in wafer geometry that occur as a result of wafer processing. The top two rows show wafer shape while the bottom rows show geometry maps that have been filtered to remove long wavelength (>10 mm) variations.
The goals of this paper are to introduce the range of PIWG features that can be measured using modern wafer geometry metrology tools and demonstrate how such measurements can be used in conjunction with mechanics-based models to assess impact of PIWG in subsequent process steps.
Process-induced wafer geometry changes
The control and monitoring of PIWG is critical to achieving high yield in a broad range of advanced semiconductor manufacturing processes. This includes the performance of lithography [3, 4], CMP [5, 6], and wafer bonding processes [7-9]. PIWG changes often occur as a result of thermal processing or due to the deposition or etching of residually stressed thin films [10, 11].
Distortions introduced by stressed films are perhaps the best known example of PIWG. From the early work of Stoney  to the significant research on thin film mechanics over the past several decades (e.g., ) it is well-known that the addition or removal of a film from wafer can change the curvature of the wafer, and wafer geometry measurements have been widely used for determining stresses in deposited films.
PIWG measurements, however, can provide much more information than simply film stress. Figure 2 shows examples of PIWG measurements that show changes in shape as well as features with higher spatial frequencies due to processing. The relevance of a particular PIWG feature depends on both the mechanism by which the PIWG change was introduced (e.g., stress, slip-line) and the characteristics of the processes performed after the process step that induced the PIWG. Lithography processes, for example, depend on the flatness of the patterned surface of the wafer when it is chucked and PIWG can lead to incomplete chucking that degrades focus and narrows the process window. As discussed in section 3, shorter wavelength (<5 mm) PIWG features are substantially more difficult to chuck than longer wavelength features, thus short wavelength features are the primary concern when assessing the impact of PIWG on defocus in lithography processes.
On the other hand, the in-plane distortion (IPD) of the patterned surface of a wafer that occurs during processing is crucial in determining overlay alignment between multiple layers in lithography. As discussed in section 4, PIWG with intermediate to long wavelengths (i.e., millimeter to hundreds of millimeters) can be quantitatively related to IPD and used to predict the process-induced component of overlay errors.
These two cases are presented in this paper as examples of a much broader set of challenges in semiconductor manufacturing that can be addressed through a proper combination of PIWG measurements and physics-based process models.
Wafer geometry effects in chucking during lithography processes
Wafers are clamped on vacuum chucks in lithography tools in order to remove variations in the shape of the wafer and reduce height variation of the wafer surface over the exposure field. Chucking can remove OPD of the wafer (i.e., variations of the mid-plane of the wafer), but cannot, in general, compensate for thickness variations.
For wafers that fall within the SEMI specifications, the ~8 0 kPa of clamping pressure provided by typical vacuum chucks is sufficient to completely chuck the wafer and remove OPD. However, PIWG features with a range of spatial wavelengths and amplitudes can be imparted by processing (Figure 2). In order for the chuck to be able to flatten the wafer, the clamping pressure provided by the chuck must be sufficient to elastically deform the wafer into complete contact with the chuck.
Longer wavelength geometry features are accommodated through bending deformation, while shorter wavelength features must be accommodated through bulk deformation of the wafer. Analytical and computational mechanics models were used in  to quantify the range of wafer geometry features that can be chucked completely as a function of spatial wavelength. Figure 3 shows the results of a simple model that estimates the amplitude of a PIWG feature that can be chucked with 80 kPa of pressure as a function of spatial wavelength. At longer wavelengths, large shape variations (>100 µm) can be easily accommodated. At shorter wavelengths (< 5 mm), the maximum amplitude that can be chucked is 20 nm or less. This clearly suggests that it is critical to monitor and control shorter wavelength PIWG features to ensure that poor chucking performance does not narrow the lithography process window. Figure 4 further illustrates the importance of shorter wavelength features in chucking.
Figure 3: (a) Schematic of simple model used to understand effect of PIWG on wafer chucking. (b) Amplitude of wafer geometry feature that can be chucked as a function of spatial wavelength for vacuum chuck with a clamping pressure of 80 kPa. Figure reproduced from .
Shown in Figure 4 are the overall wafer geometry measurements (left column) of two different process wafers. From these maps it is not apparent that the wafer contains short wavelength features.
Figure 4: Two examples of wafers with PIWG across a range of spatial wavelengths. (a), (b) and (c) show the wafer shape, local curvature map of a section of the wafer, and the predicted wafer-chuck gap for the first wafer. (d), (e) and (f) show the wafer shape, local curvature map of a section of the wafer, and the predicted wafer- chuck gap for the second wafer. Note the strong correlation between local curvature and wafer-chuck gap for both wafers. Figure adapted from .
The shorter wavelength features can be identified by filtering to remove the long wavelengths or calculating the local curvature from the wafer geometry data. The latter is done here and curvature maps for the two wafers are shown in the middle column of Figure 5. A finite element simulation of chucking these wafers was performed and the residual gap at wafer-chuck interface is shown in the right column of Figure 5. There is clear correlation between the areas of high curvature and residual wafer-chuck gap, again demonstrating that it is crucial to control higher-order PIWG features.
Figure 5: Schematic of out-of-plane and in-plane distortion due to the deposition of a residually stressed film and the effect of distortion on lithography processes. Figure adapted from .
Relationship between PIWG and overlay
Overlay budgets in patterning processes continue to be reduced as a result of decreasing feature size and the use of multiple patterning steps. Overlay budgets are well below 10 nm in many processes today and meeting such strict requirements is challenging . As a result, there is a significant effort to minimize all sources of overlay, including process-induced components. Process-induced overlay errors can be caused by distortion of the wafer due to deposition or etching of residually stressed films that elastically distort the wafer or thermal processing steps that introduce plastic deformation.
Such types of distortions on wafers can be identified via wafer geometry measurements prior to the lithography step and used as feedback to alter processes to reduce the distortion, or fed forward to the lithography tool to realize improved correction strategies. In order to quantitatively relate PIWG measurements to overlay in lithographic processes, models that relate the OPD of the wafer to the IPD on the pattern surface of the wafer are required. We and others [4, 14, 15] have developed such models and these concepts have also recently been experimentally validated [2, 3].
Figure 5 illustrates an example set of process steps consisting of two patterning steps with the deposition of a residually stressed film in between. In step N, a set of features is patterned on the wafer. The deposition of the film introduces both OPD and IPD. The OPD, which can be measured in a wafer geometry metrology tool, results from bending of the wafer due to the film stress. The IPD on pattern surface of the wafer is due to both bending of the wafer and axial deformation of the wafer along the mid-plane. In the second lithography step, the wafer is chucked flat.
This chucking process removes the OPD and the portion of IPD on the pattern surface that was due to bending of the wafer. It does not, however, remove the IPD due to the axial deformation along the mid-plane . If a film with a uniform stress is deposited on a wafer, the IPD on the pattern surface, Δu(r), after chucking is given as
EQUATION IS MISSING
where E and h2, are the biaxial modulus and thickness of the wafer, σR and h are residual stress and thickness of the film, and r is the radial coordinate. In exposure tools, corrections are applied to compensate for such distortions of the wafer. The deformation field given in the above equation can be fully compensated through a magnification correction, which is standard in the simplest linear correction schemes available in lithography tools. Thus, for residually stressed films to cause overlay errors, more complex deformation fields must be introduced via the processing.
Non-uniform film stress distributions (i.e., stresses that vary with location on the wafer) can be induced as a result of non-uniform deposition processes or via variations in pattern density across the field or wafer. In ref. , finite element modeling results demonstrated that subtle variations (e.g., 10%) in residual stress across the wafer can lead to non-correctable IPD.
In order to quantitatively link PIWG measurements to overlay, a mechanics model that considers non-uniform residual stress fields is required. A simple analytical model presented in , which assumes a non-uniform residual stress distribution, shows that the IPD after application of linear corrections is linearly proportional to the corrected slope of the OPD change (i.e., residual after applying corrections to a map of the slope change of the wafer shape due to processing) that is induced by the residually stressed film .
The model, which is derived under a number of assumptions, suggests that the overlay error is equal to the product of the shape-slope-residual and 1/6 of the wafer thickness. This basic relationship between overlay and slope was verified via finite element simulations in which the deformations of wafers with films with non-uniform stress were simulated .
Figure 6 shows maps of the simulated overlay and shape-slope-residual obtained from the finite element simulations as well as correlation plots demonstrating the relationship between these two quantities. This simple relationship between slope and overlay is useful in a wide range of practical cases, but can break down if the stress distribution is highly non-axisymmetric as the deformations of the wafer become more complex. In these more complicated cases, computational mechanics models can be used to quantitatively link PIWG and overlay.
Figure 6: Results of a finite element simulation of a wafer with a film with a non-uniform radial stress distribution. In the top row, the predicted overlay (left) and shape-slope-residual, which is termed PWG-IPD, (right) are shown. The lower plots show the correlation between overlay and PWG-IPD. Figure reproduced from .
Recent work done as collaboration between IBM and KLA-Tencor has experimentally demonstrated the ability to predict overlay from wafer geometry measurements. Using specially developed engineering stress monitor (ESM) wafers that had prescribed stress non-uniformity, a strong correlation between the local shape-slope-residual and the non-correctable component of IPD on the pattern surface of the wafer was demonstrated . In subsequent work, the use of PIWG measurements in a feed-forward scheme to reduce overlay error was demonstrated .
For example, for a wafer with a radial stress variation the x-overlay was reduced from 21 nm to 5.5 nm and the y-overlay was reduced from 17.5 to 7.5 nm by using PIWG data to set corrections in a feed-forward manner .
This paper has highlighted examples of the impact and use of PIWG measurements in advanced semiconductor manufacturing. As error budgets shrink, tighter control of process-induced distortions and wafer geometry are essential. Measurements of PIWG throughout the process flow provide quantitative information about the distortions that are introduced in the wafer. With appropriate analyses, which often involve mechanics-based models that capture the wafer deformation or process-physics, these PIWG measurements can be used to provide feedback for process refinement or fed forward to subsequent processes to apply suitable corrections.