Sneak Preview 2: Outliers, Metric Transformation, and ES Distribution
Posted: May 31, 2012 | Author: A. R. Hafdahl | Filed under: Sneak Preview | Tags: assumption violation, between-studies variance component, conditional variance, correlation, effect size, fixed effect, heterogeneity, interval estimation, meta-analysis, outlier, random effect, substantive application |Leave a commentMy previous three posts on fitting models to effect sizes (ESs)—Parts 5a, 5b, and 5c—were the core of my seven-part overview of meta-analysis. With only two posts remaining in the overview, I’ll pause again to describe three more methodological issues I plan to discuss: potential outliers, transforming ES metrics, and the distribution of ES parameters. As in my first sneak preview—about degraded ESs and tricky conditional variances (CVs)—I’ll keep these “teaser” descriptions fairly short, mainly to pique your interest; each issue deserves at least one dedicated post with more detail.
Potential Outliers
In Part 4 of my meta-analysis overview I mentioned checking for outlier ESs as one aspect of data exploration. Detecting and managing outliers is often challenging even in primary studies, and these tasks are complicated further in meta-analyses. One working definition of an outlier is that it’s a case generated by a different process than most others in a sample. Here’s how this applies to meta-analysis:
- Case: This is usually a primary study or other entity for which we have an ES estimate and associated values (e.g., CV, ES features).
- Different process: This might involve aspects of the study’s conduct or reporting or something about how we extracted, recorded, or manipulated its information; it could be due to errors or deliberate differences in procedures.
An outlier’s value(s) needn’t be extreme (e.g., due to smaller variance), and a case with an extreme value might not be an outlier (e.g., due to randomness). Because outliers represent a different phenomenon than the rest of a sample, meta-analyzing them together with non-outliers can distort results.
Several issues associated with outliers are common to primary studies and meta-analyses. Broadly speaking, most approaches for dealing with outliers entail either excluding them from analyses or using robust or resistant analyses that are relatively immune to them. Excluding outliers requires first identifying them, which involves deciding whether each case is an outlier. This error-prone classification is often based on inspecting plots for aberrant values on certain variables; using statistical measures of each case’s departure from others, usually relative to an assumed model (e.g., residuals, leave-1-out estimates); or checking for errors or other indicators of a distinct process. In contrast, robust and resistant methods typically entail reducing extreme values’ influence on results, such as by using special weights, objective functions, or distributions (e.g., heavy-tailed, mixture).
Certain attributes of meta-analytic data complicate the two broad approaches noted above. These complications are due mainly to differences among studies. I described conventional models and methods for such data in Parts 5a, 5b, and 5c of my overview of meta-analysis. Even under a simple homogeneous fixed-effects (SHoFE) model we should recognize differentially precise ES estimates (i.e., heterogeneous CVs) when excluding outliers or using robust or resistant methods. When the SHoFE model holds exactly, for example, a sample of ES estimates (yi) with varying CVs (vi) will be leptokurtic, and we should judge a given study’s departure from others relative to its CV (e.g., when μ = 1.0, yi > 1.3 is much less likely if vi = 0.12 than if vi = 0.32).
Variation in ES parameters among studies induces more complications, such as distinguishing outliers from other sources of between-studies heterogeneity that require further assumptions (e.g., distribution of random effects, form of covariate effects). Handling outliers becomes even more difficult with increasingly complex meta-analytic data, models, or methods, such as with multivariate or hierarchically dependent ESs or in the presence of publication bias.
Other interesting matters warrant mention. For instance, subject-level outliers in a primary study can influence meta-analyses, and authors have proposed ES estimates that are less influenced by such outliers. Also, techniques have been developed to handle outliers when combining p values instead of using models and methods for ESs. Finally, whether and how we handle outliers can influence meta-analytic results considerably, and documenting these procedures and reporting them transparently can help avoid misleading findings or conclusions.
Transforming ES Metrics
We might wish to re-express certain meta-analytic results in a different ES metric, such as to facilitate interpretation by stakeholders or comparison with previous findings. For instance, as suggested in Parts 1 and 5a of my meta-analysis overview, suppose we meta-analyze ESs that conform reasonably well to assumptions of our model and procedures, such as Fisher z-transformed correlations; we might want to report estimates of correlation parameters’ mean and variance in the more familiar Pearson-r metric.
Considering univariate ESs for simplicity, let’s call the metric of ESs used in analyses the analysis metric and denote its ES parameter for Study i by Θi (a random variable) or θi (a known value or realization of Θi). Now suppose we’re interested in a different ES metric that can be expressed as a function g of the analysis metric, where g can take numerous forms but is usually nonlinear, monotonic, and smooth. Let’s call this new ES metric the interpretation metric and denote its ES parameter by Γi = g(Θi) or γi = g(θi). For example, if θi is a Fisher z-transform, then its hyperbolic tangent γi = tanh(θi) is the corresponding Pearson-r correlation.
When ES parameters are homogeneous, re-expressing meta-analytic results in the interpretation metric is relatively easy: Under the SHoFE model the interpretation metric’s common ES (μγ) is just g applied to the analysis metric’s common ES (μθ), so it’s sensible to use the usual estimate ^μθ to estimate μγ as ^μγ = g(^μθ). Similarly, it’s sensible to construct a confidence interval (CI) for μγ by applying g to a CI for μθ.
Re-expressing results for heterogeneous ES parameters is harder. Ignoring study-level covariates for now, let’s consider the simple random-effects (SRE) model. Suppose ES parameters in the analysis metric have a particular distribution with mean μΘ and variance τΘ2. Switching to the interpretation metric, which amounts to applying g to all ES parameters (e.g., for all studies), yields new ES parameters that have a different distribution with mean μΓ and variance τΓ2. Except in rare special situations (e.g., linear g), hyperparameters in the interpretation metric are not simple functions of their analysis-metric counterparts. For instance, usually we can’t estimate μΓ by simply applying g to an estimate of μΘ, because
μΓ = E[g(Θ)] ≠ g[E(Θ)] = g(μΘ) .
The middle inequality is the critical impediment: The mean of a function of a random variable isn’t in general that function of the random variable’s mean. To illustrate this with Fisher z-transforms and Pearson-r correlations as the analysis and interpretation metrics, respectively, suppose μΘ = 1.0: Applying g to this directly yields tanh(1.0) ≈ .762, which is larger than μΓ when Θ ~ N(1.0, τΘ2)—for example, μΓ ≈ .758 if τΘ = 0.1, and μΓ ≈ .734 if τΘ = 0.3.
Authors have developed techniques to re-express meta-analytic results for Θ as their counterparts for Γ. One relatively simple point-estimation strategy for μΓ or τΓ2 entails integrating appropriate quantities over an estimated distribution for Θ, and another entails approximating g by a low-order polynomial (e.g., truncated Taylor series). One proposed inference strategy entails transforming endpoints of a CI for μΘ to obtain a CI for μΓ; others involve techniques such as the delta method and bootstrapping.
Proposed re-expression procedures haven’t been well-studied in realistic circumstances, and some of them require non-trivial computations. Nevertheless, they’re promising ways to translate meta-analytic findings to better inform practice, policy, or substantive theory. I’m especially interested in extensions of these procedures for models with study-level covariates or multivariate ES; for instance, we could meta-analyze (possibly incomplete) Fisher-z correlation matrices and transform the results to estimate and make inferences about the mean and variance of indirect-effect parameters (e.g., in mediation analysis) or the mean and covariance matrix of path-model parameters.
Distribution of ES Parameters
Conventional random-effects models assume ES parameters arise from a distribution. Although meta-analysts typically focus on this distribution’s mean—possibly as a function of covariates—and (residual) variance, we might want to know more about it for at least two reasons. First, some random-effects procedures require assuming this distribution is from a specific family (e.g., normal), and violating that assumption might distort certain results. Second, substantive questions might concern this distribution’s density function, general shape (e.g., uni- vs. bimodal), or particular attributes, such as the following:
- higher-order moments: skewness, kurtosis, etc.
- tail or central proportions: probability that ES is (not) positive, probability that ES is (not) negligibly small, etc.
- quantiles or boundaries of other ES sets: 3rd quartile, 10th percentile, “middle” 95% (e.g., equal tail, highest density region), etc.
We might also be interested in pairs or larger sets of attributes, such as skewness and kurtosis as a paired measure of (non)normality. Furthermore, our questions about the ES-parameter distribution might pertain to a different ES metric—as in the previous section—and could be extended to multivariate ESs or models with study-level covariates.
Research on ES-parameter distributions in meta-analysis seems to be rather sparse. Some authors have investigated consequences of violating assumptions about this distribution, and a few have proposed methods for relaxing distributional assumptions or estimating the distribution. For these latter problems, Bayesian approaches to meta-analysis might be particularly well-suited.
To illustrate a relatively simple density-estimation procedure in a situation where normality seems dubious, I’ll use Becker’s (2009) data on anxiety and sport performance.
Becker, B. J. (2009). Model-based meta-analysis. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The handbook of research synthesis and meta-analysis (2nd ed., pp. 377-395). New York: Russell Sage Foundation.
Below are the 10 sample sizes and Pearson-r correlations between measures of cognitive anxiety and sport performance from her Table 20.1, sorted by correlation and ignoring the distinction between team versus individual sports.
- n: 142, 51, 14, 30, 70, 45, 128, 100, 16, 37
- r: -.55, -.52, -.39, -.27, -.01, .10, .14, .23, .44, .53
These correlations exhibit substantial between-studies heterogeneity, and we can estimate the distribution of correlation parameters using a relatively simple mixture strategy. Here’s the basic idea: Use an EM algorithm to estimate the mean and variance of each study’s posterior distribution of Fisher-z parameters, then estimate the distribution of Fisher-z or Pearson-r parameters as an equally weighted mixture of these posteriors. Figure 1 shows the estimated mixture density (i.e., p.d.f.) for Fisher-z parameters, with a superimposed normal density that has the same mean (-0.04) and variance (0.362), and Figure 2 shows the Pearson-r counterparts.

Figure 1. Density (p.d.f.) for Fisher-z parameters estimated from Becker’s (2009) 10 correlations using mixture (black) or normal (red).

Figure 2. Density (p.d.f.) for Pearson-r parameters estimated from Becker’s (2009) 10 correlations using mixture (black) or normal (red).
Although sampling error in these estimates should be considered, there’s a clear indication of bimodality. We could, in principle, use this mixture density to estimate and make inferences about various attributes of these correlation parameters’ distribution. For example, the percentage of negative correlations is 47.9% based on the mixture estimate and 54.5% based on its normal counterpart.
With that I’ll end this second sneak preview. For each of the three topics I’ve addressed above, I’ve only scratched the surface. I hope to write one or more subsequent posts on each of these. I’d welcome comments about which topic is most interesting or about particular issues to address.