Overview of Meta-Analysis, Part 5b (of 7): Primary Meta-Analyses (cont.)

Posted: April 30, 2012 | Author: A. R. Hafdahl | Filed under: Overview of Meta-Analysis | Tags: between-studies variance component, conditional variance, fixed effect, heterogeneity, interval estimation, math notation, meta-analysis, meta-regression, random effect, significance testing, standardized mean difference |2 Comments

This is the second of three posts in Part 5 of my overview of meta-analysis. In Part 5a I described six conventional models for meta-analysis, each of which combines within-study and between-studies models. In this second post I first comment on nested models then describe estimation and inference for two models without covariates—procedures for fitting these models to effect-size (ES) estimates and quantifying uncertainty about their focal (hyper)parameters. In the third post, Part 5c, I’ll do the same for two models with covariates and also comment on extensions and variants of these models and procedures.

Nested Models

As a precursor to estimation and inference, it’s useful to note certain relations among the six models I presented. To that end, below I list them in combined linear-model form. Where relevant we assume E_i ~ N(0, σ_i²) with known conditional variance (CV) σ_i², E(U_i) = 0, Var(U_i) = τ², and E_i and U_i are independent. (For some procedures we further assume U_i ~ N.)

SHoFE: Y_i = μ + E_i
SHeFE: Y_i = μ + η_i + E_i
SRE: Y_i = μ + U_i + E_i
MHoFE: Y_i = x_iβ + E_i
MHeFE: Y_i = x_iβ + η_i + E_i
MRE: Y_i = x_iβ + U_i + E_i

Some meta-analytic procedures involve comparing nested pairs of these models, at least implicitly. For present purposes, let’s consider Model A nested within Model B if constraining (hyper)parameters in Model B to specific values yields (a model equivalent to) Model A.

For example, the SHoFE model is nested within all five others: We can arrive at it by constraining quantities in the SHeFE model (η_i = 0), the SRE model (U_i = 0 or τ² = 0), the MHoFE model (x_i = 1 and β = μ), and so on. Similarly, the SHeFE and SRE models are nested within the MHeFE and MRE models, respectively, by the constraints x_i = 1 and β = μ, and the MHoFE model is nested within both the MHeFE model (η_i = 0) and the MRE model (U_i = 0 or τ² = 0).^F1 Furthermore, any model that permits covariates can include different sets of covariates, and two versions of such a model with nested sets of covariates (e.g., 1 set is a subset of the other) are nested models.

Comparing nested models essentially involves assessing a tradeoff between adequacy and parsimony, given that more complex models tend to fit a data set better than simpler models: It’s sensible to prefer a simpler (more complex) model whose gain in parsimony (adequacy) is large relative to its loss in adequacy (parsimony).^F2 This principle can be used to assess whether a mean effect size (ES) is plausibly some specific value (e.g., μ = 0, μ = 1/2), whether ES parameters are plausibly homogeneous (e.g., η_i = 0, τ² = 0), and whether one or more covariates’ associations with ES parameters are plausibly 0 or other specific values (e.g., for non-intercept elements of β). I’ll mention such comparisons occasionally when describing meta-analysis procedures.

The two heterogeneous fixed-effects models, SHeFE and MHeFE, seem to be used rarely. For this reason and to conserve resources, I won’t discuss them further in this overview. Interested readers might refer to the following articles about such models and associated procedures:

Bonett, D. G. (2008). Meta-analytic interval estimation for bivariate correlations. Psychological Methods, 13, 173-181. doi:10.1037/a0012868

Bonett, D. G. (2009). Meta-analytic interval estimation for standardized and unstandardized mean differences. Psychological Methods, 14, 225–238. doi:10.1037/a0016619

Bonett, D. G. (2010). Varying coefficient meta-analytic methods for alpha reliability. Psychological Methods, 15, 368–385. doi:10.1037/a0020142

Overton, R. C. (1998). A comparison of fixed-effects and mixed (random-effects) models for meta-analysis tests of moderator variable effects. Psychological Methods, 3, 354-379. doi:10.1037/1082-989X.3.3.354

Estimation and Inference: Models without Covariates

In this section I describe common procedures for estimating and making inferences about (hyper)parameters in two of the above models without covariates: SHoFE and SRE. Each subsection below focuses on one of these. Common meta-analytic procedures for these models and those with covariates share several attributes, such as using weighted least-squares (WLS) estimators for fixed effects (e.g., μ or β), with weights based on CVs.

Aside about notation: I haven’t yet figured out how to typset non-trivial mathematical expressions in blog posts (time to learn LaTeX!), so for now I’ll denote estimates of (hyper)parameters with a caret prefix (e.g., ^μ is an estimate of μ) and denote summation over k studies using ∑_i, where i runs from 1 to k (e.g., ∑_iy_i = y₁ + y₂ + … + y_k). (end of aside)

SHoFE. The main statistical tasks under this model are to estimate and make inferences about μ, the common ES parameter. A widely used procedure for accomplishing these tasks is a simple WLS method that yields a point estimate of μ and this estimate’s variance. This point estimate is just a precision-weighted mean of the ES estimates; the optimal weights—which minimize the estimator’s sampling variance or maximize its precision—are reciprocals of CVs, 1 / σ_i², but as described in this overview’s Part 2 we often estimate these weights based on CV estimates, w_i = 1 / v_i. In terms of estimated weights, the WLS point estimate is

^μ_F = ∑_iw_iy_i / ∑_iw_i ,

and its variance is

Var(^μ_F) = 1 / ∑_iw_i .

This point estimate and variance are typically used to make standard-normal inferences, such as a confidence interval (CI) for or test of μ. Specifically, we could construct a 100(1 − α)% equal-tail CI for μ as

^μ_F ± z_αSE(^μ_F) ,

where z_α = -Φ(α/2) (e.g., for a 95% CI z_.05 = 1.960) and SE = √Var(^μ_F)] is ^μ_F‘s standard error (SE). Likewise, to test the null hypothesis H₀: μ = μ₀, where μ₀ is an a priori value, we could refer the statistic

z_F = (^μ_F − μ₀) / SE(^μ_F)

to a standard-normal reference distribution to obtain a p value.^F3

Because Var(^μ_F) is not known when w_i is an estimate, standard-normal inferences might not perform as advertised (e.g., CI coverage rate below nominal, inflated Type I error rate for tests). Other potential problems include non-normality of ES estimators—especially with small samples of subjects—and non-independence of ESs. Strategies to address these problems are beyond the present scope but could entail updating weights iteratively when σ_i² depends on θ_i (= μ), using alternative ES estimators that are more normal or whose CVs are more nearly known, or eliminating or combining dependent ESs or modeling their dependence.

Another statistical task is to decide whether the model is adequate or less appropriate than another model. This falls under the general statistical problem of model selection, which is challenging in many contexts. One or more aspects of the SHoFE model could be inappropriate for our data, but perhaps the most commonly assessed aspect is between-studies homogeneity of ES parameters (i.e., θ_i = μ for all i). A popular way to assess this assumption is to test H₀: θ_i = μ, which is often done using the following heterogeneity statistic:

Q = ∑_iw_i(y_i − ^μ_F)² .

If H₀ is true (and other assumptions underlying the SHoFE model are satisfied), this weighted sum of squares follows a χ²(k − 1) distribution. Essentially, this “specification” test evaluates whether our collection of ES estimates vary more than we’d expect based on their CVs; a statistically significant upper-tail test suggests there’s excess variation due to between-studies heterogeneity of ES parameters. It’s an omnibus test designed to detect any departure from homogeneity, so it’s not tailored to a specific pattern of heterogeneity (e.g., different ES parameters for 2 subsets of studies).

This homogeneity test is a topic of controversy. Meta-analysts often misuse it to guide or defend data-analysis choices. Its performance depends on several features of the data, such as how well our ES estimators and data-collection process conform to the SHoFE model. Rejecting homogeneity doesn’t guarantee there’s some type of heterogeneity (e.g., it might be a Type I error), provide a measure of any such heterogeneity’s real-world importance, or tell us which of countless alternative models is appropriate. Likewise, failing to reject homogeneity doesn’t rule out definitively some type of heterogeneity (e.g., it might be a Type II error) or preclude detecting a specific pattern of heterogeneity (e.g., a covariate effect). Other proposed ways to assess homogeneity, such as descriptive measures of the magnitude of heterogeneity or its influence on certain results (e.g., H², I²), are beyond the present scope.

Example—Workplace Exercise: Let’s illustrate SHoFE analyses using Conn, Hafdahl, Cooper, Brown, and Lusk’s (2009) quantitative review of workplace exercise interventions, described in Part 1 of this overview. Corresponding to each of their (well, our) SRE results in Tables 2 and 3, for three types of standardized mean difference (SMD) on 11 outcome variables, they also conducted SHoFE analyses. In particular, for fitness they analyzed k = 35 two-group posttest SMDs after excluding one outlier.^F4 These estimates and their (estimated) CVs—based on shrinkage estimates of θ_i that I won’t discuss here—yield the following sums needed for SHoFE analyses:

∑_iw_i = 321.7

∑_iw_iy_i = 183.4

∑_iw_iy_i² = 172.1

These in turn yield the WLS point estimate of μ

^μ_F = 183.4 / 321.7 = 0.570

and its variance

Var(^μ_F) = 1 / 321.7 = 0.0558² .

This estimate of the common two-group posttest SMD on fitness represents a treatment mean just over ½ standard deviation (SD) above the control mean, and it’s about 10 times larger than its SE. Using these quantities for standard-normal inferences, we obtain the 95% CI

0.570 ± 1.960(0.0558) = (0.461, 0.679) .

A two-tailed test of the nil null hypothesis H₀: μ = 0 at α₂ = .05 yields the test statistic

z_F = 0.570 / 0.0558 = 10.22 ,

whose p value is 0 to many decimal places. This CI and test reflect only within-study sampling error over hypothetical meta-analyses (due to random sampling of participants), thereby supporting conditional inferences that extend only to studies like Conn et al.’s 35. The test indicates that the common SMD is (statistically) significantly positive, and the CI suggests more specifically that we can be 95% confident—in the somewhat awkward frequentist sense—that this common SMD is between 0.46 and 0.68.

To assess homogeneity we can compute the heterogeneity statistic

Q(34) = 172.1 − (183.4² / 321.7) = 67.6 ,

for which p = .000529. This indicates significant heterogeneity, which suggests these data might violate the SHoFE model’s homogeneity assumption.

SRE. This model’s two hyperparameters, μ and τ², are usefully viewed as the mean and variance (i.e., BSVC) of a (hyper)distribution of ES parameters. We can estimate and make inferences about both of these. Many meta-analysts who use this model focus solely on μ, but some are also interested in τ² or other features of the ES-parameter distribution. Perhaps the most widely used meta-analytic technique for this model is a two-step procedure that entails first obtaining a weighted method-of-moments (WMoM) estimate of τ²; adding this to each study’s CV to estimate yields unconditional variances, whose reciprocals are weights in a WLS estimate of μ. Specially, we first use the SHoFE model’s weights (w_i) and heterogeneity statistic (Q) to estimate τ² as

^τ_S² = max{0, [Q − (k − 1)] / c_S} ,

where taking the maximum avoids negative estimates, and

c_S = ∑_iw_i − (∑_iw_i² / ∑_iw_i) .

For insight into this BSVC estimator, consider the “balanced” case where every study’s CV estimate is v: Because all weights are equal (i.e., w_i = w = 1 / v for all i), ^μ_F is just the simple mean of ES estimates, Q is the unweighted sum of squared deviations from this mean, c_S reduces to w(k − 1), and ^τ_S² is either 0 or a positive value for s_y² − v, where s_y² is the usual unbiased variance estimate applied to the ES estimates. Re-arranging this yields

s_y² = ^τ_S² + v ,

which represents a decomposition of the ES estimates’ total variance into between-studies and within-study variances (i.e., due to sampling of studies and subjects). Even in the more general situation with unequal v_i, the above BSVC estimate is still essentially the excess variance in ES estimates beyond that due to within-study variance.

At any rate, we next use the BSVC estimate to estimate each study’s unconditional weight as as w_Si = 1 / (^τ_S² + v_i). (The somewhat clumsy notation w_Si distinguishes this weight from its counterparts from the SHoFE, MHoFE, and MRE models.) Provided that ^τ_S² > 0, these unconditional weights (w_Si) will be smaller—reflecting lower precision—and more similar than their conditional counterparts (w_i). Now, to estimate μ we simply apply WLS with these new weights:

^μ_R = ∑_iw_Siy_i / ∑_iw_Si .^F5

As ^τ_S² increases, ^μ_R approaches the ES estimates’ unweighted mean. The mean estimator’s variance,

Var(^μ_R) = 1 / ∑_iw_Si ,

increases with larger ^τ_S²; this is evident in the balanced case (i.e., v_i = v for all i), where Var(^μ_R) = (^τ_S² + v) / k. It’s conventional to use ^μ_R and its variance for standard-normal inferences about μ, such as a CI or test. These procedures face additional limitations besides those for their counterparts under the SHoFE model: Because ^τ_S² and hence Var(^μ_R) are subject to sampling error, standard-normal techniques may perform poorly, especially with few studies (i.e., small k). Moreover, if the CV depends on θ_i it’s unclear what substitute for θ_i in v_i would optimize estimation or inference (e.g., estimate of μ? shrinkage estimate of θ_i?). To overcome some of these limitations, other estimators for τ² have been proposed, as have other methods of inference for μ; they’re beyond this overview’s scope, but the following articles and chapter address several of them:

DerSimonian, R., & Kacker, R. (2007). Random-effects model for meta-analysis of clinical trials: An update. Contemporary Clinical Trials, 28, 105-114. doi:10.1016/j.cct.2006.04.004

Raudenbush, S. W. (2009). Analyzing effect sizes: Random-effects models. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The handbook of research synthesis and meta-analysis (2nd ed., pp. 295-315). New York: Russell Sage Foundation.

Sidik, K., & Jonkman, J. N. (2007). A comparison of heterogeneity variance estimators in combining results of studies. Statistics in Medicine, 26, 1964-1981. doi:10.1002/sim.2688

Viechtbauer, W. (2005). Bias and efficiency of meta-analytic variance estimators in the random-effects model. Journal of Educational & Behavioral Statistics, 30, 261-293. doi:10.3102/10769986030003261

As for inference about τ², we’ve already met the most common procedure (and its limitations): The Q test of the SHoFE model’s homogeneity assumption also tests H₀: τ² = 0. (Readers familiar with random-effects ANOVA may recognize a parallel with the classical 1-way ANOVA for k independent samples’ means, where the test is identical for fixed and random factors.) It’s also possible to test H₀: τ² = τ₀² with a non-zero a priori value for τ₀², but this is rarely done and won’t be addressed here. Constructing a CI for τ² is more common but still fairly rare; the following article provides computational details and may help interested readers find related work (e.g., in citing articles):

Viechtbauer, W. (2007). Confidence intervals for the amount of heterogeneity in meta-analysis. Statistics in Medicine, 26, 37-52. doi:10.1002/sim.2514

Finally, I’ll simply mention other potentially interesting features of the ES-parameter distribution without considering estimation or inference methods. In some situations we might wish to find the proportion of ES parameters below, above, or between selected values (e.g., positive, negligibly small), which involves the cumulative distribution function (CDF). Likewise, finding values that demarcate specific proportions of ES parameters, such as quartiles or percentiles, involves the quantile function (i.e., inverse CDF). For instance, we might wish to express between-studies heterogeneity as an interval or more general set of values in which most ES parameters fall, such as a 95% prediction interval, credibility interval (in validity-generalization parlance), or highest density region. These proportions and quantiles depend on the distribution’s shape, which we might estimate from our data instead of assuming normality. If we permit non-normal ES parameters, we might also be interested in higher-order moments such as skewness or kurtosis.

Example—Workplace Exercise: Let’s illustrate a SRE analysis using again Conn et al.’s (2009) 35 two-group posttest SMDs on fitness. We’ll need ∑_iw_i = 321.7 from the SHoFE analyses as well as ∑_iw_i² = 4063.9. To estimate the BSVC we first compute

c_S = 321.7 − (4063.9 / 321.7) = 309.1 ,

which in turn yields

^τ_S² = [67.6 − (35 − 1)] / 309.1 = 0.330² .

Adding this estimate to each study’s CV estimate and computing unconditional weights (w_Si) yields the following sums:

∑_iw_Si = 149.2

∑_iw_Siy_i = 86.4

These in turn yield the point estimate of μ

^μ_R = 86.4 / 149.2 = 0.579

and its variance

Var(^μ_R) = 1 / 149.2 = 0.0819² .

This estimate of the mean two-group posttest SMD on fitness is only slightly larger than its SHoFE counterpart (for the common SMD). This SRE estimate’s variance is more than twice the SHoFE estimate’s, however, reflecting the substantial BSVC. Using these quantities for standard-normal inferences about μ, we obtain the 95% CI

0.579 ± 1.960(0.0819) = (0.419, 0.740) .

A two-tailed test of the nil null hypothesis H₀: μ = 0 at α₂ = .05 yields the test statistic

z_R = 0.579 / 0.0819 = 7.08 ,

whose p value is 0 to many decimal places. This CI and test reflect both within-study and between-studies sampling error over hypothetical meta-analyses, supporting unconditional inferences that extend to a universe of studies from which our 35 were sampled. The price paid for these broader inferences, relative to their SHoFE counterparts, is a less precise estimate of μ, as reflected in the wider CI and smaller test statistic. The test indicates that the mean SMD is significantly positive, and the CI suggests we can be 95% confident that this mean SMD is between 0.42 and 0.74.

That’s all I’ll say about meta-analytic models without covariates. Stay tuned for Part 5c, in which I’ll describe and demonstrate versions of the above procedures that handle covariates; I’ll also mention some extensions and other variants of these models and procedures.

Footnotes

1. I doubt HeFE models are nested within RE models, but I’m unsure; this is rarely (if ever) discussed. Clearly they’d be equivalent if U_i = η_i, but this constraint isn’t expressed in terms of (hyper)parameters.

2. Comparing non-nested models is trickier but possible.

3. To relate this test to similar tests of fixed effects in more complex models, note that squaring z_F yields a statistic distributed approximately as χ²(1) (i.e., chi-squared with 1 degree of freedom) under H₀. We can write this as a weighted sum of squares comparing two models—one in which μ is estimated freely and another in which it’s constrained to μ₀:

Q_μF = ∑_iw_i(^μ_F − μ₀)² = (^μ_F − μ₀)² / (1 / ∑_iw_i) = [(^μ_F − μ₀) / SE(^μ_F)]² = z_F² .

4. Their analyses accounted for dependence among 4 multiple-treatment pairs and 1 multiple-treatment triplet; for simplicity I’ll instead treat the 35 SMD estimates as independent, which decreases Var(^μ_F) and Q somewhat.

5. Although ^μ_R is a different estimator than the SHoFE model’s ^μ_F, and these estimate different quantities in different models, they sometimes take the same value: when Q ≤ k − 1 and, hence, ^τ_S² = 0 so that w_Si = w_i.

2 Comments on “Overview of Meta-Analysis, Part 5b (of 7): Primary Meta-Analyses (cont.)”

Sneak Preview 2: Outliers, Metric Transformation, and ES Distribution « Meta-Analysis Sandwich says:

May 31, 2012 at 10:58 pm

[…] previous three posts on fitting models to effect sizes (ESs)—Parts 5a, 5b, and 5c—were the core of my seven-part overview of meta-analysis. With only two posts remaining […]

Reply
Overview of Meta-Analysis, Part 5c (of 7): Primary Meta-Analyses (cont.) « Meta-Analysis Sandwich says:

May 13, 2012 at 10:56 am

[…] Overview of Meta-Analysis, Part 5b (of 7): Primary Meta-Analyses (cont.) […]

Reply

Meta-Analysis Sandwich

… stuff before | meta-analysis | stuff after …

Overview of Meta-Analysis, Part 5b (of 7): Primary Meta-Analyses (cont.)

Nested Models

Estimation and Inference: Models without Covariates

Footnotes

2 Comments on “Overview of Meta-Analysis, Part 5b (of 7): Primary Meta-Analyses (cont.)”

Leave a comment Cancel reply

Follow Blog via Email

Recent Posts

Archives

Blogroll

Meta-Analysis Sandwich

… stuff before | meta-analysis | stuff after …

Overview of Meta-Analysis, Part 5b (of 7): Primary Meta-Analyses (cont.)

Nested Models

Estimation and Inference: Models without Covariates

Footnotes

Share this:

Related

2 Comments on “Overview of Meta-Analysis, Part 5b (of 7): Primary Meta-Analyses (cont.)”

Leave a comment Cancel reply

Follow Blog via Email

Recent Posts

Archives

Blogroll