Overview of Meta-Analysis, Part 5c (of 7): Primary Meta-Analyses (cont.)

Posted: May 13, 2012 | Author: A. R. Hafdahl | Filed under: Overview of Meta-Analysis | Tags: Bayesian analysis, between-studies variance component, dependence, fixed effect, heterogeneity, interval estimation, meta-analysis, meta-regression, model comparison, moderator, multivariate effect size, random effect, significance testing, standardized mean difference |1 Comment

This is the last of three posts in Part 5 of my overview of meta-analysis. In Part 5a I described six conventional meta-analytic models for effect-size (ES) estimates, and in Part 5b I described estimation and inference for two of those models without covariates. In this post I’ll extend the methods of Part 5b to two models with covariates and comment on extensions and other variants of these models and procedures, to hint at the wide variety of situations that arise in meta-analysis. In Parts 6 and 7 of the overview, I’ll address follow-up procedures and ways to report results, respectively.

Estimation and Inference: Models with Covariates

In each subsection below I describe common procedures for estimating and making inferences about (hyper)parameters in two conventional meta-analytic models with one or more study-level covariates: MHoFE and MRE. Each of these models extends its simpler no-covariate counterpart—SHoFE or SRE—by incorporating fixed covariate effects. These effects, contained in the coefficient vector β, are often central to meta-analyses that address ES moderators.

MHoFE. Under this meta-regression model, estimation and inference focus mainly on β, the vector of regression coefficients. The widely used weighted least-squares (WLS) procedures for these tasks just generalize their counterparts for the SHoFE model. We can express these compactly by using matrices to collect the studies’ covariates, weights, and ES estimates:

X = [x₁^T x₂^T … x_k^T]^T is a k × (q+1) matrix formed by stacking x_i vertically.
W = diag(w₁, w₂, …, w_k) is a k × k diagonal matrix with (estimated) weights w_i = 1/ v_i on the diagonal (and 0 elsewhere).
y = [y₁ y₂ … y_k]^T is a k-element column vector of ES estimates.

Now we can estimate β as

^β_F = (X^TWX)^-1X^TWy ,

and this estimator’s covariance matrix is

Cov(^β_F) = (X^TWX)^-1 .^F1

To be clear, the (partial regression) coefficients in ^β_F are not standardized, and Cov(^β_F) contains each coefficient’s sampling variance on the diagonal and each pair’s sampling covariance off the diagonal. We can use ^β_F and Cov(^β_F) for inference about β. For instance, denoting individual coefficients as β_j, j = 0, 1, 2, …, q, we can apply standard-normal procedures to ^β_jF and SE(^β_jF) = √Var(^β_jF) to test H₀: β_j = β_j0 or construct a confidence interval (CI) for β_j just as we did for μ under the SHoFE model.

More generally, standard-normal inference procedures extend readily to two or more of β‘s elements or linear combinations thereof, such as if we wish to test or construct a multivariate confidence region for all non-intercept coefficients, a subset of them (e.g., testing a block of covariates while partialling another), or linear combinations of them (e.g., 1 or more contrasts among a categorical moderator’s levels). In brief, the general procedures for m linear combinations of β‘s elements use a m × (q + 1) matrix L, of which each row contains that linear combination’s q + 1 coefficients. Denoting the vector of linear combinations as γ = Lβ, we can use L to obtain the estimate

^γ_F = L^β_F

and its covariance matrix

Cov(^γ_F) = LCov(^β_F)L^T .

We can in turn use these to test γ (e.g., H₀: γ = γ₀) or its elements or construct confidence regions for these quantities using multivariate normal-theory procedures that involve χ² distributions. All of these inference methods are prone to similar problems as their counterparts for the SHoFE model, with additional complications such as what to substitute for θ_i when the conditional variance (CV) depends on it (e.g., y_i, x_i^β_F, ^μ_F).

As for assessing the MHoFE model’s assumption of residual homogeneity, we can use a generalization of the SHoFE model’s Q statistic. Namely, this assumption that each study’s ES parameter is its covariate-predicted value—so there’s no excess or residual variation of ES parameters—can be written as the null hypothesis H₀: θ_i = x_iβ. We can test this using the residual (or error) heterogeneity statistic

Q_E = (y − X^β_F)^TW(y − X^β_F) = ∑_iw_i(y_i − x_i^β_F)² ,

which is distributed approximately as χ²[k − (q + 1)] under H₀. A significant upper-tail test indicates there’s more heterogeneity among ES parameters than expected due to the covariate(s) and CVs. This test is subject to similar limitations as its SHoFE counterpart.

The form of Q_E suggests a weighted sum of squared deviations between predicted values from two nested models: one with k parameters (θ_i for each study) and another with q + 1 parameters. Indeed, Q_E is a special case of a more general statistic for comparing Models A and B when A is nested within B:

Q_A,B = ∑_iw_i(x_iB^β_FB − x_iA^β_FA)² ,

where x_iA and x_iB are Study i‘s covariate values for each model and ^β_FA and ^β_FB are each model’s estimated coefficients. Under the null hypothesis that Models A and B are equivalent (i.e., constraints on B to create A reflect reality), Q_A,B is distributed approximately as χ²(q_B − q_A), where q_A and q_B are each model’s number of independent parameters. A significant upper-tail test indicates that Model B’s gain in adequacy more than offsets its loss in parsimony. Furthermore, if w_i is the same for both models, then we can compute this model-comparison statistic as

Q_A,B = Q_EA − Q_EB ,

the difference between the two models’ residual heterogeneity statistics. For example, since the SHoFE model is nested within the MHoFE model, we can test the latter’s q non-intercept coefficients (in β) by comparing these models—with 1 and q + 1 parameters—using Q − Q_E, which is distributed approximately as χ²(q) when all non-intercept coefficients are 0.

Finally, several issues related to these procedures for the MHoFE model deserve mention:

Most of the above procedures can be implemented as weighted versions of least-squares methods in popular statistical software packages, though care must be taken to ensure that inferential results are handled correctly (i.e., treating w_i as known).
Some procedures used routinely in ordinary least-squares regression for primary-study data are less useful or harder to justify under this MHoFE model. For instance, meta-analytic methodologists rarely discuss standardizing ^β_F, and R²-type indices are complicated by the model’s known, heterogeneous CVs.
For certain models in which all covariates represent coded values for categorical moderators (i.e., ANOVA analogues), some of the above formulas can be expressed in terms of weighted cell means, main and interaction effects, etc. Some authors report results for such models in terms of a decomposition of total heterogeneity into that due to one or more effects (e.g., main, interaction) and within-cell variation.
Comparing models usually requires estimating both models from the same data. This is often complicated by missing data: Because adding more covariates tends to reduce the number of studies with complete data, some studies with all covariates for a simpler model might not have all covariates for a more complex model.
Meta-regression analyses often entail multiple inference, such as several tests or CIs based on a given model (e.g., for 2 or more elements of β or γ) or inferences based on multiple models for the same ES estimates. In these situations, modifying procedures to avoid inflated Type I error rates or overconfidence might be advisable (e.g., adjusted tests, simultaneous CIs).

Example—Workplace Exercise: Conn et al. (2009) conducted meta-regression analyses to investigate potential moderators of two-group posttest standardized mean differences (SMDs) on four outcome variables: physical activity, fitness, lipids, and anthropometric measures. Although they (well, we) reported results for only separate mixed-effects (i.e., MRE) analyses of several dichotomous covariates and one three-level categorical covariate, they also conducted fixed-effects (i.e., MHoFE) versions of these analyses as well as fixed- and mixed-effects analyses of various pairs and larger sets of selected dichotomies. They considered these numerous analyses largely exploratory, aimed at generating hypotheses to be examined in future primary studies.

To illustrate a MHoFE analysis, let’s consider Conn et al.’s (2009) fixed-effects model for all k = 35 (non-outlier) fitness SMDs and the dichotomy Paid During Intervention (PDI)—whether employees were paid during their time participating in the intervention.^F2 They dummy coded PDI such that x_i = [1 1] if the study reported that employees were paid during the intervention (PDIy, k_y = 8 SMDs) and x_i = [1 0] otherwise (PDIn, k_n = 27 SMDs). Computational details aside, here are the basic results WLS yields:

intercept and its variance: ^β_0F = 0.466, Var(^β_0F) = 0.0625² .

slope and its variance: ^β_1F = 0.512, Var(^β_1F) = 0.138² .

intercept-slope correlation: Corr(^β_0F, ^β_1F) = -0.452 .

Their main interest was in the slope. Denoting the common SMDs for PDIn and PDIy as μ_y and μ_n, respectively, we see that their dummy-coding scheme implies that β₁ = μ_y − μ_n (because μ_n = β₀ and μ_y = β₀ + β₁), so ^β_1F estimates the difference between common SMDs. Using the above quantities for standard-normal inferences about β₁, we obtain the 95% CI

0.512 ± 1.960(0.138) = (0.241, 0.783) .

A two-tailed test of the nil null hypothesis H₀: β₁ = 0 at α₂ = .05 yields the test statistic

z_F = 0.512 / 0.138 = 3.70 ,

for which p₂ = .000216. (Some authors would report this as a heterogeneity statistic for the [non-intercept] model, between-groups, or regression source, Q_MF(1) = z_F² = 3.70² = 13.7.) This fixed-effects CI and test support conditional inferences that extend only to studies like the 35 we’ve included. The test indicates that the common SMD is significantly higher for studies reporting that employees were paid during the intervention, and the CI suggests we can be 95% confident that this PDIy “advantage” is between 0.24 and 0.78.

We could use the WLS results to estimate μ_y and μ_n, as Conn et al. (2009) did, and make inferences about these common SMDs, either separately or simultaneously as a pair. We could also estimate and make inferences about other linear combinations of β₀ and β₁ (e.g., an unweighted or weighted mean of μ_y and μ_n), terms in countless more complex models that include other covariates and joint effects (e.g., interaction effects), and so on. I’ll defer those, perhaps for later posts.

Finally, this analysis yields the residual (or within-group, or error) heterogeneity statistic Q_E(1) = 53.9, p = .0122, which indicates significant heterogeneity beyond PDI. Note that

Q_MF + Q_E = 13.7 + 53.9 = 67.6 = Q ,

where Q is from the SHoFE model in Part 5b. On a related note, we could obtain most of the MHoFE results by fitting the SHoFE model separately to each of the PDIy and PDIn subsets; in particular, we could compute Q_E as the sum of the two resulting Q statistics, say Q_y and Q_n. Hence, we’d have the following decomposition of total heterogeniety into between-groups/model and two within-group/error sources:

Q = Q_MF + Q_y + Q_n .

MRE. Because this model generalizes each of the SRE and MHoFE models, we’ve already done most of the heavy lifting to understand the former’s meta-regression procedures. Simply put, estimation of and inference for β depends on τ², the residual between-studies variance component (BSVC) that quantifies residual heterogeneity beyond the covariate(s). Here I describe a relatively simple two-step procedure that involves a weighted method-of-moments (WMoM) estimator for τ² and WLS methods for β. Specially, we first use the fixed-effects weights (w_i) and MHoFE residual heterogeneity statistic (Q_E) to estimate τ² as

^τ_M² = max{0, [Q_E − (k − q − 1)] / c_M} ,

where

c_M = ∑_iw_i − tr[(X^TWX)^-1X^TW²X] ,

and tr() denotes the argument matrix’s trace (i.e., sum of diagonal elements). We next use the BSVC estimate to estimate each study’s unconditional weight as w_Mi = 1 / (^τ_M² + v_i), which we in turn use with WLS to estimate β and obtain this estimate’s covariance matrix:

^β_R = (X^TW_MX)^-1X^TW_My ,

where W_M = diag(w_M1, w_M2, …, w_Mk), and

Cov(^β_R) = (X^TW_MX)^-1 .

(The notation w_Mi and W_M distinguishes these quantities from their counterparts for the SHoFE, SRE, and MHoFE models.) As with the SRE model, it’s conventional to use ^β_R and its covariance matrix for standard-normal or multivariate normal-theory inferences about β, such as confidence regions or tests like those described for the MHoFE model—including linear combinations. These procedures face similar limitations as their counterparts under the SRE model; Raudenbush (2009) described alternative methods that overcome some of these limitations:

Raudenbush, S. W. (2009). Analyzing effect sizes: Random-effects models. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The handbook of research synthesis and meta-analysis (2nd ed., pp. 295-315). New York: Russell Sage Foundation.

As you might suspect, using Q_E to test residual homogeneity under the MHoFE model also tests the MRE model’s residual BSVC (i.e., H₀: τ² = 0). Although comparing nested MRE models is complicated by τ², which differs between models with different covariates, we can make inferences about subvectors of β by using inference for linear combinations of β (e.g., with point estimate ^γ_R = L^β_R). Some authors suggest comparing two MRE models (or an MRE and SRE model) informally using the following proportion-of-variance measure based on BSVC estimates for Models A and B:

(^τ_MA² − ^τ_MB²) / ^τ_MA² .

This ratio may be negative, however, when Model B accounts for little more heterogeneity relative to its more parameters.

Example—Workplace Exercise: To illustrate a MRE analysis, let’s use the mixed-/random-effects analysis of PDI from Conn et al.’s (2009) Table 4—with the same dummy-coding scheme as above—and compare its results to those from the SRE and MHoFE analyses.^F2 To estimate the residual BSVC we’ll need c_M = 294.0 (computational details omitted), which yields

τ_M² = [53.9 − (35 − 1 − 1)] / 294.0 = 0.267² .

At this point we can roughly compare the MRE and SRE models by comparing their BSVCs:

(^τ_S² − ^τ_M²) / ^τ_S² = (0.330² − 0.267²) / 0.330² = 0.191 ,

which suggests that PDI accounts for about 19% of the between-studies variance in SMDs. Adding the estimated residual BSVC to each study’s CV estimate and computing unconditional weights (w_Mi) yields the following WLS results for β in the MRE model:

intercept and its variance: ^β_0R = 0.490, Var(^β_0R) = 0.0831² .

slope and its variance: ^β_1R = 0.433, Var(^β_1R) = 0.187² .

intercept-slope correlation: Corr(^β_0R, ^β_1R) = -0.445 .

Under this model ^β_1R estimates the difference in mean SMD between PDIy and PDYn studies. This difference under the MRE model (0.433) is notably smaller than its MHoFE counterpart (0.512), and the former’s variance (0.187²) is more then 80% larger than the latter’s (0.138²). Using the MRE quantities for standard-normal inferences about β₁, we obtain the 95% CI

0.433 ± 1.960(0.187) = (0.067, 0.799) .

A two-tailed test of H₀: β₁ = 0 at α₂ = .05 yields the test statistic

z_R = 0.433 / 0.187 = 2.32 ,

for which p₂ = .0203. (This matches the [non-intercept] model heterogeneity statistic Conn et al. reported, Q_MR(1) = z_R² = 2.32² = 5.4.) This mixed-effects CI and test support unconditional inferences that extend to a universe of studies from which our 35 were sampled. The test indicates that the mean SMD is significantly higher for studies reporting that employees were paid during the intervention, and the CI suggests we can be 95% confident that this PDIy advantage is between 0.07 and 0.80. This wider CI than the MHoFE model’s reflects less precision due to incorporating between-studies heterogeneity into our inference.

Extensions and Other Variants

I suspect that a majority of meta-analyses conducted to date have used meta-analysis models and procedures described in Parts 5a, 5b, and 5c. Countless other techniques exist, however, that differ in subtle or substantial ways from those I’ve presented. In this final section I’ll comment briefly on several different approaches that either extend those I’ve described or depart from them in notable ways.

Model extensions. The models I’ve described can be viewed as special cases of more general models. Two such extensions involve multiple ES estimates from each study or other focal unit of analysis (e.g., report). As suggested in Part 1 of this overview, one form of multiple ESs involves a multivariate ES, which is a vector of distinct ESs. Extending the models I’ve described to multivariate ESs essentially entails replacing within-study and between-studies variances with covariance matrices, accommodating incomplete ESs from some studies (i.e., missing elements), and structuring the design matrix to accommodate (possibly different) covariates for each element. For instance, multivariate ESs arise in meta-analytic approaches for diagnostic test accuracy (e.g., sensitivity and specificity), mixed-treatment comparisons (e.g., direct and indirect evidence), and explanatory models (e.g., path or factor models).

Multiple ESs may also occur when a study contributes two or more estimates of essentially the same ES parameter, such as from independent samples or the same sample on different measures or occasions. This nesting or clustering of ESs induces a more complicated structure some authors call “hierarchical dependence,” whereby a study’s ES parameters might be less (or more) heterogeneous among themselves than ES parameters from different studies. We can accommodate this by extending the two-level models I’ve described to include an intermediate level between studies and ES estimates, which might also specify within-study covariates to account for variation among a study’s ES parameters.

Other types of extensions have also been proposed, such as models that incorporate certain types of bias (e.g., publication bias, inadequate randomization or allocation concealment), individual participant/patient data (IPD), or heterogeneity due to unobserved groups of studies (e.g., finite mixtures). These are extensions insofar as models I’ve described can be expressed as special cases, such as when there’s no bias, no IPD, or only one group of studies. Estimation and inference procedures for such models and those involving multiple ESs are beyond this overview’s scope.

Alternative procedures. Estimation and inference for many of the models I described and their extensions mentioned above can be handled using different procedures. For example, methods developed for linear mixed models—also called multilevel or hierarchical linear models in some contexts—can be adapted for many meta-analytic models; this requires care in handling the latter’s special error structure (e.g., known heterogeneous conditional variances). Along similar lines, readers familiar with connections between mixed models and structural equation models (SEMs) may not be surprised that clever adaptations of SEM software can be used for many meta-analytic models.

A more substantial departure from procedures I’ve described involves Bayesian approaches, which are becoming more popular and are especially useful for complex models. Consider a Bayesian approach for the SHoFE model: We could express our belief about plausible values of μ as a prior distribution, and Bayesian techniques could be used to combine this prior with our ES estimates and CVs to obtain a posterior distribution for μ; from this posterior, which represents our prior belief updated by the data, we could obtain a point estimate of or inferences about μ. We can use similar strategies for more complex models by specifying priors for all (hyper)parameters. Bayesian methods typically require special software, which is becoming more widespread and accessible for non-statisticians.

Special data types. Meta-analytic models and methods have been proposed for numerous special types of data that may not conform well to the conventional models I’ve described in Part 5. Below I offer brief remarks on several of these data types.

Validity generalization: Methods exist to adjust ESs for various so-called artifacts, such as unreliability and range restriction. Developed originally for correlations from studies of predictive validity in personnel selection, these methods have been extended to regression slopes, mean differences, and other types of ESs.
Reliability generalization: Procedures have been proposed to meta-analyze various measures of reliability (in a psychometrics context), such as test-retest correlations or internal-consistency coefficients.
Significance levels: Numerous techniques have been proposed to summarize p values from several studies as one combined test of the composite null hypothesis that every study’s null hypothesis is true. Historically popular, these procedures neglect ESs and are now used mainly in special applications (e.g., microarrays).
Vote counts: When some studies provide for an ES only the estimate’s direction or its directional significance test’s binary result (e.g., significantly positive or not), this crude information can be used to estimate (hyper)parameters in certain meta-analytic models.
Categorical outcome: For ESs used with binary or other categorical outcomes (e.g., proportions, counts), models that respect these discrete variables (e.g., binomial, Poisson) may perform better than those based on normal approximations.
Single-subject designs: Methods have been proposed for studies that include only one or a few subjects measured on several occasions, usually under two conditions experienced in phases.
Longitudinal: When several subjects are measured on multiple occasions, meta-analytic methods for combining such studies typically incorporate information about dependence between repeated measurements.
Neuroimaging: Meta-analytic techniques for images of brain structure or function, such as fMRI maps, are complicated by the nature of the data—(relative) activation level from many locations in a three dimensional space.
Genetics and genomics: Studies of genetic linkage, genetic association, gene expression, or other phenomena involving genes present challenges for meta-analysis, such as many results from each unit (e.g., in genome-wide studies or from microarrays) and joint effects that involve sets of genes (e.g. pathways).

With that, I’ll end this third and final post of Part 5. As with previous parts in this overview of meta-analysis, this tour of meta-analytic models and procedures emphasized key ideas but omitted several complications meta-analysts encounter with real data. Some of these complications will be addressed in Part 6, but others are beyond this overview’s scope. I hope to address some of the latter in future posts.

Footnotes

1. For the SHoFE model x_i = [1] and β = μ, so the MHoFE formulas simplify markedly: Because X^TWX = ∑_iw_i we have Cov(^β_F) = Var(^μ_F), and because X^TWy = ∑_iw_iy_i we have ^β_F = ^μ_F.
2. Their moderator analyses did not account for dependence among 4 multiple-treatment pairs and 1 multiple-treatment triplet; for simplicity I’ll follow that practice here.

One Comment on “Overview of Meta-Analysis, Part 5c (of 7): Primary Meta-Analyses (cont.)”

Sneak Preview 2: Outliers, Metric Transformation, and ES Distribution « Meta-Analysis Sandwich says:

May 31, 2012 at 10:58 pm

[…] Overview of Meta-Analysis, Part 5c (of 7): Primary Meta-Analyses (cont.) […]

Reply

Meta-Analysis Sandwich

… stuff before | meta-analysis | stuff after …

Overview of Meta-Analysis, Part 5c (of 7): Primary Meta-Analyses (cont.)

Estimation and Inference: Models with Covariates

Extensions and Other Variants

Footnotes

One Comment on “Overview of Meta-Analysis, Part 5c (of 7): Primary Meta-Analyses (cont.)”

Leave a reply to Sneak Preview 2: Outliers, Metric Transformation, and ES Distribution « Meta-Analysis Sandwich Cancel reply

Follow Blog via Email

Recent Posts

Archives

Blogroll

Meta-Analysis Sandwich

… stuff before | meta-analysis | stuff after …

Overview of Meta-Analysis, Part 5c (of 7): Primary Meta-Analyses (cont.)

Estimation and Inference: Models with Covariates

Extensions and Other Variants

Footnotes

Share this:

Related

One Comment on “Overview of Meta-Analysis, Part 5c (of 7): Primary Meta-Analyses (cont.)”

Leave a reply to Sneak Preview 2: Outliers, Metric Transformation, and ES Distribution « Meta-Analysis Sandwich Cancel reply

Follow Blog via Email

Recent Posts

Archives

Blogroll