Dr. Damian Sendler Psychology’s Generalized Linear Mixed Models
Damian Sendler: An advantage of using generalized linear mixed models (GLMMs) is that they can be used to estimate both the fixed and random effects of a dependent variable that isn’t ordinal, numerical, or qualitative. Additionally, GLMMs can model autocorrelation when the dependent variable has multiple measures. It was the goal of this study to […]
Last updated on April 21, 2022
damian sendler physician scientist

Damian Sendler: An advantage of using generalized linear mixed models (GLMMs) is that they can be used to estimate both the fixed and random effects of a dependent variable that isn’t ordinal, numerical, or qualitative. Additionally, GLMMs can model autocorrelation when the dependent variable has multiple measures. It was the goal of this study to determine how often and how frequently GLMMs are used in psychology, and to summarize how they are presented in published articles. We focused primarily on frequentist models in this regard. We searched the Web of Science for articles published between 2014 and 2018 that used GLMMs in psychology. From 2014 to 2018, 316 empirical articles were chosen for trend analysis. We then conducted a systematic review of 118 GLMM analyses from 80 empirical articles indexed in Journal Citation Reports in 2018 in order to evaluate the quality of the reports.. Over time, the use of GLMMs has grown in popularity, as evidenced by the fact that 86.4% of the articles published were from quartile 1 or 2. When it comes to GLMMs, the majority of relevant information has been left out, despite their increasing popularity in the field of psychology. Improved report quality is needed according to current GLMM recommendations.

Damian Jacob Sendler: The complexity of data analysis methods varies. Analysis of variance (ANOVA) and regression analysis are two of the most commonly used techniques in the health and social sciences because they are based on the general linear model. Assumptions such as normal distribution, homogeneity of variance, and independence of errors are not always met in real-world settings when testing quantitative variables. Advanced analysis is provided by the generalized linear model (GLM) and linear mixed model (LM). As the name implies, the GLM is just another name for linear regression in a more generalized form. When the assumptions of independence and constant variance are violated, the LMM can be used to incorporate random effects factors. A more complex version of the linear mixed model, the generalized linear mixed model (GLMM), incorporates random effects and extends the LMM to include other types of response variables. If we are only dealing with fixed effects or random effects, then our choice of model depends on the metric of the response variable and its distribution. As a result, each of the aforementioned models serves a distinct purpose and is best suited for a specific type of data. GLMMs are used in psychology in this article. Readers interested in GLMMs should consult Dean and Nielsen (2007) and Stroup (2007) for a more in-depth discussion (2013).

Dr. Sendler: The skewness and kurtosis values obtained in health, social science, and educational research frequently depart from the normal distribution in a clear way (Micceri, 1989; Lei and Lomax, 2005; Bauer and Sterba, 2011; Blanca et al., 2013; Arnau et al., 2014; Bono et al., 2020). Non-normality in the distributions of real psychological data was found by Micceri (1989), Blanca et al. (2013), and other researchers. Analysis of 440 achievement and psychometric measures by Micceri (1989) identified several classes of non-normality in the distribution. There were 693 distributions examined by Blanca et al. (2013), who found that the vast majority of them were non-normal. Recent systematic reviews of empirical studies in the fields of health, education, and social science have found that a high percentage of non-normal distributions have been used in the studies, with the most commonly used ones being gamma, negative binomial, multinomial, binary and lognormal in order.

Data in psychology frequently follow distributions other than the normal, as shown by these studies. However, despite the fact that ANOVA has been shown to be tolerant of non-normality (Kanji 1976), Khan and Rayner, 2003; Schmider et al., 2010; Ferreira et al., 2012; Blanca et al., 2017), it is not suitable for multinomial or ordinal data, and it is also not optimal with count data (Aiken et al., 2015). As a result, researchers in the field of applied statistics must pick a statistical technique that is suitable for their data rather than attempting to use classical approaches at all costs (e.g., by transforming data so as to achieve normality or using non-parametric analyses). In psychology, nested sampling and repeated measures are common study designs that imply non-independence of observations. GLMMs are well-known to applied psychologists as a flexible tool for analyzing such data, but their complexity has kept them from being widely used. ANOVA is, in fact, the most commonly used analytic technique in psychological research, according to a number of reviews (Edgington, 1964, 1974; Reis and Stiller, 1992; Schinka et al., 1997; Kieffer et al., 2001). Studies by Skidmore and Thompson (2010) and Counsell and Harlow (2017), as well as others in the recent era, have found that ANOVA, correlations, and regression are the most commonly employed techniques in psychology. Following regression, ANOVA was found to be the most commonly used data analysis procedure in a Blanca et al. (2018) review of 663 empirical studies in various areas of psychology published in prestigious journals in 2017.

GLMs are defined by Thiele and Markussen (2012) in terms of regression models that can be used by researchers to model a wide range of dependent variables through linear combinations involving one or more predictor variables (fixed effects). Using a link function, the response variable’s values can be transformed to match the scale of the linear predictors. As a result, the distributions and link functions are in agreement. An appropriate distribution and link function for the available data must be found before modeling can begin (Garson, 2013). A Poisson or negative binomial distribution is appropriate for count data if the variance is greater than the mean (overdispersion); proportions and binary outcomes are both binomial variables, and a logit link function is used as a connecting function. Each distribution has a natural link function, but less commonly used alternatives may be better suited to the data in some cases (Thiele and Markussen, 2012). Modeling binomial data with a probit link is one option, while modeling large-mean count data with an identity link is another. Identity link can also be used to model the negative binomial distribution. Other distributions and link functions may be available depending on the software package. To get the best fit and parameter interpretation, Thiele and Markussen (2012) suggested fitting models with a variety of links to the data.

Also keep in mind that GLMMs have some random effects that can change when the experiment is repeated. Subjects in a drug study, classrooms in an education study, or the passage of time in repeated measurements are all examples of random effects. With random effects, the intraclass correlation (ICC) can be calculated in multilevel modeling. Grouping structure can explain a significant portion of the variance in outcomes, according to Hox (2002). It is also possible to think of the ICC as the expected correlation between any two randomly selected members of the same group (Hox, 2002). It is the proportion of the total variance in the outcome that is due to differences between units at higher levels that Heck et al. (2010, 2012) call the ICC.

Another issue to keep in mind when it comes to GLMMs is overdispersion. Incorrect standard errors are generated by this phenomenon, and as a result, the data have a greater variance than the statistical model would predict or expect (Bell and Grunwald, 2011). Real-world data often have a much higher standard deviation than the mean. Type I error rates were found to be significantly inflated by Milanzi et al. (2012) when overdispersion was ignored, implying that the likelihood of detecting a fictitious effect rises. As a result, in psychological studies, it is possible for overdispersion to lead to incorrect conclusions. Count results that are heavily skewed by many zero observations (e.g. total number of drinks, number of drinking problems or days of drinking), for example, from the field of addictive behavior, are combined with repeated assessments (e.g., longitudinal follow-up after intervention). By using hierarchical or multilevel models to analyze longitudinal substance use data, Atkins et al. (2013) tackled the issue of overdispersion. Statistics based on the assumption of normally distributed residuals fail miserably when applied to these data. It’s also important to note that when using generalized linear models, data with many zeroes cannot be modelled by the probability distributions. This type of data is much better suited to GLMMs for count regression models like overdispersed Poisson or negative binomial regression, zero-inflated or Tobit (Atkins and Gallop, 2007; Coxe et al., 2009; Hilbe, 2011; Thiele and Markussen, 2012; Aiken et al., 2015).

A study by Milanzi et al. (2012) found that while GLMMs can account for the heterogeneity caused by correlated measurements, additional sources of heterogeneity may affect statistical inferences if ignored. Due to repeated measurements and additional overdispersion, longitudinal Poisson data may have a high degree of heterogeneity.

Information on the estimation method should also be included. In the absence of this information, it is difficult to determine not only the model’s suitability, but also its reliability, validity, and accuracy. The dependent variable and the random effects to be included in the model influence the estimation method selection (Bolker et al., 2009). Many factors must be taken into account when selecting the best method from the many available options. When the standard deviations of the random effects are large, the penalized quasi-likelihood (PQL) method produces biased parameter estimates, especially with binary data (Bolker et al., 2009). For Poisson and binomial variables, PQL estimates can be skewed if the mean counts within groups are less than five, or if the mean number of either successes or failures is less than five (Thiele and Markussen, 2012). Other examples include Laplace approximation yielding less biased estimates with a large number of clusters, while bias is greater when there are fewer clusters in the data set (McNeish, 2016).

Finally, the information criterion, which is used to evaluate or compare various models and take into account the best model fit, must be considered. This is where the Akaike information criterion (AIC) comes in. An information criterion that is very similar to AIC is the Bayesian or Schwarz criterion (BIC). In comparison to the AIC, the BIC prefers simpler models (Keselman et al., 1998). Another option for information criterion evaluation is AIC (corrected AIC for small sample sizes, quasi-Aic for overdispersed data). Different indices may perform equally well, so the choice is left up to the researcher’s discretion. As a result, authors may choose to use a variety of fit indices in the same study.

GLMMs have become more popular as a result of statistical software. Even though GLMM theory and concepts have been around since the early 1990s, the inclusion of PROC GLIMMIX in SAS has made these techniques more widely available and usable in the behavioral and social sciences (Charnigo et al., 2011). R, STATA, and SPSS are some of the more popular software packages that include GLMM fitting procedures. Despite the fact that statistical software is readily available, these models remain complicated. Furthermore, GLMM computation algorithms may fail to converge due to a complex random and fixed effects structure.

Psychologists are starting to use generalized linear mixed models (GLMMs) more frequently, but they are still far less common than in other fields like ecology (Bolker et al., 2009; Johnson et al., 2015; Kain et al., 2015), psychophysics (Moscatelli et al., 2012), biology (Thiele and Markussen, 2012), medicine (Cnnan et al., 1998; Platt et al Casals et al. (2014) conducted a systematic review of the use and reporting quality of GLMMs in clinical medicine, and found that while these models became more popular between 2000 and 2012, the report quality was poor. There has yet to be a psychology systematic review of this type, and we believe that such a study would aid in the proper application of GLMMs in the field.

There were two main goals in this investigation. GLMMs are frequently used in studies published in high-profile psychology journals, and the trend in this regard was examined in a first step. Because it’s important to show how these models can be used in psychology, as well as to explain why and how, we conducted a systematic review to show how they can be used, with an emphasis on frequentist models. The quality of the report was important to us, so we looked at articles published in the field of psychology to see if they contained all the information necessary for GLMMs.

Damian Sendler

Generating generalized linear mixed models is a combination of generalized linear models and generalized linear mixed models (that can handle non-normal data by using link functions and fitting distributions from the exponential family such as the binomial, multinomial, Poisson, gamma, lognormal or exponential). This means that GLMMs are better suited for the analysis of data with distributions other than normal, whether continuous or discrete in nature. These random effects must be used in conjunction with a distribution, link function, and structure that are all explicitly specified. It is possible to estimate parameters and test significance using a variety of methods, and their suitability depends on the data.

GLMMs can be extremely useful to psychologists because they allow the analysis of categorical data as well as counts or proportional responses to be performed on the responses. GLMMs also allow psychologists to generalize their findings by incorporating random effects. Because of this, GLMMs rarely appear in research published by prominent psychology journals despite their many advantages (Blanca et al., 2018). Thus, the purpose of this study was to examine the use of these models in psychology between 2014 and 2018. We were able to learn more about how GLMMs are used and reported in psychology through a review of empirical studies published in 2018 that used them. So we were able to see where the results were lacking and, in turn, assess the overall quality of the psychological reports that used GLMMs. These statistical analyses should be used correctly, and we hope to encourage psychologists to use them more frequently.

Although at a slower rate than that observed by Casals et al. (2014) in clinical medicine between the years 2000 and 2012, the number of articles in various fields of psychology that used GLMMs increased between 2014 and 2018. A similar thing happened when LMMs were used prior to GLMMs in medicine and gradually introduced into psychology (Bono et al., 2008). When researchers in psychology become more aware of the advantages of GLMMs, we hope that they will also be widely used.

For the years 2014–2018, we examined 198 JCR-indexed journals for articles related to psychology. PLOS ONE was the journal with the most articles involving GLMMs in its archives. If these models are used in first or second quartile journals, it suggests that the use of these more advanced analytical models is linked to publication success in journals that have a higher impact factor.. Substance Abuse, Psychiatry, and Multidisciplinary Sciences were the most common JCR categories to which articles in this journal corresponded. A large number of articles using GLMMs were published in the United States, followed by the United Kingdom, in terms of the country of publication. The majority of the first authors were from the United States, followed by Germany, Australia, the United Kingdom, the Netherlands, and Canada (all at much lower frequencies).

Although gamma and negative binomial distributions are widely used in health, education, and social sciences (Bono et al., 2017), our results show that they are still underutilized.. According to Casals et al. (2014), clinical medicine, we found that the response variable’s distribution was most commonly binomial. In spite of the fact that our total number of GLMM analyses was greater than the number of articles examined by Casals et al. (2014), the percentages reported in both studies are consistent with the type of distribution and other variables examined.

We found that more than half of the GLMM analyses we looked at failed to provide information on the distribution’s shape or the function that links variables together. These two variables are naturally linked, but it is important to note that different link functions may be appropriate for different distributions (Garson, 2013). Furthermore, it is common practice to use a different distribution than the one that best fits the data when there is overdispersion. Overdispersion is common, for example, in Poisson data sets.

To ensure the validity of GLMM estimates, it is important to report the estimation method used in a study. For each model, we found an estimation method in only 17.7% of analyses, which is similar to the percentage found by Casals et al. (2014) in clinical medicine. It is likely that the authors used the default method when the estimation method is not reported (e.g., pseudo-likelihood is the default method in glimmix in SAS). However, this should be specified.

Damian Jacob Markiewicz Sendler: Maximum likelihood was found to be the most commonly used method for parameter estimation in the psychology field. When comparing models with fixed and random effects, maximum likelihood can underestimate the standard deviations of random effects (Bolker et al., 2009). Although the goal was to compare models, our review found that maximum likelihood was mostly used with small samples (less than 100). For GLMMs with more than two or three random effects, the Gauss–Hermite quadrature can provide more accurate estimates of fixed effects and variance components (Pan and Thompson, 2003). (Bolker et al., 2009). There was only one random effect in the two GLMM analyses that used it, so it was an appropriate method. In addition, we discovered three studies that used restricted maximum likelihood (REML). When the dependent variable can be modeled with a normal distribution, this method is most appropriate (Thiele and Markussen, 2012). REML was used in three separate analyses, one of which used a binomial distribution, while the other two did not.

Damian Jacob Sendler

It was found that PQL was the most commonly used estimation method in clinical medicine because it is fast to compute (Breslow and Clayton, 1993). Only a small percentage of GLMM analyses focused on psychology reported the estimation method, and none used PQL.

However, Casals et al. (2014) found that the percentage of studies reporting the goodness-of-fit method was lower in the field of psychology than it was in clinical medicine (69.5 percent to 84.3 percent). The AIC was the most commonly used method in both our study and that of Casals et al. (2014). In order to determine whether or not the method employed is the most appropriate, it is critical that this data be reported. When we looked at GLMM analyses, seven used the AIC with samples of less than 100 when the AICc would have been more appropriate. The goodness-of-fit method was not specified in 15 analyses involving small samples (less than 100), so it is unclear if the most appropriate criterion was used. AICc was used with small samples in only two out of the eight GLMM analyses examined (149 subjects).

Damien Sendler: For the fixed effects test, we found that very little information was provided, which is consistent with findings from Casals et al. (2014). However, small and moderate sample sizes shouldn’t use this method because it’s not recommended for likelihood ratio tests of fixed effects that use maximum likelihood to estimate parameters (Bolker et al. 2009; Cheng et al. 2010). (Pinheiro and Bates, 2000). A likelihood ratio test was used in six of the GLMM analyses we examined, but no specific estimation method was specified in any of them. As a result, three of the six GLMM analyses that used the likelihood ratio test for fixed effects had a sample size of fewer than 100 participants.

We were unable to quantify the statistical modeling strategies (forward selection, backward elimination, best subset, and stepwise procedures) because they were not, as a rule, described. In spite of the fact that statistical modeling was used in only half of the GLMM analyses, almost all of the remaining studies used a full model that included all of the available predictor variables, even if they were not significant. When there are a large number of predictor variables, statistical modeling can be beneficial. Unbiased estimates are provided by full models, but they also contain predictors that are of no consequence. Statistical modeling for inference may or may not be used depending on the study’s goals. Stepwise procedures for inference have been criticized because the order in which parameters are entered or removed can affect the selection result, the parameter estimates can be biased, and multiple tests involved in the procedures inflate type I errors (Burnham and Anderson, 2002). The best subset modeling with GLMM can easily become computationally expensive when there are multiple fixed or random effects, as Thiele and Markussen (2012) recommend that model building is done by backward model selection.

Most of the GLMM studies reviewed did not report the random effects test, which is consistent with findings from clinical medicine (Casals et al., 2014). For random effect inferences, the likelihood ratio test is useful, but only for nested models. This necessitates that the hierarchical structure be made clear. A single random effect encompassing participant variation was the focus of the majority of the analyses examined here.

The autocorrelation of such data can be modelled using structured covariance matrices, such as the autoregressive, using GLMMs, which are common in psychology. There was very little discussion of the covariance matrix structure in any of the analyses examined here. As a result, it’s not clear if psychologists prefer structured or unstructured covariance matrices in their research. Some statistical software does not allow the researcher to define the covariance structure when GLMMs are used (Thiele and Markussen, 2012).

Despite the fact that ignoring overdispersion can cause issues, 90.7 percent of GLMM analyses did not specify whether it was assessed. Casals et al. (2014) found the same results in the field of medicine. Overdispersion, on the other hand, is only a problem for some types of distribution.

Additionally, it’s critical to mention the statistical software used, as this can have an impact on the estimation methods used and the amount of time it takes to run the calculations. Most commonly used statistical software suites in these studies were SAS and R. SAS’s standard procedure for fitting GLMMs was originally part of the glimmix package. We found lme4 (implemented in R) to be a popular package for fitting GLMMs. SPSS, STATA, and HLM were the most commonly used software suites. Even for statisticians, GLMMs are difficult to implement, even with statistical software. Furthermore, it is common for models to fail to converge, particularly when there are many random and fixed effects.

It is our primary goal in this article to examine whether the information provided in studies using generalized linear mixed models (GLMMs) published in psychology is sufficient for researchers with experience in these models to evaluate the methodological quality of studies and the validity of the results. We found that report quality in psychology was similar to that observed in clinical medicine by Casals et al. (2014). The use of GLMMs is well-known in statistical literature, but the psychological literature frequently employs models with only fixed effects. As a result of a lack of familiarity with complex models used in applied psychology, there is a tendency to report statistical significance without taking into account other important aspects.

As opposed to looking for anomalies in their application, the purpose of this article was to examine how GLMMs report key information. Despite this, we discovered that they were not always used correctly. In light of this, we suspect that the analyses may have a high number of model misspecifications. As a result, scientists should proceed with caution and a thorough knowledge of GLMMs when performing these types of analyses.

We believe that GLMM results should be presented in psychology journals in accordance with a set of minimum standardized guidelines following a review of the literature. The following is the minimum information that should be included in any statistical analysis: measurement scale and distribution of the dependent variable, link function, estimation method, goodness-of-fit method, fixed effects test, statistical modeling, random effects test, variance estimates of random effects, covariance structure for repeated m The inclusion of random effects in a GLMM analysis is particularly interesting. Examples and recommendations are provided by Stroup et al. (2018) on how to configure the random effects structure. The quality of GLMM analysis in psychology could be improved if information on power calculations was made available. When determining power output, readers can refer to Stroup et al. (2013, 2014). (2018).

There is a need for better reporting when using GLMMs in psychology, suggesting that there is a lack of understanding of what information from GLMM analyses should be presented. According to our systematic review, most of the important information about GLMMs was not stated in most of the articles included. Because of this, it is difficult to evaluate the GLMM approaches used because of the lack of reporting on key aspects (e.g. estimation method, link function, goodness-of-fit method, and overdispersion evaluation). Moreover, none of the articles reviewed contained all of the relevant data.

Dr. Sendler

Damian Jacob Markiewicz Sendler

Sendler Damian Jacob