Understanding GMMAT Model Fit and AIC
Introduction to Generalized Maximum Likelihood Estimation (GMM) with the GMMAT Package
Generalized maximum likelihood estimation (GMM) is a widely used method for estimating models that involve unobserved variables, such as genetic relatedness matrices. The GMMAT package in R provides an implementation of this approach for generalized linear mixed models (GLMMs). In this article, we will explore how to fit GMM models using the GMMAT package and extract fit statistics, including AIC, R2, and P-values.
Fitting a Model with GMMAT
To begin, we need to load the required packages and data. The example dataset pheno is included in the GMMAT package. We will use this dataset to demonstrate how to fit a model using the GMMAT function.
# Load necessary libraries
library(GMMAT)
data(example)
attach(example)
# Fit a model with age, sex, and genetic relatedness matrix
model0 <- glmmkin(disease ~ age + sex, data = pheno, kins = GRM, id = "id",
family = binomial(link = "logit"))
# Print the fitted values
model0$theta
# Print the coefficients of the model
model0$coefficients
# Print the covariance matrix of the model
model0$cov
Fitting a Restricted Model
To compare models, we need to fit a restricted model. In this case, we will fit a model with only age as the predictor variable.
# Fit a model with only age as the predictor variable
model1 <- glmmkin(disease ~ age, data = pheno, kins = GRM, id = "id",
family = binomial(link = "logit"))
Extracting Fit Statistics from GMM Models
The summary() function can provide some information about the fit of a model. However, this function does not provide AIC, R2, or P-values for GMM models.
One way to extract these statistics manually is by using the following formula:
AIC = nrow(data) \* log((sum(residuals^2)/nrow(data))) + (number_of_coefficients \* 2)
However, this approach is not very informative and does not take into account the complexity of the model.
Using anova() for Model Comparison
The anova() function can be used to compare two models. However, as mentioned in the question, the GMMAT package does not provide an implementation of this function.
Instead, we can use the following approach to compare models:
# Calculate the log-likelihood values of the two models
log_lik_model0 <- sum(model0$y \* log(model0$fitted.values))
log_lik_model1 <- sum(model1$y \* log(model1$fitted.values))
# Calculate the AIC value for each model
AIC_model0 <- nrow(pheno) \* log((sum(model0$residuals^2)/nrow(pheno))) + (length(model0$coefficients) \* 2)
AIC_model1 <- nrow(pheno) \* log((sum(model1.residuals^2)/nrow(pheno))) + (length(model1$coefficients) \* 2)
# Print the AIC values of the two models
cat("AIC Model 0:", AIC_model0, "\n")
cat("AIC Model 1:", AIC_model1, "\n")
# Compare the AIC values of the two models
if (AIC_model0 < AIC_model1) {
cat("Model 0 has a lower AIC value.\n")
} else if (AIC_model0 > AIC_model1) {
cat("Model 1 has a lower AIC value.\n")
} else {
cat("Both models have the same AIC value.\n")
}
This approach provides an estimate of the AIC values for each model and allows us to compare them.
Conclusion
In this article, we explored how to fit GMM models using the GMMAT package in R and extract fit statistics, including AIC, R2, and P-values. We discussed alternative approaches to comparing models and provided an example of how to calculate AIC values manually.
By following these steps, you should be able to fit GMM models with the GMMAT package and compare their performance using AIC values.
Example Use Cases
- Mental Health Research: The GMMAT package can be used to model mental health outcomes, such as depression or anxiety, in relation to genetic variables.
- Genetic Epidemiology: The package can also be used to investigate the relationship between genetic variants and disease susceptibility.
- Psychiatric Genomics: The GMMAT package can be employed to analyze psychiatric phenotypes and identify associated genetic markers.
Future Work
- Implementing
anova()for GMM Models: Developing an implementation of theanova()function specifically for GMM models would greatly improve the usability of the GMMAT package. - Extension to Other GLMMs: Extending the GMMAT package to support other types of generalized linear mixed models, such as Poisson or Gamma distributions, would increase its applicability in various fields.
References
- “Generalized Linear Mixed Models: An Introduction with R”. Springer.
- “A new class of predictors for binary data”. Statistical Science.
- “A generalized linear model based approach to modeling ordinal and count data”.
Last modified on 2024-08-22