--- title: "ANOFA vs HiLogLinear" bibliography: "../inst/REFERENCES.bib" csl: "../inst/apa-6th.csl" output: rmarkdown::html_vignette description: > This vignette describes the differences between ANOFA and Hierchichal Log linear models. vignette: > %\VignetteIndexEntry{ANOFA vs HiLogLinear} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- ```{r, echo = FALSE, message = FALSE, results = 'hide', warning = FALSE} cat("this is hidden; general initializations.\n") library(ANOFA) ``` # ANOFA vs. hierchical log-linear modeling Some have noticed similarities between ANOFA and hierchical log-linear model. Indeed, the starting point of the two techniques is the same, the use of a G statistics which computes the log-likelihood ratio of two competing models, one being a restricted version of the other [@lc23b]. @s15 did an excellent overview of frequency analyses. He ends by citing @d83: " It is not difficult to argue that log-linear models will eventually supersede the use of Pearson's chi-square in the future because of their similarity to analysis of variance (ANOVA) procedures and their extension to higher order tables." However, a major chiasma happened in 1940, and from that moment on, hierchical linear models took an unexpected direction apart from ANOFA. # What happened in 1940? @ds40 published an article related to U.S. census. Herein, they noted, working with a classification table having more than two factors, that the expected cell frequencies computed using products of estimators (MLE) did not totalize the number of observations. As it turns out, this happens only when there are three or more classification dimensions. As a solution, they proposed to generate the expected marginals using an iterative algorithm (SPSS calls it the _iterative proportional- fitting algorithm_). @f70a described in more details the said algorithm, showing that it always converges, and does so in just a few iterations, making it a very apt algorithm. @f70b also claimed that the marginals so estimated were suitable for log-linear model. Previous works showed that estimates obtained in that way were maximizing the likelihood of a model with fixed marginal totals (a _product-multinomial model_ which could be schematized as a multinomial with sub-multinomial layers model ) [@f07, p. 168], which is not the adequate model in a totally free multidimensional classification sampling. This _iterative proportional-fitting algorithm_ became the norm and is implemented in most software performing log-linear model fitting. Fienberg was an influential advocate of this algorithm (see his 1980 books re-edited 2007; @f07, chapter 3). # What are the pros and cons of using the _iterative proportional-fitting model_? ### Pros: * When the predicted cells are added, they sum up to the observed frequencies; * The G statistics is never negative ### Cons: * The expected marginal frequencies computed with this algorithm are *not* MLE estimators; * Consequently, the @w38 theorem, which says that asymptotically, the G statistic (a likelihood ratio of MLE's) follows a chi-square distribution, is no longer applicable to hierarchical log-linear model, * Also, the w76 correction to the chi-square distribution for small samples is likewise no longer valid in hierarchical log-linear model; * @np33 showed that tests based on the likelihood ratio of MLE's result in the most powerful statistical tests of hypothesis; * The G statistics are no longer additive, not totalizing $G_{\rm{total}}$ anymore, * Consequently, it is no longer possible to decompose the total G statistics into main effects and interactions --or-- into simple effects --or-- into orthogonal contrasts... In our opinion, the list of disadvantages of using the iterative algorithm *by far* exceed the advantages it offers. Wilks and Williams' theorems are the important foundations of ANOFA which makes this technique sound, and Neyman and Pearson's theorem, that ANOFA is statistically the most powerful test. # References