7.0 Decomposition of Variability: Between and Within Institution Components
This section discusses the finding that almost all of the
variability in publication and citation counts and in resources used for
research occurs between institutions, rather than across years within
institutions. As a result, models based on the dataset are more sensitive to
differences between institutions (e.g., the relationship of average resources
to average publications across years) than to how changes in resources within
institutions are related to changes in publication counts for those institutions.
Most of the variability in the citations and publications counts,
when aggregated to the institutional-year level, occurs between institutions.
Regressing an indicator variable for institution on the first principal
component results in an r-square of 0.918. (If institution is entered as a
random effect rather than a fixed effect, the r-square is 0.910). Thus, to a
substantial extent modeling using the publication trends database will be
differentiating publication and citation levels between institutions, rather
than modeling changes that occur over time within institutions. Adding a
linear year term as an independent variable increases r-square to 0.934. Adding an interaction between year and institution (which allows for different
linear publication trends by institution) increases r-square to 0.943.
Most of the variability in the independent variables also occurs
between institutions. Institution accounts for the following proportion of
variance among prominent independent variables: total academic R&D (97.6%),
number of S&E Ph.D. recipients (98.3%), number of postdoctoral researchers
(97.5%), number of faculty (97.1%), and number of non-faculty doctoral research
staff (92.6%).