nsf.gov - NCSES U.S. Academic Scientific Publishing - US National Science Foundation (NSF)
text-only page produced automatically by LIFT Text
Transcoder Skip all navigation and go to page contentSkip top navigation and go to directorate navigationSkip top navigation and go to page navigation
National Science Foundation National Center for Science and Engineering Statistics
U.S. Academic Scientific Publishing

11.0 Key Factors Associated with Publication Counts at the Field Group Level

 

In this section we develop models for the five field groups (i.e., biology, life and agricultural sciences (bio-life-ag); computer sciences; engineering, math and physical sciences (eng-math-physical sciences); medical sciences; and social sciences and psychology (soc-sciences).

Sections 11.1 and 11.2 discuss the models to explain fractional count publications and whole count publications in the expanding journal set, respectively. We initially performed separate regressions for each field group. However, finding that the explanatory variables are very similar (both across field groups and across the two publication measures), we found that the same three variables that were used to explain institutional level variability were able to explain most of the variability for individual field groups.

In section 11.3, we model institution-level publications by modeling each field group separately and summing the expected publications to the institution level. The model's fit at the institution level show only a modest improvement. This suggests that institutional variations in the proportion of publications from various field groups (i.e., disciplinary concentration) does not especially help in explaining institution-level publications variability.

In section 11.4, we examine the ratio of observed to expected publications over time by field group. Increases in resources used per publication were greatest for medical sciences and bio-life-ag (approximately 33%) and intermediate for soc-sciences (approximately 23%), with fairly constant trends during the entire time period from 1990 to 2001. Resources used per publication in the eng-math-physical sciences field group increased about 15% from 1990 to 1997 and then reversed itself, resulting in a net increase of about 9% from 1990 to 2001. Resources used per publication in computer sciences followed an erratic pathway, resulting in an increase of approximately 13% over this time period.

11.1 Analyses of Fractional Count Publications in the Expanding Journal Set

We performed regression modeling for each field group using publications as measured by fractional counts in the expanding journal set as the dependent variable. The results (excluding variables with an incremental r-square smaller than 0.01) and the cumulative r-square values are as follows:

  • Biology, life and agricultural sciences (bio-life-ag)[45]: Postdoctorates supported by federal research grants by field (0.766), S&E doctoral recipients by field (0.866), S&E postdoctorates by field (0.900), and other funding sources of academic R&D expenditures by institution (0.914). (Other funding sources of academic R&D expenditures includes R&D funded by foundation grants and excludes R&D funded with federal, industry, state, local or institution funds.)  All variables yielded an r-square of 0.937.
  • Computer sciences[46]: S&E doctoral recipients by field (0.634), total academic R&D expenditures by field (0.707), and basic research expenditures by institution (0.738). All variables yielded an r-square of 0.774.
  • Engineering, math and physical sciences (eng-math-physical sciences)[47]: S&E doctoral recipients by field (0.888) and S&E postdoctorates by field (0.931). All variables yielded an r-square of 0.953.
  • Medical sciences[48]:  Total academic R&D expenditures by field (0.771), S&E postdoctorates with M.D.s supported by federal research grants (0.858), S&E doctoral recipients by field (0.880), and Carnegie R-1 classification (0.891). Carnegie Medical classification almost qualified, with an incremental r-square of 0.009. All variables yielded an r-square of 0.932.
  • Social sciences and psychology (soc-sciences)[49] S&E doctoral recipients by field (0.790), basic research by institution (0.842), S&E postdoctorates by field (0.863), and full time assistant professors by institution (0.874). All variables yielded an r-square of 0.913.

We compared the explanatory ability of the models listed above with a model including only total academic R&D expenditures, the number of S&E postdoctorates, and the number of S&E Ph.D. recipients in the five field groups. The r-squares for this simplified model[50] are as follows: 1) bio-life-ag (0.894), 2) computer sciences (0.686), 3) eng-math-physical sciences (0.937), 4) medical sciences (0.857) and 5) soc-sciences (0.840). This simple 3 variable model was thus able to account for most (i.e., at least 89%) of the explanatory power of the field group specific models described above.

The coefficients for the three variable regression model appear in table 5Excel table.. S&E Ph.D. recipients appear to be more important in the bio-life-ag and medical sciences fields than in the other fields. Total academic R&D expenditures appear generally to be more important and postdocs less important in the medical sciences field than in the other fields. An interesting aspect of this table is that the coefficient for total academic R&D expenditures is larger in the institution-level regression than in all but one of the field groups, and the coefficient for S&E Ph.D. recipients is smaller in the institution level than for most of the field groups.

Top of page. Back to Top

11.2 Analyses of Whole Count Publications in the Expanding Journal Set

We performed regression modeling for each field group using publications as measured by whole counts in the expanding journal set as the dependent variable. We obtained similar r-squares using the same explanatory variables in the fractional count models for two field groups, eng-math-physical sciences and medical sciences. We obtained similar r-squares in the other three field groups but the type and number of explanatory variables varied between the fractional and whole count publication models.

  • Bio-life-ag[51]: S&E postdoctorates without an M.D. degree by field (0.804), S&E doctoral recipients by field (0.893), and S&E postdoctorates with an M.D. degree by field (0.909). All variables yielded an r-square of 0.938.
  • Computer sciences[52]: S&E doctoral recipients by field (0.642), total academic R&D expenditures by field (0.706), and basic research expenditures by institution (0.744). All variables yielded an r-square of 0.787.
  • Eng-math-physical sciences[53]: S&E doctoral recipients by field (0.868), and S&E postdoctorates by field (0.930).  All variables yielded an r-square of 0.949.
  • Medical sciences[54]:  Total academic R&D expenditures by field (0.732), S&E postdoctorates with M.D.s supported by federal research grant (0.871), S&E doctoral recipients by field (0.890), and Carnegie R-1 classification (0.900). Carnegie Medical classification almost qualified, with an incremental r-square of 0.008. All variables yielded an r-square of 0.938.
  • Soc-sciences[55]: S&E doctoral recipients by field (0.781), basic research expenditures by institution (0.842), and S&E postdoctorates by field (0.863). All variables yielded an r-square of 0.913.

We compared the explanatory ability of the models listed above with a model including only total academic R&D expenditures, the number of S&E postdoctorates, and the number of S&E Ph.D. recipients in the five field groups.[56] The r-squares for this simplified model are as follows: 1) bio-life-ag (0.902), 2) computer sciences (0.688), 3) eng-math-physical sciences (0.937), 4) medical sciences (0.869) and 5) soc-sciences (0.835). This simple 3 variable model is thus able to account for most (i.e., at least 87%) of the explanatory power of the field group specific models described above.

Top of page. Back to Top

11.3 Improving Model's Fit Using Field-Specific Publication Estimates

Estimation of publications as measured by fractional and whole counts in the expanding journal set at the institution level can be slightly improved by combining field group specific estimates. We performed regressions for each individual field-group and then combined the estimates to obtain expected publications at the institution-level. For publications as measured by fractional counts in the expanding journal set the r-square for institution-level regression was 91.8% corresponding to a root mean square error (RMSE) of 175. The corresponding r-square and RMSE for aggregated field group estimates were 93.8% and 153, respectively. Thus, RMSE was reduced by 12%. For publications as measured by whole counts in the expanding journal set, the r-square for institution-level regression was 93.4% corresponding to a RMSE of 244. The corresponding r-square and RMSE for aggregated field group estimates were 94.5% and 223. Thus, RMSE was reduced by 9%.

Top of page. Back to Top

11.4 Ratio of Observed to Expected Publications Over Time

Figure 31Figure. is a plot of the ratio of observed to expected publications as measured by whole counts in the expanding journal set by field group (normalized to 1988 equals 1.0). Values for 2000 were interpolated from expected publications from 1999 and 2001. Three of the field groups (bio-life-ag, medical sciences, and soc-sciences) show a decreasing trend, until about 1999, after which the curves are essentially flat. That is, these fields had increasing resource needs per whole count article over time, but recently have stabilized resource requirements. The trends for bio-life-ag and medical sciences are particularly similar. The other two field groups (computer sciences and eng-math-physical sciences) do not demonstrate any particular trend until approximately 1998 or 1999, when they begin to increase, with the increase being particularly noteworthy for eng-math-physical sciences. That is, the resources required per whole count article remains essentially flat for computer sciences and eng-math-physical sciences over time, until most recently when they have begun to decrease.

Figure 32Figure. is a plot of the ratio of observed to expected publications as measured by fractional counts in the expanding journal set by field group (normalized to 1988 equals 1.0). Values for 2000 were interpolated from expected publications from 1999 and 2001. All of the field groups show a decreasing trend until about 1997. The trend is fairly steep, and is partially accounted for by increased resource requirements per whole count and partially by greater collaboration with other institutions. After 1997, three field groups (bio-life-ag, medical sciences, and soc-sciences) show a continuing decrease. Since the resource requirements per whole count stabilized during this time period, the continuing decrease in fractional counts must be attributable to continuing increases in collaboration. Two field groups (computer sciences and eng-math-physical sciences) show flattening of the curve in the range of about 1996 to 1998, and increases in the ratio from 1998 to 2001. The most recent years were associated with dramatic decreases in resource needs per whole count publication, and the corresponding increase that we would expect in the ratio of observed to expected count publications as measured by fractional counts appears to be attenuated for these two fields by continuing increases in collaboration.


Top of page. Back to Top



Footnotes

[45] Regression output is displayed in Exhibit J-1.

[46] Regression output is displayed in Exhibit J–2.

[47] Regression output is displayed in Exhibit J-3.

[48] Regression output is displayed in Exhibit J-4.

[49] Regression output is displayed in Exhibit J-5.

[50] Regression output is displayed in Exhibit J-6.

[51] Regression output is displayed in Exhibit J-7.

[52] Regression output is displayed in Exhibit J-8.

[53] Regression output is displayed in Exhibit J-9.

[54] Regression output is displayed in Exhibit J-10.

[55] Regression output is displayed in Exhibit J-11.

[56] Regression output is displayed in Exhibit J-12.


 
U.S. Academic Scientific Publishing
Working Paper | SRS 11-201 | November 2010