Citations Per Dollar as a Measure of Productivity
NIH grants reflect research investments that we hope will lead to
advancement of fundamental knowledge and/or application of that
knowledge to efforts to improve health and well-being. In February, we
published a blog on the publication impact of NIH funded research.
We were gratified to hear your many thoughtful comments and questions.
Some of you suggested that we should not only focus on output (e.g.
highly cited papers), but also on cost – or as one of you mentioned
“citations per dollar.” Indeed, my colleagues and I have previously
taken a preliminary look
at this question in the world of cardiovascular research. Today I’d
like to share our exploration of citations per dollar using a sample of
R01 grants across NIH’s research portfolio. What we found has an
interesting policy implication for maximizing NIH’s return on investment
in research.
To think about impact per dollar across the NIH research portfolio,
let’s look at a sample of 37,909 NIH R01 grants that were first funded
between 2000 and 2010. When thinking about citations per dollar, one
important consideration is whether we are largely looking at human or
non-human studies.
Table 1 shows some of the characteristics of these grants according
to whether or not they included human subjects. Continuous variables
are shown in the table in the format a b c, where a is the lower
quartile, b is the median, and c is the upper quartile. The total award
amount includes direct and indirect costs across all awarded years and
is shown in 2015 constant dollars (with inflation adjustment by the BRDPI). “Prior NIH funding” refers to total NIH funding the PI received prior to this specific award.
As might be expected, grants supporting research on human subjects
were more expensive and more likely to involve multiple PI’s. Human
studies were less likely to be renewed at least once.
Next, let’s look at publishing and citation outcomes for the same
group of grants, broken out by whether the study involves humans are
not. Similar to what I showed in my prior blog,
I show a “normalized citation impact”, a citation impact measure that
accounts for varying citation behavior across different scientific
disciplines, but now divide that by total dollars spent. We’ll do this
using box and violin plots
to show the distribution of normalized citation impact per million
dollars according to whether or not the grant included human subjects.
The shaded area shows the distribution of NIH-supported papers
ranging from the most highly cited (100 percentile) to least cited. Note
that the Y-axis is displayed on a logarithmic scale. This is an
important point – scientific productivity follows a highly skewed “heavy-tailed”
logarithmic distribution, not a simple normal distribution like human
height. The log-normal distribution of grant productivity is evident,
though with “tails” of grants that yielded minimal productivity. The
log-normal distribution also reflects that there are a small – but not
very small – number of grants with extraordinarily high productivity
(e.g. those that produced the equivalent of 10 or more highly cited
papers). We also see that by this measure, grants that focus on human
studies– in aggregate – have less normalized citation impact per dollar
than other grants.
Another approach to describing the association of citation impact
with budget is to produce a “production plot,” in which we examine how
changes in inputs (in this case dollars) are associated with changes in
output (in this case, citation impact). Figure 2 below shows such a
production plot in which both axes (total award on the X-axis and
citation impact on the Y-axis) are logarithmically scaled. This kind of
plot allows us to ask the question, “does a 10% increase in input
(here, total grant award funding) predict a 10% increase in output
(citations, normalized as described earlier)?” If there is a 1:1
relationship between the input and the output, and a 10% increase in
funding yields a 10% increase in citations, we’d expect a plot with a
slope of exactly 1.The trendlines/curves are based on loess smoothers,
with shaded areas representing 95% confidence intervals. We see that
the association between the logarithm of grant citation impact and the
logarithm of grant total costs is nearly linear. We also see that over
95% of the projects have total costs greater than $1 million, and less
than $10 million for the lifetime of the grant, and in this range the
association is linear with a slope of < 1 (whereas the dotted line
which has a slope of exactly 1). Not only is this pattern consistent
with prior literature, it is illustrative of an important point: Research productivity follows (to some extent) a “power law,” meaning that productivity is a function of the power of funding.
There are important policy implications of the power law as it applies to research. In cases in which power laws apply, extreme observations are not as infrequent as one might think.
In other words, extreme events may be uncommon, but they are not rare.
Extreme events in biomedical science certainly happen – from the
discoveries of evolution and the genetic code to the development of
vaccines that have saved millions (if not billions) of lives to the findings of the transformative Women’s Health Initiative Trial to the more recent developments in targeted treatments and immunotherapy for cancer.
Because extreme events happen more often than we might think, the best
way to maximize the chance of such extreme transformative discoveries
is, as some thought leaders
have argued, to do all we can to fund as many scientists as possible.
We cannot predict where or when the next great discovery will happen,
but we can predict that if we fund more scientists or more projects we
increase our ability to maximize the number of great discoveries as a
function of the dollars we invest.
advancement of fundamental knowledge and/or application of that
knowledge to efforts to improve health and well-being. In February, we
published a blog on the publication impact of NIH funded research.
We were gratified to hear your many thoughtful comments and questions.
Some of you suggested that we should not only focus on output (e.g.
highly cited papers), but also on cost – or as one of you mentioned
“citations per dollar.” Indeed, my colleagues and I have previously
taken a preliminary look
at this question in the world of cardiovascular research. Today I’d
like to share our exploration of citations per dollar using a sample of
R01 grants across NIH’s research portfolio. What we found has an
interesting policy implication for maximizing NIH’s return on investment
in research.
To think about impact per dollar across the NIH research portfolio,
let’s look at a sample of 37,909 NIH R01 grants that were first funded
between 2000 and 2010. When thinking about citations per dollar, one
important consideration is whether we are largely looking at human or
non-human studies.
Table 1 shows some of the characteristics of these grants according
to whether or not they included human subjects. Continuous variables
are shown in the table in the format a b c, where a is the lower
quartile, b is the median, and c is the upper quartile. The total award
amount includes direct and indirect costs across all awarded years and
is shown in 2015 constant dollars (with inflation adjustment by the BRDPI). “Prior NIH funding” refers to total NIH funding the PI received prior to this specific award.
As might be expected, grants supporting research on human subjects
were more expensive and more likely to involve multiple PI’s. Human
studies were less likely to be renewed at least once.
Next, let’s look at publishing and citation outcomes for the same
group of grants, broken out by whether the study involves humans are
not. Similar to what I showed in my prior blog,
I show a “normalized citation impact”, a citation impact measure that
accounts for varying citation behavior across different scientific
disciplines, but now divide that by total dollars spent. We’ll do this
using box and violin plots
to show the distribution of normalized citation impact per million
dollars according to whether or not the grant included human subjects.
The shaded area shows the distribution of NIH-supported papers
ranging from the most highly cited (100 percentile) to least cited. Note
that the Y-axis is displayed on a logarithmic scale. This is an
important point – scientific productivity follows a highly skewed “heavy-tailed”
logarithmic distribution, not a simple normal distribution like human
height. The log-normal distribution of grant productivity is evident,
though with “tails” of grants that yielded minimal productivity. The
log-normal distribution also reflects that there are a small – but not
very small – number of grants with extraordinarily high productivity
(e.g. those that produced the equivalent of 10 or more highly cited
papers). We also see that by this measure, grants that focus on human
studies– in aggregate – have less normalized citation impact per dollar
than other grants.
Another approach to describing the association of citation impact
with budget is to produce a “production plot,” in which we examine how
changes in inputs (in this case dollars) are associated with changes in
output (in this case, citation impact). Figure 2 below shows such a
production plot in which both axes (total award on the X-axis and
citation impact on the Y-axis) are logarithmically scaled. This kind of
plot allows us to ask the question, “does a 10% increase in input
(here, total grant award funding) predict a 10% increase in output
(citations, normalized as described earlier)?” If there is a 1:1
relationship between the input and the output, and a 10% increase in
funding yields a 10% increase in citations, we’d expect a plot with a
slope of exactly 1.The trendlines/curves are based on loess smoothers,
with shaded areas representing 95% confidence intervals. We see that
the association between the logarithm of grant citation impact and the
logarithm of grant total costs is nearly linear. We also see that over
95% of the projects have total costs greater than $1 million, and less
than $10 million for the lifetime of the grant, and in this range the
association is linear with a slope of < 1 (whereas the dotted line
which has a slope of exactly 1). Not only is this pattern consistent
with prior literature, it is illustrative of an important point: Research productivity follows (to some extent) a “power law,” meaning that productivity is a function of the power of funding.
There are important policy implications of the power law as it applies to research. In cases in which power laws apply, extreme observations are not as infrequent as one might think.
In other words, extreme events may be uncommon, but they are not rare.
Extreme events in biomedical science certainly happen – from the
discoveries of evolution and the genetic code to the development of
vaccines that have saved millions (if not billions) of lives to the findings of the transformative Women’s Health Initiative Trial to the more recent developments in targeted treatments and immunotherapy for cancer.
Because extreme events happen more often than we might think, the best
way to maximize the chance of such extreme transformative discoveries
is, as some thought leaders
have argued, to do all we can to fund as many scientists as possible.
We cannot predict where or when the next great discovery will happen,
but we can predict that if we fund more scientists or more projects we
increase our ability to maximize the number of great discoveries as a
function of the dollars we invest.
Citations Per Dollar as a Measure of Productivity | NIH Extramural Nexus
No comments:
Post a Comment