Primary Questions

1. Does national R&D spending correlate with patent output?

Before any treatment is made, the relationship’s shape can be seen in an initial scatter plot of raw R&D spending (% of GDP) against the total number of patents.

R&D spending (% of GDP) vs. raw patent count across all country-year observations. The distribution is heavily right-skewed, with most countries clustered near the origin and a small number of high-investment nations producing disproportionately large patent volumes.

The Pearson correlation on raw values yields a moderate positive association:

\[r = 0.355, \quad p < 0.001\]

However, this understates the true strength of the relationship. The Spearman rank correlation — robust to outliers and distributional skew — is substantially higher:

\[\rho = 0.808, \quad p < 0.001\]

The large gap between the two statistics (Δ = 0.453) indicates that the relationship is monotonic but non-linear: as R&D spending increases, patent output increases at an accelerating rather than a constant rate. Patent counts are highly right-skewed, with most countries producing relatively few patents while a handful of high-investment nations produce orders of magnitude more.

Applying a log-transformation to patent counts (\(\log(\text{patents} + 1)\)) substantially improves the linear fit. The regression plots below compare the raw and log-transformed scales side by side.

Regression plots of R&D spending vs. patent output. Left: raw scale (Pearson r = 0.355). Right: log-transformed patent count (Pearson r = 0.791). The log scale linearises the relationship and confirms an exponential pattern.

\[r_{\log} = 0.791, \quad p < 0.001\]

A given increase in R&D spending (% of GDP) is associated with a multiplicative increase in patent output rather than an additive one.

OLS regression with log(patents) as the outcome and both rd_gdp and log(population) as predictors yielded the following coefficient estimates:

Predictor Coefficient
Intercept −7.76
R&D spending (% GDP) +2.00
log(Population) +0.61

Holding population constant, each additional percentage point of GDP directed toward R&D is associated with a doubling of expected patent output on the log scale. The residual plot below confirms there is no systematic bias in the fitted values.

Residuals vs. fitted values from the OLS model (outcome: log patent count; predictors: R&D % GDP and log population). Residuals are centred near zero (mean = −0.0000, SD = 1.30) with no obvious structure, indicating a reasonable model fit.

Summary: R&D spending is a strong positive predictor of patent output. The relationship is monotonic and non-linear — consistent with an exponential scaling pattern — and is highly statistically significant across all tested specifications.


2. Does patent output correlate with GDP growth?

The raw-scale and log-scale scatter plots are shown side by side below.

GDP growth (% of GDP) as a function of patent count. Left: raw patent count (Pearson r = -0.055, Spearman ρ = −0.248). Right: log-transformed patent count (Pearson r = −0.162, Spearman ρ = −0.248). The data likely follows a non-linear or skewed downward trend, where the rank-order relationship is stronger than the direct linear association.

The near-zero Pearson correlation (r=−0.055) confirms the absence of a linear relationship, while the Spearman correlation (ρ=−0.248,p<0.001) reveals a persistent, non-linear downward trend.By utilizing ranks, the Spearman coefficient captures a directional association that the Pearson metric fails to detect due to the data’s non-linear structure.

After log-transforming patent counts, both measures align:

\[r_{\log} = -0.162, \quad \rho_{\log} = -0.248, \quad p < 0.001 \text{ (both)}\]

Countries with higher patent output do not systematically experience higher GDP growth. Two hypotheses were considered and tested in the lag analysis that follows.


3. Is there a lag effect?

Given that the economic payoff from research and innovation would realistically take years to materialise, a multi-year lag analysis was conducted for both directions of the relationship.

R&D spending → Patent output (lags 0–5 years):

Lag (years) Spearman ρ p-value n
0 0.808 < 0.001 1,801
1 0.806 < 0.001 1,672
2 0.806 < 0.001 1,556
3 0.806 < 0.001 1,449
4 0.805 < 0.001 1,347
5 0.804 < 0.001 1,250

The R&D → patent association is virtually time-invariant across all lag lengths, suggesting that patent output responds to R&D investment without a detectable multi-year delay at annual resolution.

Patent output → GDP growth (lags 1–12 years):

The Spearman correlation profile below shows how the patent → GDP growth association evolves across lag lengths.

Spearman ρ between lagged log(patent count) and GDP growth, for lags 1 through 12 years. The correlation remains negative throughout, weakening gradually from ρ = −0.247 at lag 1 to ρ ≈ −0.081 at lag 9 before levelling off. No positive reversal is observed at any lag length tested.
Lag (years) Spearman ρ p-value
1 −0.247 < 0.001
2 −0.228 < 0.001
3 −0.211 < 0.001
4 −0.176 < 0.001
5 −0.159 < 0.001
7 −0.128 < 0.001
8 −0.105 0.001
9 −0.081 0.016
10 −0.092 0.009
11 −0.091 0.015
12 −0.093 0.018

Testing Hypothesis 1 — Reverse causality: Lagging GDP growth by one year and correlating with current patent output yielded ρ = −0.259 (p < 0.001) — also negative. This rules out the possibility that GDP growth is itself driving patent activity.

Testing Hypothesis 2 — Income-group averaging: Countries were split into “High R&D” (rd_gdp ≥ 1.0%) and “Low R&D” groups. Both sub-groups showed negative correlations (ρ = −0.130 and −0.163 respectively), ruling out the averaging-artefact explanation.

Summary: The most plausible interpretation is a short-term resource cost: patenting is expensive and R&D-intensive, generating a drag on measured GDP growth in the near term. Any long-term growth payoff, if it exists, appears too diffuse or conditional to emerge in a simple bivariate correlation at annual resolution.


Secondary Questions

4. Which countries convert R&D investment into patents most efficiently?

Efficiency metric:

\[E_1 = \frac{\text{patent\_count}}{\text{rd\_gdp} \times \text{inventor\_count}} \times 10^6\]

This formulation normalises patent output by both the scale of R&D investment and the size of the inventor workforce, providing a measure of research productivity per unit of combined resource input. Minimum thresholds were applied to avoid anomalous scores: at least 500 patents, 2,000 inventors, 1.0% R&D/GDP, and 5 million population — reducing the dataset to 324 observations across 20 countries.

Top 10 countries by mean R&D → patent efficiency:

Country Mean E₁ Total Patents Mean R&D (% GDP)
Italy (ITA) 600,573 68,816 1.24%
Spain (ESP) 432,166 12,205 1.29%
China (CHN) 406,881 244,159 1.96%
United Kingdom (GBR) 402,119 156,642 1.95%
Australia (AUS) 400,311 17,759 2.04%
Canada (CAN) 399,072 158,514 1.84%
Netherlands (NLD) 376,873 64,011 1.95%
Singapore (SGP) 365,001 6,239 1.94%
France (FRA) 303,343 148,026 2.17%
Belgium (BEL) 276,552 19,515 2.85%

5. Which countries convert patent output into GDP growth most efficiently?

Efficiency metric:

\[E_2 = \frac{\text{gdp\_growth}}{\text{patent\_count} / \text{population}} \times 10^6\]

Top 10 countries by mean patent → GDP growth efficiency:

Country Mean E₂ Total Patents Mean R&D (% GDP)
China (CHN) 2.01 × 10¹² 244,159 1.96%
Spain (ESP) 7.45 × 10¹⁰ 12,205 1.29%
South Korea (KOR) 2.68 × 10¹⁰ 319,187 3.51%
Australia (AUS) 2.15 × 10¹⁰ 17,759 2.04%
United Kingdom (GBR) 1.65 × 10¹⁰ 156,642 1.95%
France (FRA) 1.60 × 10¹⁰ 148,026 2.17%
Canada (CAN) 1.20 × 10¹⁰ 158,514 1.84%
Netherlands (NLD) 1.16 × 10¹⁰ 64,011 1.95%
Israel (ISR) 1.15 × 10¹⁰ 63,317 4.56%
Italy (ITA) 1.11 × 10¹⁰ 68,816 1.24%

China’s dominant position in E₂ reflects its high GDP growth rates during the study period combined with its large population, which keeps the per-capita patent denominator low even as absolute patent volume grows.


6. Is innovation productivity saturating in high-income countries?

Data workaround applied here. Because GDP per capita was not extracted in the original pipeline, countries could not be classified by income level from the data alone. Income group labels were sourced externally from World Bank and IMF reference lists and applied as a static lookup. This classification has not been independently verified within the scope of this study.

The chart below tracks mean E₁ (R&D → patents) and E₂ (patents → GDP growth) over time for High Income and Emerging economy groups.

Mean R&D → patent efficiency (E₁, left) and patent → GDP growth efficiency (E₂, right) by year and income group (2000–2022). Both groups show declining E₁ over time. E₂ declines significantly for Emerging economies but shows no statistically significant trend for High Income countries.

Linear regression of mean E₁ over time by group:

Group Slope (E₁ per year) p-value
High Income −4,374 0.907 < 0.001
Emerging −20,644 0.777 < 0.001

Both groups show a statistically significant decline in R&D → patent efficiency over time, consistent with a diminishing returns hypothesis. The steeper decline in emerging economies may reflect rapid scaling of R&D programmes outpacing the growth of productive inventor capacity.

The scatter below shows where each group sits in R&D spend vs. patent output space, revealing the structural differences in innovation volume between the two groups.

R&D spending (% of GDP) vs. patent count, coloured by income group. High Income countries span a wider range of R&D intensities and consistently produce higher absolute patent volumes. Emerging economies are concentrated at lower R&D spending levels with greater variance in patent output.

7. Are emerging economies more innovation-efficient per dollar spent?

To normalise E₁ by R&D spending intensity, a per-unit efficiency metric was computed:

\[E_{1,\text{norm}} = \frac{E_1}{\text{rd\_gdp}}\]

Mean normalised efficiency by group:

Group Mean E₁ / R&D unit
High Income 155,885
Emerging 226,521

Emerging economies produce approximately 45% more patents per unit of R&D investment than high-income countries on average. The box plots below compare distributions of both the normalised and raw efficiency metrics across groups.

Left: Distribution of E₁ normalised by R&D spending (E₁ / rd_gdp) for High Income vs. Emerging economies. Emerging economies show a higher median and wider spread. Right: Raw E₁ for reference. The higher per-dollar efficiency of emerging economies is consistent across the distribution, not driven by outliers alone.

This is consistent with theoretical expectations: high-income economies direct R&D toward complex, incremental frontier innovation where the marginal cost per patent is higher, while emerging economies may capture more available patent space through technology adoption, adaptation, and applied engineering at lower marginal cost per unit of investment.


8. Does patent volume plateau beyond certain R&D spending thresholds?

R&D spending was binned into deciles and mean patent output was calculated within each bin, separately for High Income and Emerging countries, to test for a saturation effect.

Mean patent count by R&D spending decile (bin midpoint), for High Income and Emerging economy groups. Both curves follow a broadly concave trajectory — rising steeply at lower R&D spending levels and flattening at higher spending — consistent with diminishing marginal returns to R&D investment.

A logarithmic curve was then fitted directly to High Income country observations to quantify the saturation shape:

\[\hat{y} = a \cdot \ln(x) + b\]

Logarithmic curve fit (crimson line) to High Income country observations of R&D spending vs. patent count. The log model captures the concave shape of the relationship: large patent gains at lower R&D spending levels, with returns flattening progressively as spending rises toward and beyond ~3% of GDP.

The log curve fit confirms a saturation pattern: early increases in R&D spending yield proportionally large gains in patent output, but the marginal return diminishes as spending rises. The effect is most clearly demonstrated in High Income countries, which occupy a wider R&D spending range where the flattening of the curve is directly observable.


Concluding Remarks

Across the eight analyses, the most robust finding is the strong positive association between R&D spending and patent output (Spearman ρ ≈ 0.81), which is stable across lag windows, statistically highly significant, and consistent with a log-linear functional form. This relationship holds across income groups and is not an artifact of country size or population.

The patent → GDP growth relationship is weaker, consistently negative in bivariate tests, and does not resolve to a positive signal at any lag length tested up to twelve years. This most likely reflects the limitations of a bivariate framework that cannot account for institutional quality, human capital, sector composition, and other confounders that would be necessary for a credible causal claim.

The secondary findings — declining innovation efficiency over time, higher per-dollar efficiency in emerging economies, and logarithmic saturation in the R&D → patent curve — are coherent and mutually reinforcing. They suggest that the global innovation system is characterised by diminishing marginal returns, and that the productivity advantage of emerging economies, while real, is narrowing.