Capability Gap Radar — Methodology Note

How the winners’-recipe target and the achievement fraction are computed for the Atlas radar panel

Author

Net Zero Industrial Policy Lab · Johns Hopkins SAIS

Published

July 8, 2026

One method, end to end. This note documents the corrected capability-gap method as used across the ML-audit suite (ml_replication.qmd, target_profile.qmd, capability_gap.qmd): the need axis is the winners’-recipe target share (the competitive-class signed SHAP), and the reached axis is the achievement fraction (what the country has ÷ what the technology needs). The earlier count-driven mean|z| need and trade-RCA “has” axis are not used here. The live Atlas radar panel is being migrated to these quantities; the companion capability_gap.qmd already renders them for the six focal countries.

1 What the Radar Shows

The Capability Gap Radar reads a country × technology as two surfaces over five product categories — Chemicals · Electronics · Industrial Materials · Machinery · Metals:

Surface	Meaning	Source
Frontier (dashed ring)	the winners’ recipe — how much a competitive producer of this technology leans on each cluster (`target_weight`, §2). Fixed per technology, identical for every country.	`shap_gap_by_country_cluster.csv`
Achievement (green fill)	how far this country has come toward that frontier in each cluster — the achievement fraction \(a_c\) (§3), 0 at the centre, 1 at the frontier.	`shap_gap_by_country_cluster.csv`

The frontier is the demand surface (what the technology needs); the green fill is the supply surface (what the country holds relative to that need). Where green reaches the frontier on a cluster the technology genuinely needs, the country has a strength to leverage; where it collapses toward the centre on such a cluster, that is a capability gap. The magnitude of each cluster’s need — whether it is core, supporting or peripheral for this technology — sets which gaps matter, and is what the classification grid (§4) crosses against achievement.

3 What the Country Has — the Achievement Fraction

3.1 From the country’s own signed SHAP to a fraction of the frontier

Where the need is the winners’ recipe, achievement asks the country-specific counterpart: how much of each cluster’s winning contribution has this country actually built? For country \(i\), cluster \(c\), the achievement fraction divides the country’s own recent (3-year) competitive-class signed SHAP in the cluster by the winners’-recipe target weight for that cluster:

\[ a_{i,c}^{\text{tech}} = \frac{\max\!\big(\phi^{\,i}_{c},\,0\big)}{t\text{-weight}_c^{\text{tech}}} \]

so \(a_{i,c} = 1\) means the country sits exactly at the winners’ frontier for that cluster, \(a_{i,c} > 1\) is a surplus, and \(a_{i,c} \to 0\) means little of the required capability is in place. This is the green fill of the radar. It carries all the country-specific signal — the same technology’s frontier is identical for everyone, but achievement is unique to each country. It is the achieved column of shap_gap_by_country_cluster.csv (country_signed ÷ target_weight).

Numerator vs denominator. The achievement fraction is literally what the country has ÷ what the technology needs. The denominator is the §2 need (winners’ recipe); the numerator is the country’s own competitive-class contribution in the same cluster, in the same SHAP units — which is what lets §4 cross need against achievement without a unit mismatch.

3.2 Achievement for India · Solar

India · Solar — achievement fraction by cluster (what India has ÷ what Solar needs).
Cluster	Need share %	Country signed SHAP	Achievement fraction	At frontier?
Chemicals	23.3	0.0647	0.81	No
Electronics	61.6	0.0357	0.17	No
Industrial Materials	0.5	0.0001	0.04	No
Machinery	6.1	0.0544	2.59	Yes
Metals	8.4	0.0290	1.01	Yes

India is at or beyond the frontier in Machinery and Metals but only ≈17% of the way there in Electronics — the cluster Solar most needs (§2). That mismatch is exactly what the classification grid turns into a priority ordering.

4 Classification Rules: The WP-4 3×3 Grid

The grid crosses two thresholds: how much the technology needs a cluster (§2) against how much of that need the country has met (§3). Because both axes are measured in the same SHAP units, the crossing can be read straight off the gap radar of the companion capability_gap.qmd rather than from a separate table.

Need is the winners’-recipe target share \(t_c\) (§2): the fraction of the technology’s competitive-class signed SHAP that falls in cluster \(c\) — magnitude-preserving and direction-checked. Measured this way, the core cluster is the one competitive producers genuinely rely on (e.g. Solar’s is Electronics, not the cluster with the most HS codes).
Reached is the achievement fraction \(a_c\) (§3) — the country’s own recent signed SHAP in the cluster ÷ that target weight, i.e. what the country has ÷ what the technology needs. \(a_c = 1\) means the country sits exactly at the winners’ frontier for that cluster; \(a_c > 1\) is a surplus.

On the two axes. Because \(a_c\) carries the need in its denominator and the need is also the row axis, the grid is a clean reparameterisation of (need × has) — no degeneracy, but both axes now derive from the model. The circularity that could raise is answered independently in the Empirical Validation section below: the SHAP signal is validated against trade the model never saw, and the target is a lagged, separate final-product RCA — so a country’s need profile is not a restatement of its current exports.

4.1 Threshold definitions

Need level \(t_c\) — what the technology requires (winners’-recipe share of the cluster):

Level	Threshold	Meaning
Core	\(t_c \geq 30\%\)	A cluster competitive producers genuinely lean on — every tech has one or two, running 35–76%
Supporting	\(10\% \leq t_c < 30\%\)	A real but secondary requirement
Peripheral	\(t_c < 10\%\)	The technology barely draws on this cluster

Reached level \(a_c\) — what the country holds, relative to that need:

Level	Threshold	Meaning
At frontier	\(a_c \geq 1\)	Country already holds the winners’ level of this capability (\(\gg 1\) = surplus)
Partial	\(0.5 \leq a_c < 1\)	Meaningfully building, still below the frontier
Behind	\(a_c < 0.5\)	Little or none of the required capability yet

4.2 The 3×3 classification grid

	At frontier \(a_c \geq 1\)	Partial \(0.5 \leq a_c < 1\)	Behind \(a_c < 0.5\)
Core need \(t_c \geq 30\%\)	🟢 leverage	🟠 build_up	🔴 critical_gap
Supporting \(10\%\text{–}30\%\)	🟡 mature_strength	🟠 build_up	🟥 gap
Peripheral \(t_c < 10\%\)	🔵 bonus_strength	⚪ not_priority	⚪ not_priority

4.3 Policy reading per cell

Classification	Policy orientation
leverage	The country already sits at the winners’ frontier on a cluster this technology genuinely leans on. Anchor FDI here; use as the supply-chain nucleus. A large surplus (\(a_c \gg 1\)) is spare capacity to redeploy.
mature_strength	At-frontier capability in a supporting cluster. Useful but not alone decisive — maintain and deepen.
bonus_strength	At-frontier (often surplus) capability in a cluster this technology barely needs. Likely more decisive for a different technology — don’t anchor this tech’s strategy on it.
build_up	Partway to the frontier on a cluster that matters (core or supporting). Closing the remaining distance has high expected returns; expect an import surge in the relevant capital goods before the export signal follows.
critical_gap	The technology’s top driver is missing — behind the frontier on a core cluster. The hardest problem: upstream sector-building, not just incentives. “Core” here means the cluster winners actually rely on (by magnitude), not the one with the most product codes.
gap	A supporting driver is missing. Important to address but not a show-stopper on its own.
not_priority	Neither the technology nor the country prioritises this cluster. Low salience for strategy.

Why there is no “avoid” cell. Because the need is direction-checked (see §4 of target_profile), every retained cluster is one to build toward — the model finds no anti-targets among them. A genuine “don’t specialise here” signal would require a negative competitive-class contribution, which for these 11 technologies never clears the selection bar; when one exists it is reported separately, not classified in this grid. So the grid’s ambition is deliberately bounded: it ranks what to build and in what order, not what to shun.

“It would be applying a false level of precision to the model and misinterpret what it is telling us. We can have confidence in the simpler, but more elegant finding that the knowledge and capabilities required to succeed in industrial development differs across technology areas.” — WP-4, p.10

The framework operates at the cluster level by design. Individual HS6 codes that surface as top SHAP features are signals of what kinds of capabilities matter, not a prescription to build specific products.

5 Worked example: India · Solar

Read directly from the gap profile (analysis/ml/shap_gap_by_country_cluster.csv): need is Solar’s winners’-recipe target share per cluster, reached is India’s recent (3-year) achievement fraction, and the rules above are applied.

India · Solar — classification: winners’-recipe need × achievement fraction.
Cluster	Need %	Need	Achieved	Reached	Classification
Electronics	61.6	Core	17%	Behind	critical_gap
Chemicals	23.3	Supporting	81%	Partial	build_up
Metals	8.4	Peripheral	101%	At frontier	bonus_strength
Machinery	6.1	Peripheral	259%	At frontier	bonus_strength
Industrial Materials	0.5	Peripheral	4%	Behind	not_priority

Reading the result. Solar’s core need is Electronics — ≈62% of the winners’ recipe — and India, machinery- and metals-strong but thin in solar electronics, is far behind there (\(a \approx 0.17\)): Electronics is the critical_gap. Chemicals is a supporting need that India has largely met (build_up, ~80% reached), while India’s Machinery and Metals strength reads as bonus_strength — real capability, but in clusters this technology barely needs. The lesson is that magnitude and sign matter: a cluster’s importance is how much competitive producers rely on it, not how many HS codes it spans, so the “hardest problem” of Indian solar policy is electronics sector-building, not chemicals. capability_gap.qmd carries this same reading across all six focal countries.

6 How the Gap Drives Policy Reasoning

The diagnostic power comes from reading both signals together — need and reached — because neither alone is sufficient:

Reached alone (what the country holds) tells you a country’s strengths, but not whether they matter for the technology in question. India sits at or beyond the frontier in Solar’s Machinery and Metals clusters — but for Solar those are peripheral needs, so “India has machinery” is a far weaker anchor than it first looks.
Need alone (what the technology requires) tells you the winners’ recipe, but not whether a given country can deliver it. Solar’s core need is Electronics — but a country already at the electronics frontier faces a completely different task from one, like India, that is barely a fifth of the way there.

Reading the gap on a single axis:

Core need + Behind       →  critical_gap   The country lacks what matters most.
                                           Policy: sector-building, upstream strategy —
                                           not fixable by incentives alone.

Core need + At frontier  →  leverage       The country has what matters most.
                                           Policy: anchor FDI, attract supply-chain firms,
                                           use as a nucleus for downstream development.

Peripheral + At frontier →  bonus_strength The country is strong in a cluster this
                                           technology barely needs. Policy: pivot — that
                                           capability may be decisive for a DIFFERENT tech
                                           (India's metals strength matters more for
                                           Magnets/Wind than for Solar).

Peripheral + Behind      →  not_priority   Neither the tech nor the country prioritises
                                           this. Ignore for this country×tech pair.

Reading the whole radar. The dashed frontier pentagon is the demand surface — the winners’ recipe, fixed per technology and identical for every country. The green fill is the supply surface — that country’s achievement, unique to it. Where the green reaches the frontier on a core axis, the country has what the technology most needs (leverage); where it collapses toward the centre on a core axis, that is the critical gap.

A country whose green fill lies mostly inside the frontier faces broad capability gaps and needs long-run industrial policy. A country that reaches or overshoots the frontier on several axes but falls short on the single core one faces a more concentrated, addressable problem — India · Solar is exactly this: full on Machinery and Metals, nearly empty on the core Electronics axis.

7 Empirical Validation: SHAP Predicts Trade Intensity

A natural question is whether the model’s SHAP importance scores are circular — do they simply reflect existing trade patterns? The scatterplot report (qmd/scatterplot_report.qmd) tests this directly using OLS: log10(trade % GDP) ~ SHAP mean |z| across products for 2023.

Key findings:

Finding	Result	Interpretation
E1 — Overall	SHAP slope on exports is statistically significant for most technologies	Products the RF values are also traded more intensively in practice — model is not circular
E2 — By function	Product Components show the strongest positive slope (***)	ML-trade alignment is tightest at the component level of the supply chain
E2 — Final Products	Negative slope (significant)	Final product HS codes all get uniformly high SHAP; actual exports are concentrated in 1–2 dominant exporters (China for PV, Denmark for Wind), so among other top exporters the correlation inverts
I1 — Imports	SHAP predicts import intensity more strongly than exports	High-SHAP products are often capital goods — countries import them to build production capacity before developing export competitiveness
I2 — Process Equipment	Strongest import slope	Confirms that machinery for manufacturing is acquired first, exported second
E5 — Time	Gap between high-SHAP and low-SHAP export intensity widening 2018–2024	The model’s importance scores map onto a real and growing divergence in trade patterns

Why this matters for the radar. The scatterplot findings confirm that the SHAP signal behind the need is not a tautology of trade data — it is an independent signal from the Random Forest that happens to correlate with, and in some dimensions precede, actual trade specialisation. A country with a critical_gap on a core cluster is not just defined as lacking by the model: countries that closed such gaps historically went on to trade more intensively in those exact products.

The asymmetry between imports and exports is particularly important for policy: build_up countries should expect to first see an import surge in high-SHAP capital goods (Process Equipment) as they invest in production capacity, before the corresponding export signal materialises. A rising import slope in the right product categories is therefore an early positive indicator, not a deficit signal.

8 What the Radar Does and Does Not Capture

8.1 Specificity: what varies by tech, by country, and by both

The two surfaces have deliberately different scopes:

Surface	Varies by	Source	Country-specific?
Need — winners’-recipe target (\(t_c\))	Tech only	`shap_gap_by_country_cluster.csv` (`target_weight`)	✗ (the frontier is shared)
Achievement fraction (\(a_c\))	Country × Tech	`shap_gap_by_country_cluster.csv` (`achieved`)	✓

The need is a shared demand surface — by construction the same for every country, since a “standard profile countries evolve toward” must be common. All the country-specific signal lives in the achievement fraction, whose numerator is the country’s own competitive-class signed SHAP. So unlike a global average, the green fill already reflects India’s actual position on the model’s decision trees for Solar — no separate country-specific SHAP export is required. What the radar deliberately does not resolve is the within-year path: it reads a recent 3-year window, not a full trajectory (that is the timeline panel’s job).

8.2 Category definitions: HS chapters

The five SHAP categories are a property of the HS code, not of the technology. They follow standard HS chapter groupings and were verified to be perfectly consistent across techs — zero cases of the same HS code assigned to different categories for different technologies.

SHAP Category	HS Chapters	Examples
Chemicals	28–39	Inorganic/organic chemicals, plastics, polymers, paints, photographic materials
Electronics	85	Electrical machinery, semiconductors, instruments
Machinery	84, 86–96	Mechanical appliances, optical/measuring instruments, vehicles
Metals	26, 71–83	Ores, iron/steel, copper, aluminium, precious metals
Industrial Materials	25, 27, 44, 68–70	Minerals, petroleum, wood, ceramics, glass
Other	1–24	Agricultural commodities, food (mainly biofuel feedstocks)

These two columns (shap_category, shap_category_source) have been added to data/green_dict/green_dictionary.csv — the master HS code reference file — for all 605 entries. 373 rows carry a pc_features source (directly confirmed by the ML model’s above-threshold SHAP values); the remaining 232 are chapter_inferred.

8.3 Magnitude, not product count

A cluster’s need is set by how much competitive producers rely on it, not by how many HS codes sit in it. Because the target share (§2) sums the competitive-class signed SHAP rather than counting features, a category resting on a single highly diagnostic input — Industrial Materials is often just one raw material, e.g. quartz sand for Solar — is neither inflated nor dismissed, and a category spanning many minor codes is not flattered. This is the substantive difference from a count-based reading, and it is what moves cores like Solar’s from a many-coded cluster to Electronics (§5).

9 Critical Limitation: Macro Features Excluded from the Radar

The model uses two types of features. For Solar, ordered by shap_mean_z:

The six macro indicators — FDI inflows, trade openness, tariff measures, industry GDP share, total GDP, and manufacturing share — collectively account for 16% of total SHAP weight for Solar, yet they are excluded from the radar because they do not map to a product category.

The winners’-recipe target (§2) is therefore built over the HS-based clusters only, and the five target shares sum to 100% of the HS-product portion of the winners’ recipe — not of total model importance. If the macro block were shown as a sixth axis, the five product axes would together shrink by roughly 16%.

Warning

Implication for interpretation. The radar shows capability gaps in tradeable product categories. It does not capture the role of FDI attraction, trade openness, or manufacturing depth — which the model finds at least as important as any product category. A country could close all five product gaps and still score poorly if it lacks the macro-economic conditions the model relies on. The radar is a necessary but not sufficient diagnostic.

10 Relationship to the ML Pipeline

For fuller context see:

ML Pipeline Explainer — end-to-end walkthrough of the Random Forest, SHAP computation, and PC scores for a non-technical audience.
ML Pipeline Review — critical review of SHAP feature importance, RCA data coverage, and the SHAP ↔︎ trade correlation.

10.1 Data lineage

NZIPL Python ML pipeline
  features: product-level RCA + 6 macro indicators
  target:   RCA > 1 per product (binary)
  train RF → SHAP per (country, year, feature)         ← signed contributions
        └─► raw_signed/                                (regen_shap.py)

target_profile_prep.py
  reconstruct RCA>1 "competitive" class per country-year (CICE trade)
  ├─ winners' recipe: mean signed SHAP over RCA>1 → target_weight  (need, §2)
  └─ country's recent signed SHAP ÷ target_weight  → achieved      (achievement, §3)
        └─► analysis/ml/shap_gap_by_country_cluster.csv
              ← the frontier (need) + green fill (achievement) of the radar

Note on country specificity. The need (target_weight) is a shared frontier — identical for India and Germany on a given tech, by design. All country-specific signal is in achieved, whose numerator is the country’s own competitive-class signed SHAP — so the green fill already differs across countries without any separate per-country SHAP export. See target_profile.qmd for the full construction and ml_replication.qmd for the audit that motivated it.