# Replication Package: Governance Topology Thesis Audit

**Date:** February 2026
**Contents:** 5 audit phases, 3 data files, 2 synthesis documents, canonical parameters, pre-generated results

---

## 1. Overview

This package contains a five-phase empirical audit of the Governance Topology thesis. Each phase is a standalone Python script that reads from a single flat CSV file and produces a Markdown results file. The audit tests the thesis's core quantitative claims against the data and against external benchmarks.

| Phase | Script | Output | Focus |
|-------|--------|--------|-------|
| 1 | `phase1-foundation-audit.py` | `phase1-audit-results.md` | Crosswalk validation, Event Horizon, Stage 5, velocity CIs, holdout accuracy |
| 2 | `phase2-model-hardening.py` | `phase2-model-hardening-results.md` | Shock estimation, Markov test, yield regression, AIC/BIC model comparison, mean reversion |
| 3 | `phase3-us-case-hardening.py` | `phase3-us-case-hardening-results.md` | US cross-validation vs 7 indices, institutional resilience, matched comparison, elections, reserve currency |
| 4 | `phase4-missing-evidence.py` | `phase4-missing-evidence-results.md` | Recalibration framework, Monte Carlo sensitivity, out-of-sample backtesting, counter-arguments CA1-CA7 |
| 5a | `phase5-recalibrated-monte-carlo.py` | `phase5-recalibrated-mc-results.md` | Recalibrated MC engine: data-driven sigma, AR(1) mean reversion, recalibration table, scenario distributions |
| 5b | `phase5-gdp-covariate.py` | `phase5-gdp-covariate-results.md` | GDP per capita as covariate: yield regressions, Przeworski threshold, Lipset hypothesis, GDP-augmented MC |

---

## 2. Prerequisites

- **Python 3.7+** (tested with Python 3.7.3)
- **No third-party packages required.** All scripts use only the Python standard library (`csv`, `math`, `random`, `statistics`, `collections`). No `pip install` step is needed.
- **Operating system:** Any (macOS, Linux, Windows). No OS-specific dependencies.

---

## 3. Data Files

| File | Format | Description |
|------|--------|-------------|
| `political-topology-flat.csv` | CSV | Master dataset. 1,656 observations, 91 countries, 1800-2025. |
| `political-topology-data.xlsx` | Excel (6 sheets) | Master dataset in multi-sheet Excel format. |
| `human_capabilities_index.xlsx` | Excel | Human Capabilities Index dataset (15 indicators). |

### CSV column definitions (`political-topology-flat.csv`)

| Column | Description |
|--------|-------------|
| `country` | Country name |
| `iso3` | ISO 3166-1 alpha-3 code (may be blank for some entries) |
| `region` | Geographic region |
| `year` | Observation year |
| `liberty` | Liberty score (0-100 scale) |
| `tyranny` | Tyranny score |
| `chaos` | Chaos score |
| `status` | Freedom House status classification |
| `event_horizon_below` | Whether the country is below the Event Horizon threshold (YES/NO) |
| `data_source_period` | Period label for the data source |

All four Python scripts read exclusively from `political-topology-flat.csv`. The Excel files are provided for reference but are not required for replication.

---

## 4. How to Run

### Important: Fix hardcoded paths first

All four scripts contain hardcoded absolute paths pointing to the original author's filesystem:

```python
DATA_PATH = "/Users/nickgogerty/Downloads/Political topology/political-topology-flat.csv"
OUTPUT_PATH = "/Users/nickgogerty/Downloads/Political topology/phase1-audit-results.md"
```

**Before running, you must update these paths.** In each script, change `DATA_PATH` and `OUTPUT_PATH` at the top of the file to match your local directory. For example:

```python
DATA_PATH = "/your/path/to/political-topology-flat.csv"
OUTPUT_PATH = "/your/path/to/phase1-audit-results.md"
```

### Running the scripts

Run each phase in order from the package directory:

```bash
python3 phase1-foundation-audit.py
python3 phase2-model-hardening.py
python3 phase3-us-case-hardening.py
python3 phase4-missing-evidence.py
python3 phase5-recalibrated-monte-carlo.py
python3 phase5-gdp-covariate.py
```

Each script takes a few seconds to run and writes its output to the corresponding `-results.md` file. The scripts are independent of each other -- any phase can be run in isolation.

**Note:** Phase 5 scripts use relative paths (via `os.path.dirname(__file__)`) and do not require path editing. They expect the standard directory layout with `../data/` for CSV files and `../../artifacts/` for output.

### Expected runtime

All scripts should complete in under 30 seconds each. Phase 4 and Phase 5a (Monte Carlo simulations) may take slightly longer due to simulation loops but should still finish within a minute.

---

## 5. Expected Outputs

After running all scripts, you should have:

```
phase1-audit-results.md                (~25 KB)
phase2-model-hardening-results.md      (~19 KB)
phase3-us-case-hardening-results.md    (~27 KB)
phase4-missing-evidence-results.md     (~28 KB)
phase5-recalibrated-mc-results.md      (~15 KB)  [output to artifacts/]
phase5-gdp-covariate-results.md        (~8 KB)   [output to artifacts/]
```

Pre-generated result files are included in this package for comparison. Because phases 2, 4, and 5 use `random` for bootstrap/Monte Carlo procedures, exact numerical values may differ slightly between runs, but the qualitative conclusions should be identical. Phase 5 scripts use `random.seed(42)` for reproducibility.

---

## 6. Key Findings Summary

### Phase 1 -- Foundation

- The Event Horizon threshold has been revised from L=60 to L≈52-55, where three independent methods converge. The recovery rate is 3.0% (95% CI: 0.7-6.0%). The reversal rate at L=60 is approximately 80%, not the originally claimed 12%.
- The US liberty score of L=48 is an author estimate, not an official Freedom House value.

> ⚠️ **METHODOLOGY NOTE:** The PTI score of L≈48 reflects the author's real-time institutional assessment incorporating executive action pace through early 2026. Published indices score the US higher: Freedom House 83/100 (2024 report), V-Dem LDI ≈0.65–0.72 (scaled: ~65–72). The divergence reflects the PTI's faster update cycle, weighting toward institutional constraint erosion, and incorporation of events post-dating published index coverage. All claims should be evaluated under both the author's PTI and established indices.
- Stage 5 exhibits a structural break: pre-2006: +38%, post-2006: -23.3%. Global break ~2000 (F=21.2); stage-specific 2006.

### Phase 2 -- Model Hardening

- ALL thesis sigma (shock probability) values were stipulated without empirical basis and are wrong by a factor of 2-7x compared to data-derived estimates.
- The Markov assumption is rejected: transition probabilities are path-dependent.
- A simple AR(1) model beats all stage-based models with a delta-AIC greater than 300.
- The mean reversion parameter k is approximately 0, meaning the claimed attractor dynamics are not present in the data.

### Phase 3 -- US Case

- US L=48 is not credible. The mean across 7 external indices (V-Dem, Polity5, EIU, FH, Bertelsmann, IDEA, WGI) is 76.6 on a normalized 0-100 scale. The credible range is 57-84.
- The reserve currency effect explains most of the yield gap that the thesis attributes to liberty scores.

### Phase 4 -- Missing Evidence

- V-Dem reclassified the US as an "electoral autocracy" in September 2025, which partially supports the thesis's directional concern but at a much milder level than L=48 implies.
- Data-driven Monte Carlo simulation gives P(tyranny) approximately 0% with data-driven parameters; however, post-2006: P(L<50|15yr)=69%. The original thesis claim of 62% was built on stipulated σ values.
- Of 7 counter-arguments examined, 3 are assessed as strong (CA1: Institutional Resilience, CA5: Reserve Currency, CA7: Democratic Recovery).

### Phase 5a -- Recalibrated Monte Carlo

- The AR(1) model with data-driven sigma produces a median 15-year trajectory from L=48 to L=65 (converging toward L*=80.9), confirming mean reversion as the dominant dynamic.
- P(L<25 by 2040) is approximately 0% with data-driven sigma. The original P(tyranny)=62% claim is retracted.
- Recalibration table spans starting values L=48 through L=84 across 5/10/15-year horizons.
- Data-driven sigma values (0.45-4.45) are 2-7x lower than thesis stipulated values (3-7).
- Named scenario bands derived from percentile distributions: Recovery (p90-p95), Stabilization (p50-p75), Continued Erosion (p25-p50), Accelerated Decline (p5-p10).

### Phase 5b -- GDP Covariate

- Adding log(GDP per capita) to the liberty-yield regression improves R² from 0.209 to 0.362.
- GDP is a significant moderator of liberty dynamics in the AR(1) model (gamma=1.964).
- The Przeworski threshold test shows higher recovery rates for countries above $15K GDP per capita.
- The Lipset hypothesis (high GDP dampens downside volatility) receives partial support from the data.

---

## 7. Synthesis Documents

| File | Description |
|------|-------------|
| `00-MODEL-THESIS-MAP.md` | Master thesis model with all audit findings integrated. Start here for a structured overview of what holds and what does not. |
| `00-THESIS-ARCHITECTURE-DIAGRAM.svg` | Visual architecture diagram of the thesis structure. |

---

## 8. Caveats and Limitations

1. **Hardcoded paths.** Scripts must be edited before running on a different machine (see Section 4).

2. **Stochastic variation.** Bootstrap confidence intervals (Phase 1, Phase 2) and Monte Carlo simulations (Phase 4) use Python's `random` module. Results will vary between runs unless you set a fixed seed. The pre-generated results reflect one specific run.

3. **No external data fetching.** All analysis is performed against the included CSV. The scripts do not download or validate against live Freedom House, V-Dem, or other external data sources. Cross-validation values for external indices (Phase 3) are hardcoded within the scripts based on published scores.

4. **Standard library only.** The deliberate choice to avoid NumPy/SciPy/statsmodels means some statistical procedures (e.g., panel fixed-effects regression, formal AIC computation) are implemented from scratch. These implementations are correct for the purpose of this audit but are less battle-tested than library equivalents.

5. **CSV data completeness.** Some `iso3` fields are blank. The dataset spans 1800-2025, but coverage is sparse before 1950. Most statistical tests in the audit focus on the post-1972 period where Freedom House data is available.

6. **Scope.** This audit tests quantitative claims only. Qualitative arguments, historical narratives, and theoretical framing in the thesis are not evaluated by these scripts.

---

## 9. Reproducibility Checklist

Use this checklist to confirm a successful replication:

- [ ] Python 3.7+ is installed (`python3 --version`)
- [ ] `DATA_PATH` and `OUTPUT_PATH` updated in Phase 1-4 scripts (Phase 5 uses relative paths)
- [ ] `political-topology-flat.csv` is present and has 1,657 lines (1 header + 1,656 data rows)
- [ ] Phase 1 runs without errors and produces `phase1-audit-results.md`
- [ ] Phase 2 runs without errors and produces `phase2-model-hardening-results.md`
- [ ] Phase 3 runs without errors and produces `phase3-us-case-hardening-results.md`
- [ ] Phase 4 runs without errors and produces `phase4-missing-evidence-results.md`
- [ ] Phase 5a runs without errors and produces `phase5-recalibrated-mc-results.md`
- [ ] Phase 5b runs without errors and produces `phase5-gdp-covariate-results.md`
- [ ] Phase 1 result confirms Event Horizon threshold at L≈52-55 with recovery rate 3.0% (95% CI: 0.7-6.0%)
- [ ] Phase 2 result confirms AR(1) outperforms stage models (delta-AIC > 100)
- [ ] Phase 3 result confirms US liberty cross-validation mean is above 70
- [ ] Phase 4 result confirms data-driven P(tyranny) is near 0%
- [ ] Phase 5a result confirms AR(1)-based MC median converges toward L*=80.9 (median 10yr from L=48 ≈ 55-65)
- [ ] Phase 5a result confirms data-driven sigma range is 0.45-4.45
- [ ] Phase 5b result confirms GDP improves liberty-yield regression R²
- [ ] Qualitative conclusions match between your run and the pre-generated results (exact numbers may differ due to stochastic variation)

---

## 10. File Manifest

### Core replication files (required)

```
political-topology-flat.csv               Data     Master dataset (CSV)
phase1-foundation-audit.py                Script   Phase 1: Foundation audit
phase2-model-hardening.py                 Script   Phase 2: Model hardening
phase3-us-case-hardening.py               Script   Phase 3: US case hardening
phase4-missing-evidence.py                Script   Phase 4: Missing evidence
phase5-recalibrated-monte-carlo.py        Script   Phase 5a: Recalibrated MC engine
phase5-gdp-covariate.py                   Script   Phase 5b: GDP covariate analysis
```

### Pre-generated results (for comparison)

```
phase1-audit-results.md                   Results  Phase 1 output
phase2-model-hardening-results.md         Results  Phase 2 output
phase3-us-case-hardening-results.md       Results  Phase 3 output
phase4-missing-evidence-results.md        Results  Phase 4 output
phase5-recalibrated-mc-results.md         Results  Phase 5a output (in artifacts/)
phase5-gdp-covariate-results.md           Results  Phase 5b output (in artifacts/)
```

### Synthesis and reference

```
00-MODEL-THESIS-MAP.md                    Synthesis   Integrated thesis model with audit findings
00-THESIS-ARCHITECTURE-DIAGRAM.svg        Synthesis   Visual architecture diagram
00-CANONICAL-PARAMETERS.md                Reference   All canonical model parameters (definitive)
political-topology-data.xlsx              Data        Master dataset (Excel, 6 sheets)
human_capabilities_index.xlsx             Data        HCI dataset (15 indicators)
```

---

## 11. Contact

For questions about replication or data provenance, contact the package author.