Museum of Code

alphafold_2018

The Atomic Scale Origami Algorithm that Changed Humanity.

This is a deep code base architectural teardown generated by the blAST engine. A custom static analysis AST-free, LLM-free knowledge graph engine specific to code repositories. Parses functions against 50+ unique metrics, rolled up into classes, files, folder and repo levels. O(N)1 level network analysis and full function call graphs for reachability. Supports 50+ languages so you can fully analyze your multi-language repos. Able to switch languages mid-file. CLI based. The writing is human, the data and framework is from my automated data analysis pipeline.

git sparse-checkout set alphafold_casp13

Historical Significance

Status
The Atomic Scale Origami Algorithm
Why it matters
It brought Protein Shape into Focus
Architects
John Jumper and the DeepMind Team

For 50 years, the 'Folding Problem' was the holy grail of biology. We knew the ingredients of life but we couldn't predict their shape.

And in biology, shape is function—it dictates how drugs work, how diseases spread, and how life survives.

AlphaFold changed the rules of the game. For decades, the time between a gene’s sequence and a protein’s shape was a lifetime of lab work, AlphaFold collapsed that time to seconds.

It’s like humming a simple melody into a computer and then having it instantly transformed into an award-winning top-of-the-charts song, in any genre, again, and again and again.

But instead of music, it produces the shapes of proteins, the very machines that keep us alive and go awry in different diseases. We can finally predict the machines that control our health and life on this planet.

Architectural Synthesis

1. Information Flow & Purpose (The Executive Summary)
This is not a traditional software application; it is a highly specialized, brute-force mathematical pipeline. Data flows from massive pre-compiled weight tensors (the 13 binary .pb and .h5 "Dark Matter" files) directly into tightly encapsulated Python scripts. With an Encapsulation Ratio of 1.0 and a mere 1,756 lines of executable code driving the entire system, the architecture relies on intense computational density rather than sprawling object-oriented abstraction.

2. Notable Structures & Topology
The dependency graph is startlingly flat. A network topology with an Average Path Length of 0.0, 0 Articulation Points, and 0.0% Cyclic Loop Density indicates that these files do not form a deep, interconnected web. Instead, they act as highly isolated utility scripts processing data in sequence. However, this flat structure incurs a massive Architectural Drift (Z-Score: 4.66). The system heavily deviates from standard Python conventions, sacrificing modularity for immediate, linear execution.

3. Security & Vulnerabilities
From a zero-trust perspective, the ecosystem is perfectly sterile—0 Shadow APIs, 0 Typosquatting hits, and 0 Supply Chain Anomalies. However, operational safety is severely compromised by a 40.9% Verification Risk and only 1 active Test Suite. This is the definitive hallmark of "Academic Research Code": it was built rapidly to prove a thesis for a publication, not test-driven for enterprise production. It relies entirely on the mathematical brilliance of its authors rather than programmatic guardrails.

4. Outliers & Extremes
The structural extremities reveal the friction of deployment. contacts_network.py acts as a "Blind Bottleneck"—a God Node calculating spatial distances at an agonizing O(N^6) time complexity, yet crippled by a 100% Documentation Risk. Simultaneously, the deployment pipeline itself (run_eval.sh) collapses under 100% Cognitive Load and 75% Tech Debt. The team was clearly focused on the neural network, treating the operational shell as a brittle afterthought, further evidenced by a chaotic 51.5% "Civil War" formatting clash (Tabs vs. Spaces) across the codebase.

5. Recommended Next Steps (Refactoring for Stability)

  • Decouple the God Node: Fracture contacts_network.py into distinct, documented modules to lower the cognitive load and isolate the hazardous O(N^6) spatial logic.
  • Establish Verification Guardrails: Introduce unit test coverage to the core contacts.py orchestrators to reduce the 41% Verification Risk before attempting to scale the algorithm.
  • Standardize the Deployment Shell: Rewrite the brittle run_eval.sh script into a formalized Python orchestration tool to eliminate the extreme Tech Debt and cognitive load at the execution boundary.

Global System Scorecard

Ecosystem Composition

1
Test Suites
5
Doc/Prose
1
Build/Make
0
Config/JSON

Network Health

Cluster 3
Global Archetype
4.66
Z-Score Drift
0.0%
Cyclic Loop Density

Zero-Trust Audit

0
Typosquat Hits
0
Binary Anomalies
0
Blacklist/Unknown Pkgs

Architecture & Scale

33
Total Files
1756
Coding LOC
6
Total Classes
1.0
Encapsulation Ratio

Extended Topology

0.0
Avg Path Length
0
Articulation Points
0.0
Assortativity
0.0
Modularity

Extended Security

0
Shadow APIs

Global Risk Exposures (Averages)

7.57
Cog Load
0.0
Deep Churn
5.73
Error/Safety Risk
21.6
Tech Debt
21.84
Doc Risk
40.93
Verification Risk
0.0
Stability (Heat)
0.0
Graveyard
1.4
API Exposure
0.38
Concurrency
5.98
State Flux
51.52
Civil War (Tabs/Spaces)

What to look for:

Sort by Fragility to find orchestrators that pull the system together—these are highly coupled and break easily if external APIs change. Sort by Popularity to find the load-bearing pillars; if these fail, the ecosystem collapses. A healthy system balances mass across many nodes; a fragile one consolidates it into a single 'God Node'.

File Name Ecosystem Role Structural Mass Fragility Popularity

What to look for:

This matrix exposes the multi-dimensional technical debt of the architecture. Sort by Cumulative Risk to prioritize your refactoring efforts. Look for the deadly trio: High Cog Load, High State Flux, and High Test Risk. A file with low 'Cog Load' but extreme 'State Flux' is easy to read but mutates data dangerously.

File Name Cumulative Risk Cognitive Load Tech Debt State Flux Test Risk Safety Risk Concurrency API Exposure Graveyard Churn

What to look for:

This is the raw heuristic telemetry driving the physics engine. Use this to manually verify the automated risk scores and investigate specifically why a file was flagged. Scan for severe outliers in structural signatures—from high Struct Branch density to dangerous State Bailout Hits.

File Name ↕ Mass Max Big-O Cumulative Risk File Archetype LOC Cog Raw ↕ Ownership Entropy Silo Risk ↕ Raw Churn Freq ↕ Pagerank Score ↕ Closeness Score ↕ Producer Ratio ↕ Avg Func Loc Avg Func Complexity ↕ Max Func Complexity ↕ Avg Func Args Func Complexity Gini ↕ Func Internal Density ↕ Dependency Density ↕ Encapsulation Ratio ↕ Ai Threat Confidence ↕ Func Z Max ↕ Func Z Mean ↕ Func Z Median ↕ Pct Z Above 5 ↕ Pct Z Above 15 ↕ Repo Z Score ↕ Is Malware ↕ Has Credentials ↕ Binary Anomaly ↕ Glassworm Flag ↕ Token Mass Financial Read Cost ↕ Agentic Black Hole ↕ Requires Hitl ↕ Appsec Rce Funnel ↕ Appsec God Mode ↕ Appsec Exfiltration ↕ Hallucination Zone ↕ Silent Mutation Risk ↕ Struct Branch Struct Linear Struct Args Struct Func Start Struct Class Start Def Safety State Safety Neg State Danger Arch Io Arch Api State Flux State Graveyard Def Doc Def Test Arch Concurrency Arch Ui Framework Struct Closures Arch Globals Struct Decorators Struct Generics Struct Comprehensions Arch Scientific State Heat Triggers Arch Import Def Ownership State Planned Debt State Fragile Debt Def Spec Exposure Arch Ssr Boundaries Arch Events Arch Dependency Injection Struct Macros State Pointers State Memory Alloc Arch Inline Asm Def Telemetry State Print Hits State Cast Hits State Bailout Hits State Halt Hits Bitwise Hits Def Sync Locks Def Freeze Hits Def Cleanup Def Encapsulation Def Listeners Def Test Skip Struct Tabs Struct Spaces Arch Hardware Arch Crypto Def Auth Arch Ipc Arch Feature Flags Arch Serialization Arch Regex Arch Time Llm Api Llm Orchestrator Llm Vector Store Llm Local Compute Ai Tools Ai Memory Ai Logic Loop Ml Traditional Dl Frameworks Lazy Evaluation Vectorized Math Struct Var Decl Struct Camel Case Struct Snake Case Struct Pascal Case Struct Upper Case Struct Short Vars Struct Long Vars State Slop Duplicates State Slop Orphans Threat Obfuscated Threat Bypasses Threat Network Hooks Threat Eval Exec Threat Env Mutation Sec Graveyard Threat Crypto Math Threat Stego Imports Threat Homoglyphs Threat Private Info Threat Extension Mismatch Threat Entropy Threat Tainted Injection Prompt Injection Agentic Rce

What to look for:

Here we isolate the architecture down to the atomic level. Sort by Big-O to find recursive or highly nested algorithms that threaten performance (O(N^6) or worse). Sort by Impact Mass to locate massive, monolithic functions that violate the Single Responsibility Principle and need to be fractured into smaller, testable units.

Function Name ↕ Parent File ↕ Impact Mass Big-O Recursive? ↕ Function Archetype LOC Args Outbound Calls Function Drift Token Mass Keyword Density Struct Branch Struct Linear Struct Args Struct Func Start Struct Class Start Def Safety State Safety Neg State Danger Arch Io Arch Api State Flux State Graveyard Def Doc Def Test Arch Concurrency Arch Ui Framework Struct Closures Arch Globals Struct Decorators Struct Generics Struct Comprehensions Arch Scientific State Heat Triggers Arch Import Def Ownership State Planned Debt State Fragile Debt Def Spec Exposure Civil War Arch Ssr Boundaries Arch Events Arch Dependency Injection Struct Macros State Pointers State Memory Alloc Arch Inline Asm Def Telemetry State Print Hits State Cast Hits State Bailout Hits State Halt Hits Bitwise Hits Def Sync Locks Def Freeze Hits Def Cleanup Def Encapsulation Def Listeners Def Test Skip Struct Tabs Struct Spaces Arch Hardware Arch Crypto Def Auth Arch Ipc Arch Feature Flags Arch Serialization Arch Regex Arch Time Llm Api Llm Orchestrator Llm Vector Store Llm Local Compute Ai Tools Ai Memory Ai Logic Loop Ml Traditional Dl Frameworks Lazy Evaluation Vectorized Math Struct Var Decl Struct Camel Case Struct Snake Case Struct Pascal Case Struct Upper Case Struct Short Vars Struct Long Vars State Slop Duplicates State Slop Orphans Threat Obfuscated Threat Bypasses Threat Network Hooks Threat Eval Exec Threat Env Mutation Sec Graveyard Threat Crypto Math Threat Stego Imports Threat Homoglyphs Threat Private Info Threat Extension Mismatch Threat Entropy Threat Tainted Injection Prompt Injection Agentic Rce