Scriptable Agents: Running Python Inside Strategy

Key Takeaways

Text generation is table stakes: The next frontier is agents that execute real computation, not just write about it
Skills as executable modules: Python scripts embedded in Skills (like market-sizing.py) transform agents from consultants into analysts
The computation gap: Most AI agents can explain Monte Carlo simulations but can't run them—scriptable agents close this gap
MCP + executable Skills: Combining Model Context Protocol with script execution creates agents that both think and compute
Deterministic backbone matters: When agents run actual code, reliability infrastructure (Google ADK) becomes critical, not optional

The Uncomfortable Truth About AI "Analysis"

Here's the dirty secret about most AI agents: they're sophisticated bullshitters when it comes to numbers.

Ask your average AI assistant to "analyze market size for a B2B SaaS in the HR tech space," and you'll get beautifully formatted prose about TAM, SAM, and SOM. It'll cite frameworks. It'll sound authoritative. But here's what it won't do: actually run the numbers.

It won't pull real data, apply statistical models, or generate Monte Carlo simulations to show you confidence intervals. It'll just... write about what analysis should look like. It's the business equivalent of someone describing how to cook a meal instead of actually cooking it.

This is the dirty secret of first-generation AI agents: they're incredible at appearing analytical while doing zero actual computation.

Why Text Generation Hit a Wall

The AI Board Room—with its specialized agents like Atlas (strategy), Cipher (analytics), and Nova (operations)—was designed to move beyond generic ChatGPT responses. Each agent loads specialized Skills via SKILL.md files, bringing domain expertise to the table.

But even with specialized knowledge, there's a fundamental limitation: language models generate language. They predict the next token, not the next calculation.

When Cipher needs to model customer acquisition costs across different channels with varying conversion rates, LLM-native approaches force it to either:

Approximate with rough mental math (unreliable)
Describe what the calculation should be (useless)
Generate code as text for humans to run elsewhere (friction-heavy)

None of these options are acceptable for a solo founder making real business decisions at 11 PM on a Sunday.

Enter Scriptable Skills: Code as Native Intelligence

Here's where it gets interesting. What if Skills weren't just markdown files with expertise, but executable modules that agents could actually run?

Imagine a Skill file that looks like this:

# SKILL.md: Market Sizing Analysis

## Core Competency
Statistical market analysis using bottom-up and top-down methodologies

## Executable Tools
- market-sizing.py: Monte Carlo simulation for TAM/SAM/SOM
- cohort-analysis.py: Customer lifetime value modeling
- sensitivity.py: Multi-variable sensitivity analysis

## When to Execute
When user requests numerical market analysis, financial projections, or uncertainty quantification

Now when you ask Cipher to "analyze the addressable market for my AI coaching platform," it doesn't just write about it—it executes market-sizing.py with your parameters, runs actual simulations, and returns real distributions with confidence intervals.

This is the difference between an agent that can talk about strategy and one that can compute strategy.

The Technical Architecture: MCP + Executable Skills

The magic happens at the intersection of three technologies:

1. Model Context Protocol (MCP)

MCP already enables agents to use external tools—APIs, databases, calculators. It's the bridge between language models and the real world. But traditionally, these tools are pre-built, generic utilities.

2. Skills as Modular Expertise

The AI Board Room's Skill system loads specialized knowledge dynamically. When you're talking market strategy, Atlas loads competitive-positioning.md. When you're analyzing metrics, Cipher loads unit-economics.md.

3. Executable Scripts Within Skills

Here's the innovation: Skills can bundle Python scripts as first-class citizens. Not as text to be generated, but as actual executables the agent can invoke via MCP.

When Cipher loads financial-modeling.md, it also gets access to dcf-analysis.py, burn-rate-calculator.py, and scenario-planner.py. These aren't suggestions—they're tools it can actually use.

Why This Matters for Solo Founders

If you're building a startup solo, you're already wearing 12 hats. You're the CEO, the product manager, the marketer, and the analyst. You don't have time to:

Export data to Excel
Remember which Python libraries do what
Debug code at midnight when you just need a quick sensitivity analysis
Hire a data analyst for a 2-hour project

Scriptable agents collapse this entire workflow. You're having a strategic conversation with Atlas about pricing strategy, and mid-conversation you say, "Actually, can we model what happens if we increase price 20% but lose 15% of customers?"

Atlas doesn't pause. It doesn't tell you to open a spreadsheet. It invokes pricing-elasticity.py, runs the simulation with your current metrics from the User Dossier, and shows you the results in context of the strategic discussion.

The computation becomes invisible. The insight becomes immediate.

The Deterministic Backbone: Why Reliability Matters 10x More

Here's where things get serious: when agents were just generating text, hallucinations were annoying. When agents are executing code that influences business decisions, hallucinations become dangerous.

This is why the Google ADK (Agent Development Kit) and deterministic backbone architecture are critical. When Cipher decides to run market-sizing.py, you need:

Deterministic action extraction: The system must reliably identify when to execute vs. when to explain
Sandboxed execution: Scripts run in isolated environments with proper resource limits
Output validation: Results pass through the Critic Agent before being presented as facts
Audit trails: Every execution is logged with inputs, outputs, and reasoning

The A2A (Agent-to-Agent) protocol becomes crucial here. When Atlas realizes it needs computational analysis, it doesn't just "call a function"—it formally delegates to Cipher with explicit context, success criteria, and validation requirements.

Real-World Scenarios

Scenario 1: Market Entry Decision

You: "Should I expand to the UK market?"

Atlas (loading market-entry.md): "Let me analyze this systematically. First, I need to model the addressable market and unit economics in GBP."

[Atlas delegates to Cipher via A2A]

Cipher (executing market-sizing.py and localization-cost-model.py): "Based on comparable SaaS penetration rates and your current CAC/LTV, here are three scenarios..."

[Returns actual probability distributions, not guesses]

Atlas: "Given these numbers, I recommend a soft launch with these specific constraints..."

Scenario 2: Pricing Optimization

You: "I'm thinking of switching to usage-based pricing."

Cipher (executing pricing-model-comparison.py): "I've modeled your current customer cohorts against three usage-based structures. Here's the revenue impact over 18 months with 80% confidence intervals..."

The agent doesn't just have opinions—it has computed projections based on your actual data.

The Voice Mode Multiplier

Now add Native Audio to this mix. You're driving to a meeting, and you verbally brainstorm with Nova about a new product feature. Mid-conversation, you wonder about pricing implications.

Nova—via A2A—brings Cipher into the conversation. Cipher runs the pricing model while you're still talking. By the time you finish your thought, it has results.

This is the future: strategic conversations that seamlessly incorporate real-time computation, all happening at the speed of speech.

Implementation Reality Check

Let's be clear: this isn't vaporware, but it's also not trivial. Building scriptable agents requires:

Secure sandbox environments for code execution
Careful Skill design to know when to compute vs. explain
Robust error handling when scripts fail
Clear user consent around what code runs when
Critic Agent validation to catch nonsense outputs

The User Dossier becomes even more important—scripts need access to your metrics, but with proper privacy controls.

This is why the deterministic backbone matters. You're not just stringing together API calls; you're building a reliable system where agents make consequential computational decisions.

Beyond Python: The Broader Vision

Python is the obvious starting point—it's the lingua franca of data analysis and modeling. But the scriptable agent concept extends further:

SQL scripts for direct database analysis
R scripts for specialized statistical modeling
Shell scripts for infrastructure automation
Custom binaries for domain-specific simulation

The key insight is that agents should be able to invoke any computational tool, not just generate text about it.

The Competitive Moat

Here's the provocative take: companies building AI agents without executable capabilities are building the wrong product.

Text generation is commoditizing rapidly. Every startup has a "AI consultant" chatbot now. The differentiation isn't in how well your agent can describe analysis—it's in whether it can actually do analysis.

Scriptable agents create a moat because they require:

Deep integration between LLMs and execution environments
Sophisticated delegation protocols (A2A)
Reliability infrastructure (deterministic backbone)
Domain expertise encoded as executable Skills

This is hard to replicate. It's not just prompt engineering—it's actual systems architecture.

Call to Action: Experience Computation-Native AI

The AI Board Room at JobInterview.live is pioneering this scriptable agent approach. Our specialized agents—Atlas, Cipher, Nova, and the team—don't just talk strategy; they compute it.

Whether you're sizing a market, modeling pricing scenarios, or analyzing customer cohorts, you're working with agents that can both think and calculate.

Ready to move beyond AI that just sounds smart to AI that actually is smart?

Try the AI Board Room at JobInterview.live and experience the difference between agents that describe analysis and agents that perform it.

The future of AI assistance isn't better text generation. It's invisible computation woven into natural conversation.

Welcome to the era of scriptable agents.