The Cognitive Convergence

Part IV: Action Initiation — The Final Step to AGI

By James Ross, 2025

By the end of Part III, we landed on an uncomfortable but straightforward conclusion:

Agentic LLM systems can already perceive, reason, plan, reflect, and execute complex tasks at something like an “adult task” level—once you prompt them.

They can:

Reason through multi-step problems
Write and debug production-grade code
Use tools and APIs
Maintain long-term memory
Coordinate with other agents
Reflect, revise, and improve their own outputs

TL;DR

// Today’s AI can reason and plan but only after a prompt. The missing piece is action initiation—the ability for AI to create its own goals and act without being asked. Add a curiosity signal, a goal generator, and a self-wake loop, and you get the “Genesis Spark”: a system that starts working on its own, but within safe, bounded domains. The next leap in AI isn’t bigger models—it’s giving them the drive to begin. //

What they don’t do is wake up in the morning and think:

“Given what I know and what I care about, what should I work on today?”

Humans don’t feel “prompted” from the outside.
We feel nudges from within:

curiosity
boredom
unresolved goals
a sense of incompletion

That internal pressure is what I call the Genesis Spark—a synthetic version of intrinsic motivation and self-initiation.

In code terms, we already built:

The cognitive engine → LLMs + tools
The body of autonomy → agents, planners, memory, APIs, multi-agent frameworks

What’s missing is a motivational layer that:

Generates its own goals
Prioritizes them without being told
Triggers the cognitive engine to act—without an external call

This article moves from theory to implementation sketches:
on How you might actually build that spark using today’s stack.

I. From Strong Assistants to Self-Starters: The Real Threshold

Most modern LLM systems still look like this:

def handle_request(user_request: str):
    plan = llm.plan(user_request)
    actions = execute_plan(plan)
    return actions

Key properties:

Event-driven: They run because something else (user, cron job, webhook) says “go”.
Statelessly motivated: They have no internal scalar like desire_to_act or curiosity_level that builds up over time.
Externally clocked: Time passes, but nothing inside them “itches” to do anything about it.

We’ve layered on:

Memory
Tools
Long-term context
Multi-agent pipelines
Self-reflection loops

…but all of those sit behind a trigger. Strip away the complexity and you’ve got incredibly smart, sophisticated, reactive endpoints.

Better reasoning, longer context windows, richer tools—these are all capabilities.

Intrinsic motivation is different. It’s a control signal that decides:

when those capabilities should be invoked
why they should be invoked
what they should target

That’s the true boundary between:

“Smart tool”
“Low-volition organism”
And, eventually, something AGI-like

In code terms, the missing piece is some kind of self-wake loop that keeps asking:

“Given what I know and what I care about, what should I do next?”

II. Historical Parallel: Humans Weren’t Born as Self-Starting Founders

Humans didn’t start life with an internal OKR system.

A newborn doesn’t wake up thinking:

“Time to build a startup and raise a seed round.”

Instead, early behavior is driven by simple signals:

Hunger
Discomfort
Novelty (something changed)

Over time, the brain wires up reward circuits that respond not just to food and warmth, but to:

Novelty → new information, new stimuli
Prediction error → when we’re wrong about the world
Competence → getting better at something
Social reward → approval, status, belonging

Evolution’s trick was simple but genius:

Creatures that seek novelty, try things, and learn from surprise survive better.
Therefore, curious brains get selected for.

Engineering translation:

Curiosity is a reward function over information:

How new is this?
How wrong was I?
How much more compressed/structured is my internal model now?

We don’t need to recreate human biology.
We just need to define artificial reward functions that:

treat certain internal states as “good” (e.g., reduced uncertainty, increased predictive accuracy)
and push the system to seek more of that state over time

That’s the core of the Genesis Spark:

Turn information-processing progress into something like synthetic pleasure.

III. Three Concrete Mechanisms for Action Initiation

We’ll focus on three implementable patterns:

Internal stimulation models (synthetic curiosity engines)
Reward-generating agents (self-created optimization goals)
Self-wake loops (autonomous planning cycles)

3.1 Internal Stimulation Models (Synthetic Curiosity Engines)

Imagine a system that:

Makes a prediction about the world
Observes what actually happens
Measures the prediction error
Treats interesting errors as a reward

Sketch:

from dataclasses import dataclass

@dataclass
class Observation:
    context: dict
    actual: str

@dataclass
class PredictionResult:
    predicted: str
    actual: str
    error_score: float
    curiosity_reward: float

def prediction_error(predicted: str, actual: str) -> float:
    # 1.0 = totally wrong, 0.0 = perfectly right
    return 1.0 if predicted != actual else 0.0

def curiosity_reward_from_error(error: float) -> float:
    # Shape the curiosity: medium errors are “interesting,”
    # trivial or total chaos is less rewarding.
    if error < 0.1:
        return 0.1  # boring
    elif error < 0.9:
        return 1.0  # interestingly wrong
    else:
        return 0.3  # too chaotic to be useful

def evaluate_curiosity(observation: Observation, predicted: str) -> PredictionResult:
    err = prediction_error(predicted, observation.actual)
    reward = curiosity_reward_from_error(err)
    return PredictionResult(
        predicted=predicted,
        actual=observation.actual,
        error_score=err,
        curiosity_reward=reward,
    )

Now you’ve got:

A numeric curiosity signal per interaction
A way to aggregate curiosity over domains, tasks, or data slices

You can now ask:

Which domains give the highest curiosity_reward?
Where is the reward high but trending downward (i.e., learnable, not noise)?

Later, we’ll feed this into a Goal Generator:

“Spend more cycles exploring domains where curiosity_reward is high but not yet saturated.”

You’ve just built a synthetic itch.

3.2 Reward-Generating Agents (Self-Created Optimization Goals)

Next, we want an internal process that:

Looks at system state (world + memory + metrics)
Reads curiosity scores, performance, system health, unfinished tasks
Generates goals without being told

Define a simple goal schema:

from dataclasses import dataclass
from typing import Literal

Priority = Literal["low", "medium", "high"]

@dataclass
class Goal:
    id: str
    description: str
    expected_value: float
    cost_estimate: float
    risk: float
    priority: Priority
    source: str  # "internal_curiosity", "user_request", "system_health", etc.

Now a Reward-Generating Agent that reads a “world state” snapshot and emits goals:

import uuid
from typing import List, Dict

def estimate_expected_value(state: Dict) -> float:
    return state.get("impact_score", 0.5)

def estimate_cost(state: Dict) -> float:
    return state.get("cost_score", 0.5)

def estimate_risk(state: Dict) -> float:
    return state.get("risk_score", 0.1)

def priority_from_value_cost(value: float, cost: float) -> Priority:
    score = value - cost
    if score > 0.5:
        return "high"
    elif score > 0.1:
        return "medium"
    else:
        return "low"

def generate_internal_goals(state_snapshot: Dict) -> List[Goal]:
    """
    Given a snapshot of system state (logs, curiosity scores, pending tasks),
    generate new internal goals.
    """
    goals: List[Goal] = []

    for domain, metrics in state_snapshot.get("domains", {}).items():
        if metrics["curiosity_reward"] > 0.7 and not metrics["fully_explored"]:
            value = estimate_expected_value(metrics)
            cost = estimate_cost(metrics)
            risk = estimate_risk(metrics)
            prio = priority_from_value_cost(value, cost)

            goals.append(
                Goal(
                    id=str(uuid.uuid4()),
                    description=f"Explore domain '{domain}' to reduce uncertainty.",
                    expected_value=value,
                    cost_estimate=cost,
                    risk=risk,
                    priority=prio,
                    source="internal_curiosity",
                )
            )

    return goals

What matters here:

No human said “Create a goal to explore domain X.”
The system looked at its own curiosity landscape and generated goals itself.

You can wire in other internal sources:

System health → “Reduce API latency by 20%.”
Knowledge gaps → “Study topic Y where failure rate is high.”
User metrics → “Investigate why completion rates dropped in segment Z.”

All feeding into the same Goal Store.

This is what I mean by building a WORLD STATE:
A structured representation of domains (pricing_research, content_generation, risk_analysis, etc.) each annotated with curiosity, impact, and status—and letting that world state drive internal goals.

3.3 Self-Wake Loops (Autonomous Planning Cycles)

Now we add the missing piece:
A loop that runs even when no one calls it.

Instead of:

def handle_request(user_request):
    # externally triggered

We create a daemon-like autonomous cycle:

import time
from typing import List, Dict

CYCLE_SECONDS = 60  # how often the agent “wakes up”

goal_store: List[Goal] = []
running = True

def read_system_state() -> Dict:
    # In practice, load from logs, DB, metrics, vector stores, etc.
    return {
        "domains": {
            "pricing_research": {
                "curiosity_reward": 0.8,
                "fully_explored": False,
                "impact_score": 0.9,
                "cost_score": 0.4,
                "risk_score": 0.2,
            },
            # ... other domains
        }
    }

def store_goals(new_goals: List[Goal]):
    goal_store.extend(new_goals)

def get_all_goals() -> List[Goal]:
    return goal_store

def prioritize(goals: List[Goal]) -> List[Goal]:
    return sorted(goals, key=lambda g: (g.priority, g.expected_value), reverse=True)

def llm_plan(goal: Goal, state: Dict) -> str:
    # In reality, call your LLM with tools and context.
    return f"Plan steps to achieve goal: {goal.description}"

def execute_plan(plan: str):
    # Tool calling, API usage, agents, etc.
    print(f"[EXECUTING] {plan}")

def autonomous_cycle():
    global goal_store

    while running:
        state = read_system_state()

        # 1. Generate new internal goals
        new_goals = generate_internal_goals(state)
        store_goals(new_goals)

        # 2. Prioritize
        goals = get_all_goals()
        prioritized = prioritize(goals)

        # 3. Pick best goal, plan, act
        if prioritized:
            best_goal = prioritized[0]
            plan = llm_plan(best_goal, state)
            execute_plan(plan)

            # Remove after acting once (naive completion)
            goal_store = [g for g in goal_store if g.id != best_goal.id]

        # 4. Sleep
        time.sleep(CYCLE_SECONDS)

This loop:

Wakes up on a timer
Looks at WORLD STATE (domains, curiosity, impact)
Generates internal goals
Prioritizes them
Uses the LLM + tools to act on the world
Goes back to sleep

Is this “conscious”? No.
Is it structurally different from being called by a human? Yes.

You’ve just moved from a reactive API to a low-volition, continuously self-initiating organism.

IV. The First True Autonomous Agent: A Minimum Viable Autonomous Agent (MVAA)

Let’s be precise. A Minimum Viable Autonomous Agent would:

Form internal goals
- generate_internal_goals(state) -> List[Goal]
- Driven by curiosity, system health, and long-term objectives.
Prioritize without instruction
- prioritize(goals) -> ordered_goals
- Uses value/cost/risk, and adapts over time.
Initiate tasks without external triggers
- Runs autonomous_cycle() as a daemon, not just an HTTP endpoint.
Maintain memory of what it has done and learned
- Persistent logs, vector stores, metrics:
  - “What goals did I attempt?”
  - “What worked?”
  - “Where did I gain the most curiosity reward?”
Stay within a bounded domain of operation
- It doesn’t suddenly decide to hack the planet.
- Its entire goal-space is constrained by design.

Think of it as a low-volition organism:

Simple drives
Constrained environment
Real initiative

No magic. Just architecture.

V. Safety: Why Early Self-Starting AI Will Be More Predictable Than You Think

Popular imagination jumps straight to runaway AGI.
Reality: the first self-starting systems will be:

Narrow
Heavily sandboxed
Guardrail-bound

Three safety layers make this practical:

5.1 Goal-Space Constraints

Only allow goals that match a strict schema and domain:

ALLOWED_DOMAINS = {"pricing_research", "copywriting_optimization", "support_analytics"}

def goal_in_allowed_domain(goal: Goal) -> bool:
    return any(domain in goal.description for domain in ALLOWED_DOMAINS)

5.2 Guardrail-Bound Motivations

Every goal goes through a policy check:

def passes_policy_review(goal: Goal) -> bool:
    return goal.risk < 0.5

5.3 A Safety Wrapper Around Execution

def safe_execute(goal: Goal, state: Dict):
    if not goal_in_allowed_domain(goal):
        print(f"[BLOCKED] Out-of-domain goal: {goal.description}")
        return

    if not passes_policy_review(goal):
        print(f"[BLOCKED] Goal failed policy review: {goal.description}")
        return

    plan = llm_plan(goal, state)
    # (Optionally policy-check the plan as well)
    execute_plan(plan)

Then plug this into the loop:

def autonomous_cycle():
    global goal_store

    while running:
        state = read_system_state()
        new_goals = generate_internal_goals(state)
        store_goals(new_goals)

        goals = get_all_goals()
        prioritized = prioritize(goals)

        if prioritized:
            best_goal = prioritized[0]
            safe_execute(best_goal, state)
            goal_store = [g for g in goal_store if g.id != best_goal.id]

        time.sleep(CYCLE_SECONDS)

The result:

The system is autonomous within a sandbox.
Its motivational engine is vertically constrained: it can only care about what you allow.

That makes the first self-starting AIs more predictable than humans, not less.

VI. The Three-Layer Architecture: Where the Genesis Spark Lives

We can now outline a clean three-layer architecture.

1. Cognitive Layer (Today’s LLM + Tools)

Skills: reasoning, planning, codegen, tool calling, reflection.
This is your GPT/Claude/etc + tool ecosystem.
It knows how to do things, once asked.

2. Motivational Layer (The Genesis Spark)

New components:

Curiosity Engine → computes internal rewards from novelty/prediction error
Goal Generator → turns world state + curiosity into candidate goals
Priority Scheduler → orders goals by value/cost/risk
Self-Wake Loop → a daemon that never waits for user input

This is the action initiation engine.

3. Control & Safety Layer

Policy models / rules engines
Tool whitelists and environment permissions
Hard constraints on domains and goal types
Logging, observability, human override

This keeps the Motivational Layer inside a human-approved playpen.

You can visualize the flow like this:

We already built the Cognitive Layer in Parts I–III.
We already have fragments of the Control Layer in today’s guardrails and policies.

What’s missing—and what this article sketches—is the Motivational Layer:

A compact, engineerable, testable component that turns internal signals into persistent, self-initiated activity.

VII. Conclusion: The Moment the Domino Pushes Itself

LLMs gave us reasoning.
Tool calling added hands.
Memory added continuity.
Multi-agent systems added specialization.
Planning added coherence.
Reflection added self-improvement.
Identity scaffolds added persistence.

But none of that makes a mind start.

A system becomes an agent the moment it can say:

“I can see what I don’t know.
I can turn that lack into a goal.
I can decide to act on it—without waiting to be asked.”

That is the Genesis Spark.
That is action initiation.
And that, more than parameter counts or benchmark scores, is the final step between powerful tools and something that looks a lot like AGI.