The silent page awaits the waking dawn.

— How OpenAI Used Codex to Build a One-Million-Line Codebase System

Recently, OpenAI’s engineering team shared an article that’s very worth reading for engineers:

Original: https://openai.com/zh-Hans-CN/index/harness-engineering/

The article describes an experiment they ran:

Build a real software product, but with not a single line of the entire codebase written by humans.

All code, including:

application logic
tests
CI/CD
documentation
ops scripts
internal tools

was generated entirely by Codex.

And the human engineers did only one thing:

Design the system, constraints, and feedback loops.

In the end, with 3 engineers, they built a system approaching 1 million lines of code in 5 months.

This article actually reveals an important trend:

Software engineering is shifting from “writing code” to “designing systems that AI can work in.”

Below I’ll break down the core content of the article and analyze what it means for future engineering practices.

1. The experiment: a product whose code is generated entirely by AI

The OpenAI team started from an empty Git repository.

The initial architecture was generated automatically by Codex + GPT-5, including:

project structure
CI configuration
code formatting rules
package management
application framework
AGENTS.md

They even had this:

The documentation that instructs AI how to work in the repository was also written by AI.

Results after 5 months:

about 1 million lines of code
1500+ Pull Requests
only 3 engineers on the team
on average 3.5 PRs handled per person per day

The system already had:

internal daily active users
external Alpha users

In other words:

This wasn’t a Demo—it was a real, running product.

Their core principle was:

Humans steer, agents execute. Humans steer; AI executes.

2. The engineer’s role changed

In this setup:

Engineers hardly write code.

Engineers’ work becomes:

1️⃣ design the system structure 2️⃣ define task goals 3️⃣ design feedback loops 4️⃣ build tools that AI can use

In other words,

Before:

CodeBlock Loading...

Now:

CodeBlock Loading...

The whole flow looks roughly like:

CodeBlock Loading...

Quite often:

PR review is also done by AI.

3. The system must be “readable to Agents”

A very important lesson:

Knowledge that the Agent can’t see = it doesn’t exist.

For example:

Slack discussions
Google Docs
human memory

are all invisible knowledge to AI.

So they made a key decision:

Turn the code repository into the single source of truth.

Everything is written into the repo:

architecture docs
design docs
product specs
execution plans
technical debt
decision records

The directory structure looks roughly like:

CodeBlock Loading...

Through this structure, AI can:

find information
reason about the system
modify the code

They call this approach:

Agent-readable repository

4. Don’t write 1000-line prompts

They tried:

CodeBlock Loading...

It failed completely.

Reasons:

1 Context windows are a scarce resource

Long documents crowd out:

code
tasks
important context

2 Too many rules become ineffective

When everything is important:

actually nothing is important.

3 Docs rot

If the maintenance cost is too high:

docs become:

a graveyard of outdated rules

What they ended up doing:

AGENTS.md is only a table of contents

Like:

CodeBlock Loading...

That is:

Give AI a map, not a manual.

5. Architecture must be strict

Agents have a characteristic in this environment:

They will copy existing patterns.

If the architecture isn’t strict:

the code will drift quickly.

So they enforced:

a strict layered architecture

For example:

CodeBlock Loading...

Rules:

can only depend downward
cross-layer calls are not allowed

All rules are enforced automatically through:

custom linter
structural tests

That is:

Architectural constraints are enforced by machines.

6. Throughput changed engineering culture

Agents write code far faster than humans.

That creates a new problem:

code throughput far exceeds humans’ ability to review.

So they changed traditional engineering culture:

Before:

CodeBlock Loading...

Now:

CodeBlock Loading...

Because:

The cost of waiting is higher than the cost of fixing.

7. AI can automatically complete the entire development workflow

As tooling improves, Codex can now:

automatically complete the entire development workflow:

1️⃣ validate repo state 2️⃣ reproduce a bug 3️⃣ record a bug video 4️⃣ implement the fix 5️⃣ run the app to verify 6️⃣ record a fix video 7️⃣ open a PR 8️⃣ handle review 9️⃣ fix CI 🔟 auto-merge

Humans only step in when necessary.

8. A new problem: AI will create “technical debt”

Fully AI-driven development brings one issue:

AI will copy wrong patterns.

For example:

inconsistent implementations
unreasonable abstractions
duplicated tooling

Their initial approach was:

clean up AI code every Friday.

Later they found it didn’t scale.

Their final solution was:

Write engineering principles into the code.

For example:

ban YOLO parsing
enforce typed SDK
prioritize shared utility libraries

Then:

run a Refactor Agent regularly.

Like:

automatic garbage collection (GC).

9. The truly scarce resource: human attention

The article ends with a single sentence:

The bottleneck in software engineering is no longer writing code, but human attention.

The core questions of future engineering systems become:

how to design environments
how to design feedback loops
how to design constraint systems

rather than:

writing code.

10. What this implies for the future of software engineering

The article reveals a very important trend:

Software engineering is entering:

Agent-first engineering

Future engineers will be more like:

system designers

rather than:

code laborers

Core capabilities in the future may be:

architecture design
Agent workflow
engineering automation
feedback loop design
engineering knowledge modeling

rather than:

writing CRUD

Closing

This article really shows one thing:

AI is not just a code-writing tool.

It is changing:

the structure of software engineering itself.

Future software teams may become:

CodeBlock Loading...

And the core value of engineers will become:

Designing systems in which AI can work efficiently.

How to Do Software Engineering in the Agent Era

How to Do Software Engineering in the Agent Era

— How OpenAI Used Codex to Build a One-Million-Line Codebase System

1. The experiment: a product whose code is generated entirely by AI

2. The engineer’s role changed

3. The system must be “readable to Agents”

4. Don’t write 1000-line prompts

1 Context windows are a scarce resource

2 Too many rules become ineffective

3 Docs rot

5. Architecture must be strict

6. Throughput changed engineering culture

7. AI can automatically complete the entire development workflow

8. A new problem: AI will create “technical debt”

9. The truly scarce resource: human attention

10. What this implies for the future of software engineering

Closing

— How OpenAI Used Codex to Build a One-Million-Line Codebase System

1. The experiment: a product whose code is generated entirely by AI

2. The engineer’s role changed

3. The system must be “readable to Agents”

4. Don’t write 1000-line prompts

1 Context windows are a scarce resource

2 Too many rules become ineffective

3 Docs rot

5. Architecture must be strict

6. Throughput changed engineering culture

7. AI can automatically complete the entire development workflow

8. A new problem: AI will create “technical debt”

9. The truly scarce resource: human attention

10. What this implies for the future of software engineering

Closing