How Claude Code Works - Harness, Context, and Memory Explained
How Claude Code Works - Harness, Context, and Memory Explained
This post is written for users who have used Claude Code a few times. It may be overwhelming for first-time users. Surprisingly, over 60% of this post was written by Claude Code itself.
While using Claude Code, I kept wondering, “How does this thing actually work internally?” I figured that understanding the internals would help me use it better, so I read through the official documentation and summarized what I learned.
Surprisingly, Claude Code also wrote about 60% of this post.
Before We Begin
Claude Code is constantly being updated, so referring to the official
documentation is always the best approach. This post is based on my
experience as of February 2026.
- Official docs: https://code.claude.com/docs/en/settings
Summary
- Claude Code is an Agentic Harness that wraps an AI model to make it actually capable of doing work
- Each session has its own Context, which is the entire set of information sent to the AI model
- Context includes Memory (CLAUDE.md, Auto Memory), which persists across sessions
- Unfortunately, this post does not cover Tools, Sandbox, Permission, or Extensions (MCP, Skills, Hooks). I plan to write about those separately if time permits.
What Is Claude Code?
Let’s dive into what Claude Code actually is. To understand Claude Code, you need to know three concepts: AI Model, Harness, and AI Agent.
AI Model: Inference
The Claude AI model is an LLM (Large Language Model). An LLM probabilistically predicts the next token and selects the one with the highest probability. This process is called inference.
As of February 2026, Claude Code only supports Claude AI models, and
you can switch between models using the /model
command.
During inference, the term “token” comes up. A token is the smallest unit that an AI model uses to process text. An important distinction: token ≠ word. In English, roughly 1 word ≈ 1 token, but in Korean, a single character can be split into multiple tokens.
There are two types of tokens:
| Type | Description |
|---|---|
| Input token | Everything the model reads: user questions, system prompts, prior context |
| Output token | The response the model generates |
Since tokens are the processing unit of AI models, they are used to
measure model usage. In Claude Code, you can check your token usage with
the /stats command.
# Check token usage
/stats
Harness: The System That Wraps the AI Model to Make It Work
An AI model cannot read files, modify code, or execute commands on its own. It only generates text by predicting tokens. Therefore, a system is needed to wrap the AI model and enable it to actually perform work. This system is called the Harness.
The word “harness” originally refers to the gear placed on a horse to control it. The concept is similar in AI. A Harness is the entire system that wraps an AI model (LLM) to use tools, manage permissions, and maintain state. Just as an OS manages a computer system, a Harness manages an AI system.
CLAUDE.md, Skills, and similar components are part of the Harness.
AI Agent: AI Model + Harness
An AI Agent is software that leverages an AI model. An AI Agent uses an AI model to autonomously make decisions and take actions (Act). In doing so, the AI Agent also utilizes the Harness that manages the AI system.
| Component | Role |
|---|---|
| AI Model | Generates text. Cannot act on its own |
| Harness | Executes tools, manages permissions, maintains state and context |
| AI Agent | Autonomously makes decisions and takes actions |
Claude Code Is an Agentic Harness
So where does Claude Code fit among AI Model, Harness, and AI Agent? Based on the official documentation, I believe Claude Code is an Agentic Harness. It wraps the Claude AI model and provides tool execution, context management, and an execution environment.
Claude Code serves as the agentic harness around Claude: it provides the tools, context management, and execution environment that turn a language model into a capable coding agent.
Source: https://code.claude.com/docs/en/how-claude-code-works
When a user says “fix this bug,” the Claude AI model decides “I need to read the file,” and Claude Code (Harness) actually reads the file and returns the result to the model. When the model then decides “I need to modify this part,” Claude Code makes the code change. This cycle repeats until the bug is fixed.
How Does Claude Code Work? (Agentic Loop)
Claude Code (Harness) operates through a repeating cycle called the Agentic Loop. It repeats three steps:
Agentic Loop documentation: https://code.claude.com/docs/en/how-claude-code-works
- Gather Context: Reads files, searches code, and understands the project structure
- Take Action: Modifies code and executes commands
- Verify Results: Runs tests and checks results
The Agentic Loop is the core of how Claude Code works. When the AI model decides “what to do next,” Claude Code (Harness) actually executes the tool and passes the result back to the AI model. The model then decides its next action based on that result. This cycle continues until the task is complete.
We, the users, are also part of this loop. We can intervene at any point to change direction, provide additional information, or ask it to try a different approach.
Components of Claude Code (Harness)
Claude Code provides six main capabilities as a Harness. This post only covers Sessions and Context (including Memory). I plan to cover the rest separately if time permits.
| Component | Description |
|---|---|
| Tools | Bash, file read/write, web search, MCP servers — executes real actions |
| Permission | Per-tool permission settings, user approval gates |
| Sandbox | Executes tools safely in an isolated environment |
| Session | Connection unit with the user. Session save and restore |
| Context (incl. Memory) | Maintains conversation context, CLAUDE.md, Context Window management |
| Extensibility | Extends functionality via MCP, Skills, and Hooks |
Session
What Is a Session?
A Session is the connection that persists from when Claude
Code starts until it is terminated. Users make work requests to
Claude Code through sessions. Running claude in the
terminal starts a new session, and exiting with /exit or
Ctrl+C ends the session.
In web browsers or desktop apps, a session is created when a user makes a work request.
When a Session Starts
When a session is created, Claude Code performs initialization tasks to prepare for work. This includes loading MCP servers, Skills, Context, and more.
A session holds data for each session, which is called context. Context is maintained in memory during the session and backed up to a local file (JSONL). For more details about context, see the Context chapter.
Checking the Session ID
Each session has a unique ID. You can find the session ID in
/status.
/statusWhere Is Session Context Stored?
Session context is maintained in memory and backed up to a local file (JSONL). The file serves as a backup, so it is not used during the active session. The backup exists for the purpose of restoring terminated sessions.
The file storage path is ~/.claude/projects/.
~/.claude/projects/{project-path}/{session-UUID}.jsonl
For example, if you run Claude Code in the my-project
directory, the session file is created at the following path:
~/.claude/projects/my-project/f41ff972-7bab-4719-8a7a-e564732bdaf0.jsonl
In JSONL (JSON Lines) files, the session context is recorded line by
line. Running claude multiple times in the same directory
creates that many JSONL files. For example, I asked a question about
sessions in Claude Code like this:
My question is first saved to the context, then backed up to the
JSONL file. Note that JSONL files can be filtered with the
jq command.
cat f41ff972-7bab-4719-8a7a-e564732bdaf0.jsonl | jq -r 'select(.type == "user") | .message.content'Restoring a Previous Session
Because session context is backed up to files, even if you accidentally close a session, you can restore it. There are two options for resuming a previous session:
–continue: Resume the Most Recent Session
claude --continue finds the most recently used session’s
JSONL file in the current directory and reloads the conversation
context.
claude --continueHow is the “most recent session” determined? There is no pointer like git HEAD. It simply selects the most recent file based on the JSONL file’s modification time (mtime).
An important caveat: --continue searches for session
files based on the current directory. If you run it from a different
directory than where the previous session was executed, you’ll get an
error like this:
$ claude --continue
No conversation found to continue–resume: Select a Specific Session
claude --resume lets you select a session from a list.
You don’t need to memorize session IDs — just pick from the list to
resume a session.
claude --resumeContext
What Is Context?
AI models do not remember previous conversations. Every request is processed as a new one. That’s why Claude Code (Harness) constructs the necessary context for each request and sends it to the AI model.
Context is the entire set of information that Claude Code (Harness) sends along with each request to the AI model. As explained in the Harness chapter, context management is a core responsibility of Claude Code.
Types of Context
The main types of context are as follows:
| Type | Description |
|---|---|
| System prompt | Claude Code’s default system instructions |
| System tools | Tool definitions (Read, Edit, Bash, etc.) |
| Skills | Skill metadata |
| Messages | Conversation history so far (questions + responses + tool results) |
| Autocompact buffer | Compressed context |
Context Window (Size)
The Context Window is the maximum size that context can hold. The Claude model’s context window is 200k tokens.
As conversations accumulate, the context grows, and when it approaches the 200k token limit, management is needed. The management methods are Context Reset and Context Compaction, explained below.
Memory
Among the elements loaded into Context, there is something special. Memory is instructions/information that persists across sessions. While other Context content (conversations, tool results) disappears when a session ends, Memory is permanently stored in files and reloaded into context in the next session.
Reference: https://code.claude.com/docs/en/memory
Claude Code’s Memory comes in two main types:
1. CLAUDE.md files: Instructions/rules written directly
by the user
2. Auto Memory: Learning content automatically recorded
by Claude during work
CLAUDE.md
CLAUDE.md is an instruction document for Claude
Code. It is automatically read and loaded into context when a
session starts. You can edit it with the /memory
command.
| Type | Location | Purpose | Sharing Scope |
|---|---|---|---|
| User memory | ~/.claude/CLAUDE.md |
Personal settings (global) | Only you |
| Project memory | ./CLAUDE.md |
Shared project rules | Entire team (git) |
| Project memory (local) | ./CLAUDE.local.md |
Personal project settings | Only you (auto-gitignored) |
| Project rules | .claude/rules/*.md |
Modular project rules | Entire team (git) |
Auto Memory
Auto Memory is learning content automatically recorded by Claude during work. It automatically saves project patterns, debugging insights, user preferences, and more.
As of February 2026, Auto Memory is being gradually rolled out, so
some users have it enabled by default while others do not. You can check
whether Auto Memory is enabled with the /memory
command.
/memoryTo enable Auto Memory, set the environment variable
CLAUDE_CODE_DISABLE_AUTO_MEMORY:
CLAUDE_CODE_DISABLE_AUTO_MEMORY=0The storage location is
~/.claude/projects/<project>/memory/. The
MEMORY.md file serves as an index, and detailed content is
managed in separate files.
~/.claude/projects/<project>/memory/
├── MEMORY.md # Index (first 200 lines loaded at session start)
├── debugging.md # Debugging pattern notes
├── api-conventions.md # API design decisions
└── ...
Context Reset (/clear)
/clear is a command that resets the current session’s
context. All conversation history, file contents read, etc. are
cleared and the session returns to its initial state. Memory
(CLAUDE.md, Auto Memory) is stored in files, so it gets reloaded.
/clearAn important distinction: /clear does not delete
the JSONL session file. It only resets the context — the
session file remains intact.
Context Compaction
As conversations grow longer, the context keeps getting larger. What happens when it approaches the 200k token limit? If Auto Compact is enabled, it compresses automatically. If disabled, Claude Code notifies the user to compress the context.
Users can manually compress context with the /compact
command.
/compactAfter context compaction, you may lose detailed information from previous conversations. Since the summary only retains the essentials, it may not be able to answer detailed questions like “show me the changes you made in the third file earlier.”
References
- How Claude Code works: https://code.claude.com/docs/en/how-claude-code-works
- Claude Code settings: https://code.claude.com/docs/en/settings
- Claude Code CLI reference: https://code.claude.com/docs/en/cli-reference
- Claude Code sandbox release: https://www.anthropic.com/engineering/claude-code-sandboxing
- Claude Code compaction: https://platform.claude.com/docs/en/build-with-claude/compaction#how-compaction-works
- Claude Code context window: https://platform.claude.com/docs/en/build-with-claude/context-windows
Comments
Post a Comment