How Claude Code Works - Harness, Context, and Memory Explained

image reference: ChatGPT

This post is written for users who have used Claude Code a few times. It may be overwhelming for first-time users. Surprisingly, over 60% of this post was written by Claude Code itself.

While using Claude Code, I kept wondering, “How does this thing actually work internally?” I figured that understanding the internals would help me use it better, so I read through the official documentation and summarized what I learned.

Surprisingly, Claude Code also wrote about 60% of this post.

Before We Begin

Claude Code is constantly being updated, so referring to the official documentation is always the best approach. This post is based on my experience as of February 2026.
- Official docs: https://code.claude.com/docs/en/settings

Summary

Claude Code is an Agentic Harness that wraps an AI model to make it actually capable of doing work
Each session has its own Context, which is the entire set of information sent to the AI model
Context includes Memory (CLAUDE.md, Auto Memory), which persists across sessions
Unfortunately, this post does not cover Tools, Sandbox, Permission, or Extensions (MCP, Skills, Hooks). I plan to write about those separately if time permits.

What Is Claude Code?

Let’s dive into what Claude Code actually is. To understand Claude Code, you need to know three concepts: AI Model, Harness, and AI Agent.

AI Model: Inference

The Claude AI model is an LLM (Large Language Model). An LLM probabilistically predicts the next token and selects the one with the highest probability. This process is called inference.

As of February 2026, Claude Code only supports Claude AI models, and you can switch between models using the /model command.

During inference, the term “token” comes up. A token is the smallest unit that an AI model uses to process text. An important distinction: token ≠ word. In English, roughly 1 word ≈ 1 token, but in Korean, a single character can be split into multiple tokens.

There are two types of tokens:

Type	Description
Input token	Everything the model reads: user questions, system prompts, prior context
Output token	The response the model generates

image reference: ChatGPT

Since tokens are the processing unit of AI models, they are used to measure model usage. In Claude Code, you can check your token usage with the /stats command.

# Check token usage
/stats

Harness: The System That Wraps the AI Model to Make It Work

An AI model cannot read files, modify code, or execute commands on its own. It only generates text by predicting tokens. Therefore, a system is needed to wrap the AI model and enable it to actually perform work. This system is called the Harness.

The word “harness” originally refers to the gear placed on a horse to control it. The concept is similar in AI. A Harness is the entire system that wraps an AI model (LLM) to use tools, manage permissions, and maintain state. Just as an OS manages a computer system, a Harness manages an AI system.

CLAUDE.md, Skills, and similar components are part of the Harness.

AI Agent: AI Model + Harness

An AI Agent is software that leverages an AI model. An AI Agent uses an AI model to autonomously make decisions and take actions (Act). In doing so, the AI Agent also utilizes the Harness that manages the AI system.

Component	Role
AI Model	Generates text. Cannot act on its own
Harness	Executes tools, manages permissions, maintains state and context
AI Agent	Autonomously makes decisions and takes actions

Claude Code Is an Agentic Harness

So where does Claude Code fit among AI Model, Harness, and AI Agent? Based on the official documentation, I believe Claude Code is an Agentic Harness. It wraps the Claude AI model and provides tool execution, context management, and an execution environment.

Claude Code serves as the agentic harness around Claude: it provides the tools, context management, and execution environment that turn a language model into a capable coding agent.
Source: https://code.claude.com/docs/en/how-claude-code-works

When a user says “fix this bug,” the Claude AI model decides “I need to read the file,” and Claude Code (Harness) actually reads the file and returns the result to the model. When the model then decides “I need to modify this part,” Claude Code makes the code change. This cycle repeats until the bug is fixed.

How Does Claude Code Work? (Agentic Loop)

Claude Code (Harness) operates through a repeating cycle called the Agentic Loop. It repeats three steps:

Agentic Loop documentation: https://code.claude.com/docs/en/how-claude-code-works

Gather Context: Reads files, searches code, and understands the project structure
Take Action: Modifies code and executes commands
Verify Results: Runs tests and checks results

The Agentic Loop is the core of how Claude Code works. When the AI model decides “what to do next,” Claude Code (Harness) actually executes the tool and passes the result back to the AI model. The model then decides its next action based on that result. This cycle continues until the task is complete.

We, the users, are also part of this loop. We can intervene at any point to change direction, provide additional information, or ask it to try a different approach.

Components of Claude Code (Harness)

Claude Code provides six main capabilities as a Harness. This post only covers Sessions and Context (including Memory). I plan to cover the rest separately if time permits.

Component	Description
Tools	Bash, file read/write, web search, MCP servers — executes real actions
Permission	Per-tool permission settings, user approval gates
Sandbox	Executes tools safely in an isolated environment
Session	Connection unit with the user. Session save and restore
Context (incl. Memory)	Maintains conversation context, CLAUDE.md, Context Window management
Extensibility	Extends functionality via MCP, Skills, and Hooks

Session

What Is a Session?

A Session is the connection that persists from when Claude Code starts until it is terminated. Users make work requests to Claude Code through sessions. Running claude in the terminal starts a new session, and exiting with /exit or Ctrl+C ends the session.

In web browsers or desktop apps, a session is created when a user makes a work request.

When a Session Starts

When a session is created, Claude Code performs initialization tasks to prepare for work. This includes loading MCP servers, Skills, Context, and more.

A session holds data for each session, which is called context. Context is maintained in memory during the session and backed up to a local file (JSONL). For more details about context, see the Context chapter.

Checking the Session ID

Each session has a unique ID. You can find the session ID in /status.

/status

Where Is Session Context Stored?

Session context is maintained in memory and backed up to a local file (JSONL). The file serves as a backup, so it is not used during the active session. The backup exists for the purpose of restoring terminated sessions.

The file storage path is ~/.claude/projects/.

~/.claude/projects/{project-path}/{session-UUID}.jsonl

For example, if you run Claude Code in the my-project directory, the session file is created at the following path:

~/.claude/projects/my-project/f41ff972-7bab-4719-8a7a-e564732bdaf0.jsonl

In JSONL (JSON Lines) files, the session context is recorded line by line. Running claude multiple times in the same directory creates that many JSONL files. For example, I asked a question about sessions in Claude Code like this:

My question is first saved to the context, then backed up to the JSONL file. Note that JSONL files can be filtered with the jq command.

cat f41ff972-7bab-4719-8a7a-e564732bdaf0.jsonl |  jq -r 'select(.type == "user") | .message.content'

Restoring a Previous Session

Because session context is backed up to files, even if you accidentally close a session, you can restore it. There are two options for resuming a previous session:

–continue: Resume the Most Recent Session

claude --continue finds the most recently used session’s JSONL file in the current directory and reloads the conversation context.

claude --continue

How is the “most recent session” determined? There is no pointer like git HEAD. It simply selects the most recent file based on the JSONL file’s modification time (mtime).

An important caveat: --continue searches for session files based on the current directory. If you run it from a different directory than where the previous session was executed, you’ll get an error like this:

$ claude --continue
No conversation found to continue

–resume: Select a Specific Session

claude --resume lets you select a session from a list. You don’t need to memorize session IDs — just pick from the list to resume a session.

claude --resume

Context

What Is Context?

AI models do not remember previous conversations. Every request is processed as a new one. That’s why Claude Code (Harness) constructs the necessary context for each request and sends it to the AI model.

Context is the entire set of information that Claude Code (Harness) sends along with each request to the AI model. As explained in the Harness chapter, context management is a core responsibility of Claude Code.

Types of Context

The main types of context are as follows:

Type	Description
System prompt	Claude Code’s default system instructions
System tools	Tool definitions (Read, Edit, Bash, etc.)
Skills	Skill metadata
Messages	Conversation history so far (questions + responses + tool results)
Autocompact buffer	Compressed context

Context Window (Size)

The Context Window is the maximum size that context can hold. The Claude model’s context window is 200k tokens.

As conversations accumulate, the context grows, and when it approaches the 200k token limit, management is needed. The management methods are Context Reset and Context Compaction, explained below.

Memory

Among the elements loaded into Context, there is something special. Memory is instructions/information that persists across sessions. While other Context content (conversations, tool results) disappears when a session ends, Memory is permanently stored in files and reloaded into context in the next session.

Reference: https://code.claude.com/docs/en/memory

Claude Code’s Memory comes in two main types:
1. CLAUDE.md files: Instructions/rules written directly by the user
2. Auto Memory: Learning content automatically recorded by Claude during work

CLAUDE.md

CLAUDE.md is an instruction document for Claude Code. It is automatically read and loaded into context when a session starts. You can edit it with the /memory command.

Type	Location	Purpose	Sharing Scope
User memory	`~/.claude/CLAUDE.md`	Personal settings (global)	Only you
Project memory	`./CLAUDE.md`	Shared project rules	Entire team (git)
Project memory (local)	`./CLAUDE.local.md`	Personal project settings	Only you (auto-gitignored)
Project rules	`.claude/rules/*.md`	Modular project rules	Entire team (git)

Auto Memory

Auto Memory is learning content automatically recorded by Claude during work. It automatically saves project patterns, debugging insights, user preferences, and more.

As of February 2026, Auto Memory is being gradually rolled out, so some users have it enabled by default while others do not. You can check whether Auto Memory is enabled with the /memory command.

/memory

To enable Auto Memory, set the environment variable CLAUDE_CODE_DISABLE_AUTO_MEMORY:

CLAUDE_CODE_DISABLE_AUTO_MEMORY=0

The storage location is ~/.claude/projects/<project>/memory/. The MEMORY.md file serves as an index, and detailed content is managed in separate files.

~/.claude/projects/<project>/memory/
├── MEMORY.md          # Index (first 200 lines loaded at session start)
├── debugging.md       # Debugging pattern notes
├── api-conventions.md # API design decisions
└── ...

Context Reset (/clear)

/clear is a command that resets the current session’s context. All conversation history, file contents read, etc. are cleared and the session returns to its initial state. Memory (CLAUDE.md, Auto Memory) is stored in files, so it gets reloaded.

/clear

An important distinction: /clear does not delete the JSONL session file. It only resets the context — the session file remains intact.

Context Compaction

As conversations grow longer, the context keeps getting larger. What happens when it approaches the 200k token limit? If Auto Compact is enabled, it compresses automatically. If disabled, Claude Code notifies the user to compress the context.

Users can manually compress context with the /compact command.

/compact

After context compaction, you may lose detailed information from previous conversations. Since the summary only retains the essentials, it may not be able to answer detailed questions like “show me the changes you made in the third file earlier.”

References

How Claude Code works: https://code.claude.com/docs/en/how-claude-code-works
Claude Code settings: https://code.claude.com/docs/en/settings
Claude Code CLI reference: https://code.claude.com/docs/en/cli-reference
Claude Code sandbox release: https://www.anthropic.com/engineering/claude-code-sandboxing
Claude Code compaction: https://platform.claude.com/docs/en/build-with-claude/compaction#how-compaction-works
Claude Code context window: https://platform.claude.com/docs/en/build-with-claude/context-windows

Search This Blog

Akbun