🧠Claudeintermediate

Claude Code in India: How I Actually Run It Daily (Max 20x, Fork Subagents, the 5-Hour Meter)

Claude Code is my daily driver across a multi-site empire. Install and INR pricing in five minutes, then the operator settings nobody tells you: fork-subagent cache sharing, the session meter that burns invisibly, and the MCP toolbox I lean on.

By··8 min read·Reviewed
Claude Code Setup in India, Install, Pricing (INR), and Your First Project, claude on AutoKaam

Claude Code is Anthropic's command-line coding agent, the official way to drive Claude Opus from a terminal. It is my daily driver, running across a dozen-plus projects on one Linux box in India: a jobs site, a finance newsroom, govt-form automation, trading bots, and the content pipeline behind this site.

Most setup guides stop at install and pricing. That part takes five minutes. The part that decides whether Claude Code is a toy or a force multiplier is the operator layer underneath: how the session meter actually bills, the one env flag that fits twice as many subagents in the same cap, and the MCP tools that replace whole categories of one-off scripts.

Why terminal, not IDE

Cursor is a VS Code fork. Claude Code is terminal-native, file-aware, git-aware, with a 1M-token context window. If you work over SSH, live in tmux or neovim, run a commit-heavy workflow, or navigate large codebases, the terminal agent fits the hand better. I keep both: Claude Code for planning and large multi-file work, Cursor for quick autocomplete.

Install in five minutes

Prerequisite: Node 18+. It installs from npm.

npm install -g @anthropic-ai/claude-code

# or with bun
bun install -g @anthropic-ai/claude-code

claude --version

Pricing in INR, and which plan a solo operator wants

Two auth paths.

Subscription (what I run)

  • Pro: $20/mo (~Rs 1,660). Covers light, occasional Claude Code use.
  • Max 5x: $100/mo (~Rs 8,300). Higher limits, priority.
  • Max 20x: $200/mo (~Rs 16,600). The realistic floor if Claude Code is your primary tool.

If you code four-plus hours a day across multiple projects, Max 20x is the only subscription tier that stops you hitting the wall mid-afternoon. At ~Rs 16,600 a month it replaces far more in pay-as-you-go API spend for the same volume, and the billing is flat: a heavy day costs the same as a light one. For a solo operator running an empire of small projects, that flat ceiling is the whole point.

claude
# first run prompts login, browser opens, auth flow completes

API key (overflow and pipelines)

export ANTHROPIC_API_KEY="sk-ant-..."
# add to .bashrc / .zshrc for persistence

The API rail bills per token, separate from your Max window. The smart move is not "Max or API", it is knowing which work belongs on which rail, and I will come back to it.

One thing the Max OAuth path does not get: most Anthropic beta headers (server-side compaction, the memory tool, code execution, Files API, tool search) are API-key only. The CLI says so in claude --help. If a tutorial tells you to enable a beta on a Max plan, it silently does nothing.

Your first project

Point it at a real repo and give it a concrete task.

cd ~/my-project && claude
Read src/app/page.tsx and refactor it into smaller components.
Create src/components/ with logical splits.
Update imports. Run tsc --noEmit to verify no errors.

It reads the file, proposes a plan, asks permission on each edit, creates the components, then runs the type check. You approve or reject every step. The interactive permission model is slower than a fully autonomous agent and it is the reason I trust it on a live empire: nothing gets deleted behind my back.

The 5-hour meter nobody warns you about

This is the single most expensive misunderstanding on the Max plan, and it cost me a postmortem to learn.

Max does not meter only token-equivalent rupees. It also meters per the 5-hour session window, counted by connections and requests, not just tokens burned. The two budgets move independently. You can have a healthy cache hit rate, a low token-dollar figure for the day, and still watch the Max percentage bar burn down fast.

The trap is headless one-shot runs. Every fresh claude -p pays a fixed tax: a new API connection, full session priming (system prompt, CLAUDE.md, skills, MCP handshake), and a cache write a one-turn session never reads back. Stack enough and the connection count, not the token count, hits the ceiling.

My rules of thumb for one-shot claude -p:

  • Under 10 a day: fine.
  • 10 to 30 a day: noticeable.
  • Over 30 a day: problematic.
  • Over 70 a day: fatal to the window.

So before wiring claude -p per item into a pipeline, count the calls per day. If it clears 10, redesign it as one long-running session that reuses cache, or move it onto the direct-API rail, which bills rupees and never touches the Max window. A loop firing a fresh agent per draft is the classic way to nuke your own afternoon.

The one env flag that doubles your agent budget

If you run subagent fanouts, this is the highest-leverage line in your shell config.

# ~/.bashrc, Claude Code v2.1.117 or newer
export CLAUDE_CODE_FORK_SUBAGENT=1

With this set, forked subagents inherit the parent session's prompt cache and read it at roughly 10 percent of full input cost. Without it, every subagent cold-starts and pays full freight. On a large fanout the gap is dramatic: a fresh-context spread of nineteen agents can burn close to a million input tokens, while the same fanout with shared cache lands under half that. It fits a bit over twice as many agents inside one cap window.

Two caveats from running it for real:

  • Cron does not load ~/.bashrc. If a scheduled job does fanout, set the flag inside the crontab line or wrapper script, or it silently runs cold.
  • Warm the cache first. Fire one subagent, wait for its first streamed token (that hydrates the shared cache), then fan out the rest. Confirm it works via cache_read_input_tokens in a subagent response: after warmup it should be most of the input.

Design for the cache first

The Claude Code team treats cache hit rate like uptime, with alerts when it drops, and real production sessions sit around 96 percent. On a Max plan that hit rate maps straight to how much cap headroom you have left, so cache design is the budget, not a micro-optimisation. What actually moves the number:

  • Keep the system prompt slim. Every byte of your global CLAUDE.md is re-sent as a cache read on every message. A bloated context file can quietly dominate your whole month's token volume even at the discounted rate. Push detail into on-demand files.
  • Put dynamic content at the end. Timestamps, request IDs, per-run context go after the stable prefix. The auto cache breakpoint lands after the last cacheable block, so volatile data up front changes the prefix hash every request and you never get a hit.
  • Lock tool order. Tool definitions cache as one prefix. Reorder them between subagent calls and you invalidate the whole tool cache.
  • Never swap models mid-session. Switching Opus to a smaller model partway through costs more than letting Opus finish, because the cache invalidates. Route by session boundary.

/fast mode on Opus

When I need quicker turnarounds without dropping to a weaker model, /fast toggles a faster output path on Opus. Same model, same reasoning quality, lower response latency. The right lever when wall-clock time is the bottleneck, not when you need a cheaper model.

The MCP toolbox I actually use

Out of the box Claude Code is useful. The next half hour of setup is where it becomes the daily driver. MCP servers add new tools to the agent. Register them in ~/.claude/.mcp.json and run /mcp to confirm they connected. The three I lean on as an Indian operator:

  • adobe-pdf: Adobe PDF Services over REST for govt forms, invoices, OCR, and table extraction with proper layout. It replaced pdftk and PyPDF in my workflow entirely.
  • gemma-vision: a local vision model on my own GPU for screenshot debugging and document data extraction. Running locally, it does the image work without spending Anthropic context tokens.
  • claude-context: semantic search across my indexed repos. I reach for it before grep -r on any cross-repo refactor.

The pattern that matters: reach for an MCP tool before writing a one-off script or installing a library. Most glue work you would hand-roll is already a tool call. When no existing server covers your stack, write a custom MCP server in Python and the agent gains a typed tool for exactly your job.

Hooks and custom slash commands

Hooks are shell commands the harness runs on lifecycle events. They live in ~/.claude/settings.json under the hooks key. I use a SessionStart hook that loads a per-project RESUME.md if one exists in the working directory, so the agent wakes up knowing where I left off.

{
  "hooks": {
    "SessionStart": [
      {
        "hooks": [
          { "type": "command", "command": "test -f ./RESUME.md && cat ./RESUME.md || true" }
        ]
      }
    ]
  }
}

The output is injected into context, and because the hook lives in the global settings file it survives across every session and project.

Custom slash commands are markdown files in ~/.claude/commands/<name>.md. Each becomes a /<name> command, and the body of the file is the prompt that runs.

mkdir -p ~/.claude/commands
cat > ~/.claude/commands/lint.md <<'EOF'
Run `npm run lint` and `npm run typecheck`. Fix any errors you can without
changing public APIs. Report what you fixed and what needs a human.
EOF

The CLAUDE.md file in a project root is read automatically every session. Keep it tight, since per the cache rule above it costs you on every message.

# Project Context

## Stack
- Next.js + TypeScript + Tailwind
- PocketBase backend
- Deployed on Cloudflare Pages

## Conventions
- One component per file in src/components/
- No Docker, direct installs only

Indian network note

On some ISPs the route to Anthropic is slow. Cloudflare WARP (free) usually fixes it: warp-cli register then warp-cli connect.

When things break

Auth token expired mid-session. You see "Authentication failed" partway through. Run claude logout, restart, finish the browser flow. Context is lost on re-auth, so save key state to a file first.

MCP server failed to start. A bad entry fails silently. Run claude --debug once and read the startup log. Most common cause: a wrong path in ~/.claude/.mcp.json. Second: the server's own dependency missing.

Context approaching the limit on a 1M run. Warnings arrive near the top of the window. Type /compact to summarise prior turns and reset. Save anything you do not want compressed to a file first, since compaction is lossy by design.

Tool denied without a clear reason. Check your permission mode with /status. The deny-list in ~/.claude/settings.json blocks dangerous patterns by default. If a command is legitimate, allowlist it explicitly. Never reach for --dangerously-skip-permissions as a shortcut.

Claude Code vs Codex CLI vs Aider

Codex CLI is OpenAI's terminal agent, currently routing GPT-5.x. More autonomous loop, fewer permission prompts, longer multi-step runs before checking in. Trade-off: less surgical control, and it occasionally races into edits you did not intend.

Aider is the open-source veteran. Bring your own API key, works with any OpenAI-compatible endpoint, lightest footprint of the three. Permission model is git-based, every edit is a commit, so revert is one command. No native MCP, no 1M context.

Claude Code is the most polished, the only one with a 1M context window plus native MCP, and the interactive permission model is the feature, not the friction. For a solo operator running many small projects from India, it is my default daily driver.

Quick takeaways

  • Max 20x (~Rs 16,600/mo) is the realistic floor if Claude Code is your primary tool. The flat ceiling is the value.
  • The 5-hour window meters connections and requests, not just tokens. Over 30 headless claude -p runs a day burns it invisibly. Push high-frequency jobs to the API rail.
  • export CLAUDE_CODE_FORK_SUBAGENT=1 shares the parent cache at ~10 percent cost, fitting roughly twice as many agents per window. Cron needs the flag set explicitly.
  • Design for prompt caching first: slim system prompt, dynamic content last, locked tool order, no mid-session model swaps. /fast gives faster Opus output at the same quality when wall-clock time is the bottleneck.

Also see the Cursor vs Copilot comparison for a decision framework.