Reference Library

AutoKaam Playbook

30 tools, reviewed by an operator. INR pricing, the spec sheet I actually run, and the one thing each tool gets wrong.

Last reviewed across the whole library: 2026-05-06. 30 entries live.

Dev Tools

10 entries

Claude.ai Pro: ~Rs 1,700/mo (USD 20). Claude Max (5h sessions): ~Rs 8,500/mo (USD 100). API Sonnet 4.6: USD 3 input / USD 15 output per 1M tokens. API Opus 4.7: USD 15 input / USD 75 output per 1M tokens.

Claude, Anthropic's Sonnet and Opus Families

The model I reach for when reasoning matters more than throughput.

Read playbook

Free tier available with Claude Pro. Claude Max (recommended for empire-scale work): ~Rs 8,500/mo. Pay-as-you-go API also supported.

Claude Code, the CLI Agent I Run All Day

Anthropic's terminal-first coding agent; the empire's daily driver for code work.

Read playbook

Free tier with limits. Pro: Rs 1,700/mo (USD 20). Business: Rs 3,400/seat/mo (USD 40).

Cursor, the IDE I Tried and the Empire's Soft Pass

Genuinely good IDE-pane AI; I just prefer the terminal flow for empire work.

Read playbook

Free extension. API costs at vendor rate. Ollama-backed runs at zero marginal cost.

Cline, the VSCode Agent That Stays Out of My Way

The free open-source alternative to Cursor's agent mode, with bring-your-own-key.

Read playbook

Free open source. API costs at vendor rate. Sessions average Rs 200-400 with Sonnet.

Aider, the Terminal Pair-Programmer With a Soul

git-aware command-line AI coding; the original good open-source agent and still excellent.

Read playbook

Free tier (limited GPT-5.4-mini). ChatGPT Go: Rs 399/mo (India-specific). Plus: Rs 1,700/mo. Pro: Rs 16,500/mo.

ChatGPT, the Default Most Indians Started With

Still the most-used AI app in India, and a defensible choice for general-purpose work.

Read playbook

GPT-5.4: USD 5/15 per 1M tokens (input/output). GPT-5.4-mini: USD 0.15/0.6. Sora video: per-second, varies by quality. Embeddings: USD 0.10 per 1M tokens.

OpenAI API, the Developer Surface I Use Sparingly

GPT-5.4-mini for cost-sensitive batch work; structured output is the genuine win.

Read playbook

Free tier with Gemini 3 Flash. Google AI Pro: Rs 1,950/mo (often bundled with YT Premium in India). Ultra tier: Rs 19,500/mo (Veo unlimited, longer context).

Gemini, Google's Surprisingly Strong Family

Gemini 3 Pro is the multimodal leader; Gemini 3 Flash is the cheap-and-fast bet.

Read playbook

Gemini 3 Pro: USD 3.5 / 10.5 per 1M tokens. Gemini 3 Flash: USD 0.10 / 0.40 (volatile, check current). Free tier available via AI Studio for development.

Gemini API, the Google Developer Surface

Cheap, fast, multimodal-strong; production-viable with caveats around pricing volatility.

Read playbook

DeepSeek V3.2: USD 0.14 / 0.28 per 1M tokens (peak). Off-peak ~50 percent discount. Available via OpenRouter at small markup.

DeepSeek, the Cheapest Reasoning Tier I Trust

DeepSeek V3.2 at USD 0.14 per 1M is the budget-tier reasoning model that actually works.

Read playbook

Local LLMs

10 entries

Free, open source. Compute cost on consumer hardware is electricity, roughly Rs 4 to Rs 8 per active inference hour on a 65W desktop.

Ollama, the Local Model Runtime I Actually Trust

One binary, one model registry, zero cloud dependency. The default I reach for first.

Read playbook

Free, open source. Compile-time cost on a M75q is under two minutes, on a Pi 4B about ten minutes.

llama.cpp, the Engine Under Most Local Inference

Compile once, run anything. Where I go when Ollama does not expose the knob I need.

Read playbook

Free for personal use. Commercial use license is in flux for 2026, treat as not licensed for production.

LM Studio, the GUI On-Ramp for People Who Hate Terminals

Polished desktop app for local models. I do not run it, but it converts non-developers fast.

Read playbook

Free, open source. Compute cost via RunPod is about Rs 42 per hour for L4 (24GB VRAM) and Rs 250 to Rs 800 per hour for A100 or H100.

vLLM, Serving Throughput That Defends a GPU Bill

Production inference for teams. I only run it on RunPod, never local, and the math works.

Read playbook

Free, open weights. Compute cost is local hardware electricity, effectively zero for personal use.

Gemma, the Open Family I Actually Reach For

Google's open-weight line. The 2B is my Pi default; the 9B is my desktop default; vision is the one I run daily.

Read playbook

Free open weights. Cerebras Cloud free tier 30 RPM. Cerebras paid tier roughly Rs 50 per 1M input tokens, Rs 100 per 1M output for qwen-3-235B.

Qwen, Where Cerebras Speed Plus Open Weights Actually Compose

Alibaba's family, my pick when grunt extraction has to fly through 30 RPM free quota.

Read playbook

Empire grant 200M credits valid till 2026-05-28, then OpenRouter rates: K2.6 about USD 1 / USD 3 per 1M, V2-Flash about USD 0.09 / USD 0.29 per 1M.

Xiaomi MiMo, the Empire Grunt LLM I Got 200M Credits Of

K2.6 for craft, V2-Pro for extract, V2-Flash for cheap. Wired into 4 ingest scripts as primary.

Read playbook

Free open weights. Compute cost on consumer hardware is unfavorable above 8B-class. Hosted API is roughly Rs 12 per 1M input tokens, Rs 23 per 1M output for V3.2.

DeepSeek Local, the Pricing Disruptor I Mostly Run Hosted

V3 weights are open. I downloaded them, learned the lesson, went back to the API.

Read playbook

Free, open source. Compute cost on consumer hardware is electricity, effectively zero per hour of audio.

Whisper, the Local Transcription I Run on Every Voice Memo

OpenAI's open ASR model. Powers the empire field-note pattern. Zero rupees per hour of audio.

Read playbook

Free, open source. CPU image-gen is real-time-uneconomic. RunPod L4 batch is roughly Rs 14 per 200 images including warm-up.

Diffusers, the Image-Gen Path I Use Sparingly

Hugging Face's library. Honest about what does not work without a real GPU.

Read playbook

Workflows

4 entries

Infra

6 entries

Free open source; running cost depends on host (Rs 1,200/mo Oracle ARM via Coolify is empire baseline)

FastAPI, the Empire's Default Python Serving Layer

Async by default, OpenAPI for free, and the framework I reach for first when I need an HTTP endpoint.

Read playbook

Free open source; running cost depends on host (Rs 1,200/mo Oracle ARM via Coolify supports 5+ empire apps)

PocketBase, the Empire Backend I Run Across Every Project

SQLite plus auth plus realtime plus admin UI in one binary; I have not regretted picking it once.

Read playbook

Free open source; Oracle ARM free tier covers 4-core 24GB; Rs 1,200/mo for paid scale-out instances

Coolify on Oracle ARM, the Empire Hosting Stack

The Heroku-style PaaS I actually run, on the cheapest serious cloud bare-metal in 2026.

Read playbook

Free at entry; Pro from Rs 2,000/mo (only needed for advanced WAF / analytics)

Cloudflare Pages, the Empire Static Hosting I Have Never Regretted

Free static hosting, fast global edge, and the autokaam.com deploy backbone.

Read playbook

Free at 100K daily requests; Paid from Rs 500/mo for higher volume + larger CPU budgets

Cloudflare Workers, the Edge Runtime I Use for the Sharp Bits

100K free daily requests, sub-50ms cold starts, and the empire's ads-txt + redirect layer.

Read playbook

Airtel bundle (free for 500 tx/mo) until 2026-04-30; Direct from Rs 4,000/mo at entry tier

Adobe PDF Services, the Empire's PDF Backbone

Real Adobe-grade PDF tooling for free via the Airtel-bundled S2S subscription, 500 transactions a month.

Read playbook