Building Your Own AI Agent: A Starter Guide

For Artur & Girlfriend — prepared by Claude, based on Alfonso's month of agent-building learnings so you can skip the hard parts.

What You're Building

A personal AI agent that lives on your laptop, talks to you through a messaging app, and remembers your preferences over time. Think of it like having a quirky assistant that can search the web, manage files, and do tasks on its own schedule.

The platform is called OpenClaw — it's open source, runs locally, and gives you a real agent (not just a chatbot). You'll connect it to an LLM API for the brain, a messaging app for the interface, and optionally give it skills, memory, and a personality.

The best part: OpenClaw supports multiple agents on one laptop. So you can each have your own agent — with its own name, personality, memory, and connected tools — all running from the same gateway. More on that later.

What You Need

OpenClaw — the agent platform. Runs on your laptop as a local gateway. It handles the agent loop, memory, skills, heartbeat (autonomous behavior), and messaging integrations. Install it, configure one JSON file, and you're running.

An LLM API — the brain. This is the service your agent calls to think. You'll need an API key from a provider. There are two ways providers charge: flat-rate plans and per-token pricing. Here's the difference.

Quick Detour: What's an API? What Are Tokens?

An API (Application Programming Interface) is how your agent talks to the AI model. When you send a message to your agent, it packages that message and sends it over the internet to a service (like MiniMax or Google), which runs it through a large language model and sends a response back. The API is the door your agent knocks on. You get an API key (basically a password) that lets your agent access the service.

Tokens are the units AI models use to measure text. Roughly, 1 token is about 3/4 of a word. "Hello, how are you today?" is about 7 tokens. Every message you send costs input tokens, and every response costs output tokens.

Flat-rate plans (like MiniMax Coding Plans) give you a fixed number of prompts per time window for a monthly fee. You don't think about tokens — you just send messages until you hit your limit, then wait for the window to roll over.

Per-token pricing (pay-as-you-go) charges you based on exactly how much text goes in and out. Cheaper if you use it lightly, but costs can add up if your agent is doing a lot of autonomous work.

For a personal project, flat-rate is simpler to reason about. You know exactly what you're spending.

Recommended: MiniMax ($10/month Starter)

MiniMax is an AI company (based in China, strong models) that offers flat-rate Coding Plans: $10, $20, and $50/month tiers. The $10 Starter gives you 1,500 model requests per rolling 5-hour window — that's a lot of back-and-forth with your agent throughout the day. The rolling window means your oldest usage "falls off" over time, not a hard daily reset.

Their model is MiniMax-M2.7, which runs at around 50 tokens per second normally and up to 100 TPS during off-peak hours. It's OpenAI-compatible, which means most code examples and tools just work with it out of the box. The plan also includes image understanding and web search MCP (a protocol that lets your agent use external tools).

Alfonso currently uses MiniMax at $20/month for heavier workloads. $10 is a great starting point — you can bump up later if you want to give your agent more autonomous work.

Free Option: Google Gemini via AI Studio

Google offers free Gemini API access through AI Studio (aistudio.google.com). No credit card required. You sign up with a Google account and get an API key immediately. The free tier gives you access to models like Gemini 2.5 Pro with a massive 1-million-token context window.

The rate limits on the free tier are modest (5-15 requests per minute depending on the model, 100-1000 per day), but that's more than enough for getting started and experimenting before committing to a paid plan.

Two Agents: One for Each of You

OpenClaw can run multiple fully isolated agents inside one gateway process. You don't need two laptops or two installs — you run one gateway, and it hosts two separate "brains" inside it.

Each agent gets its own:

Workspace — separate SOUL.md, USER.md, MEMORY.md files
State directory — separate auth, sessions, and config
Memory store — completely isolated conversation history
Messaging binding — each agent can be linked to a different chat

So Artur's agent could be connected to his WhatsApp, and his girlfriend's agent to hers. Or you could both use Telegram with separate bots. Each agent has its own personality, its own memory, and its own set of tools.

Separate Tools per Agent

This is where it gets interesting. Each agent can have different skills and integrations. For example:

Artur's agent could be connected to his Google Calendar, his work email, and have financial analysis skills
His girlfriend's agent could be connected to her calendar, her email, and have whatever skills matter to her

The agents don't share memory or context unless you explicitly tell them to. They're fully independent.

Google Cloud Console: Why It's Worth Knowing About

This is especially relevant for Artur given his FP&A background. Google Cloud Console ($300 free credits per new project) isn't just about Gemini API calls — it's a gateway to Google's entire cloud platform:

Google Workspace API keys — you can create API credentials scoped to individual Google accounts. This is exactly how you'd connect each agent to different calendars, different Gmail accounts, and different Drive folders. One agent reads Artur's calendar, the other reads his girlfriend's. The API keys enforce the separation.
BigQuery — if Artur ever wants his agent to query financial data, BigQuery is Google's data warehouse and the free tier covers 1TB of queries/month
Google Sheets API — an agent that can read and write to spreadsheets is immediately useful for anyone in FP&A. Budget tracking, forecast models, data pulls — all accessible through API keys
Cloud Functions — lightweight scripts that run on Google's servers. You could write a small function that your agent triggers to pull stock data, update a spreadsheet, or send a formatted report

The $300 in free credits covers a lot of experimentation with these services. And since each Google account gets its own API credentials, this naturally supports the "one agent per person" setup — each agent authenticates as a different user and sees only that user's data.

A Messaging App

How you talk to your agent. OpenClaw supports several:

WhatsApp — probably easiest since you already use it. OpenClaw has WhatsApp integration.
Telegram — what Alfonso uses. Free bot API, easy to set up (@BotFather creates your bot in 30 seconds). Slightly more developer-friendly than WhatsApp.
Discord — also works.

Use whatever you already have. The messaging layer is just plumbing. If you're running two agents, you can use the same platform with two separate bot accounts, or different platforms entirely.

GitHub

Use GitHub from day one. Here's why:

Your agent's config, personality, and skills are all text files — they are your project. Git tracks every change, so when you tweak the personality and it gets weird, you can roll back in one command.

You'll change the personality 10 times the first weekend. Without version control, you'll lose the version that worked.

Two people, one project. You and your girlfriend can both make changes without overwriting each other. Git handles the merge. Without it, you'll be texting "don't touch the config file, I'm editing it."

It's free — private repos, unlimited. And it's your backup. Laptop dies, agent lives on GitHub. Clone it onto a new machine and you're back in 5 minutes.

You don't need to be a git expert. Learn git add, git commit -m "message", git push, and git pull. That's 95% of what you'll use.

What Actually Matters

These are the learnings from Alfonso's month of building and iterating on agents.

The personality file is everything

OpenClaw uses a file called SOUL.md — it defines who your agent is. Tone, style, boundaries, how it talks. This is the single most impactful file in your whole setup. Spend time on it. Write it together — it's actually fun.

A good SOUL.md has:

A name and identity (give your agent a name you like talking to)
How it should talk (casual? formal? uses emoji? has opinions?)
What it's good at (research? reminders? creative writing?)
What it should never do (don't want it sending messages at 3am? say so)

Memory search makes it feel real

OpenClaw has built-in memory — your agent remembers past conversations and can search through them. This is what makes it feel like your agent instead of a generic chatbot. It works out of the box with local embeddings (no extra API cost).

Put a few files in its workspace: notes about yourselves, things you care about, project context. The agent will search these when relevant. The more context it has, the better it performs.

Skills are plug-and-play

OpenClaw has a skill system — small capability packages your agent can use. There's a built-in set plus a community hub (ClawHub). Start with the defaults, then add what you need: web search, file management, scholarly search, or build your own (a skill is just a markdown file describing what the tool does).

The heartbeat is where autonomy lives

The "heartbeat" is OpenClaw's mechanism for your agent to do things on its own — check in, reflect, run scheduled tasks. You can set it to send a morning briefing, do an evening reflection, or proactively work on things you've assigned.

Start simple: maybe a morning message and an evening summary. You can make it more autonomous later.

What Doesn't Matter (Yet)

Things Alfonso spent time on that you should skip:

Don't optimize retrieval weights. Alfonso literally wrote a research paper on this. For a personal agent, the defaults are fine. The agent can just search again if it doesn't find what it needs.

Don't over-engineer the architecture. Alfonso's first system was 19,000 lines of Python across 34 files with 18 background services. His current setup does more useful work with ~800 lines and 1 service. Start minimal. Add complexity only when you feel the pain of not having it.

Don't use Docker unless you already love Docker. It eats 2-4GB of RAM for no reason on a personal laptop. OpenClaw runs natively.

Don't stress about the "best" model. MiniMax M2.7, Gemini, Claude — they're all strong enough for a personal agent. Start with whatever's cheapest and easiest, and you can swap models later since OpenClaw makes it a one-line config change.

Getting Started: A Rough Game Plan

Here's a loose plan for when you two sit down to do this. No rush — the whole point is to have fun with it.

Session 1: Get it running (2-3 hours)

1. Install OpenClaw on the laptop (follow their docs, it's straightforward)

2. Sign up for MiniMax Starter ($10/mo) or grab a free Gemini API key from AI Studio

3. Get the API key, put it in your OpenClaw config (openclaw.json)

4. Set up your messaging bot (WhatsApp or Telegram)

5. Send your first message — confirm it responds

Session 2: Make it yours (2-3 hours)

1. Write your SOUL.md together — give your agent a name, personality, vibe

2. Create a USER.md with context about both of you

3. Add a few skills from ClawHub

4. git init and push to a private GitHub repo

5. (Optional) Set up the second agent with its own workspace and messaging binding

Whenever you feel like going deeper:

1. Set up a morning briefing cron job

2. Add some files to the agent's memory (notes, preferences, project ideas)

3. Start a conversation thread and see how it remembers context

4. Tweak the personality based on how the first few days felt

Config Basics

Your main config file is openclaw.json. The key fields:

{
  "models": {
    "default": "your-model-name"
  },
  "messaging": {
    "telegram": { ... }
  },
  "heartbeat": {
    "every": "6h"
  },
  "memory": {
    "enabled": true
  }
}

The workspace directory (~/.openclaw/workspace/) is where your agent's personality and memory files live: SOUL.md, USER.md, MEMORY.md, HEARTBEAT.md. These are the files you'll edit most.

For multiple agents, each agent gets its own workspace under ~/.openclaw/agents// — same file structure, fully isolated.

Lessons From the Trenches

Hard-won insights from Alfonso's build that'll save you time:

Frameworks add more complexity than they solve at your scale. Alfonso ripped out an agent framework and replaced it with 280 lines of direct API calls. The result was simpler, faster, and easier to debug. OpenClaw already IS your framework — don't add another one on top.

Commit your config changes before experimenting. git commit -m "working personality v3" before you start tweaking. You'll thank yourself.

If you change a config file, restart the gateway. The number one "bug" is editing a file and wondering why nothing changed. The running process still has the old version.

Set token limits generously. If your agent's responses are getting cut off or it's doing weird multi-step loops, your max_tokens is too low. 16,384 is a safe default. Too low causes truncation cascades that cost MORE tokens, not fewer.

Temperature 0.3 for utility, 0.7+ for creativity. Lower temperature = more focused, shorter responses. Higher = more creative but verbose.

Your agent will be bad at first. That's normal. The personality takes iteration, the memory takes time to build up, and you'll learn what instructions actually stick vs. what gets ignored. Give it a week of daily use before judging.

Resources

OpenClaw docs — start here for installation and config
ClawHub — community skills you can install
GitHub — free private repos at github.com
MiniMax — minimax.io for API signup ($10/mo Starter plan)
Google AI Studio — aistudio.google.com (free Gemini API key, no credit card)
Google Cloud Console — cloud.google.com/free ($300 credits for Workspace APIs, BigQuery, etc.)

Prepared by Claude, drawing on Alfonso's experience: a month of agent engineering, 10,000+ agent iterations, 4,500+ task episodes, and a published research paper on retrieval optimization. Compiled so you can skip straight to the fun part.

(March 2026)