Project Scope Document
Version 2 · February 2026

Building Your Personal
AI Knowledgebase

A practical guide for synthesising years of collected research, articles, documents, and websites using modern AI tools — designed for a Mac user who wants real answers from their own content. Now updated with Readwise API integration details, file sync options, and a new phase on making your knowledgebase publicly accessible.

📍 Mac · Australia 📦 Medium Scale (500–2000 files) 🎯 Synthesis-focused 💰 Subscription-friendly
What's new in v2
  • Confirmed: Readwise API access is free with your existing subscription — no extra charge
  • Updated Phase 3 to use AnythingLLM's built-in Readwise connector (no manual export needed)
  • Added file sync options: REST API scripting and Claude Cowork for automated local file updates
  • Updated Safari Reading List guidance — switch to Readwise Reader browser extension going forward
  • New Phase 4: Making your knowledgebase publicly accessible (ngrok, Cloudflare Tunnel, cloud hosting)
  • Updated budget with optional cloud hosting cost for public access
  • Added Month 3 to the Getting Started roadmap

How an AI Knowledgebase Works

Before diving into tools, it helps to understand the underlying idea. Traditional search finds documents that contain your keywords. An AI knowledgebase does something much more powerful: it understands your content and can synthesise answers across many sources at once — like having a research assistant who has read everything you've collected.

The technical term is RAG — Retrieval Augmented Generation. In plain language: when you ask a question, the system finds the most relevant pieces from your documents, then an AI model reads those pieces and writes you a synthesised answer, complete with source references. You're not limited to "which document mentioned this?" — you can ask "what do my sources collectively say about X?"

Layer 1

📥 Capture

Getting your content in — PDFs, web pages, newsletters, Word files, presentations, and emails.

Layer 2

🗄️ Index

The system reads and "understands" your content, storing it so an AI can search it by meaning, not just keywords.

Layer 3

💬 Query

You ask questions in plain English and get synthesised answers drawn from your own material.

Note on your Paid Substack content: There's an elegant solution. Readwise Reader lets you forward your Substack emails to a special inbox address. Since you receive the full article in your email as a paid subscriber, Readwise captures the complete content — no scraping or terms-of-service issues.

Tool Approaches: Pros & Cons

There are five realistic approaches for your situation, ranging from "start today, no setup" to "powerful but takes a weekend." All tools are available in Australia.

Option 1 — Google NotebookLM
Free Web-based

Google's AI research tool lets you upload documents and websites, then ask questions that get synthesised answers with citations. Very polished and easy to use — you can be up and running in an hour.

✓ Pros
  • Completely free
  • Excellent synthesis quality
  • Always cites its sources
  • Handles PDF, Word, websites
  • No technical setup needed
✗ Cons
  • 50 sources per notebook max
  • No automation — all manual upload
  • Can't handle Excel or PPT natively
  • No email/newsletter integration
  • Your data goes to Google
Option 3 — Mem.ai
All-in-one ~$23 AUD/mo

Mem is an AI-powered note-taking and knowledgebase tool that tries to automatically connect ideas across everything you add to it. It's closer to a smart notebook than a document search tool.

✓ Pros
  • All-in-one: capture & query
  • Automatically surfaces connections
  • Clean, simple interface
  • Good at creative synthesis
✗ Cons
  • More expensive for what you get
  • Weaker on large file ingestion
  • No Excel/PPT support
  • Less control over source fidelity
  • Your data is in their cloud
Option 5 — Custom RAG Pipeline
Technical Variable cost

Build your own system using developer tools like LlamaIndex or LangChain, a vector database (like Chroma), and a frontend. This is what professional AI developers build, and gives you complete control over every aspect of the system.

✓ Pros
  • Total control and flexibility
  • Full automation possible
  • Can handle any data source
  • Excellent learning opportunity
✗ Cons
  • Requires coding knowledge
  • Multi-week setup project
  • You maintain it yourself
  • Overkill for personal use

The Implementation Plan

Given your content mix — local files, web reading list, paid newsletters, and the desire to synthesise rather than just search — the best approach is a two-tool combination, introduced in phases so you're getting value from day one without being overwhelmed.

The stack: Readwise Reader (to capture web content and newsletters) + AnythingLLM (to query your local files, then connect everything via Readwise's built-in connector). Together these cover 100% of your content sources. You can start with Phase 1 alone and add subsequent phases when ready.

1
Phase 1 — Set Up Your Content Capture (Week 1)
Readwise Reader + NotebookLM · ~2–3 hours total setup
1

Sign up for Readwise Reader

Go to read.readwise.io — they have a 60-day free trial, then ~$11 USD/month ($9.99 billed annually, $12.99 month-to-month). Download the Mac app and iOS app if you use your phone for reading.

2

Install the browser extension (Safari or Chrome)

This adds a button to your browser. When you're on any webpage, click it to instantly save the full article to your Readwise library — formatted cleanly, with no ads. This becomes your replacement for Safari's Reading List going forward.

3

Import your existing Safari Reading List (one-time)

In Safari: File → Export → Bookmarks. This gives you an HTML file. In Readwise, go to Import → Reading List and upload it. Readwise will attempt to fetch the full content of each URL.

💡 Do this in batches of 100–200 at a time to avoid timeouts
4

Set up your Readwise email address for newsletters

In Readwise settings, you'll find a personal email address (e.g. yourname@readwise.io). Forward your Substack and Medium subscription emails to this address. Future emails will automatically appear in your Readwise library with full article content — including paid Substack newsletters since the full text arrives in your inbox.

💡 In Gmail/Apple Mail: create a filter to auto-forward emails from substack.com and medium.com
5

Retrieve your Readwise API key (you'll need it in Phase 3)

Go to readwise.io/access_token while logged in. Copy and save your API key somewhere safe. This is included free with your subscription — there's no additional charge or tier required to use it. You'll use this to connect Readwise directly to AnythingLLM in Phase 3.

💡 Free with subscription — no extra cost
6

Start using NotebookLM for synthesis

Go to notebooklm.google.com (free). Create your first notebook on a topic. Upload relevant PDFs and Word docs from your hard drive. Add website URLs from your Readwise library. Ask it questions like "synthesise what my sources say about X" or "what are the common themes across these articles?"

2
Phase 2 — Ingest Your Local Files into AnythingLLM (Week 2–3)
AnythingLLM · ~3–4 hours setup, handles all your local files
1

Download AnythingLLM for Mac

Go to anythingllm.com and download the desktop app. It's a standard Mac installer — open it like any other app.

2

Get an Anthropic API key

Go to console.anthropic.com, create an account, and add a small amount of credit (~$10 USD to start). This is what AnythingLLM uses to power the AI responses. You'll be charged per query — casual use costs pennies per day.

💡 Anthropic's Claude is recommended over OpenAI for synthesis quality
3

Connect the API key in AnythingLLM settings

Open AnythingLLM → Settings → LLM Provider. Select Anthropic, paste your API key, and choose Claude Sonnet as your model. Click Save.

4

Create workspaces by topic

Think of workspaces like themed notebooks. For example: "Business Strategy", "Health & Wellbeing", "Technology", "Personal Finance". You can always reorganise later — start with 3–5 broad topics.

5

Drag in your files

AnythingLLM accepts PDF, Word (docx), Excel (xlsx), PowerPoint (pptx), and plain text files. Simply drag files into the workspace. It processes them in the background — larger files take a minute or two.

💡 Start with your 'To Read' PDF folder — drag the whole folder in at once
6

Start querying your files

Type questions like "What frameworks for decision making appear in my documents?", "Summarise the key ideas across my business strategy files", or "What do my sources say about [topic]?" — AnythingLLM will respond with a synthesised answer and tell you which documents it drew from.

3
Phase 3 — Connect Everything & Automate (Month 2)
Once comfortable with Phase 1 & 2, bridge the two systems and reduce manual effort
1

Connect Readwise directly to AnythingLLM

AnythingLLM has a built-in Readwise connector — no manual file exports needed. In AnythingLLM, go to Settings → Data Connectors → Readwise. Enter your API key (the one you saved in Phase 1, Step 5). AnythingLLM will pull in your highlights and saved articles directly, sitting alongside your local documents in the same searchable workspace.

💡 If the connector isn't visible, exporting highlights as markdown files from Readwise and dragging them in is a reliable fallback
2

Keep your Readwise content fresh

Once the Readwise connector is set up, refreshing it in AnythingLLM pulls in any newly saved content. It's not fully "set and forget" in the background, but it's far less friction than manual exports — a quick refresh once a week or month keeps your knowledgebase current.

3

Keep your local files fresh — manual or scripted

AnythingLLM has no native "watch folder" that auto-syncs when you update files on your hard drive. For now, re-dragging updated files is the straightforward approach. If you want to automate it, AnythingLLM has a REST API that lets you script uploads — a small script can be triggered whenever a file changes, essentially doing what a file-watching tool would do.

Claude Cowork (currently in beta) is designed for exactly this kind of desktop file automation and may eventually handle this for you without any coding. Worth revisiting as it develops.

💡 For most use cases, a monthly "drag and update" is sufficient — automation is optional
4

Consider Obsidian as a long-term home (optional)

If you find yourself wanting to write notes and build connections between ideas — not just query files — Obsidian is a free local note-taking app beloved by researchers. It integrates natively with both Readwise and AnythingLLM, and your notes are just plain text files on your Mac forever. Worth exploring in month 2–3.

4
Phase 4 — Make it Publicly Accessible (Optional, Month 3+)
Share your knowledgebase with external users while keeping control
1

Understand your options

Running AnythingLLM locally means only you can access it by default. Sharing it externally requires one of four approaches below — each with different tradeoffs on cost, effort, and reliability. For a small trusted group, the tunneling approach is the easiest starting point.

2

Choose your access method

The four main approaches, from easiest to most robust:

⭐ Easiest — Start here

Tunneling tools

ngrok or Cloudflare Tunnel create a public web address that routes traffic to your Mac. Run a single terminal command, share the URL. Cloudflare Tunnel is free and more reliable for ongoing access. Ideal for demos or small groups.

Most reliable

Cloud hosting (VPS)

Deploy AnythingLLM on a cloud server (DigitalOcean, Vast.ai, RunPod). Anyone accesses it via the server's address. Removes dependence on your Mac being on — best for ongoing shared access. ~$10–20 AUD/month.

DIY option

Static IP + port forwarding

If your ISP provides a static IP, you can open a router port pointing to your Mac. Works, but less secure. Requires your Mac to always be on and your internet connection to stay stable.

Most polished

Web interface wrapper

Put a simple web front-end on top of AnythingLLM's built-in API server. Users get a branded chat interface. Can be combined with any of the hosting methods above.

3

Secure it before you share it

Regardless of which method you choose, always add authentication before sharing a URL. AnythingLLM supports API keys and password protection — without these, your knowledgebase and your Anthropic API costs are exposed to anyone who finds the link.

💡 Set a strong password in AnythingLLM → Settings → Security before enabling any external access
4

Know your practical limits

For tunneling (ngrok / Cloudflare): your home internet upload speed and whether your Mac stays powered on 24/7 will affect reliability. For simultaneous users, your local hardware may become a bottleneck. Cloud hosting removes both constraints cleanly, and is worth the ~$10–15 AUD/month if you're sharing with more than 2–3 people regularly.

Recommended public access path: Start with Cloudflare Tunnel (free, more stable than ngrok for ongoing use) to test sharing with a small group. If you want reliable 24/7 access for multiple users, move to a small VPS like DigitalOcean (~$10–15 AUD/month). Always secure with a password before sharing any URL.

Handling Your Specific Content Types

Content Type Recommended Approach Difficulty
PDFs (To Read folder) Drag entire folder into AnythingLLM. For large folders, do it in batches of 50–100 files. ⭐ Easy
Word & Excel files AnythingLLM handles these natively — drag and drop directly into a workspace. ⭐ Easy
PowerPoint files AnythingLLM ingests PPT content. Note: it reads the text on slides, not visual diagrams or charts. ⭐ Easy
Safari Reading List Going forward: Switch to the Readwise Reader browser extension — saves articles just as quickly as Safari Reading List, and everything flows automatically into Readwise and then AnythingLLM. Existing list: Export as HTML bookmarks from Safari (File → Export → Bookmarks), import into Readwise Reader which fetches full article text. ⭐⭐ Moderate
Paid Substack Forward emails to your Readwise inbox address. The full article is in your email, so Readwise captures it automatically. ⭐ Easy
Free Substack & Medium Same email forwarding approach as paid. Alternatively, use the Readwise browser extension when reading online. ⭐ Easy
Apple Notes Apple Notes can export individual notes as PDFs. For bulk export, a free tool like "Exporter" on the Mac App Store converts all notes to plain text files you can drag into AnythingLLM. ⭐⭐ Moderate
Readwise highlights & articles Use AnythingLLM's built-in Readwise connector (Settings → Data Connectors → Readwise). Paste your API key from readwise.io/access_token — included free with your subscription. Refresh the connector periodically to pull in newly saved content. ⭐ Easy

Monthly Budget (AUD Approximate)

Readwise Reader (annual plan) ~$17 / mo
Readwise API access Included ✓
AnythingLLM app Free
Anthropic API usage (casual queries) ~$5–15 / mo
Google NotebookLM Free
Estimated Total (Phases 1–3) ~$22–32 AUD / month
Phase 4 — Optional Public Access
Cloudflare Tunnel (tunneling) Free
Cloud VPS (DigitalOcean / Hetzner) — if needed ~$10–20 / mo

Cost tip: You can trial this for free — NotebookLM is completely free and Readwise offers a 60-day trial. Start there before committing to any subscriptions. Your Readwise API key is included at no extra cost once you subscribe.

Your Roadmap

Day 1

Start with NotebookLM

Create a free account, upload 10–15 PDFs from your To Read folder, and ask it a synthesis question. This shows you what's possible immediately.

Day 2–3

Sign up for Readwise

Install the browser extension, set up email forwarding for Substack, import your Safari Reading List, and save your API key from readwise.io/access_token.

Week 2

Set up AnythingLLM

Download the Mac app, get an Anthropic API key, and create your first workspace with your PDF folder.

Month 2

Connect & automate

Connect Readwise to AnythingLLM via the built-in connector, settle on a weekly refresh routine, and explore scripting or Cowork for local file automation.

Month 3

Share with others

Set up Cloudflare Tunnel to share with a small trusted group. Add password protection first. Evaluate whether a cloud VPS makes sense for your usage level.