A practical guide for synthesising years of collected research, articles, documents, and websites using modern AI tools — designed for a Mac user who wants real answers from their own content. Now updated with Readwise API integration details, file sync options, and a new phase on making your knowledgebase publicly accessible.
Before diving into tools, it helps to understand the underlying idea. Traditional search finds documents that contain your keywords. An AI knowledgebase does something much more powerful: it understands your content and can synthesise answers across many sources at once — like having a research assistant who has read everything you've collected.
The technical term is RAG — Retrieval Augmented Generation. In plain language: when you ask a question, the system finds the most relevant pieces from your documents, then an AI model reads those pieces and writes you a synthesised answer, complete with source references. You're not limited to "which document mentioned this?" — you can ask "what do my sources collectively say about X?"
Getting your content in — PDFs, web pages, newsletters, Word files, presentations, and emails.
The system reads and "understands" your content, storing it so an AI can search it by meaning, not just keywords.
You ask questions in plain English and get synthesised answers drawn from your own material.
Note on your Paid Substack content: There's an elegant solution. Readwise Reader lets you forward your Substack emails to a special inbox address. Since you receive the full article in your email as a paid subscriber, Readwise captures the complete content — no scraping or terms-of-service issues.
There are five realistic approaches for your situation, ranging from "start today, no setup" to "powerful but takes a weekend." All tools are available in Australia.
Google's AI research tool lets you upload documents and websites, then ask questions that get synthesised answers with citations. Very polished and easy to use — you can be up and running in an hour.
Readwise Reader is a "read it later" app on steroids — it captures web pages, PDFs, and newsletters (including paid Substack via email forwarding), highlights content, and exports everything. Use it as your capture layer, feeding curated content into NotebookLM for synthesis. This is the lowest-friction way to build meaningful coverage of your web content.
Mem is an AI-powered note-taking and knowledgebase tool that tries to automatically connect ideas across everything you add to it. It's closer to a smart notebook than a document search tool.
AnythingLLM is a free, open-source desktop app that runs on your Mac. You drag in your files (PDFs, Word, Excel, PPT, web pages), and it creates a local AI knowledgebase you can chat with. It connects to AI APIs like Anthropic or OpenAI for the intelligence layer. Your files never leave your computer — it's entirely private. This is the most powerful option for your local file collection, and it now connects directly to Readwise via API.
Build your own system using developer tools like LlamaIndex or LangChain, a vector database (like Chroma), and a frontend. This is what professional AI developers build, and gives you complete control over every aspect of the system.
Given your content mix — local files, web reading list, paid newsletters, and the desire to synthesise rather than just search — the best approach is a two-tool combination, introduced in phases so you're getting value from day one without being overwhelmed.
The stack: Readwise Reader (to capture web content and newsletters) + AnythingLLM (to query your local files, then connect everything via Readwise's built-in connector). Together these cover 100% of your content sources. You can start with Phase 1 alone and add subsequent phases when ready.
Go to read.readwise.io — they have a 60-day free trial, then ~$11 USD/month ($9.99 billed annually, $12.99 month-to-month). Download the Mac app and iOS app if you use your phone for reading.
This adds a button to your browser. When you're on any webpage, click it to instantly save the full article to your Readwise library — formatted cleanly, with no ads. This becomes your replacement for Safari's Reading List going forward.
In Safari: File → Export → Bookmarks. This gives you an HTML file. In Readwise, go to Import → Reading List and upload it. Readwise will attempt to fetch the full content of each URL.
💡 Do this in batches of 100–200 at a time to avoid timeoutsIn Readwise settings, you'll find a personal email address (e.g. yourname@readwise.io). Forward your Substack and Medium subscription emails to this address. Future emails will automatically appear in your Readwise library with full article content — including paid Substack newsletters since the full text arrives in your inbox.
💡 In Gmail/Apple Mail: create a filter to auto-forward emails from substack.com and medium.comGo to readwise.io/access_token while logged in. Copy and save your API key somewhere safe. This is included free with your subscription — there's no additional charge or tier required to use it. You'll use this to connect Readwise directly to AnythingLLM in Phase 3.
💡 Free with subscription — no extra costGo to notebooklm.google.com (free). Create your first notebook on a topic. Upload relevant PDFs and Word docs from your hard drive. Add website URLs from your Readwise library. Ask it questions like "synthesise what my sources say about X" or "what are the common themes across these articles?"
Go to anythingllm.com and download the desktop app. It's a standard Mac installer — open it like any other app.
Go to console.anthropic.com, create an account, and add a small amount of credit (~$10 USD to start). This is what AnythingLLM uses to power the AI responses. You'll be charged per query — casual use costs pennies per day.
💡 Anthropic's Claude is recommended over OpenAI for synthesis qualityOpen AnythingLLM → Settings → LLM Provider. Select Anthropic, paste your API key, and choose Claude Sonnet as your model. Click Save.
Think of workspaces like themed notebooks. For example: "Business Strategy", "Health & Wellbeing", "Technology", "Personal Finance". You can always reorganise later — start with 3–5 broad topics.
AnythingLLM accepts PDF, Word (docx), Excel (xlsx), PowerPoint (pptx), and plain text files. Simply drag files into the workspace. It processes them in the background — larger files take a minute or two.
💡 Start with your 'To Read' PDF folder — drag the whole folder in at onceType questions like "What frameworks for decision making appear in my documents?", "Summarise the key ideas across my business strategy files", or "What do my sources say about [topic]?" — AnythingLLM will respond with a synthesised answer and tell you which documents it drew from.
AnythingLLM has a built-in Readwise connector — no manual file exports needed. In AnythingLLM, go to Settings → Data Connectors → Readwise. Enter your API key (the one you saved in Phase 1, Step 5). AnythingLLM will pull in your highlights and saved articles directly, sitting alongside your local documents in the same searchable workspace.
💡 If the connector isn't visible, exporting highlights as markdown files from Readwise and dragging them in is a reliable fallbackOnce the Readwise connector is set up, refreshing it in AnythingLLM pulls in any newly saved content. It's not fully "set and forget" in the background, but it's far less friction than manual exports — a quick refresh once a week or month keeps your knowledgebase current.
AnythingLLM has no native "watch folder" that auto-syncs when you update files on your hard drive. For now, re-dragging updated files is the straightforward approach. If you want to automate it, AnythingLLM has a REST API that lets you script uploads — a small script can be triggered whenever a file changes, essentially doing what a file-watching tool would do.
Claude Cowork (currently in beta) is designed for exactly this kind of desktop file automation and may eventually handle this for you without any coding. Worth revisiting as it develops.
💡 For most use cases, a monthly "drag and update" is sufficient — automation is optionalIf you find yourself wanting to write notes and build connections between ideas — not just query files — Obsidian is a free local note-taking app beloved by researchers. It integrates natively with both Readwise and AnythingLLM, and your notes are just plain text files on your Mac forever. Worth exploring in month 2–3.
Running AnythingLLM locally means only you can access it by default. Sharing it externally requires one of four approaches below — each with different tradeoffs on cost, effort, and reliability. For a small trusted group, the tunneling approach is the easiest starting point.
The four main approaches, from easiest to most robust:
ngrok or Cloudflare Tunnel create a public web address that routes traffic to your Mac. Run a single terminal command, share the URL. Cloudflare Tunnel is free and more reliable for ongoing access. Ideal for demos or small groups.
Deploy AnythingLLM on a cloud server (DigitalOcean, Vast.ai, RunPod). Anyone accesses it via the server's address. Removes dependence on your Mac being on — best for ongoing shared access. ~$10–20 AUD/month.
If your ISP provides a static IP, you can open a router port pointing to your Mac. Works, but less secure. Requires your Mac to always be on and your internet connection to stay stable.
Put a simple web front-end on top of AnythingLLM's built-in API server. Users get a branded chat interface. Can be combined with any of the hosting methods above.
Regardless of which method you choose, always add authentication before sharing a URL. AnythingLLM supports API keys and password protection — without these, your knowledgebase and your Anthropic API costs are exposed to anyone who finds the link.
💡 Set a strong password in AnythingLLM → Settings → Security before enabling any external accessFor tunneling (ngrok / Cloudflare): your home internet upload speed and whether your Mac stays powered on 24/7 will affect reliability. For simultaneous users, your local hardware may become a bottleneck. Cloud hosting removes both constraints cleanly, and is worth the ~$10–15 AUD/month if you're sharing with more than 2–3 people regularly.
Recommended public access path: Start with Cloudflare Tunnel (free, more stable than ngrok for ongoing use) to test sharing with a small group. If you want reliable 24/7 access for multiple users, move to a small VPS like DigitalOcean (~$10–15 AUD/month). Always secure with a password before sharing any URL.
| Content Type | Recommended Approach | Difficulty |
|---|---|---|
| PDFs (To Read folder) | Drag entire folder into AnythingLLM. For large folders, do it in batches of 50–100 files. | ⭐ Easy |
| Word & Excel files | AnythingLLM handles these natively — drag and drop directly into a workspace. | ⭐ Easy |
| PowerPoint files | AnythingLLM ingests PPT content. Note: it reads the text on slides, not visual diagrams or charts. | ⭐ Easy |
| Safari Reading List | Going forward: Switch to the Readwise Reader browser extension — saves articles just as quickly as Safari Reading List, and everything flows automatically into Readwise and then AnythingLLM. Existing list: Export as HTML bookmarks from Safari (File → Export → Bookmarks), import into Readwise Reader which fetches full article text. | ⭐⭐ Moderate |
| Paid Substack | Forward emails to your Readwise inbox address. The full article is in your email, so Readwise captures it automatically. | ⭐ Easy |
| Free Substack & Medium | Same email forwarding approach as paid. Alternatively, use the Readwise browser extension when reading online. | ⭐ Easy |
| Apple Notes | Apple Notes can export individual notes as PDFs. For bulk export, a free tool like "Exporter" on the Mac App Store converts all notes to plain text files you can drag into AnythingLLM. | ⭐⭐ Moderate |
| Readwise highlights & articles | Use AnythingLLM's built-in Readwise connector (Settings → Data Connectors → Readwise). Paste your API key from readwise.io/access_token — included free with your subscription. Refresh the connector periodically to pull in newly saved content. | ⭐ Easy |
Cost tip: You can trial this for free — NotebookLM is completely free and Readwise offers a 60-day trial. Start there before committing to any subscriptions. Your Readwise API key is included at no extra cost once you subscribe.