WordPress lets you export your entire blog as an XML file (WXR format). With a short Lua connector, you can parse that export and make every post searchable by your AI tools — no plugins, no WordPress API, no PHP.
This is what you need when you want to ask Cursor “What did I write about Kubernetes last year?” and get an actual answer grounded in your own writing.
Step 1: Export your WordPress content
In your WordPress admin, go to Tools → Export → All content → Download Export File. You’ll get an XML file like wordpress-export.xml.
Step 2: Write the connector
Create connectors/wordpress.lua and put the XML file next to it:
connectors/
wordpress.lua
wordpress-export.xml
Here’s the full connector — it pattern-matches the WXR XML to extract posts:
connector = {}
connector.name = "wordpress"
connector.description = "Parse WordPress WXR XML export into individual posts"
function connector.scan(config)
local xml = fs.read(config.file or "wordpress-export.xml")
local items = {}
for item_block in xml:gmatch("<item>(.-)</item>") do
local title = item_block:match("<title>(.-)</title>") or "Untitled"
local post_type = item_block:match("<wp:post_type><!%[CDATA%[(.-)%]%]>") or "post"
-- Only index posts and pages, skip attachments/nav items
if post_type == "post" or post_type == "page" then
local link = item_block:match("<link>(.-)</link>") or ""
local pub_date = item_block:match("<pubDate>(.-)</pubDate>") or ""
local creator = item_block:match("<dc:creator><!%[CDATA%[(.-)%]%]>") or ""
-- Extract body from CDATA
local body = item_block:match(
"<content:encoded><!%[CDATA%[(.-)%]%]></content:encoded>"
) or ""
-- Strip HTML tags for cleaner chunks
body = body:gsub("<[^>]+>", " ")
body = body:gsub(" ", " ")
body = body:gsub("&", "&")
body = body:gsub("<", "<")
body = body:gsub(">", ">")
body = body:gsub("%s+", " ")
body = body:match("^%s*(.-)%s*$") or body
-- Extract categories and tags
local tags = {}
for tag in item_block:gmatch('<category domain="post_tag".-<!%[CDATA%[(.-)%]%]>') do
table.insert(tags, tag)
end
for cat in item_block:gmatch('<category domain="category".-<!%[CDATA%[(.-)%]%]>') do
table.insert(tags, cat)
end
if #body > 50 then
table.insert(items, {
id = link ~= "" and link or title,
title = title,
body = body,
url = link,
metadata = {
author = creator,
published = pub_date,
type = post_type,
tags = table.concat(tags, ", "),
},
})
end
end
end
log.info(string.format("Parsed %d posts/pages from WordPress export", #items))
return items
end
return connector
Step 3: Configure and sync
Add to your ctx.toml:
[connectors.script.wordpress]
path = "connectors/wordpress.lua"
file = "wordpress-export.xml"
Then sync:
$ ctx sync script:wordpress
sync script:wordpress
fetched: 147 items
upserted documents: 147
chunks written: 892
ok
Every blog post is now chunked, indexed, and searchable.
Step 4: Search your writing
$ ctx search "kubernetes deployment"
1. [0.91] script:wordpress / How I Migrated to K8s
"After three weekends of YAML wrangling, I finally moved everything..."
2. [0.78] script:wordpress / DevOps Lessons from 2024
"The biggest win was containerizing the legacy monolith..."
Step 5: Connect to Cursor
Start the server and add it to your workspace:
$ ctx serve mcp
.cursor/mcp.json:
{
"mcpServers": {
"my-blog": {
"url": "http://127.0.0.1:7331/mcp"
}
}
}
Now ask Cursor: “Search my blog for posts about deployment automation” — and it pulls from your actual writing.
Tips
- Comments too? Add another
gmatchloop for<wp:comment>blocks inside each item. Append them to the post body or emit them as separate documents. - Multiple exports? Use
fs.list(".")with a glob and loop over all XML files — each gets parsed separately but lands in the same database. - Incremental updates? Re-export from WordPress periodically and re-run
ctx sync. The connector upserts byid(the post URL), so unchanged posts are skipped. - HTML rendering? The connector strips HTML tags for cleaner text. If you want to preserve formatting (code blocks, lists), replace the
gsub("<[^>]+>", " ")line with more selective stripping.