Building MacCrab

MacCrab started as an experiment, a way to see how useful a Mac security tool I could build using Claude and Codex. Now it's in an alpha release state that I run on some of my own machines every day.

I enjoy the process of building with AI tools and have been asked by others how you can take on larger projects. First, learn to like the messy bit in the middle, where you have an idea, a model that can do a lot of the heavy lifting, and no clear sense of whether the idea is any good. There are two threads tangled together here: what a local-first Mac security tool should look like now that developer machines are running coding agents and AI dev environments, and what AI-assisted engineering actually looks like when the goal is production software you'd want other people to install. Both turned out to be more interesting than I expected.

Quick note on what MacCrab is, before any of that makes sense. It's a lightweight, local-first detection and response tool for macOS. It uses Apple's Endpoint Security framework to watch what's happening on your Mac and flags things that look off. It runs entirely on-device by default, sits alongside the other Mac security tools you might already use, and is open source. You can find it at maccrab.com.

On the shoulders of giants

Before any of the build story, the honest acknowledgement: I didn't decide to write this in a vacuum. In 2018 I took the SANS FOR518 Mac/iOS forensics course in London taught by the course author Sarah Edwards. I've also been running Objective-See's tools on my own machines for years. LuLu, BlockBlock, KnockKnock, OverSight, and the rest are genuinely best-in-class at what they each do, and a chunk of how I think about Mac defence comes from using them. Little Snitch is the other inspiration here. The way it surfaces network behaviour to the user without ever assuming you wanted a fleet-management console is the lineage any sensible Mac security tool sits in.

MacCrab isn't trying to replace any of that. It's an additional layer that sits alongside the tools that already do their jobs well.

Starting with the question, not the code

The first thing I did wasn't write any code. I downloaded a stack of open-source Mac security tools, including Objective-See's and a handful of others, and asked the model to review them. I wanted a meta-analysis: where did these tools converge, where did they diverge, what kind of user did each one really serve, and was there any room left for something new?

There was, but a narrow one. The free tools tend to focus on a single surface (network connections, persistence, kext loading, login items, camera and mic access) and do that one thing very well. The commercial tools in the space tend to assume a fleet and a console somewhere in the cloud. The thing I kept coming back to was something in between: a single tool, on a single Mac, doing real-time behavioural detection across surfaces, with no console anywhere, and with first-class awareness of the AI coding tools people are now running on their dev machines.

Once I had a gap I thought I could build something useful in, I asked the model to research the technical shape of the build: how Endpoint Security actually works in practice, what a production-grade Swift client looks like, what the entitlement process is, where people get tripped up. By the time I started writing code I had a reasonable map of the territory.

Finding a nice feedback loop

While the design work was happening I was also doing some Apple security research and bug-bounty work, again with Claude and Codex helping. I can't write about most of that yet, but the shape of it (subsystems behaving in ways the docs don't quite predict, undocumented edges, the patterns that show up when something is misbehaving) turned out to map cleanly onto detection ideas. The bug-bounty research started feeding the tool, and the tool started giving me a better vantage point for the bug-bounty research. That loop kept me motivated for longer than I expected.

Getting opinionated

Once there was a working v1.0 I had to decide what kind of tool I wanted it to be. A few half-built features got cut because they would have nudged the project toward "lightweight EDR with a cloud console". The world already has plenty of those, and the bigger names do them better than I would. I wanted MacCrab to stay firmly local, firmly additive, and small enough that someone could point their own AI agent at the source in an afternoon and understand what it was doing. I'm a strong believer in forking projects. You are free to fork this one and I hope you build a better one so I can use it!

Then I started running it as a daily driver across a handful of my own Macs (a MacBook Pro Apple Silicon laptop, and a Mac mini) to see how it held up against real workloads instead of synthetic ones.

Figuring out the process

Around this time I joined the Apple Developer Program and applied for the Endpoint Security entitlement against my account. ES is a privileged capability for good reasons, and going through the application process gave me a much better feel for what Apple expects from tools that operate at this level. It's one of those things much easier to do once than to discover halfway through, and the process was clearer and quicker than I'd been told to expect.

Then came the release pipeline: notarisation, signing, integrating Sparkle auto-update, a Homebrew cask, an MDM profile, a permanent feed URL. Then the website. By the time the first public alpha went out, the plumbing was probably 50% of the total work, and it was the part where AI assistance helped least. Models are very good at the inside of a Swift file. They're less good at figuring out a process, or at noticing that your Sparkle feed and your Homebrew cask have quietly drifted out of sync. Most of that you still do yourself for now.

Shipping the alpha, and what came next

The first public release is tagged as alpha on purpose. The whole point was to get it onto other people's machines and see what surfaced. A few of the things that came up are worth writing down, partly because they were useful lessons and partly because they're easy to laugh at in retrospect. None of them were disasters. They're just the texture of shipping. Beyond prompting "make this safe" I do test this live on my own machines prior to releases.

Issue 1: The Homebrew cask quietly stopped updating

For a stretch of releases, anyone installing through brew install --cask kept getting an old version of MacCrab even though GitHub Releases and the auto-update feed were both moving forward fine. The repo had two copies of the Homebrew cask file, and my release script was diligently updating the one Homebrew wasn't actually reading. Most people were getting their updates through Sparkle, so it took a while for anyone to notice. The fix was tiny once we found it, and the moral was just "if two files have to stay in lockstep, automate it or delete one of them."

Issue 2: The DB that wouldn't shrink

Around v1.6.10, a few users mentioned that MacCrab's events database had grown pretty large on their machines. There was a size cap setting in the UI that was supposed to handle this. It turned out the cap was being read but never enforced. A clean case of code that looked done but wasn't connected. v1.6.12 wired it up. Then v1.6.13 noticed that even when it ran, the file on disk wasn't visibly shrinking, because SQLite's write-ahead log was still holding the old pages. A checkpoint pair around the cleanup fixed that. Then v1.6.14 fixed the UI slider that adjusts the cap, which was passing the new value into a config object the daemon never re-read. None of those were big bugs, but together they're a useful reminder that "the feature exists" and "the feature works end-to-end" are different things.

Issue 3: MacCrab kept flagging itself

For a couple of releases the rules picked up on suspicious file-access patterns that turned out to be MacCrab itself doing its job. The detections weren't wrong. They just hadn't learned what their own host looks like. Easy to fix once spotted, and a small reminder that an EDR running on the same machine it's watching itself too.

The pattern underneath

The DB-size and Homebrew-cask issues both rhyme. A thing you've added is in the tree, has a UI surface, and is technically present, but the wire from the surface to the bit that does the work is missing or pointing somewhere else. This turns out to be the kind of bug AI assistance is worst at finding. Both halves of the wire compile and pass their tests in isolation. Nothing looks broken from inside any one file. You only spot the gap when you have the whole thing running on a real machine and notice that the user-visible behaviour disagrees with what the code seems to say. Catching that earlier is the thing I'm trying to learn from this whole stretch. The answer so far is mostly real users on real machines, which is why I'd rather ship early than wait for a clean v1.0. What helps a lot here is that in a past life I was a functional tester and am really trying to dogfood these projects so that they actually work.

Where the project is now

MacCrab is open source under Apache 2.0, ships on Homebrew and as a notarised DMG, and is on a fairly regular release cadence. It's still alpha (something I won't change until the time is right), but it's stable enough that I run it everywhere I can. It plays nicely with the Objective-See suite.

If you do anything risky on a Mac, and most of us do between coding agents, AI dev environments, browser extensions, and the general everything-is-online state of things, it might be useful to you too. If you do try it and something seems off, please tell me. The releases since the alpha have all started life as someone noticing something odd, and that's been the best feedback loop I've had on the whole project.

🦀