
A note on funding: CypherpunkGuide carries no surveillance advertising — no ad networks, tracking pixels, or sponsored content. It is funded by transparent streams: reader donations now; subscription and editorially-aligned affiliate later. We answer to our readers, not to advertisers.
For as long as operational security has existed — OPSEC, the discipline of protecting information by thinking the way the people who want it think — it has rested on one picture of the adversary: a person. An investigator with a budget. A stalker with patience. A recruiter, a border officer, an ex. You learned to build a threat model — a short, honest map of what you are protecting, who wants it, what they can realistically do, and what it costs you to stop them — and then you spent your effort where that map said it mattered. For two decades of digital life, that map was enough.
It is now wrong in four specific places, because the adversary is increasingly not a person but a machine. A machine does not get tired, does not forget, does not need a warrant to read what is already public, and does not work at human scale. The shift is not hypothetical: in a March 2026 Pew Research summary, 50% of U.S. adults said they feel more concerned than excited about the spread of AI — up from 37% in 2021 — and an earlier Pew survey of people familiar with AI found 81% expecting their personal information to be used in ways they would find uncomfortable. The concern is rational. We keep this site’s own server logs under watch for the dozen-or-so self-identifying AI crawler user-agents — GPTBot, ClaudeBot, PerplexityBot, Google-Extended and their peers — and they arrive continuously, on their own schedule, not ours.
We built the four-assumption frame below after working through the privacy guidance that already exists — and finding that most of it either secures enterprise AI systems or stops at a list of consumer tools, leaving the individual’s own threat model unwritten.
So how do you rebuild a threat model when the adversary is a machine? Not by hunting for a delete button — none reaches a model’s trained weights. You rebuild it the way you would after learning the locks on your house no longer fit the door: assumption by assumption. Below are the four that AI breaks, what each one changes, and where your remaining effort actually moves your exposure rather than merely soothing you.
| Classic OPSEC assumed… | The machine adversary instead… | Your real lever |
|---|---|---|
| Linking scattered data is slow, manual work | Correlates millions of fragments cheaply and instantly | Reduce what is linkable across contexts |
| You only expose what you choose to post | Infers unposted facts from patterns | Manage the signal, not just the statement |
| Deleting at the source removes the data | Has already absorbed copies into model weights | Prevent at publication; deletion is partial |
| Identity needs your participation to forge | Synthesizes your voice, face, and writing | Pre-register trust; minimise raw samples |
Assumption 1 — Correlation Is No Longer Slow#
Correlation at scale is the first assumption AI breaks. A machine can join data points that are individually harmless — a reused username, a photo’s embedded location, the cadence of when you post — into a single profile faster and far more cheaply than any human investigator ever could. The old protection was friction: linking your accounts took a person hours, so most adversaries never bothered. That friction is gone.
Correlation here means connecting separate pieces of information into one picture. The danger was never any single post; it was the join. Your professional handle and your anonymous one share a turn of phrase. A landscape photo carries GPS coordinates in its metadata — the invisible data attached to a file, recording where and when it was made. A delivery review, a race result, a public wishlist: each is trivial alone, and together they are a dossier. Machines are built precisely to find those joins across millions of records at once.
This reframes a classic rule. “Don’t post anything sensitive” was always incomplete, because the sensitive thing is often emergent — it appears only when fragments combine. The discipline that replaces it is compartmentation: deliberately preventing your contexts from sharing linkable features. Different identities get different usernames, different writing registers, different devices and networks where it matters; metadata gets stripped before anything leaves your hands. When the state itself is the one compelling the data that later gets correlated, that is a related threat with its own playbook — When the Government Leaks Your Data.
Assumption 2 — You Expose More Than You Post#
Inference is the second broken assumption: a model can deduce facts you never disclosed — your likely location, employer, health status, relationships, or sexual orientation — from patterns in what you did post. The old mental model was a ledger: your exposure equalled the sum of what you typed. Inference turns that ledger into a surface, where the negative space speaks too.
The mechanism is ordinary machine learning. Given enough examples, a model learns that people who write a certain way, follow certain accounts, and post at certain hours tend to share traits — and it applies that pattern to you. You did not state your city; your photo backdrops, your “good morning” timestamps, and the local slang you reuse imply it. This is why aggressive deletion can feel productive and change little: removing a single post rarely removes the pattern that lets the inference stand.
The lever is to manage the signal, not just the statement. Vary or blur the patterns an adversary would mine — posting times, location backdrops, the linguistic fingerprint that ties two identities together — and treat any data that reveals relationships and location as the highest-value target, because those are what inference compounds fastest. For most people the realistic goal is not to defeat inference but to raise its error rate enough that you are no longer the cheapest profile to build. A fuller dissection of how these inference chains run end to end is coming in a companion piece in this series.
Assumption 3 — Deletion No Longer Reaches the Data#
Permanence is the third assumption AI breaks. Once your public text or image has been absorbed into a model’s training data, deleting the original does not remove what the model has already learned — there is no “delete” that reaches inside trained weights. The old promise was reversibility: a mistake could be unpublished. Against a model, publication is closer to a one-way door.
Public posts, captions, and images are collected into web-scale datasets — Common Crawl — the web-scale archive of the public internet that most major labs train on — is the best known — and used to train language and image models. The research field of machine unlearning, which tries to make a trained model forget specific data, treats the problem as genuinely hard and unsolved at scale; the only reliable remedy is retraining without the data, which owners almost never do for one person. And ingestion is not a harmless blur: security researchers have demonstrated that fragments of training data can be extracted back out of large models.
This is the dimension where the AI age meets the older problem of the permanent web most directly, so rather than repeat it, this is the hand-off: the full audit playbook for what survives deletion — backups, brokers, archives, and the training corpora — lives in How Permanent Is Your Social Media Footprint?. The threat-model consequence to carry forward is blunt: timing beats cleanup. Because ingestion is continuous, the only fully effective control is not publishing the sensitive thing in the first place. Every defense applied afterward is partial — and that single fact should reorder your priorities away from deletion tools and toward what you release at all.
Assumption 4 — Your Voice and Face Are Now Credentials#
Synthetic identity is the fourth broken assumption: with a small sample of your voice, face, or writing, a model can generate convincing forgeries — and the same biometric features you treat as proof of “you” become raw material for impersonating you. The old assumption was that forging your identity required your participation or your secrets. It now requires only your published media.
A few seconds of clear audio is enough for voice cloning; a handful of photos is enough for a synthetic likeness; a corpus of your posts is enough to mimic your writing. This collapses a quiet protection most people relied on without noticing — that a familiar voice or face was self-authenticating. It is also not an evenly distributed risk. Impersonation, fabricated intimate imagery, and voice-based fraud fall disproportionately on women and on anyone with a motivated harasser, which makes this dimension a matter of bodily and reputational sovereignty, not merely data hygiene.
Two levers apply. The first is minimisation: limit the volume and clarity of raw biometric samples you publish — fewer high-fidelity voice clips, fewer face-forward photos tied to your legal name — accepting that this is mitigation, not a cure. The second is pre-registered trust: agree, in advance and out of band — over a separate channel an attacker cannot intercept — on a verification step with the people who matter — a shared word, a callback number, a second channel — so that a cloned voice on the phone cannot manufacture urgency. A dedicated treatment of voice- and face-as-credential, with the family-verification protocol in full, is coming in this series.
Rebuilding the Model — A Four-Dimension Checklist#
Rebuilding your threat model for the AI age means re-asking the four classic OPSEC questions against a machine adversary, then re-deciding where your effort changes real exposure. You do not need to defend every dimension equally; you need to find which one is your weakest link and start there. In working through this frame ourselves, the dimension we see underestimated most is inference — people guard what they say and forget that the patterns around it speak just as loudly.
Walk your own situation through the four dimensions — in rough priority order for someone without a specific adversary:
| Dimension | What the machine does | Your lever | Where to start |
|---|---|---|---|
| Permanence | Retains what you publish inside model weights | Publish less; treat the public version as undeletable | First — it is irreversible |
| Synthetic identity | Forges voice and face from small samples | Minimise raw samples; pre-register out-of-band verification | First — high personal harm |
| Correlation | Joins scattered fragments into one profile cheaply | Compartment: separate usernames, devices, stripped metadata | Next |
| Inference | Deduces unposted facts from your patterns | Manage the signal: blur location, routine, relationship cues | Ongoing |
Order them against your own life, not this table — the point is to find your weakest link and act there first, not to defend all four equally.
A note on what not to over-invest in: regulation. The EU AI Act begins applying most of its provisions on 2 August 2026, but its most demanding obligations for high-risk systems were pushed back — under the May 2026 “Digital Omnibus” agreement — to December 2027 and August 2028. Data-protection regulators are engaging seriously; the European Data Protection Board’s Opinion 28/2024, adopted 18 December 2024, set out how GDPR principles apply to AI models, including when a model can be considered anonymous and what unlawfully-trained models risk. This is a live frontier worth tracking — and a poor thing to rely on. Your threat model has to hold in the years before the law catches up, which is exactly why it has to be yours.
“Privacy is necessary for an open society in the electronic age. … We cannot expect governments, corporations, or other large, faceless organizations to grant us privacy out of their beneficence.” — Eric Hughes, A Cypherpunk’s Manifesto, 1993
That sentence was written about cryptography and email. It reads now as a description of the machine adversary: the tools changed, the principle did not. You build the model because no one builds it for you. Then you spend your effort where it moves your real exposure — and you keep the rest of the Privacy pillar close, because each dimension here has its own deeper map.
Bottom Line — Which Dimension Is Your Weakest Link?#
The right level of AI-age OPSEC is the one that matches your threat model — which dimension is your weakest link depends entirely on who you are protecting yourself from.
- If you are a general user with no specific adversary: the highest-leverage moves are permanence and synthetic identity — adopt a publishing pause, and cut your most identifiable raw voice and face samples. Skip the rest until you have a reason.
- If you maintain separate identities — a pseudonymous creator, an activist, anyone whose contexts must not connect: correlation is your front line. Compartment ruthlessly; one reused username can undo everything else.
- If you carry asymmetric risk — women facing harassment, survivors, public-facing professionals: prioritise synthetic identity and inference, and treat the out-of-band verification protocol as non-optional.
Across all four, the same truth holds that held in the human-adversary era — you cannot reliably delete your way to safety after the fact. You can only model the adversary you actually have, decide deliberately, and publish less of what you would not want a machine to keep.
Frequently Asked Questions#
What is AI-age OPSEC? AI-age OPSEC is operational security rebuilt for a machine adversary. Classic OPSEC modelled a human investigator with finite time; AI-age OPSEC models a system that correlates data at scale, infers what you never posted, retains what you publish inside model weights, and can synthesize your voice and face. In practice it means re-running the standard threat-model questions — what you protect, who wants it, what they can do — against those four capabilities.
Can AI really deanonymize me from “anonymous” data? Often, yes. Anonymity by omission — leaving your name off a post — is weak against inference and correlation, because a model can re-identify you from patterns and from joins across separate datasets. Strong unlinkability comes from compartmentation (separate usernames, devices, networks, and stripped metadata), not from simply withholding your name.
Does opting out of AI training actually help? Partly, and mostly going forward. Opt-outs and “do not train” signals can reduce future ingestion where platforms honour them, but they do not reach data already absorbed into trained models, and machine unlearning remains unsolved at scale. Treat opt-out as one prevention control among several, not as a delete button.
Will the EU AI Act protect me as an individual? Not soon, and not as a substitute for your own threat model. Most of the Act’s provisions apply from August 2026, but its strictest high-risk obligations were deferred to December 2027 and August 2028 under the May 2026 Digital Omnibus agreement. Regulation is a slow, uneven backstop; the controls in this article are what you hold in the meantime.
| # | Source | URL | Archive |
|---|---|---|---|
| 1 | Pew Research Center — “What the data says about Americans’ views of AI” (Mar 2026) | https://www.pewresearch.org/short-reads/2026/03/12/key-findings-about-how-americans-view-artificial-intelligence/ | https://web.archive.org/web/*/https://www.pewresearch.org/short-reads/2026/03/12/key-findings-about-how-americans-view-artificial-intelligence/ |
| 2 | Pew Research Center — “How Americans View Data Privacy” (Oct 2023) | https://www.pewresearch.org/internet/2023/10/18/how-americans-view-data-privacy/ | https://web.archive.org/web/*/https://www.pewresearch.org/internet/2023/10/18/how-americans-view-data-privacy/ |
| 3 | EU Artificial Intelligence Act — Implementation Timeline | https://artificialintelligenceact.eu/implementation-timeline/ | https://web.archive.org/web/*/https://artificialintelligenceact.eu/implementation-timeline/ |
| 4 | EDPB — Opinion 28/2024 on AI models and GDPR (18 Dec 2024) | https://www.edpb.europa.eu/news/news/2024/edpb-opinion-ai-models-gdpr-principles-support-responsible-ai_en | https://web.archive.org/web/*/https://www.edpb.europa.eu/news/news/2024/edpb-opinion-ai-models-gdpr-principles-support-responsible-ai_en |
| 5 | Carlini et al. — “Extracting Training Data from Large Language Models” (USENIX Security 2021) | https://arxiv.org/abs/2012.07805 | https://web.archive.org/web/*/https://arxiv.org/abs/2012.07805 |