
A note on funding: CypherpunkGuide carries no surveillance advertising — no ad networks, tracking pixels, or sponsored content. It is funded by transparent streams: reader donations now; subscription and editorially-aligned affiliate later. We answer to our readers, not to advertisers.
I type into a chat box the way I’d speak near a microphone whose cable I can’t see: I assume it is recording, because the cheapest safe assumption is the one that turns out to be true. Most people do the opposite. An assistant answers in a calm, conversational voice, the window feels private, and so we tell it the things we would tell a doctor or a lawyer — the medical worry, the draft resignation letter, the half-formed plan, the password we shouldn’t be pasting. The interface is built to feel like a conversation. The back end is built to keep a record.
Those are different things, and the gap between them is where the privacy problem lives. Pew Research found in March 2026 that 50% of U.S. adults are more concerned than excited about the spread of AI — up from 37% in 2021 — and an earlier survey found a large majority expecting their personal information to be used in ways they would find uncomfortable. And yet the same tools collect, by design, a more intimate stream of disclosure than search ever did. A search query is a few keywords; a chat is a confession with follow-up questions. What happens to that record afterward is governed not by the friendly tone of the reply but by each provider’s data policy, its retention schedule, its human-review pipeline, and — increasingly — by court orders the provider does not control.
So the question worth answering is not “is my AI assistant private?” — that framing invites a yes-or-no marketing answer. The useful question is: what, specifically, does each assistant keep; what does turning off “training” actually stop; and what survives every setting you can reach? Below is that audit, provider by provider, as of mid-2026 — with the honest caveat that these policies change often, so the dated claims here are a starting point to verify against each company’s current page, not a substitute for it.
| What the interface suggests | What the system actually does |
|---|---|
| A private, in-the-moment conversation | A logged record tied to your account |
| “Delete” makes it gone | “Delete” hides it from you; copies may persist |
| Turning off training protects you | Training is one use; retention, review, and disclosure are others |
| The reply is just for me | Samples may be read by humans to “improve the model” |
Why a Chat Is a Record, Not a Conversation#
An AI chat is a data record subject to at least five separate uses — model training, human review, retention, security disclosure, and biometric processing of voice or images — and a privacy setting usually governs only the first of them. Treating “opt out of training” as “make it private” is the central mistake, because the other four uses run on their own rules, and the most damaging ones are the ones no toggle touches. Naming the five uses separately is what turns a vague unease into a checklist you can actually audit.
The first use is training: your conversations become material that shapes future versions of the model. The second is human review: to measure quality and catch abuse, providers let trained staff or contractors read a sample of real conversations — a practice every major lab discloses somewhere in its policy. The third is retention: even after you delete a chat, copies commonly persist in abuse-monitoring systems, backups, and legal holds for a defined window, or longer if a conversation is flagged. The fourth is disclosure: a record that exists can be subpoenaed, produced in litigation, or handed over under a preservation order — none of which you control. The fifth, growing fastest, is biometric processing: voice input and uploaded images carry data — a voiceprint, a face — that is durable in a way text is not.
Hold those five apart and the rest of this article is just filling in a grid: for each assistant, which uses are on by default, which you can switch off, and which you cannot. The regulatory backdrop is shifting too — the EU’s AI Act began applying transparency duties to general-purpose models in August 2025, with its broader provisions reaching full application on 2 August 2026 — but regulation moves slowly and unevenly, so the practical defense is still to know the grid and act on it yourself.
What Each Assistant Actually Keeps#
As of mid-2026 the major consumer assistants share a default most users miss — they train on your chats unless you opt out, with regional exceptions such as the EU and UK — and all of them retain some data after deletion and reserve a path for human review. The table below is the cross-platform summary; the paragraphs after it carry the nuance, because a one-word cell (“Yes”) hides conditions that matter. Read the cell, then read the caveat.
| Assistant | Trains on chats by default (consumer) | Opt-out path | Notably kept after you delete |
|---|---|---|---|
| ChatGPT (OpenAI) | Yes | Settings → Data Controls | Abuse-monitoring copies; data under legal hold |
| Claude (Anthropic) | Yes, unless you opt out (since Sep 2025) | Privacy / data settings | Safety-flagged content; data under legal hold |
| Gemini (Google) | Yes (via “Gemini Apps Activity”) | Turn Activity off | Human-reviewed samples, kept up to 3 years |
| Copilot (Microsoft) | Yes, unless opted out (EU/UK: off by default) | Settings toggle | ~18-month rolling window |
| Meta AI | Yes (your AI chats) | Limited; none outside EU/UK | Content used for ads/personalization |
ChatGPT (OpenAI). For free, Plus, and Pro accounts, conversations are used to improve the models by default; you turn this off under Settings → Data Controls. OpenAI’s own help pages describe abuse-monitoring retention that persists for a window after deletion, and the company has stated that Team, Enterprise, and API customers are not trained on by default. The larger lesson of 2025 sits outside the settings page entirely: in the New York Times litigation, OpenAI was placed under a court order to preserve output data — including content users believed they had deleted. A toggle you control is no match for a hold you don’t.
Claude (Anthropic). Since a consumer-terms change that took effect in late September 2025, Anthropic trains on Free, Pro, and Max chats unless you opt out through your data settings — the same opt-out posture as ChatGPT, not the opt-in many users still assume. Two caveats compound it: conversations flagged for safety or policy review can be used and retained even after you opt out, and Anthropic does not publish what triggers a flag. I write this as an author whose own words are produced with Claude, which is exactly why I will not soften it: read the current privacy page rather than trust any summary, including this one.
Gemini (Google). Google ties training to your Gemini Apps Activity setting; with it on, conversations can be used to improve services, and turning it off stops that — at the cost of your chat history. The detail most people miss is review retention: a sample of conversations selected for human review is disconnected from your account but, per Google’s help pages, can be kept for an extended period — up to three years — regardless of your normal auto-delete window. Workspace (work/school) accounts are governed by different, generally stricter terms.
Copilot (Microsoft). Microsoft’s privacy FAQ states that, except for certain user categories or those who have opted out, it uses interactions across Bing, MSN, and Copilot for AI training — so the consumer default is not hands-off, though users in the EU, UK, and Switzerland are excluded by default. Consumer history runs on a default 18-month window you can clear. Microsoft 365 Copilot inside a work or school account is a different product: prompts and responses are treated as organizational data under enterprise terms and are not used to train the foundation models. The free consumer tier and the work tier are not the same privacy regime, even when the icon looks identical.
Meta AI. Meta uses interactions with its AI — your chats with the assistant, plus public posts — to train and, per its late-2025 update, to personalize ads and feeds. Meta has stated it does not use the content of private messages with friends and family for this — but your conversations with the Meta AI assistant are in scope, and outside the EU and UK there is effectively no opt-out from the ad use, only an exclusion for sensitive-topic categories like health and politics. Of the five, this is the one where the line between “assistant” and “ad platform” is thinnest.
The Opt-Out Playbook — and Its Limits#
The single highest-value action is to find each assistant’s training control and set it the way you want before your next sensitive chat — but treat opt-out as reducing one stream of exposure, not as making the conversation private. Do it anyway: shrinking the training surface is real and worth the two minutes. Just don’t mistake a quieter pipe for a closed one. Here is the path on each, as of mid-2026, with the standing reminder to verify against the live settings page.
- ChatGPT — Settings → Data Controls, and turn off the option to improve the model. For one-off sensitive questions, use a Temporary Chat, which is excluded from training and auto-deletes — while noting that abuse-monitoring retention can still apply for a window.
- Claude — open your privacy/data settings and turn the training control off; since late September 2025 the consumer default is opt-out, not opt-in, so the choice is yours to make rather than assume.
- Gemini — turn off Gemini Apps Activity to stop training use; understand this also clears ongoing history, and that previously selected human-review samples remain on their own retention clock.
- Copilot — turn off the model-improvement and personalization toggles in settings (outside the EU/UK the consumer default includes your interactions in training); for genuinely sensitive work, a managed Microsoft 365 (work) account is treated more protectively than the free consumer app.
- Meta AI — apply whatever regional opt-out and ad-settings controls exist to you, and operate from the assumption that this assistant is the most ad-integrated of the set.
Two cross-cutting habits beat any single toggle. First, don’t paste what you can’t afford to have kept — secrets, full identity documents, another person’s private data — because the durable defense is upstream of the model, at the point of input. Second, separate accounts by purpose, so a work question and a medical worry don’t accumulate against one profile. For the broader discipline this sits inside, the EFF’s Surveillance Self-Defense is a level-headed reference.
What Opt-Out Cannot Reach#
Opting out of training stops future model improvement from using your data; it does not delete the past, undo retention, prevent a breach, or block a subpoena — and those are the exposures with the highest consequences. This is the section the marketing pages skip, and it is the one that should shape what you type. Four things sit beyond the reach of every toggle on every platform, and naming them is the point of the whole audit.
The first is the past. Anything already used to train a deployed model cannot be pulled back out of it; opt-out is prospective, never retroactive. The second is retention after deletion — abuse-monitoring stores, backups, and legal holds keep copies on schedules you don’t set, and a conversation flagged for safety can persist far longer than your normal history. The third is disclosure: a record that exists is discoverable, and as the 2025 preservation order against OpenAI showed, “I deleted it” is not a status a court is obliged to honor. The fourth is a breach — the strongest internal policy in the world is only as good as the security around the database, and a stored confession is a stored liability for whoever holds it.
This is where regulation enters, and where to keep expectations sober. The EU AI Act’s transparency obligations for general-purpose AI — documentation, a summary of training data, copyright duties — began in August 2025 and sit within its full application on 2 August 2026, which is real progress on disclosure about systems. It is not, however, a delete button for your data, and open-source models carry lighter obligations. Rules raise the floor over time; they do not retroactively unsay what you already typed. The working conclusion is unglamorous and durable: the only data that can’t leak, be subpoenaed, or be retained past its welcome is the data you never put into the box.
The Uneven Risk: Voice, Likeness, and Who Pays Most#
Voice and image inputs raise the stakes because they carry biometric data — a voiceprint, facial geometry — that is durable and uniquely yours, and the harms from its capture fall hardest on women and other already-targeted people. Text can be rewritten; a leaked voiceprint cannot be reissued like a password. As assistants add voice modes and image understanding, the audit has to extend past words to the biometric layer, because that is where the worst-case outcomes now concentrate.
The mechanism is simple and well-documented. As little as 10 to 15 seconds of someone’s voice — OpenAI said its own Voice Engine needed just 15 — is now enough to drive convincing synthetic speech, which is why the FCC ruled in February 2024 that AI-generated voices in robocalls are “artificial” under the Telephone Consumer Protection Act and require prior consent. Feed an assistant your voice routinely and you are normalizing the capture of the exact material that impersonation needs. Uploaded photos extend the same logic to faces. None of this requires the provider to act in bad faith; it only requires the data to exist and, someday, to leak or be misused.
And the burden is not evenly shared. Impersonation, fabricated intimate imagery, and the harassment-to-doxxing pipeline land disproportionately on women, on public-facing professionals, and on activists — the same asymmetry I traced in the case for treating likeness as a credential and in defending against coordinated doxxing. For anyone carrying that risk, the practical rule is stricter than for the general user: keep voice and face out of consumer assistants you don’t have to use, prefer text, and treat any biometric input as effectively permanent once it leaves your device.
Bottom Line — How Much Should You Lock Down?#
The right defense scales with your threat model, not with one master setting: casual users need three habits, professionals need separated accounts and stricter tiers, and anyone carrying asymmetric risk should treat voice and biometric input as permanently off-limits on consumer assistants. The right level depends on who you are protecting yourself from — there is no universal answer, only a threat model and a few habits that pay off at every level.
- If you have no specific adversary: turn off training on the assistants you use, prefer a temporary/incognito chat mode for sensitive one-offs, and keep secrets and identity documents out of the box entirely. That covers most of the realistic risk for most people.
- If you handle others’ data or sensitive work: use a managed work account where the terms are stricter, separate accounts by purpose, and assume anything you type could later be retained, reviewed, or disclosed regardless of your settings.
- If you carry asymmetric risk — women facing harassment, activists, public-facing professionals: keep voice and face out of consumer assistants, minimize what you disclose to any of them, and treat biometric input as permanent.
Underneath all three is one principle that no policy change will overturn: an opt-out shapes how your data is used, but only restraint at the keyboard governs whether the data exists. Audit the settings, by all means — then write as if the record outlives the setting, because it does.
Frequently Asked Questions#
Does ChatGPT use my conversations to train its models?#
By default, yes, on free, Plus, and Pro accounts, as of mid-2026 — you can turn this off under Settings → Data Controls. Turning it off stops future training use but does not delete past data already used, and OpenAI’s own pages describe abuse-monitoring copies that persist for a window after deletion. Business tiers (Team, Enterprise, API) are not trained on by default. Because these policies change, confirm the current setting on OpenAI’s data controls page rather than relying on any summary.
If I delete an AI chat, is it really gone?#
Usually not entirely. Deletion removes the conversation from your visible history, but copies commonly remain in abuse-monitoring systems, backups, and any legal hold, each on a retention schedule you do not control. A conversation flagged for safety review can persist longer still. The 2025 preservation order in the New York Times case against OpenAI is the clearest illustration that “deleted” is not always permanent when a court is involved.
Which AI assistant is the most private by default?#
As of mid-2026 the honest answer is “none of them, by default” — ChatGPT, Claude (since its September 2025 consumer-terms change), Gemini, Copilot, and Meta AI all train on consumer chats unless you opt out, with regional carve-outs such as the EU and UK. The meaningful differences are in the details you have to act on: where the opt-out lives, how long human-reviewed samples are kept, and what a safety flag can override. Every provider also retains some data after deletion and reserves a path for human review and legal disclosure, so “private by default” is the wrong thing to shop for — opting out and disclosing less is the control you actually hold.
Is it safe to use voice mode or upload photos to an AI assistant?#
Treat it as higher-risk than text. Voice and images carry biometric data — a voiceprint, facial geometry — that is durable and uniquely identifying, and seconds of audio can be enough to drive a convincing voice clone. The data existing is the risk, independent of provider intent. If you carry elevated risk of impersonation or harassment, prefer text, and keep voice and face out of consumer assistants you are not required to use.
Will the EU AI Act make AI assistants private?#
No — it improves transparency, not personal data deletion. Its transparency obligations for general-purpose AI — technical documentation, a summary of training data — began in August 2025 and sit within the Act’s full application on 2 August 2026, which helps you understand the systems. It does not retroactively remove data you already submitted, and open-source models carry lighter duties. Regulation raises the floor over time; it is not a substitute for restraint about what you type.
| # | Source | URL | Archived |
|---|---|---|---|
| 1 | OpenAI — Data Controls FAQ | https://help.openai.com/en/articles/7730893-data-controls-faq | https://web.archive.org/web/*/https://help.openai.com/en/articles/7730893-data-controls-faq |
| 2 | Anthropic — Privacy Policy | https://www.anthropic.com/legal/privacy | https://web.archive.org/web/*/https://www.anthropic.com/legal/privacy |
| 3 | Anthropic — Updates to Consumer Terms and Privacy Policy (2025) | https://www.anthropic.com/news/updates-to-our-consumer-terms | https://web.archive.org/web/*/https://www.anthropic.com/news/updates-to-our-consumer-terms |
| 4 | Google — Gemini Apps & your data | https://support.google.com/gemini/answer/13594961 | https://web.archive.org/web/*/https://support.google.com/gemini/answer/13594961 |
| 5 | Microsoft — Copilot privacy FAQ | https://support.microsoft.com/en-us/microsoft-copilot/privacy-faq-for-microsoft-copilot | https://web.archive.org/web/*/https://support.microsoft.com/en-us/microsoft-copilot/privacy-faq-for-microsoft-copilot |
| 6 | Meta — Privacy Policy | https://www.facebook.com/privacy/policy/ | https://web.archive.org/web/*/https://www.facebook.com/privacy/policy/ |
| 7 | European Commission — Regulatory framework on AI (AI Act) | https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai | https://web.archive.org/web/*/https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai |
| 8 | Pew Research — Key findings about how Americans view AI (March 2026) | https://www.pewresearch.org/short-reads/2026/03/12/key-findings-about-how-americans-view-artificial-intelligence/ | https://web.archive.org/web/*/https://www.pewresearch.org/short-reads/2026/03/12/key-findings-about-how-americans-view-artificial-intelligence/ |
| 9 | NYT v. OpenAI — court preservation order (reporting) | https://decrypt.co/323950/openai-challenges-court-order-user-data-nyt-lawsuit | https://web.archive.org/web/*/https://decrypt.co/323950/openai-challenges-court-order-user-data-nyt-lawsuit |
| 10 | FCC — AI-Generated Voices in Robocalls Are Illegal (2024) | https://www.fcc.gov/document/fcc-makes-ai-generated-voices-robocalls-illegal | https://web.archive.org/web/*/https://www.fcc.gov/document/fcc-makes-ai-generated-voices-robocalls-illegal |
| 11 | Electronic Frontier Foundation — Surveillance Self-Defense | https://ssd.eff.org/ | https://web.archive.org/web/*/https://ssd.eff.org/ |
This audit is one half of a larger map. The threat model that makes AI a first-class adversary — and the assumptions it breaks — is laid out in OPSEC in the AI Age: Rebuilding Your Threat Model, and the deep-dive on how models infer identity from what you publish is AI Deanonymization: How Inference Undoes Your Anonymity. When the record is taken from an institution rather than typed by you, the companion playbook is When the Government Leaks Your Data; and for the biometric stakes raised above, see Your Face and Voice Are Now Credentials.


