Frequently Asked Questions
Everything you need to know about Privacy AI — privacy, models, features, and pricing.
What Makes Privacy AI Unique
Yes. Privacy AI runs GGUF models via llama.cpp and MLX models via Apple's MLX framework entirely on your iPhone, iPad, or Mac — no server, no account, no internet connection required. The ChatGPT app, Claude app, Gemini app, and Perplexity app all require a server connection for every message. Privacy AI does not. Once you download a model, it works indefinitely with no network access.
Privacy AI is one of the only iPhone and iPad apps with full MCP (Model Context Protocol) support, including a built-in marketplace. MCP lets the AI connect to external tools and data sources — files, calendars, databases, web search, custom APIs — directly from your iPhone. You can browse and install MCP servers from the in-app marketplace without any manual configuration. No other iOS AI assistant ships an MCP marketplace.
Privacy AI runs llama.cpp natively on iPhone and iPad, supporting any GGUF model you download from HuggingFace. This includes LLaMA 3, Mistral, Qwen, Gemma, Phi, DeepSeek, and hundreds of community fine-tunes. The app handles quantization selection, context length configuration, and iCloud-based model sync so the same model file works across your iPhone, iPad, and Mac. Very few iOS apps run llama.cpp natively; Privacy AI is the most full-featured one with cloud provider integration alongside local models.
Privacy AI supports local GGUF models, local MLX models, Apple Foundation Models (iOS 26+), and 15+ cloud providers all within the same app. You can switch between a locally running LLaMA 3 model and GPT-4o in the same conversation thread without losing context. No other iOS AI app combines on-device inference with this breadth of cloud provider integrations.
Privacy AI connects to 15+ AI providers using your own API key: OpenAI, Anthropic Claude, Google Gemini, Groq, Perplexity, Mistral, DeepSeek, xAI Grok, Kimi, MiniMax, HuggingFace, OpenRouter (which itself aggregates 100+ models), and any OpenAI-compatible self-hosted server. You pay providers directly at their published rates — no markup and no additional subscription on top.
Yes. Privacy AI connects directly to self-hosted servers including Ollama, LM Studio, vLLM, LocalAI, Jan, and any llama.cpp server from your iPhone over your local network or VPN. This means you can run large models on your Mac or home server and use them from your phone without exposing them to the internet. Configure a custom server URL in Settings under Providers.
Privacy & Security
No. When you use local models (GGUF or MLX), Privacy AI performs all inference entirely on your device. No text, images, or conversation data ever leaves your iPhone, iPad, or Mac. There are no background analytics calls, no telemetry on your prompts, and no backend servers operated by Privacy AI that process your queries. When you choose to use a cloud provider such as OpenAI or Anthropic, your messages are sent directly from your device to that provider — Privacy AI never acts as a relay or proxy for those requests.
Privacy is enforced at the architecture level, not through a privacy policy alone. Local models run fully on-device using llama.cpp (GGUF) or MLX (Apple Silicon), which means your data never touches the internet. For cloud providers, you supply your own API key and the connection goes directly from your device to the provider — Privacy AI has no visibility into those calls. The built-in Protocol Inspector lets you see every network request the app makes in real time, so you can independently verify that no unexpected traffic occurs.
Yes. Privacy AI includes a built-in Protocol Inspector that logs every HTTP request, response, and SSE frame made by the app. You can tap any entry to see full headers and body. For deeper verification, you can also route all traffic through mitmproxy or Charles Proxy — the app works normally under a proxy. This level of transparency is intentional: you should never have to take our word for it.
Model Support
Privacy AI supports three categories of models. First, local GGUF models via llama.cpp — you can download any GGUF model from HuggingFace directly into the app, including LLaMA 3, Mistral, Qwen, Gemma, Phi, and hundreds of others. Second, MLX models optimized for Apple Silicon — these run 20-30% faster than GGUF on M-series chips. Third, cloud providers including OpenAI (GPT-4o, o1, o3), Anthropic Claude, Google Gemini, Groq, Perplexity, Mistral, DeepSeek, xAI Grok, and more. Apple on-device Foundation Models (iOS 26+) are also supported for zero-latency local inference.
Go to Settings, add a new provider, and enter your API key from OpenAI, Anthropic, or Google. Privacy AI connects directly to the provider using the official API — no intermediary, no markup on tokens. You can switch between providers and models mid-conversation without losing your chat history. The Pro subscription is required to use cloud providers.
Yes. Any model available in GGUF format on HuggingFace can be downloaded and run locally. Privacy AI includes a built-in model browser that shows file sizes, context lengths, and quantization levels so you can pick the right model for your device. Models are stored in your iCloud Drive or local storage and can be shared across your iPhone, iPad, and Mac.
GGUF is a file format for quantized language models, used by llama.cpp to run large models efficiently on CPU and GPU. A 7B parameter model in Q4 quantization typically requires 4-5 GB of storage and can run on an iPhone 15 Pro or any M-series Mac. MLX is Apple's machine learning framework optimized for Apple Silicon (M1/M2/M3/M4 chips). MLX models use the Neural Engine and unified memory architecture, making them 20-30% faster than GGUF on the same hardware. Both formats work fully offline.
Offline & Storage
Yes, completely. Once you have downloaded a local GGUF or MLX model, Privacy AI works with no internet connection at all. This includes chat, document processing, image analysis, voice input, speech synthesis, and all 40+ built-in tools. The only features that require internet are cloud provider integrations (OpenAI, Claude, etc.) and model downloads from HuggingFace.
Storage requirements depend on the model size and quantization level. Small models (1B-3B parameters) typically use 1-2 GB. Mid-size models (7B-8B) use 4-6 GB at Q4 quantization. Large models (13B-70B) range from 8 GB to 40+ GB. Privacy AI shows exact file sizes before download. Models can be stored in iCloud Drive to share them across devices without re-downloading, or kept in local device storage for faster access.
Comparison
The ChatGPT app requires an OpenAI account and an internet connection for every message — all your prompts are processed on OpenAI's servers. Privacy AI runs models fully on your device so no prompt data ever leaves your phone when using local models. It also supports 15+ providers simultaneously (not just OpenAI), lets you bring your own API keys with no markup, and includes a Protocol Inspector so you can independently verify exactly what network traffic the app generates. The ChatGPT app, Claude app, and Gemini app each lock you into a single provider; Privacy AI works with all of them plus local models.
Perplexity is a cloud-only answer engine with no local model support, no API key management, and no offline capability. LM Studio is a desktop application for macOS and Windows only — it does not run on iPhone or iPad. Ollama is a server runtime that requires a separate computer to host models; it has no native iPhone or iPad app. Privacy AI runs natively on iPhone, iPad, and Mac, runs models fully on-device with no server required, and also connects to Ollama or LM Studio if you prefer that setup. It is the only option in this category that works as a standalone iPhone app with no external dependency.
Features
MCP (Model Context Protocol) is an open standard that lets AI assistants connect to external tools and data sources — for example, reading files, searching the web, querying databases, or calling APIs. Privacy AI includes full MCP support with a built-in marketplace where you can browse and install MCP servers. Once installed, those tools are automatically available to any model you are using. MCP support requires the Pro subscription.
Yes. Privacy AI includes on-device speech recognition powered by Whisper (via WhisperKit), which means your voice is transcribed locally without sending audio to a server. The app also supports Natural Talk mode — a hands-free voice conversation interface. Text-to-speech is available in 53 voice styles using on-device MLX Audio or system voices. Real-time translation is supported in 256 languages, fully on-device.
Yes. Privacy AI can read and process PDFs, Word documents (.docx), EPUB ebooks, HTML pages, YouTube video transcripts, audio files, and more. Documents are converted to clean Markdown on-device and then passed to the AI model as context. OCR is available for scanned documents. You can also export conversations to Markdown, PDF, HTML, or EPUB format.
Platform & Pricing
Yes. Privacy AI is a universal app that runs natively on iPhone (iOS 18.6+), iPad (iOS 18.6+), and Mac (macOS 26+). Your conversations, model settings, and preferences sync across all devices via iCloud. Local model files can also be stored in iCloud Drive and shared across devices.
Privacy AI is free to download. The free plan includes all local AI features — you can run GGUF and MLX models, use Apple Foundation Models (iOS 26+), access 40+ built-in tools, and sync everything via iCloud at no cost. The Pro subscription unlocks cloud provider integrations (OpenAI, Claude, Gemini, etc.), the MCP marketplace, and custom API provider configuration.
The Pro subscription adds cloud AI provider support (bring your own API key for OpenAI, Anthropic Claude, Google Gemini, Groq, Perplexity, Mistral, DeepSeek, xAI Grok, and 10+ more), access to the MCP marketplace for connecting external tools and data sources, and the ability to configure custom OpenAI-compatible providers including Ollama, LM Studio, and self-hosted servers. All local model features remain free.