Voice memo to task list via Whisper

A mobile app that transcribes voice memos with Whisper and extracts structured tasks, deadlines, and project labels — so thinking out loud during a walk turns into an organized to-do list.

The idea

You dictate a rambling voice memo on your commute — half brain-dump, half to-do list — and the app turns it into structured tasks by the time you sit down at your desk. It uses OpenAI Whisper for transcription, then a small extraction prompt to pull out action items, optional deadlines, and project labels. The result lands in a local task store you can export to JSON or push to a task manager of your choice.

Why build this

Voice is faster than typing for capturing ideas in motion, but the output — an unstructured audio blob — is nearly useless for task management. Converting audio to text is a solved problem; converting that text to structured tasks is still a manual step for most people. Whisper is now fast and accurate enough to run on-device via whisper.cpp, and extraction with a small language model is cheap and reliable. The people who would use this are already recording voice memos — they just need the extraction layer.

Stack sketch

React Native (Expo) for cross-platform iOS/Android
react-native-whisper wrapping whisper.cpp for on-device transcription (no audio leaves the phone)
Claude Haiku or a local Ollama model for task extraction via a structured-output prompt returning [{ task, deadline?, project? }]
SQLite via expo-sqlite for local task storage
Optional: n8n webhook to forward accepted tasks to Todoist, Linear, or Notion

Scope for v1

Record a voice memo in-app or import an existing audio file from the share sheet
Transcribe on-device with the Whisper small model (works offline)
Extract tasks with a single-shot prompt; return a JSON array of task objects
Review screen: accept, edit, or discard each extracted task individually
Export accepted tasks to a plain JSON file
Out of scope: cloud sync, user accounts, native integrations with third-party task managers, background processing

Where it could go

The most direct expansion is native task-manager integration — push accepted tasks straight into Todoist, Apple Reminders, or Linear without a copy-paste step. A share-sheet extension would let users pipe recordings from the stock Voice Memos app directly into the extraction flow without switching to a separate recorder.

A longer-term direction is passive meeting capture: run transcription on a recorded call, extract all action items with speaker attribution, and deliver each person their own list. That requires diarization (Pyannote or Whisper's word-level timestamps) and a more complex review UX, but the extraction core carries over unchanged.

Watch out for

Whisper's small model makes consistent errors on proper nouns, internal project names, and fast speech, so users need a low-friction editing step before any task is committed — don't skip the accept/edit screen. Recording length also matters: on-device transcription of a five-minute memo is fine, but longer recordings can spike battery and take thirty-plus seconds; cap v1 at five minutes and make the limit visible before recording starts.