Write documents without a keyboard or mouse β completely
AgentDoc is a voice-controlled text editor for people who can't, shouldn't, or simply don't want to type. From the first word to the exported PDF, every operation β writing, formatting, page breaks, navigation, table of contents, export β is driven by voice. There is no required keyboard interaction at any point. No installation, no license fee, no training period, no platform restriction. Open the editor in any modern browser and speak your first sentence.
What "completely hands-free" actually means here
Most "voice writing" tools transcribe speech to text and then make you reach for the keyboard for everything else β formatting, fixing the cursor, choosing a font, navigating to page 3. AgentDoc closes that gap. Concretely, all of the following work by voice:
- Writing. "Add a heading called Conclusion. Then write a closing paragraph summarising the key points."
- Formatting. "Make the recipient's name bold and navy blue. Change the body font to Lora."
- Restructuring. "Insert a page break before the conclusion. Move the second paragraph to the end."
- Navigation. "Go to page 3. Read the second paragraph."
- Export. "Download this as a PDF."
- Verification. The agent confirms what it changed, out loud and on screen, after every edit. You never have to guess whether the action landed.
For people with motor disabilities
AgentDoc is designed to remove the keyboard and mouse entirely from the document-writing workflow. People with the following conditions can use it as a primary writing tool:
- Tetraplegia, quadriplegia, and high-level spinal cord injury β write, format, and export full documents without functional hand or arm movement. Pair with a sip-and-puff switch, single-button switch, or eye-tracker for the initial mic activation if needed.
- ALS / motor neuron disease β voice remains usable longer than fine motor control, and the agent's verification-based architecture prevents cascading errors when speech becomes more variable.
- Multiple sclerosis, Parkinson's disease, essential tremor, dystonia β when tremor or coordination problems make precise typing or cursor placement difficult or impossible, voice is the only modality that doesn't degrade.
- Cerebral palsy, muscular dystrophy β full document control without the fine-motor demands of touch typing.
- Amputation, congenital limb difference β no assumption that two hands and ten fingers are available.
- Severe visual impairment or blindness β pairs naturally with screen readers; voice in, voice confirmation out, no reliance on a visual toolbar.
Important: this page describes capabilities, not a medical recommendation. Whether AgentDoc is the right tool for any individual depends on speech intelligibility, cognitive load tolerance, and clinical context. Caregivers and therapists are welcome to evaluate it directly β see the section below.
For RSI, post-injury, and temporary needs
Not every reason to stop typing is permanent. AgentDoc is also for people who used to type fine and need a competent voice writing tool right now:
- RSI, carpal tunnel, tendonitis β the most common occupational reason developers, journalists, lawyers, and writers seek voice tools. Use AgentDoc for the parts of the day where typing is most painful (often long-form writing) without giving up document formatting.
- Broken wrist, post-surgical recovery, dominant-arm injury β temporary loss of typing ability shouldn't mean missed deadlines. Dictate the full document, format it, export it, send it.
- Arthritis, chronic pain, fibromyalgia β voice on bad days, keyboard on good days, same document either way.
- Migraine and post-concussive symptoms β when staring at a screen and tapping keys makes the pain worse, dictating with eyes closed is a viable alternative.
A free Dragon NaturallySpeaking alternative β that also formats
If you've used Dragon and been frustrated by the $299β$699 price tag, the Windows-only restriction, the dropped Mac support, the long training period, or the fact that it transcribes but doesn't actually edit your document, AgentDoc is built around exactly that gap. Free. Browser-based. macOS, Windows, Linux, iOS, Android β anywhere you have a microphone and Chrome, Safari, Firefox, or Edge. No training. And formatting works mid-sentence, not as an afterthought.
For caregivers, family members, and therapists
If you're evaluating AgentDoc on behalf of a patient, family member, or client, here is what's relevant:
- Zero-friction trial. No app install, no license purchase, no hardware beyond a microphone. Open the page, create an account with an email address, start speaking. Suitable for evaluation in a single appointment.
- No training period. Unlike speaker-adapted dictation systems, AgentDoc does not require minutes or hours of voice training. Speech recognition is provided by Google Gemini Live, which is speaker-independent.
- Documented reliability. The underlying agent's tool-call accuracy, failure modes, and design decisions are documented in an academic thesis evaluating 20 distinct workflow configurations across 13 scenarios. See /research.
- Suitable hardware. Any modern computer or smartphone with a built-in or external microphone. Sip-and-puff switches, single-switch interfaces, and eye-trackers are compatible β they are only needed for initial mic activation, not for the writing itself.
- What it doesn't replace. AgentDoc is a writing tool, not an AAC device, not a clinical communication system, and not a substitute for established assistive technology assessment. It is one option among several.
The questions people actually ask before trying this
Can someone with ALS or tetraplegia write a complete letter with this β without help?
Yes. From an empty document to an exported PDF letter, no keyboard or mouse interaction is required. The voice session is started by clicking the microphone button β which can be triggered by any assistive switch that simulates a click β and from that point everything is voice. Headings, paragraphs, formatting, page layout, signature placement, and PDF export.
Is this a free alternative to Dragon NaturallySpeaking?
Yes. AgentDoc is free to use, runs in any modern browser, and works on macOS, Windows, Linux, iOS, and Android. No license, no per-user fee, no platform exclusivity, no training period. Unlike Dragon, AgentDoc is not just a transcription engine β it formats, restructures, navigates, and exports documents by voice through an AI agent that understands document context.
I have RSI. Can I still write professional documents (cover letter, report, contract)?
Yes. The use cases AgentDoc is built for include exactly this: dictating a formal letter, applying mid-sentence formatting like "make the recipient's name bold," inserting headings and page breaks, generating a table of contents, and exporting a clean PDF. Nothing about the workflow assumes a working keyboard.
How is this different from iOS Dictation, Windows Voice Access, or Google Voice Typing?
Those tools transcribe speech into text at the current cursor location. They don't format, don't restructure, can't insert page breaks, can't navigate, and can't export. The moment you need to do anything beyond entering text, you are back at the keyboard or mouse. AgentDoc operates at the document level β every editing operation is exposed to the voice agent.
Will Dragon work better for me, or this?
Dragon has decades of speaker-adaptive training and may produce slightly more accurate transcription for some users with non-standard speech. AgentDoc's advantage is the editor itself: full document control by voice, real-time multi-page formatted output, and PDF export without ever leaving voice mode. If your bottleneck is transcription accuracy on accented or atypical speech, Dragon may still win. If your bottleneck is "I can dictate text but I can't format or restructure the document," AgentDoc is built to solve that.
Are my documents and my voice secure?
Voice audio is streamed to Google Gemini Live for processing. Documents are stored on AgentDoc servers in Germany under DSGVO/GDPR rules. Per-user document isolation, JWT-based authentication, and HTTPS apply throughout. For especially sensitive documents (medical records, legal correspondence, attorney-client material), the text chat mode is recommended β it uses the same agent and the same tools, but no audio is streamed externally.
Does this work with screen readers?
The editor surface is text-based and works with VoiceOver, NVDA, and JAWS for navigation between controls. The voice mode itself is the recommended way to read and edit documents β say "read me page 2" and the agent reads the content out loud, identifying headings and paragraph boundaries.
Funktioniert das auch auf Deutsch?
Ja. Gemini Live unterstΓΌtzt Deutsch (und ~30 weitere Sprachen) sowohl bei der Spracherkennung als auch bei der gesprochenen BestΓ€tigung des Agenten. Befehle wie "Mach die Γberschrift fett und blau" oder "FΓΌge einen Seitenumbruch vor 'Schlussfolgerung' ein" funktionieren genauso wie auf Englisch.
Try it now β without commitment
AgentDoc is free, browser-based, and requires no install. Open the editor, press the microphone button, and dictate your first instruction. If you're evaluating it for someone else, the same trial works.
Open the voice editor Read the research βBehind the scenes
- Voice-First Document Editing with Gemini Live β architecture and tradeoffs of the native-audio voice path that powers AgentDoc's accessibility experience.