Dictation
Joule / Patterns / Dictation
Intro
Dictation enables voice-to-text input. It captures and transcribes user speech in real-time offering an additional way for inputting text.
Dictation in compact Joule panel (left) and regular Joule panel (right)
Usage
Do
- Request microphone access from first-time users.
- Keep the keyboard visible until the user closes it.
- Provide clear listening feedback and maintain cursor position during transcription.
- Enable editing after dictation.
Don't
- Hide listening or processing status.
- Hide the stop action. User must always have a way to exit the session.
- Auto-submit or overwrite user input.
- Clutter the UI that might distract from the transcription process.
Anatomy
A. "Stop/Cancel" Button
Ends the active session.
Note: Any text transcribed during the session is retained in the input field (element B) after stopping, allowing the user to review or manually edit before sending.
B. Placeholder/Text Area
Displays placeholder text before the user speaks and the live transcription. If the session is stopped, this area persists the captured text.
C. Audio Visualizer
Indicates microphone/speaking activity.
D. "Submit/Send" Button
Sends the retained transcription to Joule.
Dictation anatomy
States
Idle State
The component is waiting for user initiation. It displays instructional placeholder text, such as "Begin speaking ...", and the "Submit/Send" action remains inactive until text is present.
Active Listening
Triggered once recording begins. The audio visualizer provides real-time feedback of voice input, and the feedback placeholder populates with live transcription. The "Submit"/"Send"action becomes active for the user to submit when ready.
Idle state (left) and active state (right)
Behavior and Interaction
Happy Path
The user initiates the dictation session, speaks while receiving real-time visual and text feedback, and submits the query from dictation mode.
Dictating message and sending from dictation mode
Stopping Dictation
When a user initiates a dictation session and chooses to stop the recording before submitting, they’re taken back to the default input mode. Stopping the session does not clear the input; instead, the live transcription is persisted in the text area. This allows the user to review the captured text, make manual edits if necessary, add attachments or resume the conversation at a later time without losing their progress.
Dictating message, stopping, and sending from default input field
Dictation with Content
When a user has an active attachment or a specific mode selected, the dictation component adapts to include this context within the query. The user can initiate dictation while a file, such as a PDF, is attached to the message. The live transcription is captured alongside the existing attachment, allowing the user to provide voice instructions or questions related to that specific content before submitting.
Dictation with content
Resources
Joule for Android: Dictation