Input a text you would like the agent to present to the user via voice.

If you enter multiple responses, the agent will automatically choose one at random every time the Speak node is triggered.

You can either use the robotic voice you selected upon agent creation to give out the text, or upload a human voice recording.

The supported recording file types are: wav, mp3, ogg. Maximum file size is 4MB.

Use clear, concise and example oriented instructions. Your user needs to be able to understand exactly what they need to provide to the agent at that specific point in the flow. As a designer you must be able to create this using as little words as possible especially for voice agents.

Use Recording or Use Parameter

If you want to include a recording in a Speak node you have two options:

Use Recording - Use a human voice recording that was either uploaded to the Recordings property or right there on the spot to be given out as the agent's response.

Use Parameter - If you have collected a parameter that is associated with an entity with recordings, you can select this parameter here. The agent will then give out the parameter value with the recording you have previously uploaded to the entity.

When to use the Speak node

Use the Speak node to greet users as a first interaction. It’s a good idea to always have a Speak node as the first node in every agent, introducing the service and provide a clear statement about the agent's abilities and features.


If your virtual agent needs to answer knowledge questions (such as opening hours, etc.) then, after using “Classification” to classify into the right intent, you can use the "Speak" node to reply to the caller. This is only relevant if the agent is only supposed to respond and doesn’t expect any user input or want to end the conversation there.

Include an intent for small talk or as a catch all. You can build the intent to accept verbiage like “Who are you?” “What's the weather like today?” to create a more natural experience.

Last updated