Collect Input
The Collect Input node is designed to prompt the user with a question and capture a specific piece of information (parameter) from their response. Unlike the generic Listen node, this node validates the user's input against a specific Entity type to ensure the data matches the expected format before proceeding.
You can either use the robotic voice you selected upon agent creation to give out the response, or upload a human voice recording.
The supported recording file types are: wav, mp3, ogg. Maximum file size is 4MB.
When to Use This Node
This node is best used in scenarios where you need to validate the user's answer against a specific set of rules. Common use cases include:
Strict Data Entry: Collecting specific formats like User IDs, Phone Numbers, Zip Codes, or Dates.
"How can I help you?" (Menu Selection): Asking an open-ended question where the answer must match a specific list of services (e.g., Sales, Support, Billing). This acts as a "Natural Language Menu."
Pro Tip
If you want to capture an open-ended response without validating it (e.g., recording a voicemail or a long complaint), use the Listen node instead.

Setting Up the Collect Input Node
Select Parameter
Select the parameter you want to fill. You may need to create a new parameter depending on your use case.
Define Fallback behavior (optional).
What happens if the user stays silent or says the wrong thing? Customize the "Missed" and "No Input" exit points.

Node Configuration
No Input & Missed
You'll find two additional tabs next to the parameter configuration:
No Input - Triggered when the caller doesn't provide any response (stays silent). The agent repeats the prompt based on your retry settings before activating the "No Input" flow. Use this to define fallback behavior, such as routing to a live agent.
Missed - Triggered when the caller's input doesn't match the selected entity. The agent repeats the prompt based on your retry settings before activating the "Missed" flow. Define how the agent should handle invalid responses, such as asking a clarifying question or transferring the call.

"Skip this node if value is already collected"
If the parameter value has been collected on a previous node (or even the same node in case the caller went back to the same node during the same conversation), you can choose to skip this Collect Input node and keep the original value collected. If you like to override the value, leave this box unchecked.

Caller's Response Input - Speech vs. DTMF
The caller can decide to respond either via speech or using the keypad (DTMF). You can toggle on both speech and DTMF if you’d like to give the caller to respond via both inputs. At least one of these needs to be switched on.

Speech
"Detect Silence" - You can also control how long the system will wait after the user stops speaking to decide whether the input was complete. The default value is one second. The range of possible values is between 0.4 and five seconds.
"No Input" - You can control how long the system will wait for the user's input by adding a number of seconds to the "No Input" field in the node. The range of possible values is between one second and sixty seconds. Once this time frame passes, the agent will trigger the retry logic until it reaches the last retry and moves to the "No Input" flow. You can add as many retries as you see fit.
"Context Keywords" - To improve recognition quality if certain words are expected from the user. The agent will look out for these words in the caller's input, and e.g., help classify them into the proper intent.
"Should Record" - Choose the "Should Record" option to record and generate a short audio file of the value collected in a parameter. Once the recorded parameter has been filled by a caller, the system will generate a unique URL including the voice-recorded value for later use.
DTMF
The caller has the option to respond using the keypad. The following settings are related to the keypad:
"Time Out" - Set how many seconds the caller after the user completes the activity, the result is submitted. The default value is 10, max is 60. The "Time Out" value will be the same as the "No Input" value if both Speech and DTMF are toggled on.
"Max Digits" - The number of digits the user can press. The default is 20 digits, which is also the maximum.
"Submit on Hash" - Choose 'yes' if you'd like the caller's response to be submitted following the # key.
Barge-In
AI Studio allows you to enable your users to interrupt your virtual assistant to provide their input, e.g., relevant for returning customers who may already know extension codes or the options within your agent, and are in a hurry.

To accommodate returning users and help create a customized experience for them, create Users Parameters in order to skip collecting information they may have already provided.
To enable barge-in, you must go into each Collect Input node that you want it to be enabled in, scroll to the bottom of the node, and toggle the switch.
Enabling barge-in switches on the ability to interrupt the virtual assistant with both speech and DTMF input.
Pro Tip
Make sure to adjust the noise sensitivity to make sure that the virtual assistant is not “barged in” on by background noise.
Node Noise Sensitivity
This feature improves transcription performance for short user prompts (e.g., "Yes", "No", or "Cancel") by allowing you to customize sensitivity for specific nodes.
Node Noise Sensitivity addresses potential performance issues regarding short user prompts, which can sometimes result in blank or incomplete transcriptions. For example, short responses like "Yes" or "No" might result in a blank transcription, or a word like "Cancel" might appear as "ancel".
By adjusting sensitivity at the node level, designers can optimize how the Virtual Agent listens for specific types of expected input.
When Switched OFF (Default): The node uses the general agent-level sensitivity settings.
When Switched ON: The node applies the specific sensitivity value set in this field, overriding the agent-level setting.
Enable this feature in your Collect Input node by scrolling down to the bottom and toggling "Enable Node Noise Sensitivity" to ON.
Pro Tip
The value defaults to 40 when first enabled. You can adjust the slider between Low and High to match the specific requirements of that node.

This feature is available for the Telephony channel across all NLU engines and can be accessed within any Listen or Collect Input node.
Entity Ambiguation
Entity Ambiguation is a feature that helps your agent handle "ties" when a user's input matches more than one possible answer - for example, if a caller asks for "Ben" but your contact list has both "Ben Miller" and "Ben Brookes".
To fix this, you enable the Ambiguation setting in your Collect Input node and create a special multi-value parameter to catch these duplicates. This creates a dedicated path in your flow where the agent can politely ask the user to clarify exactly who or what they meant (e.g., "Did you mean Ben Miller or Ben Brookes?") before moving forward.
Learn more about Entity Ambiguation here.

Customize your Agent Prompts with SSML
Speech Synthesis Markup Language (SSML) is an XML-based markup language that allows you to fine-tune how the Vonage Text-to-Speech (TTS) engine reads your text. You can use it to vary the rate of speech, pitch, and say selected material as certain types of input, like digits, dates, numbers, etc. It helps to provide a human touch and not make the user journey robotic.
By wrapping your text in <speak> tags, you can insert specific commands to control the auditory experience, such as adding pauses with <break>, adjusting the speed, pitch, and volume using <prosody>, or ensuring specific formatting for numbers and dates via <say-as>.
Last updated
Was this helpful?