Investigating AI Agent Email Address Fabrication

Published on March 31, 2025

Jose E. Puente

CEO at Reality Border

🤖 Get AI Summary of this Post:

Investigating AI Agent Email Address Fabrication

AI voice agents using OpenAI’s Realtime API have shown a tendency to misrecord email addresses, sometimes fabricating an address based on the user’s name instead of using the actual address provided. For example, if a user says their email is "sandra1289@gmail.com," the agent might incorrectly output "alejandro.fernandez@gmail.com," assuming the email should match the user’s name. This behavior stems from a combination of model biases, pattern recognition errors (hallucinations), transcription mistakes, and context-based assumptions. Below, we analyze why this happens and recommend how to prevent it, ensuring more accurate email collection in AI-driven conversations.

Potential Reasons for Incorrect Email Generation

Model Hallucinations and Biases

Large language models sometimes “hallucinate” details when uncertain – they generate plausible-sounding but incorrect information. The model may perceive a pattern that isn’t actually there, leading it to fabricate data that fits that pattern.

Pattern Recognition Errors (Overfitting to Email Formats)

The AI agent could be overfitting to common email patterns associated with names. If the model’s training data included many examples of emails derived from a person’s name, it may treat that pattern as a rule. For instance, hearing a name like “Alejandro Fernandez” might unconsciously trigger the pattern “alejandro.fernandez@…”. The legitimate email "sandra1289@gmail.com" does not match the name, which the model might misinterpret as an error or incomplete info. As a result, it confabulates a more “reasonable” email (e.g. one based on the name) to fill the perceived gap. In short, the model’s strong expectation of name@domain format can override the actual input – a classic sign of over-generalization. Research on AI behavior notes that such misinterpretations happen when an AI “creates outputs that are nonsensical or altogether inaccurate” by erroneously matching learned patterns.

Transcription or Input Mistakes

If this AI agent interacts via voice (as is common with the Realtime API), speech-to-text (STT) errors can contribute to the problem. Email addresses with alphanumeric mixes are easy to mishear – for example, "sandra1289" could be partially lost or garbled in transcription. In cases where the STT misses some characters or the domain, the language model might fill in the blanks using context. It essentially guesses the email based on the user’s name and common email conventions. Notably, developers have observed the Realtime API getting names and numbers wrong even when the STT transcription is correct. In one forum report, an AI voice agent consistently misheard a user’s name and phone number (“I’m Saul” was acknowledged as “Hi, Ahmed”, and digit sequences were repeated back incorrectly) despite accurate transcription logs. This suggests the issue isn’t purely the audio input – the model itself may reinterpret or overwrite details. Thus, even with perfect input, the AI’s parsing of that input can be faulty, especially for complex data like emails.

Context-Based Assumptions and Memory

Conversational AI keeps track of context (conversation history), but it can sometimes misuse that context. If the user’s name “Alejandro Fernandez” is stored in memory, the model might inadvertently blend that into new information. For example, upon hearing an email that doesn’t obviously correspond to the name, the agent might assume the user meant a name-based email or that it misheard. The model’s duty to maintain conversational coherence can backfire here – it over-anchors on the known name context. Essentially, it trusts the context (user name) over the actual utterance, producing an email that “makes sense” with the name. This is a kind of faulty logic or assumption: the AI thinks it’s helping by correcting a perceived discrepancy. Additionally, if the conversation is lengthy or the model’s short-term memory is taxed, it might not retain the exact sequence of characters in the email (especially a random string like sandra1289). Instead, it recalls “user’s email” and reconstructs it from available cues (the name), leading to a fabricated result.

OpenAI’s Handling of Structured Data in Conversations

Collecting structured data (like emails, phone numbers, addresses) has historically been a challenge for language models in free-form conversation. By default, models like GPT-4 or GPT-3.5 treat every response as natural language generation, which means they might alter specifics (capitalize words, fix “errors”, or, as seen, substitute values) instead of copying verbatim. OpenAI is aware of these pitfalls. In fact, developers have reported strange bugs where ChatGPT would truncate or change provided contact info – for example, turning "myname@mydomain.com" into "myname@domain.com" or altering names inadvertently. Such issues highlight that without guidance, the model might not preserve exact structured details.

Recent solutions: OpenAI has introduced features to improve the reliability of structured data capture in dialogues:

Function Calling

This feature (launched mid-2023) allows developers to define functions (like save_email(email)) that the model can invoke with parsed arguments. Instead of returning an email as part of a narrative reply, the model can return a JSON-like function call containing the email exactly. This makes the extraction of emails or phone numbers more deterministic. Essentially, the model is guided to produce structured output (the function and parameters) when it recognizes certain prompts. In practice, the developer provides a schema for the function argument (e.g., define the email parameter as a string). The model, upon the user saying “My email is X,” can respond by calling save_email("sandra1289@gmail.com"). This helps because the model’s output in function-call mode is strictly the data in JSON, reducing the chance of hallucinating or formatting it incorrectly. (If the model tried to call the function with "alejandro.fernandez@gmail.com" – a wrong value – the developer could catch that discrepancy immediately.) Function calling thus shifts the task from free-form generation to structured parsing, which the model tends to handle more faithfully for well-defined data.

Structured Output with JSON Schema

In August 2024, OpenAI introduced “JSON Mode” or structured output enforcement for the API. Developers can now supply a JSON schema that the model must follow in its response.

Memory and Conversation Management

The Realtime API manages conversation state on the server, which means the model has a running memory of the dialogue. However, developers can influence this state. OpenAI’s guidelines suggest techniques like summarizing or system messages to correct the model’s course. For instance, if the user’s email was gathered, a developer might insert a hidden note or a confirmation turn: System message: The user’s email is XYZ. This can anchor the model’s memory to the correct value going forward. (The Realtime API’s documentation and community examples emphasize managing sessions and even using tools within the conversation.)

Why the issue still occurs: If an AI agent is not using function calls or strict schemas, it will handle emails as plain conversation text. In that mode, as evidenced, the model might treat an email address like any other sentence fragment – subject to autocorrecting and “making sense” of it. The Realtime API’s voice-enabled model (GPT-4 or GPT-3.5) may have some biases about what a “proper” email sounds like, leading to these substitutions. Also, since the Realtime API is relatively new, some quirks are still being discovered by developers (e.g. trouble with names/numbers as shown in community forums). In summary, OpenAI has provided means to get structured data reliably, but those must be explicitly integrated. Without them, an AI agent may default to its generative behavior, which can be error-prone for exact data capture.

Mitigation Strategies and Recommendations

To enhance the accuracy of email collection in AI-driven conversations, especially using the OpenAI Realtime API, consider the following steps:

Explicit Prompt Instructions

Craft your system or developer prompts to emphasize verbatim retention of user-provided data. For example, you might add a guideline: “If the user provides an email or other contact info, do not alter it in any way. Repeat it exactly as given.” Developers have found some success with this approach – one report noted improvements after instructing the model “Do not make a mistake or change the name or email address. Use the information exactly as provided.”. By clearly telling the AI its role is to record (not reinterpret) the email, you reduce creative alterations. Additionally, you can enclose the email in quotes or a code block in the model’s response, which often helps the model treat it as a literal snippet. While prompt-based solutions aren’t foolproof, they can mitigate minor pattern errors by making the AI more cautious about changing user-specific data.

Use Function Calling for Data Extraction

Leverage OpenAI’s function calling capability to handle emails as structured data rather than free text. Define a function (e.g., {"name": "store_email", "parameters": {"type":"object","properties": {"email":{"type":"string"}}}}) and include it in the model’s available tools. When the user says their email, the model can output a function call like store_email({"email": "sandra1289@gmail.com"}) instead of a normal message. This approach forces the model into a parsing mode – it knows it should output a JSON with an email value. Because the function schema expects a string, the model is less inclined to invent or modify that string arbitrarily. In essence, function calling makes the model’s job easier and more deterministic: it just needs to plug the heard email into the function. Many developers use this to reliably capture contact info, as it sidesteps the quirks of natural language generation. It’s a robust way to prevent fabrication, since any hallucinated email would clearly stand out as incorrect when the function is called (and the developer can reject or correct it).

Enable JSON Strict Mode (Structured Output)

If using models and versions that support JSON mode or structured outputs (e.g., GPT-4 with response_format schema), turn it on for collecting emails. By supplying a JSON schema for, say, a contact object with an “email” field and setting "strict": true, you instruct the model to output something like {"email": "sandra1289@gmail.com"} exactly. This constrained decoding dramatically reduces errors. The model cannot easily drift into a name-based hallucination because any deviation (like extra words or a different address) would violate the JSON format. OpenAI’s update on structured outputs was specifically to handle cases where models wouldn’t stick to a required format, which was a “challenge” for developers.

Lower the Model’s Temperature

In scenarios where the model should not be creative or infer anything, set the temperature to 0 (or very low). A high temperature (e.g., 0.8 or 1.0) introduces randomness, meaning the model might “experiment” with different phrasings – or email formats – even if it heard the user. In the case of email capture, you want deterministic behavior. Community experts have pointed out that for retrieval or factual tasks, temperatures near 0 are preferred to minimize deviations. By lowering temperature, the AI is more likely to output exactly what was said (it reduces the chance it will, say, swap “sandra1289” for some other tokens). Essentially, this makes the model act more like a reliable translator and less like a creative writer. In practice, if you’ve noticed the agent fabricating an email at temperature 0.7, try 0 or 0.1 – it should stick closer to the input.

Confirmation and Validation

Implement a confirmation step to verify the email with the user. After the AI agent records the email, have it read back what it understood: “Let me confirm, your email is sandra1289@gmail.com, correct?”. This serves two purposes: if the AI’s version is wrong, the user can catch it and correct it, and if it’s right, the user affirms it (giving the model positive feedback in context). In testing voice agents, developers found that even when users corrected the AI’s mistake, the model sometimes persisted with the wrong info. To combat this, you might need to explicitly erase or override the incorrect memory. One technique is to use a system message or a high-priority assistant message upon correction, e.g., “(System: The user’s actual email is sandra1289@gmail.com. Forget any other address.)”. While this is a bit of a workaround, it can force the model to drop the hallucinated address and use the confirmed one going forward. Apart from user confirmation, consider programmatic validation: for instance, use a simple regex or an email validation API on the captured email string. If the format differs wildly from what was spoken (e.g., the user said a number but the captured email has none), flag it for manual review or an immediate re-prompt. Basic validation can catch obvious fabrications (like mismatched name), prompting the AI to ask the user for clarification rather than logging an incorrect email silently.

Fine-Tuning the Model

If this problem is frequent and you have access to fine-tuning, you can fine-tune the AI on examples of correct behavior. For instance, train on conversation transcripts where the user’s exact email is provided and the assistant correctly repeats or stores it without changes. Fine-tuning can help the model unlearn some of its undesirable biases – in this case, the tendency to default to name-based emails. By providing many examples of oddly formatted emails that must be preserved as-is, the model can adjust its expectations. It will learn that “random” alphanumeric emails are valid and should not be reformatted. However, fine-tuning is a heavier solution: it requires data and can be costly, and one must be careful not to introduce new issues (like the model might then overfit to always trusting the user input, even when it’s unclear). OpenAI’s documentation notes that fine-tuning is useful for aligning the model with domain-specific language or style, which could include how to handle contact info. If feasible, this targeted training can make the agent inherently better at structured data capture. That said, with new features like function calling and structured outputs, fine-tuning might not be necessary unless the simpler fixes don’t suffice.

Robust Parsing of Spoken Emails

In a voice assistant context, spelling out an email (e.g., “S-A-N-D-R-A-1-2-8-9”) is tough for an AI to handle reliably. Ensure your speech recognition pipeline is optimized for dictation of email addresses. Some STT services have modes or enhanced vocabulary for email/URL transcription. Using those can reduce the initial error that the language model has to deal with. If the STT still returns a slightly imperfect string, consider using a post-processing script to correct obvious issues (like converting “at gmail dot com” to “@gmail.com”). Essentially, reduce the burden on the AI model by feeding it the cleanest possible text. The less uncertain or noisy the input, the less likely the model will hallucinate a correction. In cases where the AI still seems to fabricate, an external integration might help – for example, after capturing what it thinks is the email, you could use an email verification API (like Twilio SendGrid’s validation service) to check if that address is deliverable. If it comes back invalid, the agent can inform the user there was an error and ask to repeat the email, this time more slowly or in a phonetic way. This kind of fallback validation loop can catch mistakes that slipped past the AI and give it another chance with better guidance.

AI agents occasionally err by imposing learned patterns (like name-based emails) on user input. This happens due to the model’s training tendencies and the free-form nature of its responses. To ensure accuracy in email collection, one must guide the model with structured approaches or clear instructions. By using OpenAI’s advanced features (function calls, JSON schemas), adjusting model settings, and validating critical data, we can largely eliminate the fabrication of incorrect emails. The goal is to make the AI agent a precise secretary rather than a creative guesser when it comes to email addresses. Implementing the above recommendations will help align the agent’s behavior with user expectations, resulting in reliable capture of emails and other structured data in conversational interactions.