Voice is present. Depth is missing.

AI voice gets tolerated as courtesy, not as the thing that makes results feel true.

8 of 12 described the voice as pleasant-to-neutral but not a reason the results landed. What made results feel personal was question design and wording; the voice rode on top of whatever was underneath. No participant named voice as the decisive positive.

Counter: 2 of 12 (P05, P06) felt voice made the test more interactive and 'easier'. Neither claimed the voice was the source of the result quality — only that it made completion smoother.

Implication. Push voice into an optional 'companion' role and let the question set carry the experience. Measure 'results resonated' against question-set versions, not voice variants. Voice is polish; questions are the product.

"It's very generic. It could apply to anyone."— P04 · skeptical analyst

Generic results read as hollow no matter how warm the voice is.

6 of 12 called results generic, shallow, or uncertain. Language included 'could apply to anyone', 'quite childish', and 'three answers is too few'. When the question set felt narrow, the voice did not recover the experience.

Counter: 2 of 12 (P05, P10) experienced results as 'tailored' and 'really good', describing specific passages that matched their self-image. The signal is about question design variance — the same voice product produced both reactions.

Implication. Measure result depth directly: ask participants to name a single phrase from their result that felt specifically theirs. Train on that signal, not on voice-satisfaction metrics that confound delivery with content.

"The survey is quite childish — there's not enough range on these questions. Three answers is too few."— P09 · skeptical analyst

Constant voice presence is focus friction — quiet beats empathetic.

5 of 12 described the voice as distracting mid-task. The friction was not tone; it was presence. Participants had to context-switch between answering on-screen and acknowledging the voice. 'Check in at the start / middle / end' was the explicit request, not 'be warmer'.

Counter: 2 of 12 (P05, P06) welcomed the continuous presence, finding it 'more interactive'. Tolerance for voice companionship varies sharply between participants — it is not a tuning knob, it is a toggle.

Implication. Ship a 'start / midpoint / end' voice cadence as the default. Offer 'always on' as an opt-in for participants who prefer companionship. Let participants switch mid-session without restarting.

"I'm trying to answer you and interact with you, although you're AI. I still need to switch my eye to refocus on the task."— P04 · skeptical analyst

Open-ended answers are the unmet demand — radio buttons feel childish.

3 of 12 (P04, P09 explicitly; P07 implicitly) pushed back against multiple-choice formats. P09 made the strongest case: an LLM interpreting an open-ended answer would capture 'the nuance of your unique answers' — a capability the current three-button flow cannot touch.

Counter: 9 of 12 completed the multiple-choice format without complaint. Most did not voice a format preference at all; the demand for open-ended input is a minority signal, but it matches the 'results felt generic' complaint from a different angle.

Implication. Add an 'elaborate' micro-prompt after each multiple-choice answer: 'one sentence on why'. Feed both the choice and the prose to the interpretation model. Test whether resulting summaries move the 'felt specifically theirs' metric.

"If there's an AI LLM interpreting what the person says, that would be more personalized."— P09 · skeptical analyst

Hard boundaries show up immediately and cannot be coaxed.

2 of 12 (P02, P07) set explicit limits — P02 withdrew over data usage concerns ('I don't want to be used for social media'); P07 asked the voice to be quiet repeatedly. P03 exited mid-session. The pattern is small in count but decisive in behaviour: these participants leave rather than adapt.

Counter: 9 of 12 proceeded without friction, either because they accepted the consent terms outright or had lower privacy sensitivity. The boundary-setters were not persuadable by incentive.

Implication. Treat early-exit rate as a first-class funnel metric, not as failed recruitment. Give participants a visible 'pause voice' and 'end session with partial data' control on every screen.

"I don't consent. I don't want to be used for social media purpose."— P02 · boundary protector

●	The Skeptical Analyst	2 of 12	"It's very generic. It could apply to anyone." — P04 · skeptical analyst
■	The Boundary Protector	2 of 12	"Can you be quiet because you're distracting?" — P07 · boundary protector
▲	The Flow-Seeker	4 of 12	"They were tailored to my personality. It kinda resonated with me." — P05 · flow-seeker
◆	The Pragmatic Consumer	2 of 12	"I think so." — P01 · pragmatic consumer

Voice is present. Depth is missing.

AI voice gets tolerated as courtesy, not as the thing that makes results feel true.

Generic results read as hollow no matter how warm the voice is.

Constant voice presence is focus friction — quiet beats empathetic.

Open-ended answers are the unmet demand — radio buttons feel childish.

Hard boundaries show up immediately and cannot be coaxed.

Four archetypes

Voice is polite. Questions do the work.

The consumers in this report. You could talk to them.