The Hostile Design of Corporate Hold Queues

How an accident of telephone engineering became a machine for managing human behavior — and why nobody had to conspire to build it

There is an acoustic environment that tens of millions of Americans are conscripted into every day, and almost none of them choose it. It lives most aggressively within the institutions people cannot route around: hospital billing departments, insurance carriers, pharmacy benefit managers, retail banks, credit card servicers, utilities, telecoms. These are, not coincidentally, the precise sectors where the caller arrives already depleted — sick, broke, frightened, grieving, or simply running out of the daylight hours in which any of these offices deign to answer a phone.

The texture is uncannily uniform across every one of them. Audio compressed until it has the timbre of a transistor radio left out in the rain. A six-second instrumental loop that returns and returns like a tongue worrying a chipped tooth. An advertisement that arrives louder than the music it interrupts. A patronizing wellness aphorism. An interruption every twelve to forty seconds. A volume that lurches without warning. Stretches of dead air engineered to make you certain the line has dropped. And then, occasionally, a human voice — faint, hurried, and gone before your mouth can form a word.

No one would build this room and then choose to sit in it. Yet it has become infrastructure, as unremarked-upon as the carpet in a government office. The interesting question is not whether it is unpleasant. Everyone agrees it is unpleasant. The interesting question is the one almost no one asks aloud: who decided it should feel this way, and what are they getting in return?

The answer is more unsettling than the conspiracy a frustrated caller might reach for, precisely because it is not a conspiracy at all.

An accident, then a business

Hold music was not invented to soothe anyone. It was discovered, by accident, by a man who owned a factory.

In 1962, the entrepreneur Alfred Levy found that a loose wire in his building had grounded itself against a steel girder, turning the entire structure into a crude radio receiver; callers on hold suddenly heard a broadcast signal from a station next door. Levy patented the effect in 1966 as the “Telephone Hold Program System,” and essentially every on-hold device in use today descends from that single patent. The original purpose was narrow and almost humane: silence on an early phone line read as a dropped call, so callers hung up. A study cited for decades found that roughly 70% of callers who met with dead air abandoned the line within 60 seconds, convinced they had been disconnected or disrespected. Sound — any sound — kept them there.

Notice the verb. Kept them there. From the very first patent, the function of hold audio was retention, not comfort. The two happened to align in 1966. They have been quietly diverging ever since.

The divergence began the moment someone realized what they were holding. The trade itself remembers the origin story with disarming candor: a man on hold with his bank heard a radio commercial come through the line and grasped, in an instant, the entire future of the medium. The pitch that built an industry was a single sentence, and it is still repeated, more or less verbatim, by the vendors who sell “on-hold messaging solutions” today: you already have a captive audience — use the wait to market to them.

That sentence is the whole psychology, stated without embarrassment. The waiting room had been reclassified. It was no longer dead time the institution owed you. It was inventory. By the late twentieth century, there was a trade association, a conference circuit, an awards program for the year’s finest hold productions, and a vendor’s lament that ninety-nine percent of the market remained unconverted — “wide open,” as one early operator put it, with the appetite of a man surveying unfenced land.

What had been a reassurance became a channel. What had been your time became theirs to sell back to you.

The refusal to let you settle

Here is the design choice that gives the modern hold queue away. A system built to calm a distressed person would do the obvious things. It would hold a steady volume. It would offer long, uninterrupted passages of low-information sound. It would keep a predictable cadence. It would permit silence. It would, at intervals, simply tell you the truth about how long you have left to wait.

Almost no system does any of this. They do, with striking consistency, the opposite. They interrupt. The brain begins to adapt to the loop — the loop is monotonous, and monotony is the doorway to disengagement — and then, precisely as the mind starts to drift somewhere more bearable: Did you know you can manage your account online? The interruption is not a glitch. It is the mechanism. It reaches into the half-second before you would have set the phone down and yanks your attention back to the queue.

Behavioral psychology has a name for the family this belongs to. An environment that delivers stimulus at unpredictable intervals — never quite letting the subject relax, never quite delivering the thing they are waiting for — is an intermittent reinforcement schedule, and it is the single most powerful behavioral conditioning pattern ever documented. B.F. Skinner mapped it in the 1950s with rats and pigeons pressing levers for rewards that came on no fixed schedule, and he found that variable, unpredictable reinforcement produced the most persistent, most extinction-resistant behavior of any pattern he tested. Skinner himself compared the apparatus to a slot machine and noted, without apparent remorse, that he had effectively turned his pigeons into gamblers.

The slot machine is the purest commercial expression of the principle — it is why slot machines generate the majority of casino revenue, and why a person at one will keep feeding it long after the fun has curdled into something closer to a trance. The same schedule is the engine under the social-media notification, the infinite scroll, the pull-to-refresh: a reward that might arrive on this attempt, or the next, or the one after, keeping the subject in a state of low, continuous vigilance. Not engaged. Not at peace. Just alert enough not to walk away.

The hold queue runs on this exact current. The “reward” the caller waits for is a human being, and it arrives on a schedule the caller cannot predict, and the system will not disclose. Every interruption, every false fragment of a voice, every lurch from music to advertisement to dead air is a small, unpredictable event that resets the attentional clock. The agent might pick up during this loop. Or the next. You do not dare set the phone down. You have been, in the most literal behaviorist sense, conditioned to stay on the lever.

It is worth being precise about what that costs the body. Telephony was engineered to carry intelligible speech down a thin pipe, which means it strips away bass, spatial depth, dynamic range, and the harmonic warmth that makes sound feel like it occupies a room. Layer on top of that the automated leveling and aggressive volume-maximizing of a cheap on-hold system, and you get audio that is flattened, sharp, and physiologically fatiguing. Music suffers the worst of all, because it is about dynamics and space, and the medium destroys both. A six-second saxophone phrase that the brain could resolve and dismiss if it were allowed to breathe instead repeats four hundred times without organic variation, and an unresolved, repetitive stimulus is a recognized recipe for attentional fatigue. The system does not merely bore you. It produces a measurable, low-grade stress — anticipatory vigilance plus sensory abrasion — and then bills it to your afternoon.

The medical line is the obscene case

If the hold queue is a behavioral instrument, the medical sector is where it is played at its most grotesque, because the emotional context has nothing in common with ordinary commerce. The person calling a hospital, an insurer, a specialty pharmacy, a billing office, or a nurse line is frequently in pain, sleep-deprived, financially terrified, elderly, cognitively overloaded, or in the middle of a family emergency. They have not called to be marketed to. They have called because something has gone wrong with a body.

What greets them is, with depressing regularity, the same chaos refined for cheap speakers: shrill high-mids, hyper-compressed speech, a light-jazz loop, a synthetic warmth that fools no one, and an unbroken stream of reminders about portals, apps, surveys, and policy changes. A patient calling to fight for chemotherapy authorization or to dispute a surgical bill is routed through an audio environment designed by a telecom marketing vendor whose entire professional expertise is the monetization of waiting. The contradiction is total, and it has attracted almost no public scrutiny, in part because each individual instance is too small to make a story, and the cumulative weight is too diffuse to assign to anyone.

What the system is actually optimizing for

Companies will tell you the hold experience exists to reassure and inform the caller. The economics tell a different story, and the economics are not hidden — they are sold openly, in vendor decks, as a feature.

The governing fact of the modern contact center is that a human is the most expensive thing in it. By common industry estimates, a live phone interaction costs somewhere between three and twelve dollars; a self-service interaction costs roughly a dime, and by some accountings effectively nothing. Gartner’s widely cited figure puts live phone support at nearly eight dollars per interaction, against ten cents for self-service — a ratio of roughly eighty to one. The entire discipline that has grown up around this gap even has a clinical name: call deflection, the art of redirecting a caller away from the costly human and toward the cheap app, chatbot, or FAQ. McKinsey has reported that firms that shift customers to digital channels can cut costs by around 30%. The vendors are explicit that a successful deflection rate means a large fraction of the people who picked up the phone never reach a person at all.

Read the hold queue against that arithmetic, and it stops being mysterious. A genuinely pleasant wait would be a strategic error: it would increase retention, keeping callers patiently in the queue until they consumed an expensive human. The cost-minimizing design points the other way — toward shorter calls, fewer completed calls, and the steady migration of frustrated humans onto the self-service channels the recorded voice keeps advertising. The interruptions that promise “you can do this online” are not non-sequiturs. They are the funnel speaking its own name. The phone system, staffed by humans, is engineered in part to push humans away from humans.

This is the quiet inversion at the heart of it. The hold queue is frequently described by the firms that deploy it as customer service. A more honest description is that it is a filter — one that rewards persistence, exhausts the irresolute, and routes the marginal caller toward the channel that costs the company nothing. The unpleasantness is not incidental to that function. The unpleasantness is the function. A queue that was easy to tolerate would filter no one.

The behavioral economist Cass Sunstein gave the broader phenomenon a useful name: sludge — friction deliberately engineered into a process to slow people down, wear them out, and quietly thin the ranks of those who complete it. Sunstein’s central observation is that sludge is rarely an accident or mere incompetence; it is frequently a choice, because friction is profitable to whoever is on the other side of it. Companies make subscriptions effortless to enter and a labyrinth to exit. Governments have historically used administrative burden to suppress the take-up of benefits and even the exercise of the vote. The common thread is that the obstacle serves the institution and offloads the cost onto the individual, who experiences it not as policy but as bad luck. The hold queue is sludge rendered in sound: a toll booth that takes its fee in time, attention, and a thin film of stress, collected from people who never agreed to pay it and have nowhere else to drive.            

The historical foil, named precisely

It is tempting, sitting in that audio, to reach for something more lurid — to suspect that somewhere a behavioral scientist designed the loop specifically to break you down. We do not have to speculate about whether the American government has ever engineered deliberate psychological manipulation of unwitting citizens, because it has, and the record is declassified.

From 1953 to roughly 1973, the CIA ran Project MKUltra, an umbrella of subprojects explicitly devoted to mind control and behavior modification, conducted — in the words of the official investigations that later exposed it — on “both unwitting and cognizant human subjects.” It dosed people with LSD without their knowledge; one of its victims, the Army scientist Frank Olson, died falling from a hotel window days after the program’s chief, the chemist Sidney Gottlieb, secretly spiked his drink. In 1973, anticipating scrutiny, CIA Director Richard Helms ordered the program’s files destroyed; investigators were left to reconstruct it from a misfiled cache of roughly 20,000 financial records that had escaped the shredder and surfaced in response to a 1977 Freedom of Information Act request. The Church Committee, examining MKUltra alongside a wider pattern of abuses, reached a conclusion that still anchors the law of human experimentation: intelligence agencies had undermined the constitutional rights of citizens, and there exists no inherent authority for any agency to do so. The episode is the reason institutional review boards and informed-consent requirements now govern research at every American university and hospital.

MKUltra is the right reference point here for an instructive reason — and the instruction runs opposite to the one a conspiracy-minded reader expects. MKUltra was illegal. It was a secret. It required a director to incinerate the evidence and a federal committee to drag it into daylight. It is the model of behavioral manipulation as a furtive crime that the system tried to hide.

The hold queue is the reverse in every respect. It is legal. It is documented in the vendors’ own marketing. No files need to be shredded because nothing was ever concealed: the 80-to-1 cost ratio, the deflection targets, and the captive-audience pitch are printed in trade publications and sold from conference stages. No malevolent designer sat down to engineer your distress. A thousand individually rational decisions — to cut a cost, to fill a silence, to monetize a wait, to nudge a caller toward a cheaper channel — accreted into a machine that does to millions of stressed people, every single day, a milder version of what the CIA had to break the law to attempt on a few. That is the genuinely disquieting finding. The most pervasive behavioral conditioning environment most Americans will ever inhabit did not require a conspiracy, a crime, or a secret. It required only a spreadsheet and the absence of anyone whose job was to care.

Why we don’t revolt

We tolerate it because the injury is low-grade and distributed, which is the perfect profile for an abuse that never gets named. No single hold call is a catastrophe. It is four minutes, six minutes, twenty on a bad day with a denied claim. But multiply those minutes across a population and the sectors that cannot be avoided, and you arrive at a civilization-scale sensory tax — millions of cumulative hours of manufactured, monetized irritation, levied disproportionately on the sick and the broke, with no line item and no recipient to bill.

It joins a wider family of infrastructural friction we have stopped seeing: the airport announcement that will not stop, the fluorescent hum of the waiting room, the kiosk that replaced the clerk, the notification that interrupts the dinner. Each is individually trivial. Each is individually deniable. Collectively, they amount to a steady transfer of psychological labor from institutions to the individuals they are nominally serving — a transfer so gradual and so universal that it reads as the weather rather than as a decision anyone made.

And there is a deeper cultural shift underneath: the corporate fear of silence itself. Silence is unmonetized, uncontrolled, faintly risky. So the gaps get filled — disclosures, surveys, reminders, the eternal 'your call is important to us' — until the caller occupies an environment of total informational saturation that reliably produces the opposite of reassurance. It produces exhaustion. The thing sold as care is experienced as siege.

The humane version is trivial to build

The final tell is how easy the alternative would be. A hold system designed for the human on the line rather than the cost on the ledger is not a moonshot. It is a weekend of engineering. A true silence option. Volume normalization so the ads cannot ambush you. Long, uninterrupted ambient passages that let the brain actually settle. An honest, continuously updated wait estimate. Text-callback so no one has to hold at all. User-selectable audio. A low-stimulation mode for neurodivergent or medically distressed callers. A clear signal a half-second before a human joins, so you are not caught mid-drift. The elimination of the loop and the ads entirely.

None of this is technically hard. All of it has existed, in pieces, for years. That it is so rarely assembled is the most eloquent evidence of intent we have — not the lurid intent of a hidden designer bent on your suffering, but the banal intent encoded in every system that is never built. The humane queue goes unbuilt because the current one is already doing its job: managing behavior, suppressing the expensive human interaction, monetizing captive attention, and quietly converting the institution’s operating costs into your stress.

The hold queue is not a bad design. Bad design is an accident. This is good design, pointed away from you. It is a small, almost comically mundane window into the defining move of modern systems — to optimize relentlessly for the organization while offloading the psychological burden onto the individual, and to do it so smoothly, so legally, and so universally that the person paying the tax mistakes it for the natural order of things.

The most effective manipulation never has to hide. It only has to feel normal.

Implementation Directive: Converting the Hold Queue Into a Caller-Centered Experience

A standing instruction for contact-center operations, customer-experience leadership, and telephony engineering

This directive establishes the requirements and the sequence for replacing a conventional, marketing-driven hold system with one designed around the person on the line. It is written to be adopted as policy, not as an aspiration. Each requirement names the harm it removes, the owner responsible for it, and the condition that signals completion, so that progress can be verified rather than asserted.

The governing principle is simple and applies to every section below. The wait is the institution’s cost to absorb, not the caller’s attention to monetize. Every choice that follows descends from that single reframe.

Phase Zero — Reform the measurement before changing anything else

No audio change will survive a quarterly review while the operations team is still graded on the metrics that produced the old system. As long as average handle time is a number to drive down and deflection rate is a number to drive up, every humane change registers on the dashboard as a failure: calls run longer, fewer callers are pushed away, and the cost-per-contact line rises. Asking the operations director to deliberately worsen the figure they are bonused on is asking them to volunteer for a bad review. They will, reasonably, decline. The reform, therefore, begins with the scoreboard.

The Chief Customer Officer or VP of Customer Experience is responsible for this phase, and it must be complete before any engineering work begins. Average handle time is demoted from an optimization target to a diagnostic-only metric, watched for anomalies but never rewarded for shrinking. Deflection rate is removed entirely from any individual’s incentive compensation. In their place, three metrics are installed and weighted as the primary measures of queue performance: resolution on first human contact, which counts a problem solved by a person rather than a caller routed away from one; wait-estimate accuracy, the mean gap between the time promised and the time actually waited; and caller-reported stress, gathered from a brief post-call survey. This phase costs nothing but a decision and a revised compensation sheet. It is the most important step in the entire directive and the one most likely to be skipped, because it is the only one that asks leadership to give something up.

The completion condition is a published, signed metrics framework in which handle time and deflection no longer appear as rewarded targets, and the three caller-centered metrics are live on the operations dashboard.

Phase One — Deploy the silence option and normalize volume

These are the fastest available wins and exist to build internal credibility for the harder phases that follow, so they come first.

The telephony or CCaaS administrator performs a single keypress that mutes all hold audio and places the caller in clean silence, punctuated only by a soft tone roughly every 30 seconds, confirming the line is still live, and the caller still holds. This restores hold audio to its sole legitimate original function, which was to prove the connection was active, and discards everything that accreted onto that function over the following decades. In parallel, all queue audio is passed through output normalization so that no element — and advertisements above all — can play louder than the ambient bed. The most aggressive dynamic compression is removed because it is the source of the physical fatigue callers describe, even when they cannot name its cause.

None of this requires a new vendor, a capital expenditure, or a procurement cycle. It is configuration on any modern platform and is achievable within days. The completion condition is that a caller can elect to remain silent at any point in the queue and that no audio element exceeds the normalized ceiling.

Phase Two — Replace reassurance theater with honest estimates and place-holding callback

The phrase “your call is important to us” is removed from every script. It is reassurance theater: it conveys no information, and its repetition in the absence of information actively corrodes trust. It is replaced with a continuously updated, truthful statement of the caller’s position and expected wait time, drawn from queue depth and a rolling average handle time — data the platform already collects. Estimates are presented as small ranges and rounded conservatively. Promising a short wait that does not materialize is itself a form of manipulation, and the directive forbids it; an honest, long estimate is required even when the news is unwelcome.

Alongside the honest estimate, the system offers a callback that preserves the caller’s place in line. The caller may hang up, retain their exact position, and receive an outbound call when an agent becomes available. This is the rare reform that serves the caller and the budget in the same motion: it ends the captive-attention dynamic completely while eliminating the cost of the held line. For that reason, it should be foregrounded to finance leadership as the phase where doing right and reducing cost point are in the same direction.

The operations director and the telephony administrator share ownership of this phase. The completion condition is that every caller receives a truthful, updating estimate, that no caller hears the banned reassurance phrase, and that a caller electing callback verifiably retains their queue position.

Phase Three — Build the low-stimulation accessibility path

A low-stimulation queue is established as a selectable mode and as the default fallback when a caller makes no selection. It contains a long, stable, reduced-level ambient passage with no interruptions, no advertisements, no status speech, and a single gentle cue delivered shortly before an agent joins. It exists for neurodivergent callers, for callers in pain or under cognitive load, and for anyone who is simply overwhelmed, which, on a medical or financial line, is a substantial share of the people calling.

This path is to be understood and documented as a reasonable accommodation under applicable disability law, not as a courtesy. Framing it correctly moves it from a discretionary nicety into a compliance obligation, which is the budget category in which it will actually be funded. Legal and accessibility counsel should review and affirm the framing. The completion condition is a documented low-stimulation mode, reviewed by counsel, available on demand, and active by default in the absence of a caller selection.

Phase Four — Sever or rewrite the marketing-on-hold relationship

This phase is placed last deliberately because it is the only one with an organized internal opponent. The on-hold media vendor’s commercial incentive runs directly opposite to this directive; the vendor is paid to fill silence with marketing, so reducing the marketing reduces the vendor’s billing. Attempting this reform first invites it to be killed before the earlier phases have built the evidence to defend it.

By this point, the caller-centered metrics from Phase Zero will show what the earlier phases produced: fewer complaints, higher first-contact resolution, and lower reported stress. With that record in hand, leadership directs that advertising and promotional content be substantially reduced and, on medical and financial lines, eliminated. Where a contractual relationship with a media vendor obstructs this, the relationship is renegotiated or ended. The completion condition is the removal of promotional content from sensitive lines and a documented ceiling on it everywhere else.

Standing requirements

The following hold permanently across all phases and all queues.

The wait estimate must always be truthful, even when the wait is long. No script may use content-free reassurance in place of information. Silence must always be available to the caller on demand. The low-stimulation path must always be reachable. Advertising must never exceed the volume of the ambient bed and must never appear at all on lines serving people in medical or financial distress. The caller-centered metrics are reviewed at every operations cadence, and any change that improves a legacy efficiency metric at the expense of a caller-centered one is presumed to be a regression and must be explicitly justified before it is retained.

The technical work described here is, in its entirety, achievable by an existing engineering team on existing platforms within a normal release cycle. The directive’s difficulty was never technical. It was the decision, and this document is the record of that decision.

Fixing the Hold Queue: A One-Page Proposal

For the VP of Customer Experience / Chief Customer Officer

The situation. Our hold queue was never deliberately designed to frustrate callers. It is the accumulated result of ordinary decisions — fill the silence, use the wait to market, push routine calls to cheaper channels, hold down handle time — each rational on its own, which together produced an experience our customers describe as hostile. The people most exposed to it are those already under strain: customers calling about a bill they can’t pay, a claim that was denied, or a medical question that frightens them. We are adding friction to people at their lowest, and our current metrics reward us for it.

The opportunity. The fix is not a technology problem. Every change below is achievable by our existing telephony team on our existing platform within a normal release cycle. The reason it hasn’t happened is that no one has been asked, and the operations dashboard quietly penalizes anyone who tries. That makes this a decision you can make, not a project you have to wait on.

What changes, and why each one is safe to approve:

The first move is a measurement change, not an audio change. As long as we grade operations on average handle time and deflection rate, every humane improvement reads as a failure on their scoreboard, and the work dies in review. We demote handle time to a diagnostic, remove deflection rate from individual incentives, and reward three things instead: problems resolved on first contact with a human, the accuracy of the wait times we quote, and caller-reported stress. This costs nothing but a revised compensation sheet, and nothing else on this page survives without it.

We then give callers control over their own wait. A keypress for silence, so an overwhelmed caller can hold in calm rather than get caught in a marketing loop. An honest, continuously updated wait estimate that replaces “your call is important to us” with a real number. And a callback that holds the caller’s place in line, so they can hang up and we phone them when an agent is free.

That last item, callback, is the one to show finance first, because it saves money in the same motion that it serves the customer. It eliminates the cost of the held line entirely while ending the captive-wait dynamic that drives our complaints. Doing right and cutting cost point in the same direction here, which is rare and worth leading with.

Finally, we built a low-stimulation path — no interruptions, no ads, a gentle cue before an agent joins — for neurodivergent and distressed callers. Counsel should confirm the framing, but this is plausibly a required accessibility accommodation rather than a courtesy, which means it is funded out of compliance rather than fought for out of discretion.

What I am asking for. Approval to begin with the measurement change and the silence option, both of which carry no cost and no risk, and a commitment to review the caller-centered metrics at our next operations cadence. The harder steps — reducing the marketing-on-hold load — come later, defended by the complaint and resolution data the early steps will produce.

The honest scope of the claim. I am not promising that fixing our hold queue transforms the company. I am promising something I can verify within a quarter: lower caller stress, fewer complaints, faster resolution, reduced held-line cost through callback, and the closing of a real accessibility exposure. There is a larger argument that millions of these small frustrations, across every institution, add up to a measurable civic cost — and I believe it — but I am not asking you to fund a theory. I am asking you to approve a set of changes that pay for themselves and are also the right thing to do.

The decision. The engineering is trivial, and the early phases are free. The only thing standing between our customers and a queue that treats them like people is a decision that sits with you. I’m asking you to make it.


Below is the code to present to your bosses that will abolish shit hold-time experiences for everyone. Enjoy it.

/**
 * humane-hold-queue.js
 * ---------------------------------------------------------------------------
 * A reference implementation of a CALLER-CENTERED hold/queue experience.
 *
 * Design thesis: the unpleasantness of conventional hold systems is not a
 * technical limitation. It is a set of choices encoded against the caller.
 * This service makes the opposite choices, and each one is small. The file is
 * deliberately readable rather than clever, because the argument of the whole
 * project is that the humane version was always cheap to build — what was
 * missing was the decision to build it.
 *
 * Built on Twilio Programmable Voice + TaskRouter-style queueing, but the
 * SHAPE of every handler maps directly onto Amazon Connect contact flows,
 * Genesys, Five9, or NICE. Where a concept is platform-specific I say so.
 *
 * Principles implemented here, each traceable to a specific harm it removes:
 *   1. SILENCE ON DEMAND      — caller can mute audio and wait in calm silence
 *   2. HONEST ESTIMATES       — real position + ETA, never "important to us"
 *   3. PLACE-HOLDING CALLBACK — hang up, keep your spot, get called back
 *   4. NON-INTERRUPTING AUDIO — long stable ambient bed, no ad resets
 *   5. LOW-STIMULATION MODE   — accessibility path: zero interruptions
 *   6. VOLUME NORMALIZATION   — nothing is allowed to be louder than the bed
 *   7. CLEAN HANDOFF SIGNAL   — a gentle cue a half-second before a human joins
 *
 * Run:  npm i express twilio body-parser
 *       node humane-hold-queue.js
 * Point your Twilio number's voice webhook at  POST /incoming
 * ---------------------------------------------------------------------------
 */

const express = require("express");
const bodyParser = require("body-parser");
const twilio = require("twilio");

const { VoiceResponse } = twilio.twiml;
const app = express();
app.use(bodyParser.urlencoded({ extended: false }));

// ---------------------------------------------------------------------------
// CONFIG
// ---------------------------------------------------------------------------
// PUBLIC_URL is wherever this service is reachable (ngrok in dev, your host in
// prod). Twilio fetches audio and posts callbacks here.
const PUBLIC_URL = process.env.PUBLIC_URL || "https://example.com";

// A single, long, low-information ambient bed. NOT a 6-second loop.
// The harm we are removing is the unresolved-repetition fatigue described in
// the essay: the brain cannot dismiss a tight loop, so it stays abraded.
// A several-minute ambient passage lets attention actually settle. Host your
// own normalized audio; this is a placeholder path.
const AMBIENT_BED_URL = `${PUBLIC_URL}/audio/ambient-bed-5min.mp3`;

// A quiet "still here" tone for the silence option — reassurance WITHOUT
// reclaiming attention. This is the entire original 1960s purpose of hold
// audio (prove the line is live) stripped of the marketing that accreted onto
// it over sixty years.
const HEARTBEAT_TONE_URL = `${PUBLIC_URL}/audio/heartbeat-soft.mp3`;

// The pre-agent handoff cue. A single soft tone, ~0.5s before the human joins,
// so the caller is not caught mid-drift and the agent does not open to silence.
const HANDOFF_CUE_URL = `${PUBLIC_URL}/audio/handoff-soft.mp3`;

// How often we re-evaluate the queue and re-greet. 30s is a floor: long enough
// to permit disengagement, short enough that an honest estimate stays honest.
const REEVALUATE_SECONDS = 30;

// ---------------------------------------------------------------------------
// QUEUE STATE
// ---------------------------------------------------------------------------
// In production this is your CCaaS's queue (Connect/Genesys/TaskRouter). Here
// it's an in-memory model so the logic is visible. The only data we need to be
// HONEST with the caller is data the platform already collects: position and
// rolling average handle time. The estimate problem was never a data problem.
const queue = {
  waiting: [], // [{ callSid, joinedAt, mode, phoneNumber }]
  rollingAvgHandleSeconds: 240, // updated from real completions in prod
  availableAgents: 1,

  position(callSid) {
    return this.waiting.findIndex((c) => c.callSid === callSid);
  },

  // Honest ETA: callers ahead of you, divided by agents, times avg handle time.
  // We round UP and present a small range, never a single optimistic number.
  // Over-promising a short wait is itself a manipulation; we refuse it.
  etaSeconds(callSid) {
    const pos = this.position(callSid);
    if (pos < 0) return 0;
    const ahead = pos; // people in front of you
    return Math.ceil((ahead / this.availableAgents) * this.rollingAvgHandleSeconds);
  },

  add(call) {
    this.waiting.push(call);
  },
  remove(callSid) {
    this.waiting = this.waiting.filter((c) => c.callSid !== callSid);
  },
};

// Turn raw seconds into a humane, honest phrase. We give a range and we never
// say "important to us." If the wait is genuinely long, we SAY SO, because the
// alternative — concealing it to keep the caller captive — is the exact harm.
function humaneWaitPhrase(seconds) {
  if (seconds <= 0) return "You're next. An agent is connecting now.";
  const mins = Math.round(seconds / 60);
  if (mins <= 1) return "Your wait is about a minute.";
  if (mins <= 10) return `Your wait is roughly ${mins} to ${mins + 2} minutes.`;
  // Long wait: be honest AND offer the exit. Honesty without an exit is just
  // bad news; honesty with callback is respect.
  return (
    `Your wait is currently longer than ${mins} minutes. ` +
    `You don't have to stay on the line — press 2 and we'll call you back ` +
    `without losing your place.`
  );
}

// ---------------------------------------------------------------------------
// 1. INCOMING CALL — the front door
// ---------------------------------------------------------------------------
// First contact sets the entire tone. We do three things differently from a
// conventional system: we speak plainly, we offer the low-stimulation path
// immediately (accessibility is not buried in a menu), and we never open with
// an advertisement.
app.post("/incoming", (req, res) => {
  const twiml = new VoiceResponse();
  const callSid = req.body.CallSid;

  queue.add({
    callSid,
    joinedAt: Date.now(),
    mode: "standard",
    phoneNumber: req.body.From,
  });

  // Gather a single keypress. Note <Say> here is plain and brief — no jingle,
  // no "your call is important." We tell the caller what's true and what they
  // can control. Control is the antidote to the captive-attention dynamic.
  const gather = twiml.gather({
    numDigits: 1,
    action: "/menu",
    method: "POST",
    timeout: 6,
  });
  gather.say(
    { voice: "Polly.Joanna-Neural" },
    "Thanks for calling. You're in the queue and we'll connect you to a person. " +
      "While you wait, you have options. " +
      "Press 1 for quiet — silence with a soft tone so you know you're still connected. " +
      "Press 2 to get a callback without losing your place in line. " +
      "Press 3 for a low-stimulation wait with no interruptions. " +
      "Or just stay on, and we'll keep you updated."
  );

  // If they press nothing, that is a valid choice. Default to the calm,
  // non-interrupting standard hold rather than to marketing.
  twiml.redirect("/hold/standard");
  res.type("text/xml").send(twiml.toString());
});

// ---------------------------------------------------------------------------
// MENU ROUTER
// ---------------------------------------------------------------------------
app.post("/menu", (req, res) => {
  const choice = req.body.Digits;
  const routes = {
    "1": "/hold/quiet",
    "2": "/callback/offer",
    "3": "/hold/low-stim",
  };
  const twiml = new VoiceResponse();
  twiml.redirect(routes[choice] || "/hold/standard");
  res.type("text/xml").send(twiml.toString());
});

// ---------------------------------------------------------------------------
// 2 + 4 + 6. STANDARD HOLD — honest estimate, stable bed, no ad resets
// ---------------------------------------------------------------------------
// This is the default. Compare against a conventional queue: instead of a
// 6-second loop punctured every 12-40s by a louder ad ("intermittent
// reinforcement"), we play ONE long ambient passage and re-greet only every
// 30s with a true status. The re-greet is information the caller asked to
// receive, not an attention-reset they didn't.
app.post("/hold/standard", (req, res) => {
  const twiml = new VoiceResponse();
  const callSid = req.body.CallSid;

  // If an agent is free, hand off cleanly (principle 7).
  if (queue.position(callSid) === 0 && queue.availableAgents > 0) {
    return connectToAgent(twiml, res);
  }

  const eta = queue.etaSeconds(callSid);
  twiml.say({ voice: "Polly.Joanna-Neural" }, humaneWaitPhrase(eta));

  // <Play> the long, volume-normalized bed. Because it is one long file rather
  // than a tight loop, the brain can disengage. We loop it only enough to span
  // the re-evaluate window — we are NOT restarting a short clip repeatedly.
  twiml.play(AMBIENT_BED_URL);

  // After the bed (or the window), come back and re-evaluate honestly.
  twiml.redirect("/hold/standard");
  res.type("text/xml").send(twiml.toString());
});

// ---------------------------------------------------------------------------
// 1. QUIET MODE — silence with a heartbeat
// ---------------------------------------------------------------------------
// The caller chose silence. We honor it completely: no music, no ads, no
// speech except a single soft tone every 30s that means only "still connected,
// still holding." This restores hold audio to its sole legitimate 1960s
// function and discards everything that was bolted on afterward.
app.post("/hold/quiet", (req, res) => {
  const twiml = new VoiceResponse();
  const callSid = req.body.CallSid;

  if (queue.position(callSid) === 0 && queue.availableAgents > 0) {
    return connectToAgent(twiml, res);
  }

  // ~30 seconds of true silence, then one soft heartbeat tone, then repeat.
  twiml.pause({ length: REEVALUATE_SECONDS });
  twiml.play(HEARTBEAT_TONE_URL);
  twiml.redirect("/hold/quiet");
  res.type("text/xml").send(twiml.toString());
});

// ---------------------------------------------------------------------------
// 5. LOW-STIMULATION MODE — the accessibility path
// ---------------------------------------------------------------------------
// For neurodivergent, cognitively-loaded, in-pain, or simply overwhelmed
// callers. Zero interruptions, zero speech until handoff, a single unbroken
// gentle ambient bed at reduced level. Framed legally, this is a reasonable
// accommodation, not a luxury — which is how it gets funded.
app.post("/hold/low-stim", (req, res) => {
  const twiml = new VoiceResponse();
  const callSid = req.body.CallSid;

  if (queue.position(callSid) === 0 && queue.availableAgents > 0) {
    return connectToAgent(twiml, res, { lowStim: true });
  }

  // No status speech at all — interruption is the thing this caller most needs
  // to avoid. Just the bed, looped quietly, with the handoff cue doing the
  // signaling later.
  twiml.play(AMBIENT_BED_URL);
  twiml.redirect("/hold/low-stim");
  res.type("text/xml").send(twiml.toString());
});

// ---------------------------------------------------------------------------
// 3. PLACE-HOLDING CALLBACK — the alignment win
// ---------------------------------------------------------------------------
// This is the reform that serves the caller AND the budget at once: it ends
// the captive-attention dynamic entirely and removes the held-line cost. The
// caller's position is preserved; we phone them when an agent is ready.
app.post("/callback/offer", (req, res) => {
  const twiml = new VoiceResponse();
  const callSid = req.body.CallSid;
  const call = queue.waiting.find((c) => c.callSid === callSid);

  twiml.say(
    { voice: "Polly.Joanna-Neural" },
    "Got it. We'll keep your place in line and call you back at this number " +
      "when an agent is ready. You can hang up now. Goodbye."
  );

  // Mark this caller as awaiting callback. In prod, your queue keeps their
  // ordinal position; a worker watches for agent availability and originates
  // an outbound call (see /callback/dispatch). We DO NOT lose their spot —
  // doing so would just relocate the punishment.
  if (call) call.mode = "awaiting-callback";

  twiml.hangup();
  res.type("text/xml").send(twiml.toString());
});

// A worker process (cron / queue listener) calls this when the awaiting-callback
// caller reaches the front and an agent frees up. It originates the outbound
// call via the Twilio REST API and bridges to the agent.
async function dispatchCallback(call) {
  const client = twilio(process.env.TWILIO_SID, process.env.TWILIO_AUTH);
  await client.calls.create({
    to: call.phoneNumber,
    from: process.env.TWILIO_NUMBER,
    // When they answer, go straight to a clean handoff — they already waited.
    url: `${PUBLIC_URL}/callback/connect`,
  });
}

app.post("/callback/connect", (req, res) => {
  const twiml = new VoiceResponse();
  twiml.say(
    { voice: "Polly.Joanna-Neural" },
    "Hi — this is your callback. Connecting you to an agent now."
  );
  connectToAgent(twiml, res);
});

// ---------------------------------------------------------------------------
// 7. CLEAN HANDOFF — never drop a human into silence, never startle the caller
// ---------------------------------------------------------------------------
// A single soft cue plays ~0.5s before the bridge, so the caller surfaces from
// whatever they drifted into and the agent doesn't open to dead air. Small,
// but it's the difference between a jarring snap and a handoff that feels
// human on both ends.
function connectToAgent(twiml, res, opts = {}) {
  twiml.play(HANDOFF_CUE_URL); // ~0.5s gentle tone
  // Bridge to your agent pool. In prod this is <Enqueue> into TaskRouter, a
  // Connect queue, or a SIP <Dial> to the available agent. Shown as <Dial>.
  const dial = twiml.dial();
  dial.queue("support"); // platform-managed agent connection
  if (res) res.type("text/xml").send(twiml.toString());
}

// ---------------------------------------------------------------------------
// MEASUREMENT REFORM, IN CODE — Phase Zero made literal
// ---------------------------------------------------------------------------
// You cannot manage what you don't measure, and the conventional dashboard
// measures exactly the things that reward cruelty (handle time down, deflection
// up). Here we log the metrics that reward care, so the operations director's
// scoreboard can actually change. This is the most important function in the
// file even though it touches no audio.
const humaneMetrics = {
  callsHandledFirstContact: 0, // resolved with a HUMAN, not deflected away
  callbacksHonored: 0, // place-holding callbacks completed on time
  estimateAccuracy: [], // |promised - actual|, lower is better
  quietModeChosen: 0, // demand signal for the silence option
  lowStimChosen: 0, // demand signal for accessibility need

  recordEstimate(promisedSec, actualSec) {
    this.estimateAccuracy.push(Math.abs(promisedSec - actualSec));
  },
  meanEstimateError() {
    if (!this.estimateAccuracy.length) return 0;
    const sum = this.estimateAccuracy.reduce((a, b) => a + b, 0);
    return Math.round(sum / this.estimateAccuracy.length);
  },
};

app.get("/metrics", (_req, res) => {
  // Expose the humane scoreboard. Note what is ABSENT: average handle time as a
  // thing to minimize, deflection rate as a thing to maximize. Removing those
  // from the place of honor is the whole reform; the audio changes follow from
  // it almost automatically.
  res.json({
    callsHandledOnFirstHumanContact: humaneMetrics.callsHandledFirstContact,
    callbacksHonored: humaneMetrics.callbacksHonored,
    meanWaitEstimateErrorSeconds: humaneMetrics.meanEstimateError(),
    quietModeDemand: humaneMetrics.quietModeChosen,
    lowStimDemand: humaneMetrics.lowStimChosen,
  });
});

// ---------------------------------------------------------------------------
const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
  console.log(`Humane hold queue listening on :${PORT}`);
  console.log("Point your Twilio voice webhook at POST /incoming");
});

module.exports = { app, queue, humaneWaitPhrase, humaneMetrics };