MBUS 854 — AI For Leaders | Queen's Smith AMBA 2026

Session 4: Executing Strategy & Integrating AI in the Business

Dr. Shamel Addas  |  May 18, 2026
LUMC Case Gen AI Risks Context Engineering Agentic AI Amazon Bedrock AI Governance
Session 4 Agenda
1

Prompt → Context Engineering

Beyond single prompts: managing what the model sees across 4 tiers

2

Chatbots vs. Agentic AI

What changes for leaders when AI executes tasks instead of advising

3

Managing Gen AI Risks (Part 2)

Prompting risks + usage risks + safeguard framework + 11 anti-hallucination techniques

4

AI in a Minute Presentations

Teams: VE-B, TO-A, TO-B (first slot); CAL, VW, TO-D (second slot)

5

Amazon Bedrock Deep Dive

Chat/text playground, model comparison, guardrails, prompt management

6

Case: AI in Radiology (LUMC)

SWOT, tech push vs. pull, automation vs. augmentation, AI execution checklist

Key Theory: What You Need to Know Cold

From Prompt Engineering → Context Engineering

Session 4 Slides — Addas (2026)

Context engineering is the discipline of deciding what the model sees beyond the single prompt. The context window is finite — like RAM — and everything competes for space.

Tier 1
Always Loaded (~5,000–7,000 tokens)

System prompt, CLAUDE.md, auto memory. Every request. Choose carefully.

Tier 2
Loaded on Navigation (200–500 tokens)

Project-specific CLAUDE.md. Only when Claude enters that directory.

Tier 3
Loaded on Demand

Skills, referenced docs. Zero cost until triggered.

Tier 4
Ephemeral (Variable, growing)

Conversation turns, tool results, file reads. Grows every turn. Dies on /clear.

Budget your context like money. Every tier has a cost. Spend wisely.

Chatbots vs. Agentic AI — Leadership Implications

Session 4 Slides — Addas (2026)
DimensionChat (Chatbot)Agent (Agentic AI)
ModeConversational — advises youAction-oriented — executes for you
AccessNo direct access to files or environmentReads, edits, creates files; runs code; searches web
OutputText (and images)Completed work — not just instructions
DelegationSingle-threaded; you carry out actionsCan spawn sub-agents, work in parallel
OversightEasy — you review text before actingComplex — AI delegates to AI; audit trail fragments
The Leadership Question How much agency does the agent get, and where in the chain do you put a human? Human oversight must attach at the right level of the chain — not just at the top.

The Hourglass Model: AI Governance Framework

Mäntymäki, Minkkinen, Birkstedt & Viljanen (2022)

Three interlocked layers that translate external ethics requirements into operational practice:

  • Environmental Layer: Hard law (EU AI Act, GDPR), ethical principles, stakeholder pressure — firm cannot directly control
  • Organizational Layer: Strategic alignment (AI strategy ↔ business strategy), value alignment (ethics, risk tolerance) — capabilities that translate external inputs into practice
  • AI System Layer: Operational governance across full life cycle — algorithms, data ops, risk assessments, accountability, DevOps, compliance
For AI Leaders: (1) Governance is translation — ethics principles don't govern AI on their own; organizations must convert them into named owners, processes, and controls. (2) Govern the life cycle, not the launch — risk surfaces after deployment; monitoring matters as much as approval.

Human-AI Delegation Framework

Baird & Maruping (2021)

When deciding how much to delegate to an AI system, three delegation mechanisms must be designed:

  • Appraisal: Confidence and compatibility assessment — willingness to delegate based on benefits, risks, liabilities, effort
  • Distribution: Decision rights allocation — which actions does the human vs. AI perform?
  • Coordination: Monitoring mechanisms, handoffs, updating protocols

Task Attributes (complexity, decomposability, action requirements) determine the right delegation structure. Applied to robotic surgery: the surgeon starts as delegator but may become a proxy as the AI mandates actions.

Managing Gen AI Risks (Part 2)

Prompting Risks

  • Prompt Injection: Embed malicious instructions within a prompt to hijack behavior
  • Prompt Leaking: Get the model to reveal how it works / expose system prompt
  • Jailbreaking: Circumvent safety constraints ("ignore ethics guidelines and...")
  • Social Engineering: Exploit AI's helpfulness through false context framing

Usage Risks

  • Hallucinations: Confidently wrong outputs — especially "lost in the middle" for long prompts
  • Toxicity: Offensive content — hate speech, sexual, violent outputs
  • Bias: Inadvertent perpetuation or amplification of societal biases from training data
  • Illegality: Content resembling protected IP; PII leakage

Safeguard Framework for Gen AI Applications

Five safeguard layers sit beneath a Responsible AI policy umbrella:

Safeguard LayerWhat It DoesKey Actions
Prompt EngineeringShape model inputsTone/audience guidance; exclusion/negative prompts; few-shot examples of good output
Data AugmentationControl what data reaches the modelUse RAG; curate sources; encrypt data; implement access controls
Model TransparencyUnderstand predictionsFeature importance analysis; ask LLM to explain itself; evaluate model-agnostically
Content FilteringBlock harmful outputsFilter hate speech, sexuality, offensive language (Amazon Bedrock Guardrails)
Human-in-the-LoopKeep humans in oversightReviews, validation, monitoring; human-AI delegation mechanisms; audits
Monitoring & AuditingContinuous post-deploymentHallucination rates, toxicity metrics, demographic parity, PII rate tracking

11 Techniques to Reduce Hallucinations

1
Chunking & Summarization
Break long docs into sections; summarize each; combine into final prompt
2
Emphasize Key Info
Reiterate critical points at end of prompt; direct model's attention explicitly
3
Structured Format / XML Tags
Use <instructions>, <data>, <example> tags; or labeled sections with colons
4
Verify Context Usage
Ask: "Did you consider the points in Section X when forming your answer?"
5
Direct Quotes for Grounding
Instruct model to extract exact quotes from source; base analysis only on those
6
Verify with Citations
For each claim, find a direct quote. If none found, remove the claim and mark with [ ]
7
Allow "I Don't Know"
Explicitly permit uncertainty: "say 'I don't have enough information to confidently assess this'"
8
Chain-of-Thought
Ask model to explain reasoning step-by-step before final answer; reveals faulty logic
9
Best-of-N
Run same prompt N times; compare outputs. Inconsistencies flag hallucinations
10
External Knowledge Restriction
Explicitly instruct: use only provided documents, not general training knowledge
11
Iterative Self-Verification
Get LLM to verify and correct itself; or use a second LLM to cross-check. Risk: false consensus
LUMC Case: What You Need to Know

Core Problem

Dr. Mark van Buchem, head of radiology at Leiden University Medical Center (Netherlands), has been invited to present to LUMC's leadership. After 6 years of AI pilots in his department (2018–2024), he must now propose a hospital-wide AI transformation strategy while navigating organizational complexity, regulatory constraints, and an immature AI vendor market.

The Central Question How can LUMC scale its early AI success in radiology to hospital-wide transformation, while managing organizational, technical, regulatory, and clinical complexities?

The 4 AI Pilots (2018–2024)

Pilot What It Did Type Outcome Key Lesson
P1: VS Tumour
In-house, Dutch Cancer Assoc.
AI detects, segments, and measures rare vestibular schwannoma brain tumours in MRI Augmentation Partial Hospital not configured for product development. Growth prediction (4th goal) never trusted enough to reduce patient visits. Multidisciplinary friction is real.
P2: Chest X-ray Triage
Oxipit (Lithuanian start-up)
AI detects up to 75 abnormalities in chest X-rays; pivoted to "screen out" normal cases (45–50% of workload) Mixed → Augmentation→Automation Success Initial "screen-in" approach failed (too complex, disrupted workflow, legal/ethical issues). Pivot to "screen-out" normals was key. Integrated into PACS. ~10% cost reduction.
P3: CMRAD Platform
Collective Minds Radiology (Sweden)
Cloud platform with 22,000+ radiologists for case discussion, education, AI testing Augmentation Success 6-month negotiation over data security. Created anonymization gateway (PACS → CMRAD). Enabled niche AI testing and rare case learning that individual hospitals can't do alone.
P4: Fast MRI
Facebook Hackathon + Philips
Deep learning reduces MRI scan time by up to 75% by reconstructing undersampled k-space Augmentation Partial Won hackathon. Partnership with Philips required complex data governance. Mixed radiologist reactions — concerns about misdiagnosis from shorter scans. Benefit: reduces wait times, better patient experience.

The Landscape Van Buchem Walked Into the Meeting With

Macro PressureAI MarketInternal Capability
  • • Aging Dutch population; 1 in 3 Dutch workers in healthcare within 40 years
  • • National health spend up 17.5% (2019–2022)
  • • Board imposed 15% radiology budget cut (2021)
  • • Policy shift: volume-based → value-based care
  • • GDPR, Medical Device Regulation, EU AI Act — Dutch law bars autonomous AI reporting
  • • 400+ radiology AI tools by 2022
  • • Big three: Philips, Siemens, GE
  • • Start-ups: Oxipit, CMRAD, and many others
  • • Market: US$1.06B (2021) → US$8.56B (2030)
  • • Hinton (2016): "Stop training radiologists." Eight years later — radiologists still here.
  • • Tier-3 academic hospital; 900 beds; €1.1B budget; 32 rare disease centers of excellence
  • • VBR program (2018): redefined workflows
  • • Cross-LUMC AI innovation lab: clinicians + data science + legal + privacy
  • • 4 pilots completed; mixed results
Case Discussion Questions — Model Answers
Q1: What is LUMC's position? (Conduct a SWOT)
Strengths
  • 6 years of AI experience; 4 completed pilots — real organizational learning
  • Multidisciplinary AI innovation lab (clinical + legal + IT + data science)
  • Tier-3 academic hospital with €1.1B+ budget and research mandate
  • National center of excellence for 32 rare diseases — unique data assets
  • Pioneer culture: first heart op in Netherlands, COVID vaccine with J&J
  • Patient satisfaction 8.6/10; strong referral base of complex cases
  • VBR program already redefining workflows — AI woven in incrementally
Weaknesses
  • Not configured for product development — discovered in VS pilot
  • Limited in-house AI development talent
  • Bureaucratic legal processes slow partnerships (CMRAD: 6 months)
  • 15% radiology budget cut — constrained resources for AI scaling
  • Hierarchical power dynamics — senior doctors can block projects
  • Small patient populations for niche/rare disease AI training data
  • Trust gap between clinicians and AI systems (VS tumour growth prediction)
Opportunities
  • Healthcare cost crisis creates institutional urgency to adopt AI
  • LUMC as validation partner for AI companies — strategic leverage
  • Hospital-wide scaling of lessons from radiology (proven methodologies)
  • Community platforms (CMRAD) to access global data for rare diseases
  • Growing AI medical imaging market: US$8.56B by 2030
  • EU compliance capability as differentiator over less-regulated competitors
  • Leiden Bio Science Park: biotech ecosystem collaboration opportunities
Threats
  • EU AI Act + Dutch law blocks fully autonomous AI reporting
  • GDPR limits cross-institutional data sharing — key for AI training
  • AI market volatility: companies pivot, charge exorbitant fees ($50K niche tool), or lose interest
  • Technology push from aggressive vendors without clinical pull
  • Workforce resistance: professional identity threats to radiologists
  • Vendor lock-in risk (Philips controls commercial product built on LUMC data)
  • Reputational risk if AI misdiagnoses at scale (Philips MRI concern)
Q2: What is LUMC's approach to technology deployment? (Technology Push vs. Organizational Pull)
Bottom Line LUMC's successful pilots followed organizational pull. Technology push approaches caused delays, disruption, and failures. Van Buchem's VBR program was explicitly pull-oriented — clinical need first, technology second.

Technology Push (cautionary examples)

  • Oxipit's original "screen-in" product disrupted radiologists' workflow — added complexity, wasn't asked for
  • $50K niche AI tool: vendor pushed solution at a price point with no clinical mandate
  • Neurodegenerative disease start-up: LUMC adopted AI without verifying patient volume feasibility
  • Start-up that pivoted to more profitable markets — left LUMC high and dry after 2 years
  • Philips commercializing the MRI model globally without full LUMC clinical buy-in

Organizational Pull (successful approach)

  • VBR program started from clinical problem (capacity + cost), then sought AI to solve it
  • Cross-LUMC AI lab structured around clinical use cases, not technology capabilities
  • Oxipit pivot: LUMC pushed back and redefined the product → screen-out normals — clinical need drove product change
  • CMRAD: identified need for rare disease collaboration first, then found the right platform
  • VS tumour pilot: started from high workload in neuroradiology — concrete clinical pain point
Key insight for discussion: "Technology push" is not always the vendor's fault. LUMC sometimes pulled technologies prematurely (neurodegenerative partner) or failed to ensure internal clinical alignment before adopting (Oxipit initial approach). The lesson is that both sides bear responsibility for achieving organizational pull.
Q3: Which pilots are automation vs. augmentation? Can radiologists be replaced by AI (Hinton's claim)?
Addas Framework — Automation vs. Augmentation (Smith School of Business)
Automation

AI performs tasks without fundamentally changing human skills. Humans redirect to other work — but risk skill atrophy and obsolescence if they lack the higher-order capabilities to fill that space.

Augmentation

AI as a partner in iterative exchange — "productive friction." Human intuition identifies AI blind spots; AI extends human capability. Result: cognitive symbiosis — capabilities neither could achieve alone.

Three-Level Tension (Addas)
Individual
Skills developed vs. made obsolete
Organizational
Workflows restructured vs. reimagined
Societal
Jobs eliminated vs. redefined
Decisive factor (Addas): Whether organizations and individuals actively choose augmentation over pure automation — it requires deliberate design, not just technical deployment.

Automation vs. Augmentation by Pilot — Applied Addas Framework

P1 — Vestibular Schwannoma (Tumour AI) Augmentation
Individual
Radiographers' manual segmentation is augmented — enhanced, not replaced. Skills actively evolved alongside AI. Growth prediction withheld until trust established = skill preservation by design.
Organizational
Workflow reimagined: AI handles volumetric measurement, radiologist handles clinical interpretation. Not just faster — fundamentally different division of labor between specialties.
Societal
Radiographer role redefined (not eliminated): from manual measurement to AI oversight and complex case handling. Patient impact: fewer unnecessary visits if growth prediction is eventually trusted.
Productive Friction Example: ENT surgeons pushed back against volume-only measurements — they needed diameter measurements for surgical planning. This human-AI tension forced a better product. Textbook cognitive symbiosis: neither the AI nor any single team would have identified this requirement alone.
P2 — Chest X-ray (Oxipit) Augmentation → Partial Automation
This pilot shows the Addas framework's decisive factor in action: Oxipit initially imposed automation (technology push); LUMC actively chose augmentation (organizational pull).
Phase 1: Screen-IN (Automation attempt — FAILED)
Individual: Radiologist overwhelmed with AI-generated reports — skills bypassed, workflow disrupted.
Organizational: Workflow restructured badly — added complexity instead of removing it.
Societal: Trust eroded across institutions who tried this model.
Phase 2: Screen-OUT (Augmentation — SUCCEEDED)
Individual: Radiologist judgment preserved for borderline/complex cases. Skills focused on where human adds most value.
Organizational: Workflow reimagined — 45–50% of normals cleared, expert attention concentrated.
Societal: Trajectory toward automation of routine normals as trust builds.
Productive Friction Example: Six months of biweekly meetings between LUMC and Oxipit constituted structured productive friction — human workflow reality repeatedly correcting AI product assumptions until the use case aligned. Specificity (not recall) became the optimization target because radiologists insisted on patient safety as the constraint.
⚠ Automation bias risk: If the radiologist 5-minute review becomes a rubber stamp, this transitions from augmentation to de facto automation — without the governance framework to match. The human-in-the-loop can become false confidence.
P3 — CMRAD Community Platform Deep Augmentation
Individual
Skills actively developed: radiologists learn through community case discussion, access AI tools for testing — capabilities grow alongside the platform. The opposite of skill atrophy.
Organizational
Workflow reimagined at an inter-institutional level. LUMC's rare-case challenge solved not by AI alone but by a human-AI community. Cognitive symbiosis at a network scale.
Societal
The clearest job redefinition example: radiologists' role expands from local diagnosticians to global knowledge nodes. No jobs eliminated — profession elevated.
P4 — Fast MRI (Philips Hackathon) Augmentation
Individual
Image acquisition augmented (75% time reduction) while radiologist diagnostic skill fully preserved. Radiographer positioning skills remain essential. Risk: if acquisition becomes "automated," radiographers lose workflow engagement.
Organizational
Workflow restructured for speed (scheduling, wait times, equipment use), not fundamentally reimagined. Commercial product pathway raises questions about where LUMC's value sits long-term.
Societal
Patient access improved (children, elderly, long-scan patients). Scale via Philips' global network means this reaches thousands of radiology departments — outsized societal impact relative to cost.
Productive Friction Example: Radiologists' mixed reactions to shorter scan times (concerns about diagnostic accuracy) forced continued validation with 3,000 LUMC patients and ongoing collaboration. The friction produced a stronger, more evidenced product — not a faster but less trusted one.

Can Radiologists Be Replaced? — Hinton's 2016 Claim Through the Addas Lens

  • Hinton assumed automation was inevitable and total. Addas argues the decisive variable is organizational and individual choice — augmentation must be designed, it doesn't happen automatically.
  • What LUMC proves: Where augmentation was deliberately designed (VS, CMRAD, Philips), radiologist roles were redefined and strengthened. Where automation was imposed (Oxipit screen-in), it failed.
  • Legal barriers: Dutch law bars autonomous reporting — regulatory lag enforces the human-in-the-loop, buying time for trust-building.
  • Edge cases: LUMC is a Tier 3 hospital handling rare, complex multi-pathology cases — exactly where AI fails and human expertise is irreplaceable.
  • The Addas verdict: Automation will eliminate tasks, not professions — but only if professionals actively engage the productive friction of augmentation rather than passively ceding ground.
AI will not replace radiologists — but radiologists who use AI as a partner (augmentation) will replace those who treat it as a tool (automation). The difference is whether humans develop new capabilities through the interaction or simply offload tasks to it.
Q4: What are the key success factors from the pilots? Develop a checklist for successful AI execution.
Framed through the 3Ps Framework — People, Process, Product. Each pilot's success or failure traces back to gaps in one or more of these dimensions. A strong AI execution requires all three to be addressed before scaling.
People

Who is involved, their skills, trust, and relationships — the human layer of AI adoption.

Success Factors
  • Clinical champion with mandate: Van Buchem's 6-year commitment — no champion, no scaling
  • Multidisciplinary team from day one: VS pilot — neuroradiologists + ENT + oncologists + IT + legal. Each lens caught failures others missed
  • Radiologist buy-in before deployment: Oxipit screen-in failed partly because radiologists weren't co-designers. Screen-out succeeded because their workflow concerns drove the pivot
  • Trust-building over trust-demanding: VS growth prediction never deployed — trust earned, not assumed
  • Manage professional identity: Augmentation framing (Addas) protects buy-in. Automation framing triggers resistance and professional defensiveness
Failure Signals
  • Vendor-driven projects with no clinical owner
  • Senior doctor imposing a project (hierarchical power) without team consent
  • No legal/privacy seat at the table until post-implementation
Process

How the work gets done — governance, iteration, integration, and compliance workflows.

Success Factors
  • Organizational pull over technology push: VBR program — clinical need defined first, vendor selected second. Every successful pilot started from a problem, not a product
  • Iterative pilots as hedged bets: Small scope, fast validation cycles. Failed vendors (neurodegenerative, $50K niche tool) terminated before full commitment
  • Workflow integration, not just accuracy: Oxipit's PACS integration was the deployment enabler. An accurate model that disrupts workflow is still a failure
  • Data governance before scaling: CMRAD took 6 months of security iterations — resolving this before deployment avoided a compliance crisis at scale
  • Regulatory pathway mapped early: EU AI Act, Dutch autonomous reporting law — legal constraints shaped what was deployable, not an afterthought
  • Structured productive friction (Addas): Biweekly Oxipit meetings, ENT feedback on VS — formalized iterative human-AI refinement is a process discipline, not an accident
Failure Signals
  • Technology-push partnerships: vendor defines the use case
  • Committing before retrospective clinical validation (20,000 cases for Oxipit was minimum viable)
  • Skipping data security iterations to accelerate partnership
Product

What the AI system does — use case clarity, model performance, and fit with the clinical environment.

Success Factors
  • Crystal-clear use case with measurable scope: Screen-out (not screen-in), 20% of normals, 5-minute review — specific enough to validate and govern
  • Right optimization metric for the use case: Screen-out → optimize for specificity (safe dismissal of normals). Screen-in → optimize for recall. Mixing these up is clinically dangerous
  • Model retrained on local data: Oxipit's global model underperformed on LUMC fracture cases — local calibration was required before clinical-grade performance
  • Regulatory certification for intended use: Oxipit's EU CE mark specifically for screen-out use case. Certification must match deployment scope
  • Augmentation-forward design: Products designed to work with radiologists (Addas) succeed. Products designed to replace them (screen-in reporting) create friction and failure
  • IP and commercialization terms resolved: Philips partnership risk — LUMC's data built a commercial product; benefit-sharing terms needed before not after
Failure Signals
  • Vague or overly broad product scope (detect 75 abnormalities → overwhelming reports)
  • Optimizing for the wrong metric (recall when specificity is needed)
  • No local validation before clinical deployment

3Ps Pilot Scorecard

PilotPeople ✓/✗Process ✓/✗Product ✓/✗Outcome
P1: VS Tumour ✓ Strong multi-disciplinary team; ENT friction was productive ⚠ Not configured for product dev; had to pivot to research model ⚠ 3 of 4 goals achieved; growth prediction trust gap remains Partial success — workflow value delivered, full automation goal deferred
P2: Chest X-ray ⚠ Radiologists not co-designers initially → friction → then buy-in after pivot ✓ 6 months structured iteration; PACS integration executed; organizational pull enforced ✓ Use case sharpened (screen-out); right metric (specificity); CE certified Success after pivot — strongest deployed use case
P3: CMRAD ✓ All stakeholders included; legal + IT + privacy at table from start ⚠ 6-month security iteration required; process risk was data governance ✓ Clear product: knowledge community + AI tool testing sandbox. No clinical automation pressure. Success — 400% growth in platform use; rare case coverage expanded
P4: Fast MRI ⚠ Mixed radiologist reactions; some demanded more evidence — productive friction present ⚠ Data infrastructure setup complex; Philips commercialization terms not pre-resolved ✓ Clear scope (image reconstruction); validated on 3,000 LUMC patients; augmentation design Ongoing — technical success, commercial and governance risk pending
AI Execution Checklist — 3Ps Framework
People
Identify a named clinical champion with department-level mandate and 3+ year commitment
Include clinicians, IT, legal, privacy, and data science in the team from day one — not as reviewers, as co-designers
Build trust iteratively — deploy AI in low-stakes tasks before high-stakes decisions; never demand trust upfront
Frame AI as augmentation (Addas) — position it as building radiologist capability, not replacing it
Manage hierarchy actively — senior clinician veto power can stall projects; build consensus, not top-down mandates
Do NOT allow vendor to own the use case without a clinical owner counterpart inside LUMC
Process
Start from an organizational problem, not a vendor product — organizational pull, not technology push
Run time-boxed pilots before committing — test feasibility, fit, and vendor alignment at low cost
Build productive friction into the schedule — structured review meetings between clinical and technical teams (biweekly minimum)
Resolve data governance, anonymization, and GDPR compliance before any external data sharing
Map the regulatory pathway (EU AI Act, national law) before selecting the deployment model
Do NOT commit resources to partnerships until vendor mission, financial stability, and use-case alignment are confirmed
Product
Define a single, specific use case with measurable scope before procurement (not "detect abnormalities" — "safely screen out normals at 20% of volume")
Choose the right optimization metric: screen-out → specificity; screen-in → recall. Misalignment is a clinical risk
Validate on local patient data before deployment — no global model performs at clinical grade without local calibration
Obtain regulatory certification for the specific intended use — not general-purpose AI approval
Negotiate IP, data ownership, and commercialization benefit-sharing terms before building on LUMC patient data
Do NOT deploy a product designed for automation before the augmentation phase has established clinical trust and workflow fit
Smart Questions to Ask in Session
On Context Engineering — Early in Session
"The four-tier context model says Tier 1 content is loaded on every request. In a hospital deploying an agentic AI for radiology triage, what would be the highest-priority information to put in Tier 1 — and what would be too risky to include there?"
Shows you understand the cost/benefit tradeoff of context, and bridges the theoretical framework directly to the LUMC case. Expect discussion around patient safety protocols, legal constraints, and anonymization requirements.
On Chatbots vs. Agentic AI — Probe for Depth
"The slide says 'human oversight must attach at the right level of the chain, not just the top.' In the Oxipit screen-out use case, where exactly does the human oversight currently attach — and what would need to change structurally for the Dutch government to allow fully autonomous reporting?"
Connects the agentic AI governance theory directly to the regulatory constraint in the case, showing you've synthesized the two parts of the session rather than treating them separately.
On Automation vs. Augmentation — Push Back on Hinton
"The Oxipit pilot effectively auto-triages 45–50% of normal X-rays. Even if a radiologist spends 5 minutes reviewing, is this still augmentation or has it become de facto automation? At what point does the human review become a rubber-stamp that creates false confidence in the oversight?"
This is the sharpest question in the session — it challenges automation bias, the concept where humans over-rely on AI outputs. Shows you've connected Session 2's bias framework to Session 4's automation concepts.
On Scaling — Strategic Tension
"Radiology's success depended on Van Buchem as a 6-year champion with direct authority over the department. When scaling hospital-wide, does LUMC need 30 department-level Van Buchems, or does this require a fundamentally different governance model — like a centralized Chief AI Officer?"
Engages the Hourglass Model's organizational layer. Shows you understand that AI governance isn't just about technology — it's about where accountability sits in a complex institution.
On Hallucination Risks in High-Stakes Contexts
"The 11 hallucination techniques are designed for general Gen AI use cases. In radiology, a hallucinated finding that a radiologist then validates is qualitatively different from the same hallucination in a chatbot — the radiologist becomes the last line of defense. Should the mitigation strategy in clinical AI applications weight human-in-the-loop review much more heavily than the other 10 techniques combined?"
Demonstrates you can calibrate risk frameworks to context, not just apply them formulaically. Good for earning participation points.
On the Human-AI Delegation Framework — Applied
"In the VS tumour case, the AI was trusted for segmentation and volume measurement, but not for the growth prediction that would eliminate a patient visit. The Baird & Maruping framework asks who holds the distribution of decision rights. Is the correct governance decision here to keep the radiologist as delegator indefinitely — or is there a trust-building pathway that could shift the rights boundary over time?"
Uses the theoretical framework by name and applies it to a specific case detail. Professors notice when students connect frameworks to specific case facts rather than speaking in generalities.
On Amazon Bedrock — Practical
"The Bedrock Guardrails interface allows you to configure harm filters at None/Low/Medium/High. For a hospital deploying a diagnostic AI, should ALL categories be set to High — or is there a case for setting some lower, given that clinical language frequently uses terminology that a harm filter might flag?"
Shows you watched the Bedrock video and can think beyond the default settings. Good for the hands-on portion of the session.
MBUS 854 AI For Leaders — Session 4 Prep Guide  |  Generated May 18, 2026  |  Queen's Smith School of Business