MBUS 854 — Session 4 Prep

Session 4 Agenda

1

Prompt → Context Engineering

Beyond single prompts: managing what the model sees across 4 tiers

2

Chatbots vs. Agentic AI

What changes for leaders when AI executes tasks instead of advising

3

Managing Gen AI Risks (Part 2)

Prompting risks + usage risks + safeguard framework + 11 anti-hallucination techniques

4

AI in a Minute Presentations

Teams: VE-B, TO-A, TO-B (first slot); CAL, VW, TO-D (second slot)

5

Amazon Bedrock Deep Dive

Chat/text playground, model comparison, guardrails, prompt management

6

Case: AI in Radiology (LUMC)

SWOT, tech push vs. pull, automation vs. augmentation, AI execution checklist

Key Theory: What You Need to Know Cold

From Prompt Engineering → Context Engineering

Session 4 Slides — Addas (2026)

Context engineering is the discipline of deciding what the model sees beyond the single prompt. The context window is finite — like RAM — and everything competes for space.

Tier 1

Always Loaded (~5,000–7,000 tokens)

System prompt, CLAUDE.md, auto memory. Every request. Choose carefully.

Tier 2

Loaded on Navigation (200–500 tokens)

Project-specific CLAUDE.md. Only when Claude enters that directory.

Tier 3

Loaded on Demand

Skills, referenced docs. Zero cost until triggered.

Tier 4

Ephemeral (Variable, growing)

Conversation turns, tool results, file reads. Grows every turn. Dies on /clear.

Budget your context like money. Every tier has a cost. Spend wisely.

Chatbots vs. Agentic AI — Leadership Implications

Session 4 Slides — Addas (2026)

Dimension	Chat (Chatbot)	Agent (Agentic AI)
Mode	Conversational — advises you	Action-oriented — executes for you
Access	No direct access to files or environment	Reads, edits, creates files; runs code; searches web
Output	Text (and images)	Completed work — not just instructions
Delegation	Single-threaded; you carry out actions	Can spawn sub-agents, work in parallel
Oversight	Easy — you review text before acting	Complex — AI delegates to AI; audit trail fragments

The Leadership Question How much agency does the agent get, and where in the chain do you put a human? Human oversight must attach at the right level of the chain — not just at the top.

The Hourglass Model: AI Governance Framework

Mäntymäki, Minkkinen, Birkstedt & Viljanen (2022)

Three interlocked layers that translate external ethics requirements into operational practice:

Environmental Layer: Hard law (EU AI Act, GDPR), ethical principles, stakeholder pressure — firm cannot directly control
Organizational Layer: Strategic alignment (AI strategy ↔ business strategy), value alignment (ethics, risk tolerance) — capabilities that translate external inputs into practice
AI System Layer: Operational governance across full life cycle — algorithms, data ops, risk assessments, accountability, DevOps, compliance

For AI Leaders: (1) Governance is translation — ethics principles don't govern AI on their own; organizations must convert them into named owners, processes, and controls. (2) Govern the life cycle, not the launch — risk surfaces after deployment; monitoring matters as much as approval.

Human-AI Delegation Framework

Baird & Maruping (2021)

When deciding how much to delegate to an AI system, three delegation mechanisms must be designed:

Appraisal: Confidence and compatibility assessment — willingness to delegate based on benefits, risks, liabilities, effort
Distribution: Decision rights allocation — which actions does the human vs. AI perform?
Coordination: Monitoring mechanisms, handoffs, updating protocols

Task Attributes (complexity, decomposability, action requirements) determine the right delegation structure. Applied to robotic surgery: the surgeon starts as delegator but may become a proxy as the AI mandates actions.

Managing Gen AI Risks (Part 2)

Prompting Risks

Prompt Injection: Embed malicious instructions within a prompt to hijack behavior
Prompt Leaking: Get the model to reveal how it works / expose system prompt
Jailbreaking: Circumvent safety constraints ("ignore ethics guidelines and...")
Social Engineering: Exploit AI's helpfulness through false context framing

Usage Risks

Hallucinations: Confidently wrong outputs — especially "lost in the middle" for long prompts
Toxicity: Offensive content — hate speech, sexual, violent outputs
Bias: Inadvertent perpetuation or amplification of societal biases from training data
Illegality: Content resembling protected IP; PII leakage

Safeguard Framework for Gen AI Applications

Five safeguard layers sit beneath a Responsible AI policy umbrella:

Safeguard Layer	What It Does	Key Actions
Prompt Engineering	Shape model inputs	Tone/audience guidance; exclusion/negative prompts; few-shot examples of good output
Data Augmentation	Control what data reaches the model	Use RAG; curate sources; encrypt data; implement access controls
Model Transparency	Understand predictions	Feature importance analysis; ask LLM to explain itself; evaluate model-agnostically
Content Filtering	Block harmful outputs	Filter hate speech, sexuality, offensive language (Amazon Bedrock Guardrails)
Human-in-the-Loop	Keep humans in oversight	Reviews, validation, monitoring; human-AI delegation mechanisms; audits
Monitoring & Auditing	Continuous post-deployment	Hallucination rates, toxicity metrics, demographic parity, PII rate tracking

11 Techniques to Reduce Hallucinations

1

Chunking & Summarization

Break long docs into sections; summarize each; combine into final prompt

2

Emphasize Key Info

Reiterate critical points at end of prompt; direct model's attention explicitly

3

Structured Format / XML Tags

Use <instructions>, <data>, <example> tags; or labeled sections with colons

4

Verify Context Usage

Ask: "Did you consider the points in Section X when forming your answer?"

5

Direct Quotes for Grounding

Instruct model to extract exact quotes from source; base analysis only on those

6

Verify with Citations

For each claim, find a direct quote. If none found, remove the claim and mark with [ ]

7

Allow "I Don't Know"

Explicitly permit uncertainty: "say 'I don't have enough information to confidently assess this'"

8

Chain-of-Thought

Ask model to explain reasoning step-by-step before final answer; reveals faulty logic

9

Best-of-N

Run same prompt N times; compare outputs. Inconsistencies flag hallucinations

10

External Knowledge Restriction

Explicitly instruct: use only provided documents, not general training knowledge

11

Iterative Self-Verification

Get LLM to verify and correct itself; or use a second LLM to cross-check. Risk: false consensus

LUMC Case: What You Need to Know

Core Problem

Dr. Mark van Buchem, head of radiology at Leiden University Medical Center (Netherlands), has been invited to present to LUMC's leadership. After 6 years of AI pilots in his department (2018–2024), he must now propose a hospital-wide AI transformation strategy while navigating organizational complexity, regulatory constraints, and an immature AI vendor market.

The Central Question How can LUMC scale its early AI success in radiology to hospital-wide transformation, while managing organizational, technical, regulatory, and clinical complexities?

The 4 AI Pilots (2018–2024)

Pilot	What It Did	Type	Outcome	Key Lesson
P1: VS Tumour In-house, Dutch Cancer Assoc.	AI detects, segments, and measures rare vestibular schwannoma brain tumours in MRI	Augmentation	Partial	Hospital not configured for product development. Growth prediction (4th goal) never trusted enough to reduce patient visits. Multidisciplinary friction is real.
P2: Chest X-ray Triage Oxipit (Lithuanian start-up)	AI detects up to 75 abnormalities in chest X-rays; pivoted to "screen out" normal cases (45–50% of workload)	Mixed → Augmentation→Automation	Success	Initial "screen-in" approach failed (too complex, disrupted workflow, legal/ethical issues). Pivot to "screen-out" normals was key. Integrated into PACS. ~10% cost reduction.
P3: CMRAD Platform Collective Minds Radiology (Sweden)	Cloud platform with 22,000+ radiologists for case discussion, education, AI testing	Augmentation	Success	6-month negotiation over data security. Created anonymization gateway (PACS → CMRAD). Enabled niche AI testing and rare case learning that individual hospitals can't do alone.
P4: Fast MRI Facebook Hackathon + Philips	Deep learning reduces MRI scan time by up to 75% by reconstructing undersampled k-space	Augmentation	Partial	Won hackathon. Partnership with Philips required complex data governance. Mixed radiologist reactions — concerns about misdiagnosis from shorter scans. Benefit: reduces wait times, better patient experience.

The Landscape Van Buchem Walked Into the Meeting With

Macro Pressure	AI Market	Internal Capability
• Aging Dutch population; 1 in 3 Dutch workers in healthcare within 40 years • National health spend up 17.5% (2019–2022) • Board imposed 15% radiology budget cut (2021) • Policy shift: volume-based → value-based care • GDPR, Medical Device Regulation, EU AI Act — Dutch law bars autonomous AI reporting	• 400+ radiology AI tools by 2022 • Big three: Philips, Siemens, GE • Start-ups: Oxipit, CMRAD, and many others • Market: US$1.06B (2021) → US$8.56B (2030) • Hinton (2016): "Stop training radiologists." Eight years later — radiologists still here.	• Tier-3 academic hospital; 900 beds; €1.1B budget; 32 rare disease centers of excellence • VBR program (2018): redefined workflows • Cross-LUMC AI innovation lab: clinicians + data science + legal + privacy • 4 pilots completed; mixed results

Case Discussion Questions — Model Answers

Q1: What is LUMC's position? (Conduct a SWOT)

Strengths

6 years of AI experience; 4 completed pilots — real organizational learning
Multidisciplinary AI innovation lab (clinical + legal + IT + data science)
Tier-3 academic hospital with €1.1B+ budget and research mandate
National center of excellence for 32 rare diseases — unique data assets
Pioneer culture: first heart op in Netherlands, COVID vaccine with J&J
Patient satisfaction 8.6/10; strong referral base of complex cases
VBR program already redefining workflows — AI woven in incrementally

Weaknesses

Not configured for product development — discovered in VS pilot
Limited in-house AI development talent
Bureaucratic legal processes slow partnerships (CMRAD: 6 months)
15% radiology budget cut — constrained resources for AI scaling
Hierarchical power dynamics — senior doctors can block projects
Small patient populations for niche/rare disease AI training data
Trust gap between clinicians and AI systems (VS tumour growth prediction)

Opportunities

Healthcare cost crisis creates institutional urgency to adopt AI
LUMC as validation partner for AI companies — strategic leverage
Hospital-wide scaling of lessons from radiology (proven methodologies)
Community platforms (CMRAD) to access global data for rare diseases
Growing AI medical imaging market: US$8.56B by 2030
EU compliance capability as differentiator over less-regulated competitors
Leiden Bio Science Park: biotech ecosystem collaboration opportunities

Threats

EU AI Act + Dutch law blocks fully autonomous AI reporting
GDPR limits cross-institutional data sharing — key for AI training
AI market volatility: companies pivot, charge exorbitant fees ($50K niche tool), or lose interest
Technology push from aggressive vendors without clinical pull
Workforce resistance: professional identity threats to radiologists
Vendor lock-in risk (Philips controls commercial product built on LUMC data)
Reputational risk if AI misdiagnoses at scale (Philips MRI concern)

Q2: What is LUMC's approach to technology deployment? (Technology Push vs. Organizational Pull)

Bottom Line LUMC's successful pilots followed organizational pull. Technology push approaches caused delays, disruption, and failures. Van Buchem's VBR program was explicitly pull-oriented — clinical need first, technology second.

Technology Push (cautionary examples)

Oxipit's original "screen-in" product disrupted radiologists' workflow — added complexity, wasn't asked for
$50K niche AI tool: vendor pushed solution at a price point with no clinical mandate
Neurodegenerative disease start-up: LUMC adopted AI without verifying patient volume feasibility
Start-up that pivoted to more profitable markets — left LUMC high and dry after 2 years
Philips commercializing the MRI model globally without full LUMC clinical buy-in

Organizational Pull (successful approach)

VBR program started from clinical problem (capacity + cost), then sought AI to solve it
Cross-LUMC AI lab structured around clinical use cases, not technology capabilities
Oxipit pivot: LUMC pushed back and redefined the product → screen-out normals — clinical need drove product change
CMRAD: identified need for rare disease collaboration first, then found the right platform
VS tumour pilot: started from high workload in neuroradiology — concrete clinical pain point

Key insight for discussion: "Technology push" is not always the vendor's fault. LUMC sometimes pulled technologies prematurely (neurodegenerative partner) or failed to ensure internal clinical alignment before adopting (Oxipit initial approach). The lesson is that both sides bear responsibility for achieving organizational pull.

Q3: Which pilots are automation vs. augmentation? Can radiologists be replaced by AI (Hinton's claim)?

Addas Framework — Automation vs. Augmentation (Smith School of Business)

Automation

AI performs tasks without fundamentally changing human skills. Humans redirect to other work — but risk skill atrophy and obsolescence if they lack the higher-order capabilities to fill that space.

Augmentation

AI as a partner in iterative exchange — "productive friction." Human intuition identifies AI blind spots; AI extends human capability. Result: cognitive symbiosis — capabilities neither could achieve alone.

Three-Level Tension (Addas)

Individual

Skills developed vs. made obsolete

Organizational

Workflows restructured vs. reimagined

Societal

Jobs eliminated vs. redefined

Decisive factor (Addas): Whether organizations and individuals actively choose augmentation over pure automation — it requires deliberate design, not just technical deployment.

Automation vs. Augmentation by Pilot — Applied Addas Framework

P1 — Vestibular Schwannoma (Tumour AI) Augmentation

Individual

Radiographers' manual segmentation is augmented — enhanced, not replaced. Skills actively evolved alongside AI. Growth prediction withheld until trust established = skill preservation by design.

Organizational

Workflow reimagined: AI handles volumetric measurement, radiologist handles clinical interpretation. Not just faster — fundamentally different division of labor between specialties.

Societal

Radiographer role redefined (not eliminated): from manual measurement to AI oversight and complex case handling. Patient impact: fewer unnecessary visits if growth prediction is eventually trusted.

Productive Friction Example: ENT surgeons pushed back against volume-only measurements — they needed diameter measurements for surgical planning. This human-AI tension forced a better product. Textbook cognitive symbiosis: neither the AI nor any single team would have identified this requirement alone.

P2 — Chest X-ray (Oxipit) Augmentation → Partial Automation

This pilot shows the Addas framework's decisive factor in action: Oxipit initially imposed automation (technology push); LUMC actively chose augmentation (organizational pull).

Phase 1: Screen-IN (Automation attempt — FAILED)

Individual: Radiologist overwhelmed with AI-generated reports — skills bypassed, workflow disrupted.
Organizational: Workflow restructured badly — added complexity instead of removing it.
Societal: Trust eroded across institutions who tried this model.

Phase 2: Screen-OUT (Augmentation — SUCCEEDED)

Individual: Radiologist judgment preserved for borderline/complex cases. Skills focused on where human adds most value.
Organizational: Workflow reimagined — 45–50% of normals cleared, expert attention concentrated.
Societal: Trajectory toward automation of routine normals as trust builds.

Productive Friction Example: Six months of biweekly meetings between LUMC and Oxipit constituted structured productive friction — human workflow reality repeatedly correcting AI product assumptions until the use case aligned. Specificity (not recall) became the optimization target because radiologists insisted on patient safety as the constraint.

⚠ Automation bias risk: If the radiologist 5-minute review becomes a rubber stamp, this transitions from augmentation to de facto automation — without the governance framework to match. The human-in-the-loop can become false confidence.

P3 — CMRAD Community Platform Deep Augmentation

Individual

Skills actively developed: radiologists learn through community case discussion, access AI tools for testing — capabilities grow alongside the platform. The opposite of skill atrophy.

Organizational

Workflow reimagined at an inter-institutional level. LUMC's rare-case challenge solved not by AI alone but by a human-AI community. Cognitive symbiosis at a network scale.

Societal

The clearest job redefinition example: radiologists' role expands from local diagnosticians to global knowledge nodes. No jobs eliminated — profession elevated.

P4 — Fast MRI (Philips Hackathon) Augmentation

Individual

Image acquisition augmented (75% time reduction) while radiologist diagnostic skill fully preserved. Radiographer positioning skills remain essential. Risk: if acquisition becomes "automated," radiographers lose workflow engagement.

Organizational

Workflow restructured for speed (scheduling, wait times, equipment use), not fundamentally reimagined. Commercial product pathway raises questions about where LUMC's value sits long-term.

Societal

Patient access improved (children, elderly, long-scan patients). Scale via Philips' global network means this reaches thousands of radiology departments — outsized societal impact relative to cost.

Productive Friction Example: Radiologists' mixed reactions to shorter scan times (concerns about diagnostic accuracy) forced continued validation with 3,000 LUMC patients and ongoing collaboration. The friction produced a stronger, more evidenced product — not a faster but less trusted one.

Can Radiologists Be Replaced? — Hinton's 2016 Claim Through the Addas Lens

Hinton assumed automation was inevitable and total. Addas argues the decisive variable is organizational and individual choice — augmentation must be designed, it doesn't happen automatically.
What LUMC proves: Where augmentation was deliberately designed (VS, CMRAD, Philips), radiologist roles were redefined and strengthened. Where automation was imposed (Oxipit screen-in), it failed.
Legal barriers: Dutch law bars autonomous reporting — regulatory lag enforces the human-in-the-loop, buying time for trust-building.
Edge cases: LUMC is a Tier 3 hospital handling rare, complex multi-pathology cases — exactly where AI fails and human expertise is irreplaceable.
The Addas verdict: Automation will eliminate tasks, not professions — but only if professionals actively engage the productive friction of augmentation rather than passively ceding ground.

AI will not replace radiologists — but radiologists who use AI as a partner (augmentation) will replace those who treat it as a tool (automation). The difference is whether humans develop new capabilities through the interaction or simply offload tasks to it.

Q4: What are the key success factors from the pilots? Develop a checklist for successful AI execution.

Framed through the 3Ps Framework — People, Process, Product. Each pilot's success or failure traces back to gaps in one or more of these dimensions. A strong AI execution requires all three to be addressed before scaling.

People

Who is involved, their skills, trust, and relationships — the human layer of AI adoption.

Success Factors

Clinical champion with mandate: Van Buchem's 6-year commitment — no champion, no scaling
Multidisciplinary team from day one: VS pilot — neuroradiologists + ENT + oncologists + IT + legal. Each lens caught failures others missed
Radiologist buy-in before deployment: Oxipit screen-in failed partly because radiologists weren't co-designers. Screen-out succeeded because their workflow concerns drove the pivot
Trust-building over trust-demanding: VS growth prediction never deployed — trust earned, not assumed
Manage professional identity: Augmentation framing (Addas) protects buy-in. Automation framing triggers resistance and professional defensiveness

Failure Signals

Vendor-driven projects with no clinical owner
Senior doctor imposing a project (hierarchical power) without team consent
No legal/privacy seat at the table until post-implementation

Process

How the work gets done — governance, iteration, integration, and compliance workflows.

Success Factors

Organizational pull over technology push: VBR program — clinical need defined first, vendor selected second. Every successful pilot started from a problem, not a product
Iterative pilots as hedged bets: Small scope, fast validation cycles. Failed vendors (neurodegenerative, $50K niche tool) terminated before full commitment
Workflow integration, not just accuracy: Oxipit's PACS integration was the deployment enabler. An accurate model that disrupts workflow is still a failure
Data governance before scaling: CMRAD took 6 months of security iterations — resolving this before deployment avoided a compliance crisis at scale
Regulatory pathway mapped early: EU AI Act, Dutch autonomous reporting law — legal constraints shaped what was deployable, not an afterthought
Structured productive friction (Addas): Biweekly Oxipit meetings, ENT feedback on VS — formalized iterative human-AI refinement is a process discipline, not an accident

Failure Signals

Technology-push partnerships: vendor defines the use case
Committing before retrospective clinical validation (20,000 cases for Oxipit was minimum viable)
Skipping data security iterations to accelerate partnership

Product

What the AI system does — use case clarity, model performance, and fit with the clinical environment.

Success Factors

Crystal-clear use case with measurable scope: Screen-out (not screen-in), 20% of normals, 5-minute review — specific enough to validate and govern
Right optimization metric for the use case: Screen-out → optimize for specificity (safe dismissal of normals). Screen-in → optimize for recall. Mixing these up is clinically dangerous
Model retrained on local data: Oxipit's global model underperformed on LUMC fracture cases — local calibration was required before clinical-grade performance
Regulatory certification for intended use: Oxipit's EU CE mark specifically for screen-out use case. Certification must match deployment scope
Augmentation-forward design: Products designed to work with radiologists (Addas) succeed. Products designed to replace them (screen-in reporting) create friction and failure
IP and commercialization terms resolved: Philips partnership risk — LUMC's data built a commercial product; benefit-sharing terms needed before not after

Failure Signals

Vague or overly broad product scope (detect 75 abnormalities → overwhelming reports)
Optimizing for the wrong metric (recall when specificity is needed)
No local validation before clinical deployment

3Ps Pilot Scorecard

Pilot	People ✓/✗	Process ✓/✗	Product ✓/✗	Outcome
P1: VS Tumour	✓ Strong multi-disciplinary team; ENT friction was productive	⚠ Not configured for product dev; had to pivot to research model	⚠ 3 of 4 goals achieved; growth prediction trust gap remains	Partial success — workflow value delivered, full automation goal deferred
P2: Chest X-ray	⚠ Radiologists not co-designers initially → friction → then buy-in after pivot	✓ 6 months structured iteration; PACS integration executed; organizational pull enforced	✓ Use case sharpened (screen-out); right metric (specificity); CE certified	Success after pivot — strongest deployed use case
P3: CMRAD	✓ All stakeholders included; legal + IT + privacy at table from start	⚠ 6-month security iteration required; process risk was data governance	✓ Clear product: knowledge community + AI tool testing sandbox. No clinical automation pressure.	Success — 400% growth in platform use; rare case coverage expanded
P4: Fast MRI	⚠ Mixed radiologist reactions; some demanded more evidence — productive friction present	⚠ Data infrastructure setup complex; Philips commercialization terms not pre-resolved	✓ Clear scope (image reconstruction); validated on 3,000 LUMC patients; augmentation design	Ongoing — technical success, commercial and governance risk pending

AI Execution Checklist — 3Ps Framework

People

✓Identify a named clinical champion with department-level mandate and 3+ year commitment

✓Include clinicians, IT, legal, privacy, and data science in the team from day one — not as reviewers, as co-designers

✓Build trust iteratively — deploy AI in low-stakes tasks before high-stakes decisions; never demand trust upfront

✓Frame AI as augmentation (Addas) — position it as building radiologist capability, not replacing it

✓Manage hierarchy actively — senior clinician veto power can stall projects; build consensus, not top-down mandates

✗Do NOT allow vendor to own the use case without a clinical owner counterpart inside LUMC

Process

✓Start from an organizational problem, not a vendor product — organizational pull, not technology push

✓Run time-boxed pilots before committing — test feasibility, fit, and vendor alignment at low cost

✓Build productive friction into the schedule — structured review meetings between clinical and technical teams (biweekly minimum)

✓Resolve data governance, anonymization, and GDPR compliance before any external data sharing

✓Map the regulatory pathway (EU AI Act, national law) before selecting the deployment model

✗Do NOT commit resources to partnerships until vendor mission, financial stability, and use-case alignment are confirmed

Product

✓Define a single, specific use case with measurable scope before procurement (not "detect abnormalities" — "safely screen out normals at 20% of volume")

✓Choose the right optimization metric: screen-out → specificity; screen-in → recall. Misalignment is a clinical risk

✓Validate on local patient data before deployment — no global model performs at clinical grade without local calibration

✓Obtain regulatory certification for the specific intended use — not general-purpose AI approval

✓Negotiate IP, data ownership, and commercialization benefit-sharing terms before building on LUMC patient data

✗Do NOT deploy a product designed for automation before the augmentation phase has established clinical trust and workflow fit

Smart Questions to Ask in Session

On Context Engineering — Early in Session

"The four-tier context model says Tier 1 content is loaded on every request. In a hospital deploying an agentic AI for radiology triage, what would be the highest-priority information to put in Tier 1 — and what would be too risky to include there?"

Shows you understand the cost/benefit tradeoff of context, and bridges the theoretical framework directly to the LUMC case. Expect discussion around patient safety protocols, legal constraints, and anonymization requirements.

On Chatbots vs. Agentic AI — Probe for Depth

"The slide says 'human oversight must attach at the right level of the chain, not just the top.' In the Oxipit screen-out use case, where exactly does the human oversight currently attach — and what would need to change structurally for the Dutch government to allow fully autonomous reporting?"

Connects the agentic AI governance theory directly to the regulatory constraint in the case, showing you've synthesized the two parts of the session rather than treating them separately.

On Automation vs. Augmentation — Push Back on Hinton

"The Oxipit pilot effectively auto-triages 45–50% of normal X-rays. Even if a radiologist spends 5 minutes reviewing, is this still augmentation or has it become de facto automation? At what point does the human review become a rubber-stamp that creates false confidence in the oversight?"

This is the sharpest question in the session — it challenges automation bias, the concept where humans over-rely on AI outputs. Shows you've connected Session 2's bias framework to Session 4's automation concepts.

On Scaling — Strategic Tension

"Radiology's success depended on Van Buchem as a 6-year champion with direct authority over the department. When scaling hospital-wide, does LUMC need 30 department-level Van Buchems, or does this require a fundamentally different governance model — like a centralized Chief AI Officer?"

Engages the Hourglass Model's organizational layer. Shows you understand that AI governance isn't just about technology — it's about where accountability sits in a complex institution.

On Hallucination Risks in High-Stakes Contexts

"The 11 hallucination techniques are designed for general Gen AI use cases. In radiology, a hallucinated finding that a radiologist then validates is qualitatively different from the same hallucination in a chatbot — the radiologist becomes the last line of defense. Should the mitigation strategy in clinical AI applications weight human-in-the-loop review much more heavily than the other 10 techniques combined?"

Demonstrates you can calibrate risk frameworks to context, not just apply them formulaically. Good for earning participation points.

On the Human-AI Delegation Framework — Applied

"In the VS tumour case, the AI was trusted for segmentation and volume measurement, but not for the growth prediction that would eliminate a patient visit. The Baird & Maruping framework asks who holds the distribution of decision rights. Is the correct governance decision here to keep the radiologist as delegator indefinitely — or is there a trust-building pathway that could shift the rights boundary over time?"

Uses the theoretical framework by name and applies it to a specific case detail. Professors notice when students connect frameworks to specific case facts rather than speaking in generalities.

On Amazon Bedrock — Practical

"The Bedrock Guardrails interface allows you to configure harm filters at None/Low/Medium/High. For a hospital deploying a diagnostic AI, should ALL categories be set to High — or is there a case for setting some lower, given that clinical language frequently uses terminology that a harm filter might flag?"

Shows you watched the Bedrock video and can think beyond the default settings. Good for the hands-on portion of the session.

Session 4: Executing Strategy & Integrating AI in the Business

Prompt → Context Engineering

Chatbots vs. Agentic AI

Managing Gen AI Risks (Part 2)

AI in a Minute Presentations

Amazon Bedrock Deep Dive

Case: AI in Radiology (LUMC)

From Prompt Engineering → Context Engineering

Always Loaded (~5,000–7,000 tokens)

Loaded on Navigation (200–500 tokens)

Loaded on Demand

Ephemeral (Variable, growing)

Chatbots vs. Agentic AI — Leadership Implications

The Hourglass Model: AI Governance Framework

Human-AI Delegation Framework

Prompting Risks

Usage Risks

Safeguard Framework for Gen AI Applications

11 Techniques to Reduce Hallucinations

Core Problem

The 4 AI Pilots (2018–2024)

The Landscape Van Buchem Walked Into the Meeting With

Technology Push (cautionary examples)

Organizational Pull (successful approach)

Automation vs. Augmentation by Pilot — Applied Addas Framework

Can Radiologists Be Replaced? — Hinton's 2016 Claim Through the Addas Lens

3Ps Pilot Scorecard