Who the players are, what's changed since 2023, Stanford AI Index 2026 highlights.
Tokenization, next-token prediction, temperature. What you can and can't control.
Teams MSA, TO-E, VE-A present. Class votes and scores.
Interactive Ivey case. Role-play as Deltacore leadership navigating a GenAI strategy decision.
Five pillars: model choice, context design, prompt design, data quality, safeguards.
Teams VC, TO-C, OTT present. Hands-on PartyRock prototype exercise.
The Gen AI market has consolidated around a small number of dominant frontier model providers, with a larger ecosystem of fine-tuned and open-source models around them. Know the key players and their strategic positioning.
The Stanford AI Index tracks AI progress annually. Three 2026 findings are particularly relevant for leaders:
| Trend | What's Happening | Leader Implication |
|---|---|---|
| Training cost: up | Frontier model training costs have grown 10-100× since 2020. GPT-4-scale training runs cost $50–100M+. Only a handful of companies globally can afford frontier model development. | Your org will be a consumer of frontier models, not a builder. The strategic question is which provider to bet on, not whether to build your own. |
| Inference cost: down | Cost per 1M tokens dropped 10-100× in 2024 alone. Running AI is becoming close to free. This is what enables mass enterprise deployment. | Cost is no longer the barrier to AI adoption at scale. Speed of integration, governance, and change management are the new bottlenecks. |
| Emissions: up sharply | AI training and inference are now material contributors to tech company emissions. A single GPT-4 training run emits as much CO2 as ~300 transatlantic flights. | ESG-conscious boards and regulators are starting to ask about AI's carbon footprint. Factor this into vendor selection and model sizing decisions. |
LLMs don't read words — they read tokens. A token is roughly 4 characters or 0.75 words. "Unbelievable" might be 3-4 tokens; "AI" is 1 token. This matters because:
At its core, an LLM is a system trained to predict: "given all the tokens so far, what token comes next?" It does this by learning a probability distribution over its vocabulary (~50,000+ tokens) and sampling from it.
This means:
Temperature controls how much randomness the model introduces when sampling from the probability distribution:
| Temperature | Behavior | Best For |
|---|---|---|
| 0 | Always picks the highest-probability token. Deterministic — same input → same output every time. | Factual extraction, code generation, classification |
| 0.3–0.7 | Slightly varied. More natural language, some creativity. | General business writing, summarization |
| 1.0+ | High randomness. Creative but potentially incoherent or factually wrong. | Brainstorming, creative writing, ideation |
This is Dr. Addas's framework for structuring how organizations should think about deploying Gen AI. The five pillars are interdependent — weakness in one undermines the others. This will be directly relevant to the group project.
Model Choice: Bigger ≠ better for every task. A smaller, cheaper model fine-tuned on your domain often outperforms GPT-4 on specific tasks at 1/100th the cost. The tradeoff is: fine-tuning requires data, expertise, and maintenance. For most MBA projects → use a frontier model via API.
Context Design: The system prompt is the most powerful lever most people don't use well. A well-crafted system prompt (role, constraints, output format, examples) can eliminate 80% of prompt engineering from individual queries.
Prompt Design: Key techniques: zero-shot (just ask), few-shot (give examples), chain-of-thought (ask it to think step-by-step), role prompting (you are a...). Chain-of-thought dramatically improves performance on reasoning tasks.
Data Quality (RAG): Retrieval-Augmented Generation = give the model your documents at query time instead of fine-tuning. Faster to deploy, easier to update, more auditable. The #1 enterprise pattern for custom AI applications.
Safeguards: Output filters (block toxic content), constitutional AI (model trained to refuse harmful requests), human-in-the-loop for high-stakes outputs, and logging/audit trails for compliance.
Deltacore Analytics is an Ivey Publishing interactive case. The format is different from a standard case — you'll be navigating decisions in real-time, not discussing a static narrative. The session is 65 minutes and involves role-playing as leadership navigating a Gen AI strategy decision.
The case explores: When should a B2B analytics company build proprietary Gen AI capabilities vs. integrating existing LLM APIs? And within that: automation vs augmentation — does Gen AI replace Deltacore's analysts or make them more valuable?
Each team presents a 60-second summary of a current AI article (HBR, MIT Sloan, or similar). The goal: distill a complex AI topic into something a non-technical executive would find immediately actionable. Class provides scores and feedback.
The grading criteria typically reward: clarity, relevance to business leaders, one sharp insight (not a Wikipedia summary), and time discipline.
PartyRock is AWS's no-code Gen AI builder (bedrock.aws.amazon.com/partyrock). You can build a working AI app — chatbot, document analyzer, recommendation engine — in under 30 minutes with zero code. It's the fastest way to understand what Bedrock can do.
Bedrock is the enterprise layer: a managed service that gives API access to 30+ models from Anthropic, Amazon, Meta, Mistral, and others through a single endpoint, with enterprise security and compliance controls.