<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>Krystian Safjan's Blog</title><link href="https://www.safjan.com/" rel="alternate"/><link href="https://www.safjan.com/feeds/all.atom.xml" rel="self"/><id>https://www.safjan.com/</id><updated>2026-04-09T00:00:00+02:00</updated><subtitle>Data Scientist and Team Leader writing about Machine Learning, MLOps, and Python</subtitle><entry><title>The Real Cost of Model Migration - What Swapping LLMs Actually Requires</title><link href="https://www.safjan.com/the-real-cost-of-model-migration-what-swapping-llms-actually-requires/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2026-04-09T00:00:00+02:00</published><updated>2026-04-09T00:00:00+02:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2026-04-09:/the-real-cost-of-model-migration-what-swapping-llms-actually-requires/</id><summary type="html">&lt;p&gt;Model deprecations are routine. What they expose underneath - unmeasured quality, model-coupled prompts, unversioned behavior - rarely is. Here's what a migration actually requires, from evaluation to prompt portability to rollout, based on doing this a few times the hard way.&lt;/p&gt;</summary><content type="html">&lt;p&gt;So OpenAI deprecated &lt;code&gt;gpt-4o-mini&lt;/code&gt;. Or some other model you've built your whole system around just got a sunset date. The email lands and your first thought is: &lt;em&gt;how hard can a model swap be?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;I've been through this a few times now. The API call swap? Easy. Twenty minutes, tops. But the swap has a way of revealing every shortcut and assumption your system has been quietly depending on. That's the part people don't usually mention until you're already in it.&lt;/p&gt;
&lt;h2 id="table-of-contents"&gt;Table of contents&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a href="#what-migration-actually-is"&gt;What migration actually is&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#the-model-options-in-2026"&gt;The model options in 2026&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#phase-0--know-what-youre-migrating-from"&gt;Phase 0 - Know what you're migrating from&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#phase-1--build-your-evaluation-harness-first"&gt;Phase 1 - Build your evaluation harness first&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#phase-2--the-prompt-portability-problem"&gt;Phase 2 - The prompt portability problem&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#phase-3--automated-prompt-optimization-with-dspy"&gt;Phase 3 - Automated prompt optimization with DSPy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#phase-4--reasoning-models-where-they-belong"&gt;Phase 4 - Reasoning models: where they belong&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#phase-5--handling-missing-parameters"&gt;Phase 5 - Handling missing parameters&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#phase-6--risk-assessment"&gt;Phase 6 - Risk assessment&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#phase-7--progressive-rollout"&gt;Phase 7 - Progressive rollout&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#phase-8--post-migration-monitoring"&gt;Phase 8 - Post-migration monitoring&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#the-systems-audit-you-should-run-regardless"&gt;The systems audit you should run regardless&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="what-migration-actually-is"&gt;What migration actually is&lt;/h2&gt;
&lt;p&gt;Here's the (uncomfortable) truth: a model migration is really a systems audit. It just happens to come with a deadline someone else set for you.&lt;/p&gt;
&lt;p&gt;When you swap the model under a RAG pipeline, you're removing the environment your system's behavior was calibrated in. And you get to find out how much of that behavior was intentional versus... just kind of happened over time.&lt;/p&gt;
&lt;p&gt;Three things consistently surface during migration that were invisible before:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Quality often wasn't measured.&lt;/strong&gt; A lot of production LLM systems have never been formally evaluated. No golden dataset, no faithfulness score, no format compliance check. "Quality" is whatever the team last looked at and didn't complain about. You can't claim "no quality loss" if you never measured quality to begin with.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Your prompt is coupled to the old model.&lt;/strong&gt; That system prompt you spent weeks on? It's not a specification. It's a negotiation artifact, the residue of back-and-forth between your intentions and one specific model's quirks. Swap the model and you haven't ported a prompt. You've orphaned it. (This one hurts. More in &lt;a href="#phase-2--the-prompt-portability-problem"&gt;Phase 2&lt;/a&gt;.)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Model behavior often isn't versioned.&lt;/strong&gt; Pinning a model name isn't the same as pinning model behavior. OpenAI updates weights behind dated aliases without telling you. If you're using &lt;code&gt;gpt-4o-mini&lt;/code&gt; as a floating pointer, you may have already had a silent behavioral change in production. If your observability didn't catch it... well, that tells you something about your observability.&lt;/p&gt;
&lt;p&gt;The teams who migrate cleanly aren't the ones with the best migration plans. They're the ones who treated their LLM system like a real production system long before a deadline showed up.&lt;/p&gt;
&lt;h2 id="the-model-options-in-2026"&gt;The model options in 2026&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Note: While this article uses OpenAI models as examples, the migration patterns, evaluation strategies, and architectural decisions apply to any LLM provider. The same principles work whether you're migrating between Anthropic's Claude models, open-source models, or any other provider.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Before you plan anything, you need to know what you're migrating &lt;em&gt;to&lt;/em&gt;. OpenAI's current model family has two very different architectures, and picking the wrong one can create more problems than the deprecation itself.&lt;/p&gt;
&lt;h3 id="standard-instruction-models"&gt;Standard instruction models&lt;/h3&gt;
&lt;p&gt;&lt;a href="https://platform.openai.com/docs/models/gpt-5.4"&gt;GPT-5.4&lt;/a&gt;, &lt;a href="https://platform.openai.com/docs/models/gpt-5.4-mini"&gt;GPT-5.4-mini&lt;/a&gt;, and &lt;a href="https://platform.openai.com/docs/models/gpt-5.4-nano"&gt;GPT-5.4-nano&lt;/a&gt; are the current flagship models. They support variable &lt;code&gt;reasoning_effort&lt;/code&gt; (none/low/medium/high/xhigh), plus all the standard parameters: &lt;code&gt;temperature&lt;/code&gt;, system prompts, JSON mode, function calling, streaming. For most RAG answer generation workloads, one of these is where you should land.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Recommended default for gpt-4o-mini replacement: &lt;code&gt;gpt-5.4-nano&lt;/code&gt;&lt;/strong&gt; -comparable cost tier (&lt;span class="math"&gt;\(0.20 per million input tokens vs $0.10 for gpt-4o-mini), significantly more capable, fully API-compatible. If you need the extra capability and can handle the cost, &lt;code&gt;gpt-5.4-mini&lt;/code&gt; (\)&lt;/span&gt;0.75/MTok input) is a strong middle option.&lt;/p&gt;
&lt;h3 id="pure-reasoning-models"&gt;Pure reasoning models&lt;/h3&gt;
&lt;p&gt;The &lt;a href="https://openai.com/index/introducing-o3-and-o4-mini/"&gt;o-series models&lt;/a&gt; (o3, o4-mini) are pure reasoning models without the hybrid flexibility of GPT-5.x. They &lt;em&gt;only&lt;/em&gt; do reasoning, don't support &lt;code&gt;temperature&lt;/code&gt; or standard sampling parameters, and use &lt;code&gt;reasoning_effort&lt;/code&gt; as the sole control. These are specialists.&lt;/p&gt;
&lt;p&gt;Here's what most people get wrong: for typical RAG pipelines, pure reasoning models aren't better answer generators. They're decision infrastructure. I'll get into exactly where they earn their cost in &lt;a href="#phase-4--reasoning-models-where-they-belong"&gt;Phase 4&lt;/a&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nx"&gt;Model&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;selection&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;decision&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="nx"&gt;Is&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;answer&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;synthesis&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;genuinely&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;multi&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;hop&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;A&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;implies&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;B&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;B&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;contradicts&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;C&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;therefore&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;└──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;No&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;→&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Standard&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;gpt&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="m m-Double"&gt;5.4&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;nano&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;or&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;gpt&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="m m-Double"&gt;5.4&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;mini&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;└──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Yes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;→&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Is&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;it&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;high&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;stakes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;with&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;real&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;consequences&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt;
&lt;span class="w"&gt;              &lt;/span&gt;&lt;span class="err"&gt;└──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;No&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;→&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Standard&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;with&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;reasoning_effort&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;low&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;medium&lt;/span&gt;
&lt;span class="w"&gt;              &lt;/span&gt;&lt;span class="err"&gt;└──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Yes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;→&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Consider&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;pure&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;reasoning&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;o3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;o4&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;mini&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;                        &lt;/span&gt;&lt;span class="nx"&gt;OR&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;standard&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;with&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;reasoning_effort&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;high&lt;/span&gt;
&lt;span class="w"&gt;                        &lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;verification&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;layer&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;over&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;synthesis&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="phase-0-know-what-youre-migrating-from"&gt;Phase 0 -Know what you're migrating from&lt;/h2&gt;
&lt;p&gt;I know, I know. You want to start swapping things. But before you write a line of migration code, document what your system actually does right now. Otherwise you'll have no way to tell if the migration worked or just seemed like it did.&lt;/p&gt;
&lt;h3 id="inventory-your-integration-surface"&gt;Inventory your integration surface&lt;/h3&gt;
&lt;p&gt;Pull every place in your codebase where the model name appears. This includes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Direct API calls with &lt;code&gt;model=&lt;/code&gt; parameter&lt;/li&gt;
&lt;li&gt;Configuration files and environment variables&lt;/li&gt;
&lt;li&gt;Any SDK initialization that sets a default model&lt;/li&gt;
&lt;li&gt;Evaluation scripts that may be pinned to the old model for judging&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For each integration point, record: the model name, the prompt template used, the parameters passed (&lt;code&gt;temperature&lt;/code&gt;, &lt;code&gt;max_tokens&lt;/code&gt;, etc.), and the expected output format.&lt;/p&gt;
&lt;h3 id="document-implicit-behavioral-assumptions"&gt;Document implicit behavioral assumptions&lt;/h3&gt;
&lt;p&gt;This part's harder because these things rarely get written down. You need to look for anywhere your code processes model outputs and makes assumptions about what they look like:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;JSON parsing of model responses (field names, nesting depth)&lt;/li&gt;
&lt;li&gt;Regex or string matching on output format&lt;/li&gt;
&lt;li&gt;Length-based truncation or display logic&lt;/li&gt;
&lt;li&gt;Citation extraction that assumes a specific citation format&lt;/li&gt;
&lt;li&gt;Any code that branches on output content&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each of these is a behavioral assumption about your current model. And each assumption is a place where things can quietly break.&lt;/p&gt;
&lt;h3 id="get-a-pre-migration-quality-snapshot"&gt;Get a pre-migration quality snapshot&lt;/h3&gt;
&lt;p&gt;Run your current system against a sample of real production queries and save the outputs. This is your before-state. Even if you don't have a formal eval harness yet, just having the raw outputs lets you compare later. Future-you will be grateful.&lt;/p&gt;
&lt;h2 id="phase-1-build-your-evaluation-harness-first"&gt;Phase 1 - Build your evaluation harness first&lt;/h2&gt;
&lt;p&gt;I can't stress this enough. Everything else you do, prompt changes, model selection, rollout strategy, is going to be guided by what your evals tell you. Skip this and you'll discover regressions in production. I've watched teams do this. The cost of fixing things at that point is genuinely 10x higher.&lt;/p&gt;
&lt;h3 id="build-a-golden-dataset"&gt;Build a golden dataset&lt;/h3&gt;
&lt;p&gt;Sample 200–500 real queries from production logs. For each, store:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The original query&lt;/li&gt;
&lt;li&gt;The retrieved context chunks&lt;/li&gt;
&lt;li&gt;The current model's answer (this becomes your reference)&lt;/li&gt;
&lt;li&gt;For as many as you can afford: a human-verified "ideal" answer&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;Notice: with this approach you are preparing for testing answer generation part of the RAG not the retriever.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Don't sample uniformly.&lt;/strong&gt; Stratify on purpose. Include:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stratum&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Easy factual queries&lt;/td&gt;
&lt;td&gt;Regression canary, should never fail&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-chunk synthesis&lt;/td&gt;
&lt;td&gt;Where model capability actually matters&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Conflicting context&lt;/td&gt;
&lt;td&gt;Tests faithfulness under pressure&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Out-of-scope queries&lt;/td&gt;
&lt;td&gt;Refusal behavior regression&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Edge cases your team knows about&lt;/td&gt;
&lt;td&gt;The ones that broke things before&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Aim for at least 50 human-verified examples. 200 is significantly better.&lt;/p&gt;
&lt;h3 id="what-to-evaluate"&gt;What to evaluate&lt;/h3&gt;
&lt;p&gt;For a RAG system, here's what you actually need to measure:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Faithfulness&lt;/strong&gt; - does the answer only claim things supported by the retrieved context? This is the big one. A model that hallucinates confidently is scarier than one that refuses to answer. Use the &lt;a href="https://docs.ragas.io/en/latest/concepts/metrics/faithfulness.html"&gt;Ragas faithfulness metric&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Answer relevance&lt;/strong&gt; - does it answer what was actually asked? (&lt;a href="https://docs.ragas.io/en/latest/concepts/metrics/answer_relevance.html"&gt;Ragas answer relevance&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Format compliance&lt;/strong&gt; - does the output match your schema? JSON structure, citation format, length constraints. You'll likely need a custom LLM-as-judge metric here because format requirements vary widely.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Refusal accuracy&lt;/strong&gt; - when the context doesn't contain the answer, does the model say "I don't know" instead of making something up?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Groundedness&lt;/strong&gt; - can you trace specific claims back to specific chunks? Similar to faithfulness but more granular.&lt;/p&gt;
&lt;h3 id="evaluation-tooling"&gt;Evaluation tooling&lt;/h3&gt;
&lt;p&gt;&lt;a href="https://docs.ragas.io/"&gt;Ragas&lt;/a&gt; automates faithfulness, answer relevance, context precision, and context recall scoring using an LLM-as-judge approach. Point it at your golden dataset and run both old and new model outputs through it to get comparable scores.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.promptfoo.dev/"&gt;PromptFoo&lt;/a&gt; works well for regression testing during prompt iteration. Define test cases with expected outputs or assertions and run them against multiple models simultaneously, which is exactly the side-by-side comparison you need during migration.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://smith.langchain.com/"&gt;LangSmith&lt;/a&gt; or &lt;a href="https://www.braintrust.dev/"&gt;Braintrust&lt;/a&gt; if you want persistent experiment tracking. They store eval runs with scores, let you diff outputs visually, and can alert on regressions. Worth setting up if this migration will take more than a week.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://mlflow.org/docs/latest/genai/tracing/"&gt;MLflow&lt;/a&gt; for teams already in the MLflow ecosystem. It has native LLM tracking and integrates directly with DSPy (covered in Phase 3).&lt;/p&gt;
&lt;h3 id="define-passfail-gate-criteria"&gt;Define pass/fail gate criteria&lt;/h3&gt;
&lt;p&gt;Do this &lt;em&gt;before&lt;/em&gt; running any evals. Seriously. If you define criteria after seeing results, you'll unconsciously anchor them to whatever the new model happens to achieve. Human brains are terrible at this.&lt;/p&gt;
&lt;p&gt;Example gate criteria:
- Faithfulness score ≥ 0.92
- Format compliance ≥ 0.98
- Answer relevance ≥ 0.90
- P95 latency within 20% of baseline
- Refusal accuracy ≥ 0.95 on out-of-scope stratum&lt;/p&gt;
&lt;pre class="mermaid"&gt;
flowchart TD
    A[Sample 200-500 production queries] --&gt; B[Stratify by difficulty and type]
    B --&gt; C[Run current model - store as reference]
    C --&gt; D[Human verify 50-200 examples]
    D --&gt; E[Define metric weights per dimension]
    E --&gt; F[Set gate criteria before migration begins]
    F --&gt; G[Eval harness ready]

    style A fill:#1e3a5f,color:#b8d4f0
    style G fill:#1a3d2e,color:#8fd4b0
&lt;/pre&gt;

&lt;h2 id="phase-2-the-prompt-portability-problem"&gt;Phase 2 - The prompt portability problem&lt;/h2&gt;
&lt;p&gt;Okay, this is the part that causes the most pain, and people don't usually talk about it honestly.&lt;/p&gt;
&lt;h3 id="written-prompts-vs-tuned-prompts"&gt;Written prompts vs tuned prompts&lt;/h3&gt;
&lt;p&gt;There's a difference between a prompt that &lt;em&gt;specifies&lt;/em&gt; behavior and a prompt that was &lt;em&gt;tweaked until outputs stopped looking weird&lt;/em&gt;. It's a bigger difference than you'd think.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;A written prompt&lt;/strong&gt; starts from a behavioral spec. You know what the system must do, what it must not do, what the output format looks like. The constraints are verifiable regardless of which model you're running:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Answer&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;using&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;only&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;information&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;present&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ow"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;provided&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;If&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;does&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ow"&gt;not&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;contain&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;sufficient&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;information&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;respond&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;with&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nl"&gt;exactly&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;&amp;quot;I cannot answer this from the available information.&amp;quot;&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;Format&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;citations&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;source_id&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;inline&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;immediately&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;after&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;claim&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="n"&gt;they&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;support&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;These survive a model swap. You can read the prompt and tell whether any output satisfies them, regardless of which model made it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;A tuned prompt&lt;/strong&gt; is what most of us actually have. It starts from a vague goal and grows through patches. You wrote a first draft. Outputs were mostly fine but the model kept adding chatty preambles, so you added "be concise." Then it started truncating, so you added "be thorough but concise." Citations were inconsistent, so you added "always cite sources." Then it started over-citing, so you added "cite only when directly referencing a specific fact."&lt;/p&gt;
&lt;p&gt;Sound familiar? Six months later your prompt is 800 words and full of stuff like:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;"Do not add unnecessary preambles"&lt;/em&gt; - patch for a greeting behavior specific to an old model weight&lt;/li&gt;
&lt;li&gt;&lt;em&gt;"Avoid repeating the question in your answer"&lt;/em&gt; - patch for a retriggering behavior&lt;/li&gt;
&lt;li&gt;&lt;em&gt;"Use natural language, not bullet points unless the question explicitly asks for a list"&lt;/em&gt; - patch for a formatting regression after a silent weight update&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;None of these describe what your system is &lt;em&gt;supposed&lt;/em&gt; to do. They're band-aids for specific past failures of a model that no longer exists.&lt;/p&gt;
&lt;h3 id="why-tuned-prompts-are-so-common"&gt;Why tuned prompts are so common&lt;/h3&gt;
&lt;p&gt;If you're feeling called out right now, don't. The vast majority of production RAG prompts are more tuned than written.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Patching is faster than specifying.&lt;/strong&gt; When you're iterating on a RAG system, you see problems and fix them. The fastest fix is usually adding an instruction. Writing a proper spec requires knowing all failure modes before you've seen them, which is... impossible when failures emerge from interaction with real data.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Nothing forces you to notice until migration.&lt;/strong&gt; A tuned prompt works. It produces acceptable outputs on the current model. The coupling only becomes visible when you remove the thing it's coupled to.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Many teams never wrote a spec to begin with.&lt;/strong&gt; Writing a specification for "correct behavior" before building the system requires a level of foresight that's genuinely hard to have. So the prompt &lt;em&gt;became&lt;/em&gt; the spec over time, simultaneously the behavioral specification and the accumulated technical debt. Good luck telling them apart from the inside.&lt;/p&gt;
&lt;h3 id="what-to-do-about-it"&gt;What to do about it&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Start with prompt archaeology.&lt;/strong&gt; Before you touch anything, go through every instruction in your current prompt and label it as either:
- &lt;code&gt;SPEC&lt;/code&gt; - this describes intended behavior, survives model changes
- &lt;code&gt;PATCH&lt;/code&gt; - this suppresses a specific failure, may not be relevant to new model&lt;/p&gt;
&lt;p&gt;In my experience, most 500+ word prompts end up being about 40% spec and 60% patch. The patches are candidates for removal or replacement after you test compatibility with the new model.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Run a compatibility baseline before touching anything.&lt;/strong&gt; Take your existing prompt verbatim, swap only the model name, run your golden dataset through it, and score with Ragas. This tells you how much of the prompt is still needed versus how much is suppressing behaviors the new model doesn't even have.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Reconstruct minimally.&lt;/strong&gt; Where scores dropped, investigate the failure patterns before changing the prompt. Don't just start adding instructions (that's how you got here). Common adaptation points:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Issue&lt;/th&gt;
&lt;th&gt;Diagnosis&lt;/th&gt;
&lt;th&gt;Fix&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Markdown in plain-text outputs&lt;/td&gt;
&lt;td&gt;New model has higher markdown affinity&lt;/td&gt;
&lt;td&gt;Explicit format instruction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Longer responses than expected&lt;/td&gt;
&lt;td&gt;Different default length calibration&lt;/td&gt;
&lt;td&gt;Token budget instruction + few-shot&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Citation format drift&lt;/td&gt;
&lt;td&gt;Model interprets citation spec differently&lt;/td&gt;
&lt;td&gt;Few-shot examples, not more words&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Refusal behavior change&lt;/td&gt;
&lt;td&gt;Different threshold for "insufficient context"&lt;/td&gt;
&lt;td&gt;Explicit refusal instruction with example&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JSON key naming change&lt;/td&gt;
&lt;td&gt;Model paraphrases key names&lt;/td&gt;
&lt;td&gt;Strict schema in system prompt or JSON mode&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;Few-shot examples are the most reliable format anchor.&lt;/strong&gt; More reliable than additional instructions. They act as a behavioral anchor that survives prompt wording differences across model versions. For format-critical RAG outputs, 2-3 few-shot examples will do more work than three paragraphs of format instructions.&lt;/p&gt;
&lt;h2 id="phase-3-automated-prompt-optimization-with-dspy"&gt;Phase 3 - Automated prompt optimization with DSPy&lt;/h2&gt;
&lt;p&gt;So manual prompt iteration is what got you into the tuned-prompt mess. &lt;a href="https://dspy.ai/"&gt;DSPy&lt;/a&gt; offers a way out: it learns the optimal prompt for your specific data and target model automatically, guided by whatever metric you care about.&lt;/p&gt;
&lt;h3 id="what-dspy-actually-does"&gt;What DSPy actually does&lt;/h3&gt;
&lt;p&gt;&lt;a href="https://github.com/stanfordnlp/dspy"&gt;DSPy&lt;/a&gt; separates the &lt;em&gt;interface&lt;/em&gt; (what should the model do) from the &lt;em&gt;implementation&lt;/em&gt; (how do you tell it). You define Signatures (declarative input/output specs) and DSPy optimizers figure out the instructions and examples that best satisfy your metric on your data.&lt;/p&gt;
&lt;p&gt;Why this matters for migration: &lt;strong&gt;the prompt that worked on gpt-4o-mini isn't guaranteed to be optimal for gpt-5.4-nano&lt;/strong&gt;. DSPy relearns the prompt for the new model automatically. You get model-specific optimization without the manual iteration that created the problem in the first place.&lt;/p&gt;
&lt;h3 id="translating-your-pipeline-into-dspy"&gt;Translating your pipeline into DSPy&lt;/h3&gt;
&lt;p&gt;Your answer generation step becomes a Signature:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dspy&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RAGAnswer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Signature&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;Answer the question using only the provided context chunks.&lt;/span&gt;
&lt;span class="sd"&gt;    Cite sources using [chunk_id] notation.&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;

    &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InputField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;desc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;retrieved context chunks with IDs&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InputField&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OutputField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;desc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;faithful, cited answer&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RAGPipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;respond&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ChainOfThought&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RAGAnswer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;respond&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Switching the target model is one line. Your golden dataset and metric stay identical:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Before: gpt-4o-mini&lt;/span&gt;
&lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;configure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;openai/gpt-4o-mini&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# After: gpt-5.4-nano - golden dataset unchanged, metric unchanged&lt;/span&gt;
&lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;configure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;openai/gpt-5.4-nano&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h3 id="choosing-an-optimizer"&gt;Choosing an optimizer&lt;/h3&gt;
&lt;pre class="mermaid"&gt;
flowchart TD
    A[Start: need to optimize prompt for new model] --&gt; B{How many pipeline stages?}

    B --&gt;|Single stage| C{Budget and time?}
    B --&gt;|Multi-stage| D[MIPROv2 or GEPA]

    C --&gt;|Quick and cheap| E[BootstrapFewShot&lt;br/&gt;~$0.10, ~5 min]
    C --&gt;|Thorough| F[MIPROv2&lt;br/&gt;~$1.50-5, ~30 min]

    D --&gt; G{Complex failures&lt;br/&gt;need diagnosis?}
    G --&gt;|No| F
    G --&gt;|Yes| H[GEPA&lt;br/&gt;Expensive but deepest optimization]

    E --&gt; I[Save compiled program as artifact]
    F --&gt; I
    H --&gt; I

    style E fill:#1a3d2e,color:#8fd4b0
    style F fill:#1e3a5f,color:#b8d4f0
    style H fill:#3d1a2e,color:#d4b0c0
    style I fill:#3d2e1a,color:#d4c0a0
&lt;/pre&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://dspy.ai/api/optimizers/BootstrapFewShot/"&gt;BootstrapFewShot&lt;/a&gt;&lt;/strong&gt; - the baseline. Generates complete demonstrations for each stage of your program, keeping only those that pass your metric. Use this first. Cheap, fast, often sufficient for single-stage answer generation migration.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://dspy.ai/api/optimizers/MIPROv2/"&gt;MIPROv2&lt;/a&gt;&lt;/strong&gt; - the workhorse. Runs three stages: bootstrapping to collect high-scoring traces, grounded proposal to draft potential instructions, then discrete search to evaluate instruction-example combinations. Costs ~$1.50-5 at medium auto setting, takes 20-40 minutes. Worth running before any production rollout.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;dspy.teleprompt&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MIPROv2&lt;/span&gt;

&lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MIPROv2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;metric&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;your_faithfulness_metric&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;medium&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;           &lt;span class="c1"&gt;# light / medium / heavy&lt;/span&gt;
    &lt;span class="n"&gt;num_threads&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;compiled_rag&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;RAGPipeline&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;trainset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;train_examples&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# 20% of golden dataset&lt;/span&gt;
    &lt;span class="n"&gt;valset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;val_examples&lt;/span&gt;        &lt;span class="c1"&gt;# 80% of golden dataset&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Save as versioned artifact&lt;/span&gt;
&lt;span class="n"&gt;compiled_rag&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;rag_pipeline_gpt54nano_v1.json&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://dspy.ai/api/optimizers/GEPA/overview/"&gt;GEPA&lt;/a&gt;&lt;/strong&gt; - the newest. Rather than optimizing only the globally best candidate (which leads to local optima), GEPA maintains a Pareto frontier: candidates that achieve the highest score on at least one evaluation instance. It uses a Teacher model to analyze failures and propose targeted fixes. Use for multi-stage pipelines where interaction effects between stages matter. Requires a strong reflection model (&lt;code&gt;gpt-5.4&lt;/code&gt; or higher recommended).&lt;/p&gt;
&lt;h3 id="the-data-split-that-matters"&gt;The data split that matters&lt;/h3&gt;
&lt;p&gt;This one's counterintuitive: &lt;strong&gt;20% training, 80% validation&lt;/strong&gt;. Yes, reversed from what you're used to. Prompt-based optimizers overfit to small training sets way more aggressively than neural networks. A prompt optimized on 60 examples and validated on 240 will generalize far better than the other way around.&lt;/p&gt;
&lt;p&gt;You can get real value from as few as 30 training examples. The validation set is where quality is actually measured.&lt;/p&gt;
&lt;h3 id="dspy-in-migration-tool-vs-framework"&gt;DSPy in migration: tool vs framework&lt;/h3&gt;
&lt;p&gt;Teams adopt DSPy as a permanent architectural layer when the problem only needed a one-time tool.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Use DSPy as a migration tool:&lt;/strong&gt; run MIPROv2 against your golden dataset on the new model, inspect the optimized prompt, extract it as a string, and deploy it without DSPy in your runtime path. Optimization benefit, no framework dependency.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Adopt DSPy as a framework&lt;/strong&gt; only if you're going to keep experimenting: swapping models regularly, adding pipeline stages, running continuous optimization against production feedback. That's where the framework investment pays off.&lt;/p&gt;
&lt;h3 id="dspy-limitations-to-know-before-committing"&gt;DSPy limitations to know before committing&lt;/h3&gt;
&lt;p&gt;A few things to be aware of:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Your metric quality determines everything.&lt;/strong&gt; DSPy optimizes whatever metric you give it. Weak metric? You'll get a confidently wrong optimized prompt. Make sure faithfulness is in there, not just answer relevance.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The optimized prompt can look weird.&lt;/strong&gt; MIPROv2 generates instructions that work but can be verbose and repetitive in ways a human wouldn't write. Debugging means re-running evals, not reading the prompt.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Reasoning models need special handling.&lt;/strong&gt; DSPy doesn't have first-class support for &lt;code&gt;reasoning_effort&lt;/code&gt; yet. When targeting pure reasoning models (o4-mini, o3) or GPT-5.x with specific reasoning effort levels, wrap the model call in a custom &lt;code&gt;dspy.LM&lt;/code&gt; class that sets &lt;code&gt;reasoning_effort&lt;/code&gt; at initialization.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Watch the adapter layer.&lt;/strong&gt; DSPy's adapters wrap your instructions in scaffolding. With reasoning models where prompt verbosity interferes with internal reasoning chains, this can produce unexpected behavior. Test it.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="phase-4-reasoning-models-where-they-belong"&gt;Phase 4 - Reasoning models: where they belong&lt;/h2&gt;
&lt;p&gt;When people hear "more capable model family," their instinct is to slot the reasoning model right where the standard model was: answer generation. This is usually wrong.&lt;/p&gt;
&lt;p&gt;Think of it like hiring. You wouldn't pay a principal engineer to write CRUD endpoints. You hire them to make decisions when things are ambiguous. Same logic here.&lt;/p&gt;
&lt;h3 id="the-rag-pipeline-with-reasoning-model-placement"&gt;The RAG pipeline with reasoning model placement&lt;/h3&gt;
&lt;pre class="mermaid"&gt;
flowchart TD
    Q[User query] --&gt; QD

    subgraph REASONING ["Reasoning model layer (low-medium effort)"]
        QD["Query decomposition&lt;br/&gt;+ intent routing"]
        VER["Faithfulness&lt;br/&gt;verification"]
    end

    subgraph STANDARD ["Standard model layer"]
        RET["Vector/hybrid&lt;br/&gt;retrieval"]
        RNK["Reranking +&lt;br/&gt;conflict detection"]
        SYN["Answer&lt;br/&gt;synthesis"]
    end

    QD --&gt; RET
    RET --&gt; RNK
    RNK --&gt; SYN
    SYN --&gt; VER
    VER --&gt; ANS[Response]

    style REASONING fill:#1e2a3f,color:#a0c4f0,stroke:#3a5a8f
    style STANDARD fill:#1a2e1f,color:#90c4a0,stroke:#2a5a3a
&lt;/pre&gt;

&lt;h3 id="slot-1-query-decomposition-strong-fit-highest-roi"&gt;Slot 1: Query decomposition - strong fit, highest ROI&lt;/h3&gt;
&lt;p&gt;Most enterprise RAG queries are compound, ambiguous, or require unpacking implicit assumptions before retrieval. A reasoning model at this stage:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Decomposes "what's our exposure if we exit the APAC contracts before Q3 given the renegotiation clauses?" into 3–4 targeted retrieval sub-queries&lt;/li&gt;
&lt;li&gt;Classifies intent (factual / comparative / policy / out-of-scope) to route appropriately&lt;/li&gt;
&lt;li&gt;Identifies which sub-questions are answerable from the knowledge base vs. need human input&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The reasoning happens &lt;em&gt;before&lt;/em&gt; retrieval, so latency cost doesn't compound. Use &lt;code&gt;reasoning_effort: low&lt;/code&gt; because query decomposition rarely needs deep thinking. You pay for one reasoning call and get better chunks in return for every subsequent step.&lt;/p&gt;
&lt;h3 id="slot-2-context-reranking-and-conflict-detection-strong-fit"&gt;Slot 2: Context reranking and conflict detection - strong fit&lt;/h3&gt;
&lt;p&gt;Standard &lt;a href="https://www.sbert.net/docs/cross_encoder/usage/usage.html"&gt;cross-encoder rerankers&lt;/a&gt; like &lt;a href="https://huggingface.co/BAAI/bge-reranker-v2-m3"&gt;BGE&lt;/a&gt; or &lt;a href="https://cohere.com/rerank"&gt;Cohere Rerank&lt;/a&gt; score chunk-query relevance mechanically. A reasoning model can do something they can't: spot when retrieved chunks &lt;em&gt;contradict each other&lt;/em&gt;. That's a signal to retry retrieval, not attempt synthesis over conflicting information.&lt;/p&gt;
&lt;p&gt;Especially useful in legal, compliance, and financial RAG where "relevance" means logical applicability, not just semantic similarity.&lt;/p&gt;
&lt;h3 id="slot-3-answer-synthesis-on-complex-documents-maybe"&gt;Slot 3: Answer synthesis on complex documents - maybe&lt;/h3&gt;
&lt;p&gt;Use a reasoning model for synthesis only when the answer requires multi-hop inference across chunks (A implies B, B contradicts C, therefore...) or the domain is high-stakes with real consequences for wrong answers.&lt;/p&gt;
&lt;p&gt;Don't bother when the answer is directly stated in one or two chunks, or when query volume is high and latency matters. If you're generating templated output from straightforward lookups, a standard model is fine.&lt;/p&gt;
&lt;h3 id="slot-4-agentic-orchestration-good-fit-underused"&gt;Slot 4: Agentic orchestration - good fit, underused&lt;/h3&gt;
&lt;p&gt;If you have multi-source RAG (vector DB + SQL + APIs + document store), you need something that decides &lt;em&gt;which retrieval path to take&lt;/em&gt;, in what order, and when to stop. That's a planning problem, not a synthesis problem. Reasoning models are trained through RL to reason about &lt;em&gt;when and how to use tools&lt;/em&gt;, not just call them when told.&lt;/p&gt;
&lt;p&gt;Use the reasoning model as the orchestration brain. Let cheaper instruction models handle the actual synthesis once the right context is assembled.&lt;/p&gt;
&lt;h3 id="slot-5-faithfulness-verification-high-value-underused"&gt;Slot 5: Faithfulness verification - high value, underused&lt;/h3&gt;
&lt;p&gt;A second pass after generation: "Given only the following context, does this answer make claims not supported by the context? Flag the specific unsupported sentences."&lt;/p&gt;
&lt;p&gt;&lt;code&gt;reasoning_effort: low&lt;/code&gt;. You're verifying, not generating. One cheap call that acts as your hallucination guardrail. The economics really work here for high-stakes outputs.&lt;/p&gt;
&lt;h2 id="phase-5-handling-parameter-differences"&gt;Phase 5 - Handling parameter differences&lt;/h2&gt;
&lt;p&gt;If you're migrating between standard GPT-5.x models (like gpt-4o-mini to gpt-5.4-nano), this section mostly doesn't apply. The standard parameters work identically.&lt;/p&gt;
&lt;p&gt;The main difference: GPT-5.x models add &lt;code&gt;reasoning_effort&lt;/code&gt; (none/low/medium/high/xhigh) as an optional parameter. Set it to &lt;code&gt;none&lt;/code&gt; for standard instruction-following behavior, or use low/medium/high when you need reasoning.&lt;/p&gt;
&lt;p&gt;If you're migrating to or using pure reasoning models (o-series), here's the complete parameter gap:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Parameter&lt;/th&gt;
&lt;th&gt;GPT-5.x models&lt;/th&gt;
&lt;th&gt;Pure reasoning (o-series)&lt;/th&gt;
&lt;th&gt;Migration path&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;temperature&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;Use &lt;code&gt;reasoning_effort&lt;/code&gt; instead, or prompt template variants&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;top_p&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;Not needed, reasoning stabilizes output&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;max_tokens&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;Use &lt;code&gt;max_completion_tokens&lt;/code&gt; (covers thinking + output)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;presence_penalty&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;Not typically used in RAG anyway&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;frequency_penalty&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;Not typically used in RAG anyway&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;reasoning_effort&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;✅ (none/low/medium/high/xhigh)&lt;/td&gt;
&lt;td&gt;✅ (low/medium/high only)&lt;/td&gt;
&lt;td&gt;Built-in for GPT-5.x, only option for o-series&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;system&lt;/code&gt; prompt&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;Treated as developer message&lt;/td&gt;
&lt;td&gt;Don't use both system and developer message&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;streaming&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;Limited (o3 with access)&lt;/td&gt;
&lt;td&gt;Use progress indicators, not streaming text&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id="temperature-and-reasoning_effort"&gt;Temperature and reasoning_effort&lt;/h3&gt;
&lt;p&gt;For RAG, you were almost certainly running &lt;code&gt;temperature=0&lt;/code&gt; or close to it. You wanted determinism, not creativity.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;With GPT-5.x models:&lt;/strong&gt; You can still use &lt;code&gt;temperature=0&lt;/code&gt; for deterministic output. If you want reasoning on specific queries, add &lt;code&gt;reasoning_effort: low&lt;/code&gt; or &lt;code&gt;medium&lt;/code&gt;. The two parameters work together.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;With pure reasoning models (o-series):&lt;/strong&gt; No temperature control. The reasoning process itself stabilizes output, giving you consistency by default. If you needed temperature for diversity (generating multiple answer variants), replace it with explicit prompt variants or use a GPT-5.x model with &lt;code&gt;temperature&lt;/code&gt; + &lt;code&gt;reasoning_effort: none&lt;/code&gt;.&lt;/p&gt;
&lt;h3 id="the-max_completion_tokens-trap"&gt;The &lt;code&gt;max_completion_tokens&lt;/code&gt; trap&lt;/h3&gt;
&lt;p&gt;Set this generously. Here's why: thinking tokens don't appear in your output but they're billed and count against this budget. At &lt;code&gt;high&lt;/code&gt; effort, a single complex query can burn 10,000-50,000 thinking tokens. If &lt;code&gt;max_completion_tokens&lt;/code&gt; is too low, the model's reasoning gets truncated mid-thought. You get degraded output and no obvious error signal. Start at 16,000 for medium effort, 32,000+ for high effort, and keep an eye on actual consumption.&lt;/p&gt;
&lt;h3 id="prompt-verbosity-inversion-with-reasoning"&gt;Prompt verbosity inversion with reasoning&lt;/h3&gt;
&lt;p&gt;This one's weird if you've internalized standard prompt engineering advice. When using reasoning (either pure o-series models or GPT-5.x with &lt;code&gt;reasoning_effort&lt;/code&gt; &amp;gt; none), being explicit and repetitive hurts. Over-specified prompts interfere with the internal reasoning chain. The model follows your constraints mechanically instead of actually reasoning toward the goal.&lt;/p&gt;
&lt;p&gt;Write shorter, higher-trust prompts when reasoning is active. Define the &lt;em&gt;goal&lt;/em&gt; and &lt;em&gt;constraints&lt;/em&gt; but leave the &lt;em&gt;method&lt;/em&gt; to the model. I know this feels uncomfortable. It's the opposite of everything we learned, but it works.&lt;/p&gt;
&lt;p&gt;For GPT-5.x with &lt;code&gt;reasoning_effort: none&lt;/code&gt;, standard verbose prompts work fine.&lt;/p&gt;
&lt;h2 id="phase-6-risk-assessment"&gt;Phase 6 - Risk assessment&lt;/h2&gt;
&lt;p&gt;Before any real traffic touches the new model, take a step back and run a structured risk assessment against your eval results.&lt;/p&gt;
&lt;pre class="mermaid"&gt;
flowchart TD
    EVAL[Run eval suite on new model] --&gt; F1 &amp; F2 &amp; F3 &amp; F4

    F1{Format compliance&lt;br/&gt;≥ threshold?}
    F2{Faithfulness&lt;br/&gt;≥ threshold?}
    F3{Latency P95&lt;br/&gt;within range?}
    F4{Edge case&lt;br/&gt;behavior OK?}

    F1 --&gt;|Fail| R1[Debug: markdown bleed,&lt;br/&gt;JSON key drift,&lt;br/&gt;length calibration]
    F2 --&gt;|Fail| R2[Debug: world knowledge&lt;br/&gt;bleeding through,&lt;br/&gt;chunk attribution loss]
    F3 --&gt;|Fail| R3[Evaluate cost/latency&lt;br/&gt;tradeoff, consider&lt;br/&gt;model tier change]
    F4 --&gt;|Fail| R4[Expand golden set&lt;br/&gt;for edge cases,&lt;br/&gt;refusal tuning]

    F1 &amp; F2 &amp; F3 &amp; F4 --&gt;|All pass| PROCEED[Proceed to rollout]

    R1 &amp; R2 &amp; R3 &amp; R4 --&gt; FIX[Fix and re-evaluate]
    FIX --&gt; EVAL

    style PROCEED fill:#1a3d2e,color:#8fd4b0
    style FIX fill:#3d1a1a,color:#d4a0a0
&lt;/pre&gt;

&lt;h3 id="risk-catalog"&gt;Risk catalog&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Format regression&lt;/strong&gt; - one of the most common things to break. Newer models may wrap answers in markdown when you didn't ask, change capitalization, add disclaimers, or subtly alter JSON key naming. Fixable, but only if you're measuring it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Faithfulness change&lt;/strong&gt; - this one can go either direction, which is what makes it tricky. More capable models are generally more faithful to retrieved context, but they also have stronger world-knowledge priors that can bleed through. Watch for answers that are factually correct but not actually grounded in the retrieved context. That's a subtle and dangerous failure mode.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Latency change&lt;/strong&gt; - gpt-5.4-nano is generally comparable to gpt-4o-mini for standard workloads. gpt-5.4-mini and gpt-5.4 (full) are slower. When you enable reasoning (&lt;code&gt;reasoning_effort: medium&lt;/code&gt; or higher), latency increases significantly. At &lt;code&gt;high&lt;/code&gt; or &lt;code&gt;xhigh&lt;/code&gt; effort, a single query can take 30-90+ seconds. Pure reasoning models (o-series) operate in this higher-latency range. Measure P50, P95, P99 before committing.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cost change&lt;/strong&gt; - run a cost projection on your golden set query lengths x new model pricing before migration. The golden set gives you a realistic token distribution. Don't estimate from toy examples.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Streaming behavior&lt;/strong&gt; - if your UI depends on streaming tokens, verify the new model's streaming token patterns don't break downstream parsing. Partial JSON parsing, in particular, can fail on different tokenization patterns.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Silent behavioral drift post-migration&lt;/strong&gt; - even after successful migration, model providers update weights silently. This is the argument for pinning to a specific version string (&lt;code&gt;gpt-5.4-nano-2026-03-15&lt;/code&gt; or similar dated versions) rather than a floating alias like &lt;code&gt;gpt-5.4-nano&lt;/code&gt;. Floating aliases trade reproducibility for automatic access to improvements.&lt;/p&gt;
&lt;h2 id="phase-7-progressive-rollout"&gt;Phase 7 - Progressive rollout&lt;/h2&gt;
&lt;p&gt;I recommend don't big-bang this. The cost of a bad rollout isn't the rollout itself. It's the user impact during the time it takes you to notice and roll back.&lt;/p&gt;
&lt;pre class="mermaid"&gt;
flowchart LR
    A[Shadow mode&lt;br/&gt;48-72h&lt;br/&gt;Log only] --&gt;|Metrics stable| B[5% traffic&lt;br/&gt;Gate check]
    B --&gt;|Pass| C[20% traffic&lt;br/&gt;Gate check]
    C --&gt;|Pass| D[50% traffic&lt;br/&gt;Gate check]
    D --&gt;|Pass| E[100%&lt;br/&gt;Full cutover]

    B --&gt;|Fail| R[Rollback + investigate]
    C --&gt;|Fail| R
    D --&gt;|Fail| R

    style A fill:#2a2a1e,color:#c4c490
    style E fill:#1a3d2e,color:#8fd4b0
    style R fill:#3d1a1a,color:#d4a0a0
&lt;/pre&gt;

&lt;h3 id="shadow-mode-first"&gt;Shadow mode first&lt;/h3&gt;
&lt;p&gt;Route a percentage of production traffic to the new model, log both responses, but serve only the old model's response to users. Run for 48-72 hours to collect real distribution data. This is the only way to validate behavior on production traffic without user exposure.&lt;/p&gt;
&lt;p&gt;Yeah, shadow mode requires infrastructure. You need to fire two model calls per request and log both. Worth it though. You'll be surprised how many behavioral differences show up on real traffic that your golden dataset missed entirely.&lt;/p&gt;
&lt;h3 id="gate-criteria-at-each-step"&gt;Gate criteria at each step&lt;/h3&gt;
&lt;p&gt;Define your numeric pass/fail criteria before the rollout begins. At each step, automated checks verify:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nt"&gt;gate_criteria&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;faithfulness_score&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&amp;gt;=&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;0.92&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;format_compliance&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&amp;gt;=&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;0.98&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;answer_relevance&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&amp;gt;=&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;0.90&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;p95_latency_ms&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&amp;lt;=&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;3000&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;refusal_accuracy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&amp;gt;=&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;0.95&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;error_rate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&amp;lt;=&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;0.001&amp;quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;If any gate fails, the rollout stops. Not pauses for discussion. Stops. The investigation happens before traffic increases, not during.&lt;/p&gt;
&lt;h3 id="traffic-segmentation-for-canary-rollout"&gt;Traffic segmentation for canary rollout&lt;/h3&gt;
&lt;p&gt;If you can segment production traffic by query type or user segment, use it. Start with lower-stakes queries. Factual lookups before complex synthesis. Internal users before customers.&lt;/p&gt;
&lt;h3 id="keep-rollback-ready-for-a-week"&gt;Keep rollback ready for a week&lt;/h3&gt;
&lt;p&gt;Don't retire the old model integration the day you hit 100%. Keep the model name in a config variable so rollback is a single config change, not a code deployment. Wait at least a week before cleaning up the old path.&lt;/p&gt;
&lt;h2 id="phase-8-post-migration-monitoring"&gt;Phase 8 - Post-migration monitoring&lt;/h2&gt;
&lt;p&gt;You're not done when you hit 100% traffic. You've just established a new baseline that will itself drift.&lt;/p&gt;
&lt;h3 id="pin-your-model-version-string"&gt;Pin your model version string&lt;/h3&gt;
&lt;p&gt;This is the one I see skipped most often.&lt;/p&gt;
&lt;p&gt;Use versioned model strings like &lt;code&gt;gpt-5.4-nano-2026-03-15&lt;/code&gt;, not floating aliases like &lt;code&gt;gpt-5.4-nano&lt;/code&gt;. OpenAI updates weights behind floating aliases without telling you. When they do, your system's behavior changes and you did nothing. You won't get a deprecation notice. You'll get unexplained metric shifts in your dashboards... if you have dashboards.&lt;/p&gt;
&lt;p&gt;If you &lt;em&gt;want&lt;/em&gt; automatic access to model improvements, fine. But make that a deliberate choice, not something you fell into because you didn't think about it.&lt;/p&gt;
&lt;h3 id="ongoing-eval-cadence"&gt;Ongoing eval cadence&lt;/h3&gt;
&lt;p&gt;Run your eval suite on a random sample of production queries weekly. Use an LLM-as-judge approach so you don't need human annotation at scale. A weekly automated run catches behavioral drift from silent weight updates, shifts in what users are asking, or changes to your retrieval corpus that interact with generation in unexpected ways.&lt;/p&gt;
&lt;p&gt;Set up alerting on eval score degradation. If faithfulness drops more than 3 points week-over-week, investigate. Don't wait for users to complain.&lt;/p&gt;
&lt;h3 id="what-your-observability-actually-needs-to-cover"&gt;What your observability actually needs to cover&lt;/h3&gt;
&lt;p&gt;At minimum:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Token consumption per request (input, output, and thinking tokens if applicable)&lt;/li&gt;
&lt;li&gt;The actual model version string used (not the alias, the resolved version)&lt;/li&gt;
&lt;li&gt;Total latency per request, not just generation time&lt;/li&gt;
&lt;li&gt;Output format validation pass/fail&lt;/li&gt;
&lt;li&gt;Downstream parsing failures from unexpected output format&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a href="https://smith.langchain.com/"&gt;LangSmith&lt;/a&gt;, &lt;a href="https://phoenix.arize.com/"&gt;Arize Phoenix&lt;/a&gt;, and &lt;a href="https://www.helicone.ai/"&gt;Helicone&lt;/a&gt; all handle these. One thing worth mentioning: generic APM monitoring won't give you what you need for an LLM system. The failure modes are different. The metrics that matter are different. It's a different kind of system.&lt;/p&gt;
&lt;h2 id="the-systems-audit-you-should-run-regardless"&gt;The systems audit you should run regardless&lt;/h2&gt;
&lt;p&gt;Model deprecation is a forcing function. The problems it exposes have been there for months or years. The teams in the best position during migration aren't the ones who saw this specific deprecation coming. They're the ones who built systems that were observable and measurable from the start.&lt;/p&gt;
&lt;p&gt;Two questions worth sitting with, migration or not:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Can you actually measure output quality?&lt;/strong&gt; Not "does the team think it looks fine." Can you produce a number? If not, you're flying blind. A model change, a corpus change, a prompt tweak, a silent weight update... any of these will shift behavior and you won't know.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Is your prompt a specification or a pile of patches?&lt;/strong&gt; Do the archaeology exercise. If the patch ratio is high (and it probably is), that's a liability that compounds with every future model change.&lt;/p&gt;
&lt;p&gt;The migration is the deadline. But the real investment is in system discipline that makes the &lt;em&gt;next&lt;/em&gt; migration boring instead of terrifying.&lt;/p&gt;
&lt;h2 id="further-reading"&gt;Further reading&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://dspy.ai/"&gt;DSPy Documentation and Tutorials&lt;/a&gt; - official docs with RAG optimization examples&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.ragas.io/"&gt;Ragas Framework&lt;/a&gt; - RAG-specific evaluation metrics&lt;/li&gt;
&lt;li&gt;&lt;a href="https://platform.openai.com/docs/deprecations"&gt;OpenAI Model Deprecation Policy&lt;/a&gt; - deprecation schedule and migration guidance&lt;/li&gt;
&lt;li&gt;&lt;a href="https://openai.com/index/introducing-o3-and-o4-mini/"&gt;OpenAI o3 and o4-mini Technical Report&lt;/a&gt; - capabilities and API parameters for reasoning models&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.promptfoo.dev/"&gt;PromptFoo&lt;/a&gt; - LLM regression testing and model comparison&lt;/li&gt;
&lt;li&gt;&lt;a href="https://smith.langchain.com/"&gt;LangSmith&lt;/a&gt; - LLM observability and experiment tracking&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.braintrust.dev/"&gt;Braintrust&lt;/a&gt; - eval-driven LLM development platform&lt;/li&gt;
&lt;li&gt;&lt;a href="https://phoenix.arize.com/"&gt;Arize Phoenix&lt;/a&gt; - open-source LLM tracing and evaluation&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2406.11695"&gt;MIPRO: Optimizing Instructions and Demonstrations&lt;/a&gt; - paper behind the MIPROv2 optimizer&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2507.19457"&gt;GEPA: Reflective Prompt Evolution&lt;/a&gt; - paper behind the GEPA optimizer&lt;/li&gt;
&lt;li&gt;&lt;a href="https://huggingface.co/BAAI/bge-reranker-v2-m3"&gt;BGE Reranker Models&lt;/a&gt; - open-source cross-encoder reranking&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cohere.com/rerank"&gt;Cohere Rerank API&lt;/a&gt; - managed reranking service&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mlflow.org/docs/latest/llm-tracking.html"&gt;MLflow LLM Tracking&lt;/a&gt; - experiment tracking with native DSPy integration&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.helicone.ai/"&gt;Helicone&lt;/a&gt; - LLM observability and cost tracking&lt;/li&gt;
&lt;/ul&gt;
&lt;script type="module"&gt;
    import mermaid from 'https://cdn.jsdelivr.net/npm/mermaid@11/dist/mermaid.esm.min.mjs';
    mermaid.initialize({ startOnLoad: true });
&lt;/script&gt;

&lt;script type="text/javascript"&gt;if (!document.getElementById('mathjaxscript_pelican_#%@#$@#')) {
    var align = "center",
        indent = "0em",
        linebreak = "false";

    if (false) {
        align = (screen.width &lt; 768) ? "left" : align;
        indent = (screen.width &lt; 768) ? "0em" : indent;
        linebreak = (screen.width &lt; 768) ? 'true' : linebreak;
    }

    var mathjaxscript = document.createElement('script');
    mathjaxscript.id = 'mathjaxscript_pelican_#%@#$@#';
    mathjaxscript.type = 'text/javascript';
    mathjaxscript.src = 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.3/latest.js?config=TeX-AMS-MML_HTMLorMML';

    var configscript = document.createElement('script');
    configscript.type = 'text/x-mathjax-config';
    configscript[(window.opera ? "innerHTML" : "text")] =
        "MathJax.Hub.Config({" +
        "    config: ['MMLorHTML.js']," +
        "    TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'none' } }," +
        "    jax: ['input/TeX','input/MathML','output/HTML-CSS']," +
        "    extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," +
        "    displayAlign: '"+ align +"'," +
        "    displayIndent: '"+ indent +"'," +
        "    showMathMenu: true," +
        "    messageStyle: 'normal'," +
        "    tex2jax: { " +
        "        inlineMath: [ ['\\\\(','\\\\)'] ], " +
        "        displayMath: [ ['$$','$$'] ]," +
        "        processEscapes: true," +
        "        preview: 'TeX'," +
        "    }, " +
        "    'HTML-CSS': { " +
        "        availableFonts: ['STIX', 'TeX']," +
        "        preferredFont: 'STIX'," +
        "        styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," +
        "        linebreaks: { automatic: "+ linebreak +", width: '90% container' }," +
        "    }, " +
        "}); " +
        "if ('default' !== 'default') {" +
            "MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" +
                "var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" +
                "VARIANT['normal'].fonts.unshift('MathJax_default');" +
                "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
                "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
                "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
            "});" +
            "MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" +
                "var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" +
                "VARIANT['normal'].fonts.unshift('MathJax_default');" +
                "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
                "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
                "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
            "});" +
        "}";

    (document.body || document.getElementsByTagName('head')[0]).appendChild(configscript);
    (document.body || document.getElementsByTagName('head')[0]).appendChild(mathjaxscript);
}
&lt;/script&gt;</content><category term="Machine Learning"/><category term="llm-migration"/><category term="model-deprecation"/><category term="rag-systems"/><category term="prompt-engineering"/><category term="production-ai"/><category term="ai-evaluation"/><category term="llmops"/><category term="reasoning-models"/><category term="dspy"/><category term="prompt-optimization"/><category term="ai-architecture"/><category term="retrieval-augmented-generation"/><category term="ai-reliability"/><category term="gpt-4o-mini"/><category term="model-versioning"/><category term="ai-observability"/><category term="progressive-rollout"/><category term="faithfulness-evaluation"/><category term="technical-debt"/><category term="ai-engineering"/></entry><entry><title>The illusion of control in AI-assisted engineering</title><link href="https://www.safjan.com/the-illusion-of-control-in-ai-assisted-engineering/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2026-03-03T00:00:00+01:00</published><updated>2026-03-03T00:00:00+01:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2026-03-03:/the-illusion-of-control-in-ai-assisted-engineering/</id><summary type="html">&lt;h2 id="the-illusion-of-control-in-ai-assisted-engineering"&gt;The illusion of control in AI-assisted engineering&lt;/h2&gt;
&lt;p&gt;Dashboards are green. Reviews complete on time. Audits pass. And the organization is slowly losing track of what its own systems actually do.&lt;/p&gt;
&lt;p&gt;I keep running into this in teams that use AI-assisted engineering a …&lt;/p&gt;</summary><content type="html">&lt;h2 id="the-illusion-of-control-in-ai-assisted-engineering"&gt;The illusion of control in AI-assisted engineering&lt;/h2&gt;
&lt;p&gt;Dashboards are green. Reviews complete on time. Audits pass. And the organization is slowly losing track of what its own systems actually do.&lt;/p&gt;
&lt;p&gt;I keep running into this in teams that use AI-assisted engineering a lot. The controls are all still there. They're just not controlling what anyone thinks they're controlling.&lt;/p&gt;
&lt;h3 id="the-distinction-nobody-separates"&gt;The distinction nobody separates&lt;/h3&gt;
&lt;p&gt;When teams adopt AI-assisted development, the first question management asks is: &lt;em&gt;Is a human still reviewing the output?&lt;/em&gt; Yes. Code review gates are there. Architecture sign-offs happen. Compliance checklists get filled in.&lt;/p&gt;
&lt;p&gt;But the word "oversight" covers two different things. One is &lt;em&gt;operational control&lt;/em&gt;: can you approve, reject, or modify the output? The other is &lt;em&gt;epistemic control&lt;/em&gt;: do you actually understand what was built, why it works this way, and what it assumes?&lt;/p&gt;
&lt;p&gt;AI-assisted workflows keep the first and slowly destroy the second. Every governance framework I've seen was designed when these two things were inseparable. The engineer who wrote a module was the same person who reviewed it and signed off on it. They understood it because they built it. That coupling is broken now, and most orgs haven't noticed.&lt;/p&gt;
&lt;h3 id="what-governance-theater-actually-looks-like"&gt;What governance theater actually looks like&lt;/h3&gt;
&lt;p&gt;I watched this happen on a project last year. A team was building an integration layer between an enterprise platform and a third-party data provider. AI tooling generated the service contracts, transformation logic, and error-handling paths. Engineers reviewed everything. And here's what surprised me: the code review was thorough. More thorough than most reviews I've seen. Line-by-line, comments on the PR, long discussions about edge cases. Architecture board signed off.&lt;/p&gt;
&lt;p&gt;Six months later, a cascading failure traced back to a silent retry pattern in the error-handling logic that misbehaved under load. The postmortem asked who understood the design intent behind that module. Nobody. The AI generated it. The engineer reviewed it carefully, caught formatting issues and a null-check bug, but never asked &lt;em&gt;why the retry logic was structured that way&lt;/em&gt; because the review was focused on "is this correct," not "what is this assuming."&lt;/p&gt;
&lt;p&gt;The review wasn't sloppy. That's what made it disturbing. The process did exactly what it was designed to do. It just wasn't designed for code that arrived without anyone having thought through the design.&lt;/p&gt;
&lt;p&gt;The compliance version is worse. Imagine software company under financial regulation gets asked during an external audit to explain the reasoning behind an algorithmic approach touching customer data classification. The honest answer? It was AI-generated and the team validated the outputs without tracing the reasoning. No compliance framework accepts that. You either make up a rationale after the fact or admit a gap the auditors weren't looking for.&lt;/p&gt;
&lt;h3 id="how-responsibility-diffuses"&gt;How responsibility diffuses&lt;/h3&gt;
&lt;p&gt;In a normal engineering workflow, you can point at someone and say: you designed this, you understood it, you signed off. Architecture review boards exist partly to create that paper trail. When something breaks, you know who to ask.&lt;/p&gt;
&lt;p&gt;AI-assisted workflows put something in the authorship chain that can't be asked, can't explain itself, and can't take responsibility when a regulator shows up. Responsibility doesn't move to the AI. It just gets spread thin across everyone who touched the output. Everyone reviewed it. Nobody wrote it. When an incident review or an auditor asks "who understood this design," nobody answers.&lt;/p&gt;
&lt;p&gt;You don't notice this in any one sprint. But after a year of it, an organization ends up with a growing pile of production systems where nobody recorded the design intent, nobody captured the assumptions, and the engineers maintaining the code can't explain why a service boundary is where it is or why the retry logic works the way it does. The code passes tests. The architecture diagrams exist. The understanding doesn't.&lt;/p&gt;
&lt;h3 id="the-slower-risks"&gt;The slower risks&lt;/h3&gt;
&lt;p&gt;Leadership conversations about AI in engineering tend to focus on the obvious stuff: hallucinated code, security vulnerabilities, IP leakage. Those are real. The risks that worry me more take longer to show up.&lt;/p&gt;
&lt;p&gt;One is architectural drift. When dozens of teams across an organization generate code through AI without shared context, each team's output embeds slightly different assumptions about data models, error handling, and service contracts. Service A assumes retries are idempotent. Service B doesn't. Both pass their own tests. The inconsistency only surfaces when they're under load together, and by then the damage spreads fast.&lt;/p&gt;
&lt;p&gt;Another: AI-generated systems are unusually hard to change later. Normal technical debt is annoying but manageable because somebody on the team once understood the code. When a system was generated and reviewed but not really authored, the next team inherits a black box. They can't refactor confidently because nobody can tell them what the original constraints were. So they wrap it, route around it, or just don't touch it. The codebase hardens in a way that's different from normal decay.&lt;/p&gt;
&lt;p&gt;Regulators are catching up, too. In financial services and telecom especially, the questions about AI-assisted code generation are getting specific. The organizations I'd worry about aren't the ones using AI the most. They're the ones that adopted it without changing their traceability and governance to match.&lt;/p&gt;
&lt;h3 id="reframing-the-question"&gt;Reframing the question&lt;/h3&gt;
&lt;p&gt;The question isn't whether humans are in the loop. They are. The question is whether the human hitting "approve" on a PR could explain the design to an auditor, rebuild the module if the original was lost, or predict how it fails under conditions the tests don't cover.&lt;/p&gt;
&lt;p&gt;Reviewing code for correctness and understanding a system's architecture are different activities. Review processes were designed when they were the same activity. AI-assisted workflows have split them apart, and no one updated the process to compensate.&lt;/p&gt;
&lt;p&gt;Some of this is fixable. Not every AI-generated module needs deep human comprehension. Boilerplate, scaffolding, config, sure. But integration contracts, error-handling strategies, anything that encodes assumptions about how other systems behave? Someone needs to understand those, and "reviewed the PR" isn't the same as understanding them. Architectural reviews need to start asking "why is this designed this way" instead of just "does this look right." Traceability for AI-generated code needs to capture the assumptions, not just the output.&lt;/p&gt;
&lt;p&gt;If leadership teams keep the review ceremonies running while the understanding underneath thins out, they're not managing risk. They're performing risk management. And that works until someone actually tests it, which in regulated industries tends to happen during an incident or an audit, when the consequences are already serious.&lt;/p&gt;
&lt;h3 id="questions-worth-sitting-with"&gt;Questions worth sitting with&lt;/h3&gt;
&lt;p&gt;Can your organization tell the difference between systems your teams approved and systems they actually understand? Does anyone track that? I've never seen it tracked.&lt;/p&gt;
&lt;p&gt;If a regulator or an acquirer asked your team to walk through the design intent behind a critical integration, who would take the call? Would they be explaining what they know, or reverse-engineering it on the spot?&lt;/p&gt;
&lt;p&gt;Have your sign-off processes changed at all since AI adoption started? Every code review approval and architecture sign-off carries an implicit claim: "I understood this." If that claim has quietly become "I checked this," the paperwork is still flowing but the assurance behind it isn't there anymore.&lt;/p&gt;
&lt;p&gt;I don't have good answers to any of this. I'm working through it myself.&lt;/p&gt;</content><category term="Machine Learning"/></entry><entry><title>Version Your Vectors - Index Versioning as the Missing Layer in RAG Observability and Compliance</title><link href="https://www.safjan.com/version-your-vectors-index-versioning-as-the-missing-layer-in-rag/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2026-02-26T00:00:00+01:00</published><updated>2026-02-26T00:00:00+01:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2026-02-26:/version-your-vectors-index-versioning-as-the-missing-layer-in-rag/</id><summary type="html">&lt;p&gt;Your RAG answered correctly yesterday. Today, it contradicts itself. Nothing obvious changed — except the index. Retrieval drift is silent, cumulative, and rarely audited. This piece explains how to make it observable and reproducible - what must be versioned beyond vectors, how to enable point-in-time reconstruction, and when lightweight metadata is enough. Includes practical tracing patterns, replay strategies, and a risk-based decision guide for production systems.&lt;/p&gt;</summary><content type="html">&lt;h2 id="1-the-invisible-problem-why-rag-systems-silently-drift"&gt;1. The invisible problem: why RAG systems silently drift&lt;/h2&gt;
&lt;p&gt;Most teams building Retrieval-Augmented Generation systems obsess over the LLM, its version, its temperature, its prompt. The component that actually changes most often and most quietly is the index: the corpus of documents, their chunked representations, and the embeddings that map them into vector space.&lt;/p&gt;
&lt;p&gt;Think about what happens when:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A policy document is updated, but the old version lingers in the index alongside the new one.&lt;/li&gt;
&lt;li&gt;An embedding model is upgraded, shifting similarity rankings for every query.&lt;/li&gt;
&lt;li&gt;A reindexing job silently drops a document partition due to a pipeline bug.&lt;/li&gt;
&lt;li&gt;A new batch of documents is ingested, and one source starts dominating 80% of retrievals on a topic.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In each case, the RAG system's answers change, but the LLM didn't. The model is the same. The prompt is the same. The retrieval configuration is the same. Yet the user gets a different answer today than they got last week.&lt;/p&gt;
&lt;p&gt;Without index versioning, you cannot answer the most basic audit question:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;"Why did the system give this answer on January 10th?"&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is like running a production database without transaction logs or backups. You can serve queries, but you cannot explain, debug, or defend any historical result.&lt;/p&gt;
&lt;p&gt;The index is the part of a RAG pipeline that changes the most, and almost nobody versions it.&lt;/p&gt;
&lt;pre class="mermaid"&gt;
flowchart LR
    Q[Query] --&gt; M[Same Model]
    M --&gt; P[Same Prompt]
    P --&gt; I{Changed Index}
    I --&gt;|v1| A1[Answer A]
    I --&gt;|v2| A2[Different Answer]

    style I fill:#f96,stroke:#333,color:#000
    style A2 fill:#f96,stroke:#333,color:#000
&lt;/pre&gt;

&lt;h2 id="2-use-cases-that-justify-index-versioning"&gt;2. Use cases that justify index versioning&lt;/h2&gt;
&lt;p&gt;Index versioning is not a theoretical exercise. Here are concrete scenarios where missing versioning causes real operational, legal, or quality failures.&lt;/p&gt;
&lt;h3 id="21-debugging-and-root-cause-analysis"&gt;2.1 Debugging and root cause analysis&lt;/h3&gt;
&lt;p&gt;Answer quality suddenly degrades. Users report that the system "stopped knowing" about a topic, or started contradicting itself. The engineering team investigates:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Was it a model change? No, same model, same temperature.&lt;/li&gt;
&lt;li&gt;Was it a prompt change? No, same template.&lt;/li&gt;
&lt;li&gt;Was it an index change? Unknown.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Without index versioning, you cannot isolate retrieval drift from generation drift. You're debugging blind. With version tags on every retrieval request, you can immediately correlate quality drops with specific index changes: "Hallucination rate jumped 15% after &lt;code&gt;index_v23&lt;/code&gt; was deployed, which coincided with a re-chunking of the legal policy corpus."&lt;/p&gt;
&lt;h3 id="22-data-drift-detection"&gt;2.2 Data drift detection&lt;/h3&gt;
&lt;p&gt;The knowledge base is not static. Documents are added, updated, and removed. Embedding models are upgraded. Chunking strategies evolve.&lt;/p&gt;
&lt;p&gt;Index versioning lets you track:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Corpus drift: How many documents were added or removed between versions?&lt;/li&gt;
&lt;li&gt;Embedding drift: Did switching from &lt;code&gt;text-embedding-ada-002&lt;/code&gt; to &lt;code&gt;text-embedding-3-large&lt;/code&gt; change retrieval behavior?&lt;/li&gt;
&lt;li&gt;Source dominance shifts: Is one internal document now responsible for 70% of answers on a topic, when it used to be 30%?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These are not hypothetical concerns. In enterprise deployments, source dominance drift is one of the most common causes of biased or narrow answers, and it is invisible without versioned tracking.&lt;/p&gt;
&lt;h3 id="23-regulatory-compliance-and-audit"&gt;2.3 Regulatory compliance and audit&lt;/h3&gt;
&lt;p&gt;This is where index versioning goes from "nice to have" to "legally required."&lt;/p&gt;
&lt;p&gt;Under the EU AI Act (Regulation 2024/1689), high-risk AI systems must maintain records sufficient to reconstruct past decisions. If a RAG system supports employment decisions, credit assessments, or healthcare recommendations, a regulator may ask:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;"Reconstruct the knowledge base state and the retrieval trace for the answer provided to user X on date Y."&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Without index versioning, this is impossible.&lt;/p&gt;
&lt;p&gt;In financial services, auditors need to verify what knowledge base was active when an AI-assisted recommendation was generated. In healthcare, regulators need to confirm which clinical guidelines were indexed at the time of an AI-supported diagnosis.&lt;/p&gt;
&lt;h3 id="24-reproducibility-for-evaluation"&gt;2.4 Reproducibility for evaluation&lt;/h3&gt;
&lt;p&gt;RAG evaluation is meaningless without controlled variables. If you're comparing two retrieval strategies, two embedding models, or two chunking approaches, you need to hold the index constant, or at least know exactly how it differed.&lt;/p&gt;
&lt;p&gt;Index versioning gives you:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A/B testing of embedding models with rollback to the control version.&lt;/li&gt;
&lt;li&gt;Regression testing: "Did the new index version degrade performance on our golden test set?"&lt;/li&gt;
&lt;li&gt;Longitudinal evaluation: tracking quality metrics across index versions over months.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="25-incident-response-and-rollback"&gt;2.5 Incident response and rollback&lt;/h3&gt;
&lt;p&gt;A corrupted ingestion pipeline indexes the wrong documents. A hallucination spike is traced to a specific batch of poorly formatted PDFs. A critical policy document was accidentally excluded during re-indexing.&lt;/p&gt;
&lt;p&gt;In each case, the response must be fast:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Identify the problematic index version.&lt;/li&gt;
&lt;li&gt;Rollback to the last known-good version.&lt;/li&gt;
&lt;li&gt;Serve from the rolled-back index while the issue is resolved.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Without versioning, step 1 is a forensic investigation, step 2 is impossible, and step 3 requires rebuilding the entire index from scratch.&lt;/p&gt;
&lt;h3 id="26-counterfactual-replay"&gt;2.6 Counterfactual replay&lt;/h3&gt;
&lt;p&gt;For bias investigations and fairness audits, you need to ask:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;"What would the system have answered with last month's index?"&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This requires the ability to replay a query against a historical index state. Counterfactual replay is the best method we have for demonstrating that a system change improved (or degraded) fairness, accuracy, or safety.&lt;/p&gt;
&lt;h2 id="3-what-exactly-needs-versioning"&gt;3. What exactly needs versioning?&lt;/h2&gt;
&lt;p&gt;People often equate "index versioning" with "snapshotting the vector store." That's necessary but not enough. A RAG index is a composite artifact made of multiple independently changing components:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;What changes&lt;/th&gt;
&lt;th&gt;Why it matters&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Document corpus&lt;/td&gt;
&lt;td&gt;Documents added, removed, or updated&lt;/td&gt;
&lt;td&gt;The content of the knowledge base&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Chunk boundaries&lt;/td&gt;
&lt;td&gt;Chunking strategy or parameters change&lt;/td&gt;
&lt;td&gt;Same document produces different retrieval units&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Embedding model&lt;/td&gt;
&lt;td&gt;Model version upgraded&lt;/td&gt;
&lt;td&gt;Same text maps to different vectors&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vector index structure&lt;/td&gt;
&lt;td&gt;HNSW/IVF parameters, rebuild&lt;/td&gt;
&lt;td&gt;Approximate search behavior changes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Metadata and filters&lt;/td&gt;
&lt;td&gt;ACLs, tags, timestamps, source labels&lt;/td&gt;
&lt;td&gt;What is retrievable changes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prompt templates&lt;/td&gt;
&lt;td&gt;System prompt evolves&lt;/td&gt;
&lt;td&gt;How retrieved context is used changes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;A "version" is the combination of all these, not just the vector snapshot. If you version the vectors but not the chunking strategy, the same document will produce different answers and you won't know why.&lt;/p&gt;
&lt;p&gt;In practice, your version identifier should encode or reference all of these components. A simple approach is a composite version string or a version manifest:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;index_version&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;v47&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;corpus_hash&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;a3f8c1...&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;chunk_strategy&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;semantic_v2&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;embedding_model&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;text-embedding-3-large&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;embedding_model_version&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;2025-09&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;vector_index_type&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;HNSW&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;document_count&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;14832&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;last_updated&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;2026-02-15T08:30:00Z&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;prompt_template_version&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;v12&amp;quot;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="4-regulatory-context-when-versioning-becomes-mandatory"&gt;4. Regulatory context: when versioning becomes mandatory&lt;/h2&gt;
&lt;h3 id="41-eu-ai-act-risk-classification-for-rag"&gt;4.1 EU AI Act risk classification for RAG&lt;/h3&gt;
&lt;p&gt;The EU AI Act does not regulate "RAG systems" as a category. It regulates AI systems based on their application domain and impact.&lt;/p&gt;
&lt;p&gt;A RAG system becomes high-risk when deployed in:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Employment and worker management (screening, recruitment, task allocation)&lt;/li&gt;
&lt;li&gt;Access to essential services (credit scoring, insurance, social benefits)&lt;/li&gt;
&lt;li&gt;Education and vocational training (admissions, assessment)&lt;/li&gt;
&lt;li&gt;Law enforcement (evidence evaluation, risk assessment)&lt;/li&gt;
&lt;li&gt;Healthcare (clinical decision support, triage)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A RAG system used as an internal productivity tool (e.g., searching company wikis, drafting emails) is generally limited-risk or minimal-risk. Transparency obligations apply (users must know they're interacting with AI), but the heavy documentation and logging requirements do not.&lt;/p&gt;
&lt;p&gt;For deployers integrating third-party LLMs into RAG pipelines: even if the base model is compliant, your retrieval layer, index, and data governance are your responsibility. The model provider's compliance does not cover your index.&lt;/p&gt;
&lt;h3 id="42-what-the-regulation-actually-requires"&gt;4.2 What the regulation actually requires&lt;/h3&gt;
&lt;p&gt;For high-risk systems, the EU AI Act mandates:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Article 12 (Record-Keeping): Automatic logging of events sufficient to trace system behavior throughout its lifecycle. For RAG, this means logging retrieval traces, index versions, and context assembly decisions, not just input/output.&lt;/li&gt;
&lt;li&gt;Article 11 (Technical Documentation): Description of the system architecture, data sources, training data (or in RAG's case, indexed data), and their versions.&lt;/li&gt;
&lt;li&gt;Article 14 (Human Oversight): The system must allow effective human supervision, including the ability to review retrieval results and override decisions.&lt;/li&gt;
&lt;li&gt;Article 9 (Risk Management): Ongoing monitoring, testing, and mitigation, which for RAG includes detecting index drift, evaluating retrieval quality, and maintaining rollback capability.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The regulation does not require opening neural weights or explaining attention patterns. It requires system-level traceability and governance. Index versioning is the mechanism that makes this possible for the retrieval layer.&lt;/p&gt;
&lt;h3 id="43-proportional-versioning-match-effort-to-risk"&gt;4.3 Proportional versioning: match effort to risk&lt;/h3&gt;
&lt;p&gt;The principle here is proportionality. Not every RAG system needs a full replay architecture. But every RAG system needs &lt;em&gt;some&lt;/em&gt; level of version tracking.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Risk level&lt;/th&gt;
&lt;th&gt;What to track&lt;/th&gt;
&lt;th&gt;How to implement&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Low risk (internal assistant)&lt;/td&gt;
&lt;td&gt;Embedding model version, document count, last update timestamp&lt;/td&gt;
&lt;td&gt;Metadata tags on telemetry spans&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Medium risk (decision support)&lt;/td&gt;
&lt;td&gt;Above + document hash sets, retrieval traces, index snapshot references&lt;/td&gt;
&lt;td&gt;Structured audit events with blob storage references&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;High risk (regulatory decisions)&lt;/td&gt;
&lt;td&gt;Above + full replay capability, immutable audit trail, point-in-time index reconstruction&lt;/td&gt;
&lt;td&gt;Time-travel capable vector DB + archived prompts + evaluation metrics store&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Don't over-engineer for low-risk systems. Don't under-engineer for high-risk systems. The cost of getting this wrong runs in both directions: wasted engineering effort on one end, regulatory exposure on the other.&lt;/p&gt;
&lt;h2 id="5-solution-architectures-from-lightweight-to-full-replay"&gt;5. Solution architectures: from lightweight to full replay&lt;/h2&gt;
&lt;pre class="mermaid"&gt;
graph BT
    L0["&lt;b&gt;Level 0: Metadata Tagging&lt;/b&gt;&lt;br&gt;~zero cost &lt;br&gt;&lt;br&gt; basic drift detection"]
    L1["&lt;b&gt;Level 1: Content Hashing&lt;/b&gt;&lt;br&gt;low cost &lt;br&gt;&lt;br&gt; change detection + audit trail"]
    L2["&lt;b&gt;Level 2: Index Snapshots&lt;/b&gt;&lt;br&gt;medium cost &lt;br&gt;&lt;br&gt; point-in-time recovery + A/B testing"]
    L3["&lt;b&gt;Level 3: Full Replay&lt;/b&gt;&lt;br&gt;high cost &lt;br&gt;&lt;br&gt; counterfactual analysis + regulatory demo"]
    L4["&lt;b&gt;Level 4: Self-Healing Index&lt;/b&gt;&lt;br&gt;highest cost &lt;br&gt;&lt;br&gt; automated drift detection + remediation"]

    L0 --&gt; L1 --&gt; L2 --&gt; L3 --&gt; L4

    style L0 fill:#c8e6c9,stroke:#333,color:#000
    style L1 fill:#a5d6a7,stroke:#333,color:#000
    style L2 fill:#fff9c4,stroke:#333,color:#000
    style L3 fill:#ffcc80,stroke:#333,color:#000
    style L4 fill:#ef9a9a,stroke:#333,color:#000
&lt;/pre&gt;

&lt;h3 id="51-level-0-metadata-tagging-minimum-viable-versioning"&gt;5.1 Level 0: metadata tagging (minimum viable versioning)&lt;/h3&gt;
&lt;p&gt;The cheapest possible starting point. Tag every retrieval request with basic version metadata:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;index_version&lt;/code&gt;, a monotonically increasing identifier&lt;/li&gt;
&lt;li&gt;&lt;code&gt;embedding_model_version&lt;/code&gt;, which model generated the vectors&lt;/li&gt;
&lt;li&gt;&lt;code&gt;document_count&lt;/code&gt;, how many documents are in the index&lt;/li&gt;
&lt;li&gt;&lt;code&gt;last_updated_timestamp&lt;/code&gt;, when the index was last modified&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Store these as OpenTelemetry span attributes on every &lt;code&gt;rag.retrieval&lt;/code&gt; span.&lt;/p&gt;
&lt;p&gt;Cost: Near zero.
Value: Basic drift detection, version correlation with quality metrics, and historical analysis.&lt;/p&gt;
&lt;p&gt;This is the absolute minimum. If you are not doing this today, start here.&lt;/p&gt;
&lt;h3 id="52-level-1-content-hashing-and-change-detection"&gt;5.2 Level 1: content hashing and change detection&lt;/h3&gt;
&lt;p&gt;At ingestion time, compute a hash for each document and each chunk. Store a &lt;code&gt;document_hash_set_id&lt;/code&gt; that uniquely identifies the set of documents (and their versions) in the index.&lt;/p&gt;
&lt;p&gt;This gets you:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Change detection without diff storage: know &lt;em&gt;that&lt;/em&gt; the index changed between v46 and v47, and &lt;em&gt;which documents&lt;/em&gt; changed.&lt;/li&gt;
&lt;li&gt;Selective re-embedding: only re-embed documents whose content hash changed, saving API costs.&lt;/li&gt;
&lt;li&gt;Audit trail: prove exactly which document versions were indexed at any point.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A practical schema extension for pgvector or similar:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;TABLE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;ADD&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;COLUMN&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;content_hash&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;TABLE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;ADD&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;COLUMN&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;model_version&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;TABLE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;ADD&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;COLUMN&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;indexed_at&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;TIMESTAMP&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;DEFAULT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This pattern is well-documented in production pgvector deployments where content hash and text diff ratios determine whether re-embedding is necessary.&lt;/p&gt;
&lt;h3 id="53-level-2-index-snapshots-with-point-in-time-recovery"&gt;5.3 Level 2: index snapshots with point-in-time recovery&lt;/h3&gt;
&lt;p&gt;This is where you gain the ability to reconstruct past index states.&lt;/p&gt;
&lt;p&gt;Several approaches, depending on your vector store:&lt;/p&gt;
&lt;p&gt;Database-native time travel:
- LanceDB provides built-in versioning with zero-cost snapshots. Every mutation creates a new version. You can check out any historical version and query against it.
- Qdrant supports collection snapshots that can be stored and restored.
- Milvus offers point-in-time recovery through its storage layer.&lt;/p&gt;
&lt;p&gt;Alias-based zero-downtime deployment:
- In Elasticsearch/OpenSearch, create timestamped indexes (&lt;code&gt;rag_index_2026_02_15&lt;/code&gt;) and use aliases (&lt;code&gt;rag_index_current&lt;/code&gt;) to swap atomically. Old indexes remain queryable for audit.&lt;/p&gt;
&lt;p&gt;Storage-based snapshots:
- For FAISS or ChromaDB, periodically snapshot the index files to blob storage (S3, Azure Blob). Tag with the version manifest from Section 3.&lt;/p&gt;
&lt;p&gt;Blue-green deployment for model upgrades:
- When upgrading the embedding model, build the new index in parallel. Route a percentage of traffic to the new index. Compare quality metrics. If acceptable, switch the alias. If not, rollback is instantaneous.&lt;/p&gt;
&lt;h3 id="54-level-3-full-replay-architecture"&gt;5.4 Level 3: full replay architecture&lt;/h3&gt;
&lt;p&gt;Store enough state to deterministically replay any historical request:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Index snapshot reference, which version was active&lt;/li&gt;
&lt;li&gt;Retrieved document IDs and scores, the actual retrieval results&lt;/li&gt;
&lt;li&gt;Exact prompt sent to LLM, including the assembled context&lt;/li&gt;
&lt;li&gt;Model version and parameters: temperature, top_p, model name&lt;/li&gt;
&lt;li&gt;Evaluation metrics: groundedness score, hallucination risk, confidence&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;With all five, you can answer:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;"What would this query have returned with the index from three months ago?"&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is what makes counterfactual analysis, bias investigations, and regulatory demonstrations possible.&lt;/p&gt;
&lt;p&gt;The Drift-Adapter approach offers an interesting optimization here: instead of maintaining full parallel indexes for every embedding model version, use a lightweight learned transformation layer to map new query embeddings into the legacy embedding space. This achieves 95-99% of retrieval performance recovery at a fraction of the storage cost.&lt;/p&gt;
&lt;h3 id="55-level-4-self-healing-index-with-drift-detection"&gt;5.5 Level 4: self-healing index with drift detection&lt;/h3&gt;
&lt;p&gt;The most mature pattern: the index monitors its own health and triggers corrective actions.&lt;/p&gt;
&lt;p&gt;Continuous monitoring:
- Track retrieval quality metrics (precision@k, nDCG, confidence score distributions) per index version.
- Monitor content freshness: flag embeddings older than a configurable threshold.
- Detect semantic drift: compare embedding distributions between index versions.&lt;/p&gt;
&lt;p&gt;Automated drift detection signals:
- Content hash changes without corresponding re-embedding
- Retrieval score distribution shift beyond a threshold
- Embedding age exceeding freshness policy
- Source dominance index crossing a configured limit&lt;/p&gt;
&lt;p&gt;Automated remediation:
- Trigger selective re-embedding for stale content
- Alert on anomalous quality metric changes
- Initiate alias-based index rebuild if drift exceeds tolerance&lt;/p&gt;
&lt;p&gt;This pattern has been demonstrated in production using Elasticsearch with alias-based rebuilds, where the system continuously monitors query quality and initiates zero-downtime reindexing when drift is detected.&lt;/p&gt;
&lt;h2 id="6-implementing-observability-around-index-versions"&gt;6. Implementing observability around index versions&lt;/h2&gt;
&lt;h3 id="61-opentelemetry-span-design"&gt;6.1 OpenTelemetry span design&lt;/h3&gt;
&lt;p&gt;Model every RAG request as a span hierarchy. The &lt;code&gt;index.version&lt;/code&gt; attribute must be present on the root span and the retrieval span:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;rag.request (root span)
  ├── index.version: &amp;quot;v47&amp;quot;
  ├── embedding.model: &amp;quot;text-embedding-3-large&amp;quot;
  ├── rag.retrieval
  │     ├── top_k: 5
  │     ├── retrieved_doc_ids: [&amp;quot;doc_123&amp;quot;, &amp;quot;doc_456&amp;quot;, ...]
  │     ├── retrieval_scores: [0.92, 0.87, ...]
  │     └── filter_applied: true
  ├── rag.context_build
  │     └── context_token_count: 2400
  ├── rag.generation
  │     ├── model: &amp;quot;gpt-4o&amp;quot;
  │     └── temperature: 0.1
  └── rag.evaluation
        ├── groundedness_score: 0.91
        └── hallucination_risk: 0.08
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This structure is natively understood by Azure Application Insights, Jaeger, SigNoz, and other OpenTelemetry-compatible backends.&lt;/p&gt;
&lt;h3 id="62-metrics-to-track"&gt;6.2 Metrics to track&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;What it reveals&lt;/th&gt;
&lt;th&gt;Alert threshold (example)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Index freshness&lt;/td&gt;
&lt;td&gt;Time since last index update&lt;/td&gt;
&lt;td&gt;&amp;gt; 7 days for active knowledge bases&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Document churn rate&lt;/td&gt;
&lt;td&gt;Additions/removals per period&lt;/td&gt;
&lt;td&gt;&amp;gt; 20% churn in 24 hours&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Embedding model version distribution&lt;/td&gt;
&lt;td&gt;Mixed model versions in index&lt;/td&gt;
&lt;td&gt;Any version mismatch&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Retrieval score distribution&lt;/td&gt;
&lt;td&gt;Shift in confidence over time&lt;/td&gt;
&lt;td&gt;Mean score drops &amp;gt; 10% week-over-week&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Source dominance index&lt;/td&gt;
&lt;td&gt;Single source answering disproportionately&lt;/td&gt;
&lt;td&gt;Any source &amp;gt; 60% of retrievals for a topic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Unsupported claim rate&lt;/td&gt;
&lt;td&gt;Claims not grounded in retrieved context&lt;/td&gt;
&lt;td&gt;&amp;gt; 5% of responses&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id="63-alerting-on-index-related-anomalies"&gt;6.3 Alerting on index-related anomalies&lt;/h3&gt;
&lt;p&gt;Configure alerts for:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Hallucination rate spike within 24 hours of a reindexing event&lt;/li&gt;
&lt;li&gt;Sudden drop in average retrieval confidence scores&lt;/li&gt;
&lt;li&gt;Unexpected change in document count (indicating accidental deletion or duplication)&lt;/li&gt;
&lt;li&gt;Retrieval latency spike (possibly caused by index corruption or misconfiguration)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The important thing is correlating quality metrics with index version transitions. Without the version tag, an alert tells you something is wrong. With it, the alert tells you &lt;em&gt;what caused it&lt;/em&gt;.&lt;/p&gt;
&lt;h3 id="64-governance-dashboards"&gt;6.4 Governance dashboards&lt;/h3&gt;
&lt;p&gt;For teams operating RAG in regulated environments:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Index version timeline with quality metrics overlaid, so you can see exactly when quality changed and which index version was responsible.&lt;/li&gt;
&lt;li&gt;Document lifecycle view: when was each document added, modified, or removed from the index?&lt;/li&gt;
&lt;li&gt;Cross-version comparison: side-by-side retrieval results for the same query across two index versions.&lt;/li&gt;
&lt;li&gt;Regulatory audit view: for a given date range, show all index versions that were active and their associated quality metrics.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="7-practical-decision-framework"&gt;7. Practical decision framework&lt;/h2&gt;
&lt;p&gt;Use this decision tree to choose your versioning level:&lt;/p&gt;
&lt;pre class="mermaid"&gt;
flowchart TD
    Start([What versioning level do you need?]) --&gt; Q1{Regulated domain?&lt;br&gt;finance, healthcare,&lt;br&gt;employment, legal}
    Q1 --&gt;|Yes| R1[Level 2+ with&lt;br&gt;immutable audit trail]
    Q1 --&gt;|No| Q2{Index changes&lt;br&gt;more than weekly?}
    Q2 --&gt;|Yes| R2[Level 1+&lt;br&gt;change detection]
    Q2 --&gt;|No| Q3{Need to explain&lt;br&gt;past answers?}
    Q3 --&gt;|Yes| R3[Level 3&lt;br&gt;replay capability]
    Q3 --&gt;|No| Q4{High-risk under&lt;br&gt;EU AI Act?}
    Q4 --&gt;|Yes| R4[Level 3-4&lt;br&gt;immutable logs +&lt;br&gt;point-in-time reconstruction]
    Q4 --&gt;|No| Q5{Experimenting with&lt;br&gt;embeddings or chunking?}
    Q5 --&gt;|Yes| R5[Level 2+&lt;br&gt;blue-green + A/B]
    Q5 --&gt;|No| R6[Level 0&lt;br&gt;metadata tagging&lt;br&gt;is sufficient]

    style R1 fill:#ef9a9a,stroke:#333,color:#000
    style R4 fill:#ef9a9a,stroke:#333,color:#000
    style R3 fill:#ffcc80,stroke:#333,color:#000
    style R2 fill:#fff9c4,stroke:#333,color:#000
    style R5 fill:#fff9c4,stroke:#333,color:#000
    style R6 fill:#c8e6c9,stroke:#333,color:#000
&lt;/pre&gt;

&lt;h2 id="8-common-pitfalls"&gt;8. Common pitfalls&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;You version the vectors but forget the chunking strategy.&lt;/strong&gt; The same document, chunked differently, retrieves differently and answers differently. Change your chunking approach without bumping the index version and your version history lies to you.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Nobody tracks prompt template changes.&lt;/strong&gt; Index is the same. Model is the same. But someone tweaked the system prompt last Tuesday, and now identical retrieved context produces different answers. If prompt versions aren't in your audit trail, you have a blind spot.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;You snapshot the whole index when you don't need to.&lt;/strong&gt; A million documents at 1536 dimensions is several gigabytes per snapshot. Often, storing the document hash set, the embedding model version, and the index config is enough to rebuild the index on demand. You don't always need the raw vectors sitting in cold storage.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;No retention policy.&lt;/strong&gt; Keeping every version forever gets expensive and may run afoul of GDPR data minimization rules. Set retention windows based on actual regulatory requirements: maybe six months for an internal tool, five to ten years for healthcare or finance.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;You treat versioning as a one-time setup.&lt;/strong&gt; Pipelines change. New data sources show up. Embedding models get upgraded. Chunking strategies evolve. If your versioning system can't keep up, it becomes stale metadata that nobody trusts. Build it as infrastructure you maintain, not a script you ran once.&lt;/p&gt;
&lt;h2 id="9-index-versioning-as-a-first-class-concern"&gt;9. Index versioning as a first-class concern&lt;/h2&gt;
&lt;p&gt;The retrieval index is the part of your RAG system that changes the most and gets governed the least. It shifts more often than the model, more quietly than the prompt, and its effect on answer quality is larger than either.&lt;/p&gt;
&lt;p&gt;Versioning your index is not optional for any team that cares about debugging, compliance, or evaluation. Without it, you can't explain past answers, you can't detect drift, and you can't roll back when something breaks.&lt;/p&gt;
&lt;p&gt;How much versioning you need depends on what's at stake:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Level 0: tag every request with &lt;code&gt;index_version&lt;/code&gt; and &lt;code&gt;embedding_model_version&lt;/code&gt;. It costs nothing. Do this today.&lt;/li&gt;
&lt;li&gt;Level 1-2: add these when your index changes often or when someone asks why the system gave a particular answer last month.&lt;/li&gt;
&lt;li&gt;Level 3-4: invest here when regulators require it or when your system's outputs carry real consequences for real people.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The question most RAG systems can't answer today is straightforward:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;"What did the system know, and how did it use that knowledge, on a specific date?"&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If you can answer that, you're ahead of almost everyone.&lt;/p&gt;
&lt;script type="module"&gt;
    import mermaid from 'https://cdn.jsdelivr.net/npm/mermaid@11/dist/mermaid.esm.min.mjs';
    mermaid.initialize({ startOnLoad: true });
&lt;/script&gt;</content><category term="Machine Learning"/><category term="rag"/><category term="observability"/><category term="explainability"/><category term="versioning"/><category term="eu-ai-act"/><category term="vector-database"/><category term="data-drift"/><category term="compliance"/><category term="audit"/></entry><entry><title>Publishing a Python CLI Tool to Homebrew</title><link href="https://www.safjan.com/publishing-python-cli-tool-to-homebrew/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2026-02-05T00:00:00+01:00</published><updated>2026-02-05T00:00:00+01:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2026-02-05:/publishing-python-cli-tool-to-homebrew/</id><summary type="html">&lt;h2 id="overview"&gt;Overview&lt;/h2&gt;
&lt;p&gt;Homebrew distribution lets users install your Python CLI with &lt;code&gt;brew install your-tool&lt;/code&gt;. The process: create a GitHub "tap" repository, generate a Ruby formula file, test it, and publish.&lt;/p&gt;
&lt;h2 id="prerequisites"&gt;Prerequisites&lt;/h2&gt;
&lt;p&gt;Your package must be on PyPI with an &lt;code&gt;sdist&lt;/code&gt; (source distribution), not …&lt;/p&gt;</summary><content type="html">&lt;h2 id="overview"&gt;Overview&lt;/h2&gt;
&lt;p&gt;Homebrew distribution lets users install your Python CLI with &lt;code&gt;brew install your-tool&lt;/code&gt;. The process: create a GitHub "tap" repository, generate a Ruby formula file, test it, and publish.&lt;/p&gt;
&lt;h2 id="prerequisites"&gt;Prerequisites&lt;/h2&gt;
&lt;p&gt;Your package must be on PyPI with an &lt;code&gt;sdist&lt;/code&gt; (source distribution), not just wheels. All dependencies need &lt;code&gt;sdist&lt;/code&gt; too. Define entry points in &lt;code&gt;pyproject.toml&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;[project.scripts]&lt;/span&gt;
&lt;span class="n"&gt;your-cli-tool&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;your_package.cli:main&amp;quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="step-by-step-process"&gt;Step-by-Step Process&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;1. Create a tap repository.&lt;/strong&gt; A Homebrew tap is just a GitHub repository with a specific naming convention. Create a repo named &lt;code&gt;homebrew-{something}&lt;/code&gt; under your account. When someone runs &lt;code&gt;brew tap yourusername/something&lt;/code&gt;, Homebrew looks for &lt;code&gt;github.com/yourusername/homebrew-something&lt;/code&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Users will install your tool like this:&lt;/span&gt;
brew&lt;span class="w"&gt; &lt;/span&gt;tap&lt;span class="w"&gt; &lt;/span&gt;yourusername/tools
brew&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;your-cli-tool

&lt;span class="c1"&gt;# Or in one command using the full path:&lt;/span&gt;
brew&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;yourusername/tools/your-cli-tool
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Inside your tap repository, create a &lt;code&gt;Formula&lt;/code&gt; directory for your Ruby &lt;code&gt;.rb&lt;/code&gt; formula files. See &lt;a href="https://docs.brew.sh/How-to-Create-and-Maintain-a-Tap"&gt;How to Create and Maintain a Tap&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. Generate the formula&lt;/strong&gt; using &lt;a href="https://github.com/tdsmith/homebrew-pypi-poet"&gt;homebrew-pypi-poet&lt;/a&gt; in a fresh venv:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;/tmp&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;python&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;venv&lt;span class="w"&gt; &lt;/span&gt;venv&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;venv/bin/activate
pip&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;your-cli-tool&lt;span class="w"&gt; &lt;/span&gt;homebrew-pypi-poet
poet&lt;span class="w"&gt; &lt;/span&gt;-f&lt;span class="w"&gt; &lt;/span&gt;your-cli-tool&lt;span class="w"&gt; &lt;/span&gt;&amp;gt;&lt;span class="w"&gt; &lt;/span&gt;your-cli-tool.rb
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;3. Test locally&lt;/strong&gt; before pushing:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nv"&gt;HOMEBREW_NO_INSTALL_FROM_API&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;brew&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;--build-from-source&lt;span class="w"&gt; &lt;/span&gt;--verbose&lt;span class="w"&gt; &lt;/span&gt;--debug&lt;span class="w"&gt; &lt;/span&gt;./your-cli-tool.rb
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;4. Push to GitHub.&lt;/strong&gt; Users install via &lt;code&gt;brew install yourusername/tapname/your-cli-tool&lt;/code&gt;.&lt;/p&gt;
&lt;pre class="mermaid"&gt;
flowchart LR
    A[Create tap repo] --&gt; B[Generate formula with poet]
    B --&gt; C[Test locally]
    C --&gt; D[Push to GitHub]
    D --&gt; E[Users can brew install]
&lt;/pre&gt;

&lt;h2 id="common-pitfalls"&gt;Common Pitfalls&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Missing sdist&lt;/strong&gt; - Package or dependency lacks source distribution on PyPI&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Contaminated poet env&lt;/strong&gt; - Always use a fresh venv for formula generation&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No license&lt;/strong&gt; - Required for homebrew-core submission&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="going-further"&gt;Going Further&lt;/h2&gt;
&lt;p&gt;For homebrew-core submission (pre-compiled bottles, &lt;code&gt;brew search&lt;/code&gt; discoverability), see &lt;a href="https://github.com/Homebrew/homebrew-core/blob/master/CONTRIBUTING.md"&gt;homebrew-core CONTRIBUTING&lt;/a&gt;. For automated updates, see &lt;a href="https://til.simonwillison.net/homebrew/auto-formulas-github-actions"&gt;GitHub Actions automation&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="sources-and-further-reading"&gt;Sources and Further Reading&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/homebrew/packaging-python-cli-for-homebrew"&gt;Packaging a Python CLI tool for Homebrew&lt;/a&gt; - Simon Willison's comprehensive walkthrough&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.brew.sh/Python-for-Formula-Authors"&gt;Python for Formula Authors&lt;/a&gt; - Official Homebrew Python documentation&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.brew.sh/Formula-Cookbook"&gt;Formula Cookbook&lt;/a&gt; - Writing and testing formulas&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/tdsmith/homebrew-pypi-poet"&gt;homebrew-pypi-poet&lt;/a&gt; - Formula generation tool&lt;/li&gt;
&lt;/ul&gt;
&lt;script type="module"&gt;
    import mermaid from 'https://cdn.jsdelivr.net/npm/mermaid@11/dist/mermaid.esm.min.mjs';
    mermaid.initialize({ startOnLoad: true });
&lt;/script&gt;</content><category term="note"/><category term="python"/><category term="homebrew"/><category term="cli"/><category term="packaging"/><category term="distribution"/><category term="macos"/></entry><entry><title>The limiting factor at work isn't writing code anymore</title><link href="https://www.safjan.com/the-limiting-factor-at-work-isnt-writing-code-anymore/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2026-01-28T00:00:00+01:00</published><updated>2026-01-28T00:00:00+01:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2026-01-28:/the-limiting-factor-at-work-isnt-writing-code-anymore/</id><summary type="html">&lt;p&gt;In a Hacker News &lt;a href="https://news.ycombinator.com/item?id=46782811"&gt;discussion&lt;/a&gt; on the article &lt;a href="https://www.oneusefulthing.org/p/management-as-ai-superpower"&gt;Management as AI Superpower&lt;/a&gt; by Ethan Mollick, I came across a comment that resonated strongly with a realization I’ve had recently while coding with a factory.ai droid:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The limiting factor at work …&lt;/p&gt;&lt;/blockquote&gt;</summary><content type="html">&lt;p&gt;In a Hacker News &lt;a href="https://news.ycombinator.com/item?id=46782811"&gt;discussion&lt;/a&gt; on the article &lt;a href="https://www.oneusefulthing.org/p/management-as-ai-superpower"&gt;Management as AI Superpower&lt;/a&gt; by Ethan Mollick, I came across a comment that resonated strongly with a realization I’ve had recently while coding with a factory.ai droid:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The limiting factor at work isn’t writing code anymore. It’s deciding what to build and catching when things go sideways.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;(&lt;a href="https://news.ycombinator.com/item?id=46783415"&gt;link&lt;/a&gt; by &lt;a href="https://news.ycombinator.com/user?id=augusteo"&gt;augusteo&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;This observation forces a shift in priorities.&lt;/p&gt;
&lt;h2 id="deciding-what-to-build"&gt;Deciding what to build&lt;/h2&gt;
&lt;p&gt;Creative minds tend to have no shortage of ideas. With agentic coding, it’s incredibly tempting to start immediately: the goal feels only “a few hours of AI agent work” away, something you can even run in the background.&lt;/p&gt;
&lt;p&gt;Starting to build something immediately, however, incurs real costs:
- mental engagement
Once you commit, you have to keep making decisions: should you continue, pivot, or drop the project? Dropping it comes with a psychological cost—unfinished work, a sense of abandonment, or the feeling of being unable to finish. These are easy mental traps to fall into. You end up with one more thing occupying your cognitive space.&lt;/p&gt;
&lt;p&gt;The unfinished projects creating cognitive and emotional debt - “psychological technical debt”.  AI accelerates its accumulation by lowering the barrier to starting but not to finishing.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;expected returns on investment
You still need to ask “why” you are building something—ideally several times, as in the &lt;a href="https://en.wikipedia.org/wiki/Five_whys"&gt;Five Whys&lt;/a&gt; method. Even in the era of AI agents, resources remain limited. Your time is certainly finite. I’ll set aside, for now, considerations about token costs—both in dollars and in the broader sense of shared planetary resources.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Decision latency can be a competitive advantage. This could be expanded into a framework for “pre-coding decision filters” in an agentic world: how to deliberately slow down before starting, when everything pushes you to start immediately.&lt;/p&gt;
&lt;h2 id="catching-when-things-go-sideways"&gt;Catching when things go sideways&lt;/h2&gt;
&lt;p&gt;In many of my projects, there hasn’t been enough time to add strong guardrails for quickly detecting when things start to go wrong. Test suites need to be designed with clear principles and intent.&lt;/p&gt;
&lt;p&gt;Today, it’s easy to get close to 100% test coverage with AI agents. The problem lies deeper, in how models are trained: they are optimized to produce an answer rather than to challenge assumptions. I’ve experienced that when a test suite is AI-generated from existing code, it can be fundamentally flawed. The tests often mirror the current implementation of the business logic—even if that logic is incorrect.&lt;/p&gt;
&lt;p&gt;Without solid specifications or clearly stated expectations, generated tests tend to confirm what already exists rather than falsify it. They won’t catch flaws unless there is a reliable external reference, such as a well-defined specification. “AI tests can be bad,” is known  but more: they systematically reflect existing assumptions unless anchored to external specifications. This confirmatory nature of AI-generated tests is a blind spot for teams that want to quickly close testing gaps and reduce technical debt.&lt;/p&gt;
&lt;p&gt;From a Python-specific perspective, extensive usage of typing mechanisms available in Python can help catching bugs and inconsistencies in implementation. &lt;/p&gt;
&lt;h2 id="other-thoughts-from-the-discussion"&gt;Other thoughts from the discussion&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;The skills that matter are the same ones that make someone a good manager of people.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;(&lt;a href="https://news.ycombinator.com/item?id=46783415"&gt;link&lt;/a&gt; by &lt;a href="https://news.ycombinator.com/user?id=augusteo"&gt;augusteo&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;The current situation creates pressure for a fast track into managerial competence. Developers supervising and mentoring AI agents get many hours of practice in planning, delegating, and communicating work in a clear, actionable, and precise way. This becomes a kind of manager’s dojo: a training ground entered almost automatically when stepping into AI-assisted coding. Agent supervision exercises the same muscles as people management: planning, delegation, expectation-setting, and review. This reframes career progression and challenges the idea that management skills only come from managing people. I expect in a near future an abundance of technically inclined people with strong managerial skills, even if they have never formally led a human team.&lt;/p&gt;
&lt;p&gt;Another comment adds an important nuance:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The apparent speedup only holds if we ignore the cost of comprehension and review; once those are included, the comparison becomes less about raw code throughput and more about where and how understanding is generated in the process.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;(&lt;a href="https://news.ycombinator.com/item?id=46792118"&gt;link&lt;/a&gt; by &lt;a href="https://news.ycombinator.com/user?id=ithkuil"&gt;ithkuil&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;This reframes productivity not as code output, but as the efficiency of understanding.&lt;/p&gt;</content><category term="note"/><category term="agentic-coding"/><category term="vibe-coding"/><category term="software-development"/><category term="software-project"/></entry><entry><title>Using MLflow-RAGAS Integration Without Tracing</title><link href="https://www.safjan.com/using-mlflowragas-integration-without-tracing/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2026-01-11T00:00:00+01:00</published><updated>2026-01-11T00:00:00+01:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2026-01-11:/using-mlflowragas-integration-without-tracing/</id><summary type="html">&lt;ul&gt;
&lt;li&gt;&lt;a href="#option-1-static-data-approach-recommended"&gt;Option 1: Static Data Approach (Recommended)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#key-points"&gt;Key Points&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#option-2-simple-predict_fn-no-tracing-decorator"&gt;Option 2: Simple predict_fn (No Tracing Decorator)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#comparison-when-to-use-each-approach"&gt;Comparison: When to Use Each Approach&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#complete-working-example"&gt;Complete Working Example&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#recommendation"&gt;Recommendation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can use MLflow's RAGAS integration to calculate evaluation scores and log them to experiments &lt;strong&gt;without&lt;/strong&gt; implementing …&lt;/p&gt;</summary><content type="html">&lt;ul&gt;
&lt;li&gt;&lt;a href="#option-1-static-data-approach-recommended"&gt;Option 1: Static Data Approach (Recommended)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#key-points"&gt;Key Points&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#option-2-simple-predict_fn-no-tracing-decorator"&gt;Option 2: Simple predict_fn (No Tracing Decorator)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#comparison-when-to-use-each-approach"&gt;Comparison: When to Use Each Approach&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#complete-working-example"&gt;Complete Working Example&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#recommendation"&gt;Recommendation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can use MLflow's RAGAS integration to calculate evaluation scores and log them to experiments &lt;strong&gt;without&lt;/strong&gt; implementing MLflow traces in your RAG pipeline. This is useful when:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;You have an existing RAG pipeline you don't want to modify&lt;/li&gt;
&lt;li&gt;You're doing batch evaluation of pre-computed results&lt;/li&gt;
&lt;li&gt;You want to quickly test RAGAS metrics without full instrumentation&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="option-1-static-data-approach-recommended"&gt;Option 1: Static Data Approach (Recommended)&lt;/h2&gt;
&lt;p&gt;Pre-compute your RAG outputs, then pass everything to &lt;code&gt;mlflow.genai.evaluate()&lt;/code&gt; with the &lt;code&gt;outputs&lt;/code&gt; field already populated.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;mlflow&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;mlflow.genai.scorers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Faithfulness&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FactualCorrectness&lt;/span&gt;

&lt;span class="c1"&gt;# Your existing RAG pipeline (no MLflow code needed)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;my_rag_pipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]]:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;Your RAG implementation - completely unchanged.&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;contexts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;my_retriever&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;retrieve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;my_llm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;contexts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;contexts&lt;/span&gt;

&lt;span class="c1"&gt;# Golden dataset&lt;/span&gt;
&lt;span class="n"&gt;golden_dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;question&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;What is MLflow?&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;ground_truth&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;MLflow is an open-source platform for ML lifecycle management.&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;contexts&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;MLflow is an open-source platform...&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  &lt;span class="c1"&gt;# Expected contexts&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="c1"&gt;# ... more samples&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Step 1: Run your RAG pipeline and collect outputs&lt;/span&gt;
&lt;span class="n"&gt;eval_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;golden_dataset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;retrieved_contexts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;my_rag_pipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;question&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;inputs&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;question&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;question&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;outputs&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;response&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;retrieved_contexts&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;retrieved_contexts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Required for Faithfulness&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;expectations&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;expected_answer&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;ground_truth&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;ground_truth&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;contexts&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]),&lt;/span&gt;  &lt;span class="c1"&gt;# For ContextRecall&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;# Step 2: Run evaluation with static data (no predict_fn)&lt;/span&gt;
&lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_experiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;rag-evaluation&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;static-data-evaluation&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;evaluation_mode&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;static_data&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;num_samples&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;genai&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Data already includes outputs&lt;/span&gt;
        &lt;span class="n"&gt;scorers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="n"&gt;Faithfulness&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;openai:/gpt-4o-mini&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;FactualCorrectness&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;openai:/gpt-4o-mini&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Results logged to run: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;active_run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h3 id="key-points"&gt;Key Points&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;No &lt;code&gt;predict_fn&lt;/code&gt; needed&lt;/strong&gt; - outputs are pre-computed&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;retrieved_contexts&lt;/code&gt; in outputs&lt;/strong&gt; - Required for Faithfulness metric to work&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Your RAG code stays unchanged&lt;/strong&gt; - No decorators or MLflow imports needed in your pipeline&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="option-2-simple-predict_fn-no-tracing-decorator"&gt;Option 2: Simple predict_fn (No Tracing Decorator)&lt;/h2&gt;
&lt;p&gt;If you prefer the &lt;code&gt;predict_fn&lt;/code&gt; approach but don't want tracing:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;simple_rag_predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;Wrapper around your RAG - no @mlflow.trace decorator.&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;contexts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;my_retriever&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;retrieve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;my_llm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;contexts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;response&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;retrieved_contexts&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;contexts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Evaluation data without outputs (they&amp;#39;ll be generated)&lt;/span&gt;
&lt;span class="n"&gt;eval_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;inputs&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;question&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;question&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;expectations&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;expected_answer&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;ground_truth&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;ground_truth&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;contexts&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]),&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;golden_dataset&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_run&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;genai&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;predict_fn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;simple_rag_predict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;scorers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;scorers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="comparison-when-to-use-each-approach"&gt;Comparison: When to Use Each Approach&lt;/h2&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Tracing&lt;/th&gt;
&lt;th&gt;Faithfulness&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Static data&lt;/strong&gt; (Option 1)&lt;/td&gt;
&lt;td&gt;Not needed&lt;/td&gt;
&lt;td&gt;Works (needs &lt;code&gt;retrieved_contexts&lt;/code&gt; in outputs)&lt;/td&gt;
&lt;td&gt;Existing pipelines, batch evaluation, CI/CD&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;predict_fn with @mlflow.trace&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Required&lt;/td&gt;
&lt;td&gt;Works (reads from RETRIEVER spans)&lt;/td&gt;
&lt;td&gt;New pipelines, debugging, detailed traces&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;predict_fn without tracing&lt;/strong&gt; (Option 2)&lt;/td&gt;
&lt;td&gt;Not needed&lt;/td&gt;
&lt;td&gt;Works (needs &lt;code&gt;retrieved_contexts&lt;/code&gt; in return)&lt;/td&gt;
&lt;td&gt;Quick integration, simple pipelines&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id="complete-working-example"&gt;Complete Working Example&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="sd"&gt;Complete example: Evaluate any RAG pipeline with RAGAS metrics in MLflow.&lt;/span&gt;
&lt;span class="sd"&gt;No tracing required in your RAG implementation.&lt;/span&gt;
&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;mlflow&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;mlflow.genai.scorers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Faithfulness&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FactualCorrectness&lt;/span&gt;

&lt;span class="c1"&gt;# Configure MLflow&lt;/span&gt;
&lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_tracking_uri&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;http://localhost:5000&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Or your tracking server&lt;/span&gt;
&lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_experiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;my-rag-evaluation&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Your RAG pipeline (example - replace with your implementation)&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyRAGPipeline&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;retriever&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;retriever&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]]:&lt;/span&gt;
        &lt;span class="n"&gt;contexts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Context: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;contexts&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;Question: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;Answer:&amp;quot;&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;contexts&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize your pipeline&lt;/span&gt;
&lt;span class="n"&gt;rag&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MyRAGPipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;my_retriever&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;my_llm&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Your evaluation dataset&lt;/span&gt;
&lt;span class="n"&gt;test_questions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;question&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;What is X?&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;ground_truth&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;X is...&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;question&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;How does Y work?&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;ground_truth&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Y works by...&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Build evaluation data with pre-computed outputs&lt;/span&gt;
&lt;span class="n"&gt;eval_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;test_questions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;contexts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rag&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;question&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;inputs&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;question&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;question&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;outputs&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;response&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;retrieved_contexts&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;contexts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;expectations&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;expected_answer&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;ground_truth&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;# Run evaluation&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;rag-eval-no-tracing&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Log your pipeline configuration&lt;/span&gt;
    &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_params&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;retriever_k&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;llm_model&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;gpt-4o-mini&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;num_samples&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="c1"&gt;# Evaluate with RAGAS scorers&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;genai&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;scorers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="n"&gt;Faithfulness&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;openai:/gpt-4o-mini&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;FactualCorrectness&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;openai:/gpt-4o-mini&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Results are automatically logged to MLflow&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Evaluation complete!&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;View results: mlflow ui --port 5000&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Run ID: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;active_run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Access results programmatically&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metrics&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tables&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;eval_results_table&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="recommendation"&gt;Recommendation&lt;/h2&gt;
&lt;p&gt;For existing RAG pipelines, use &lt;strong&gt;Option 1 (Static Data)&lt;/strong&gt;:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Minimal changes&lt;/strong&gt; - Your RAG code stays completely unchanged&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Batch-friendly&lt;/strong&gt; - Run your pipeline once, evaluate multiple times with different scorers&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CI/CD compatible&lt;/strong&gt; - Easy to integrate into automated testing pipelines&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Full metric support&lt;/strong&gt; - All RAGAS metrics work when you include &lt;code&gt;retrieved_contexts&lt;/code&gt; in outputs&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The only requirement is that your RAG pipeline returns both the answer and the retrieved contexts, which most RAG implementations already do.&lt;/p&gt;</content><category term="note"/><category term="mlflow"/><category term="ragas"/><category term="RAG-evaluation"/><category term="rag-optimization"/><category term="rag"/></entry><entry><title>RAG Evaluation with RAGAS and MLflow - A Practical Guide</title><link href="https://www.safjan.com/ragas-mlflow-rag-evaluation-tutorial/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2026-01-08T00:00:00+01:00</published><updated>2026-01-11T00:00:00+01:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2026-01-08:/ragas-mlflow-rag-evaluation-tutorial/</id><summary type="html">&lt;p&gt;A comprehensive tutorial demonstrating RAG evaluation using RAGAS metrics through MLflow integration. Learn to build a minimal RAG pipeline with LangChain, create golden evaluation datasets, and systematically assess retrieval quality using Faithfulness, Context Precision, Context Recall, and Factual Correctness metrics. Supports OpenAI, Azure OpenAI, and Ollama backends.&lt;/p&gt;</summary><content type="html">&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;h1 id="RAG-Evaluation-with-RAGAS-and-MLflow"&gt;RAG Evaluation with RAGAS and MLflow&lt;/h1&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;p&gt;&lt;strong&gt;Table of contents&lt;/strong&gt;&lt;a id="toc0_"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#toc1_"&gt;Why This Tutorial?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc2_"&gt;What You'll Learn&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc3_"&gt;Prerequisites&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc4_"&gt;Setup and Configuration&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href="#toc4_1_"&gt;LLM Provider Configuration&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc5_"&gt;Sample Knowledge Base&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc6_"&gt;Minimal RAG Pipeline&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href="#toc6_1_"&gt;Enable MLflow Tracing&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc7_"&gt;Load Golden Dataset&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href="#toc7_1_"&gt;Generate RAG Responses for Evaluation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc8_"&gt;RAGAS Evaluation with MLflow&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href="#toc8_1_"&gt;Prepare Evaluation Data&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc9_"&gt;MLflow Results Analysis&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href="#toc9_1_"&gt;Interpreting RAGAS Scores&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc10_"&gt;Common Pitfalls and Solutions&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href="#toc10_1_"&gt;Model URI Format&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc10_2_"&gt;Function Signature Must Match Input Keys&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc10_3_"&gt;RETRIEVER Spans for Context Metrics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc10_4_"&gt;Judge Model Limitations&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc11_"&gt;Extras: Comparing RAG Variants with MLflow&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href="#toc11_1_"&gt;Example: Comparing Chunk Sizes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc11_2_"&gt;Comparing Results in MLflow UI&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc12_"&gt;How to inspect results in MLflow UI:&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href="#toc12_1_"&gt;Select the experiment to inspect&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc12_2_"&gt;Configure comparison of the runs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc12_3_"&gt;View Comparison Results&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc13_"&gt;More from MLflow&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href="#toc13_1_"&gt;built-in metrics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc13_2_"&gt;Guidelines-based LLM Scorers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc13_3_"&gt;MCP server&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#toc14_"&gt;References, further reading&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;!-- vscode-jupyter-toc-config
	numbering=false
	anchor=true
	flat=false
	minLevel=2
	maxLevel=6
	/vscode-jupyter-toc-config --&gt;
&lt;!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL --&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;h2 id="Why-This-Tutorial?"&gt;&lt;a id="toc1_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;Why This Tutorial?&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;Evaluating RAG pipelines is surprisingly difficult. You can build a working retrieval system in an afternoon, but answering "Is it actually good?" requires systematic measurement.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The challenge:&lt;/strong&gt; Manual evaluation doesn't scale. Eyeballing a few responses tells you almost nothing about overall quality. You need metrics that capture different aspects of RAG performance:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Is the retriever finding relevant documents?&lt;/li&gt;
&lt;li&gt;Is the LLM staying faithful to the retrieved context (not hallucinating)?&lt;/li&gt;
&lt;li&gt;Are the answers factually correct?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;The solution:&lt;/strong&gt; RAGAS provides standardized metrics for RAG evaluation. MLflow provides experiment tracking. Together, they enable systematic, reproducible evaluation that you can run on every pipeline change.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What made this tricky:&lt;/strong&gt; The MLflow-RAGAS integration looks simple in the docs, but getting it to work with a real LangChain pipeline required navigating several non-obvious requirements:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Specific model URI formats for different providers&lt;/li&gt;
&lt;li&gt;Function signatures that match MLflow's expectations&lt;/li&gt;
&lt;li&gt;Proper tracing spans for context-aware metrics&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This tutorial documents what actually works, including the gotchas I encountered along the way.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;p&gt;This tutorial demonstrates how to evaluate Retrieval-Augmented Generation (RAG) systems using &lt;strong&gt;RAGAS&lt;/strong&gt; (Retrieval Augmented Generation Assessment) metrics through &lt;strong&gt;MLflow&lt;/strong&gt; integration.&lt;/p&gt;
&lt;p&gt;NOTE: RAGAS is a third-party evaluation library. For more details, visit the &lt;a href="https://github.com/explodinggradients/ragas"&gt;RAGAS GitHub repository&lt;/a&gt;. At the time of writing (January 2026), apart from RAGAS, MLFlow supports another &lt;a href="https://mlflow.org/docs/latest/genai/eval-monitor/scorers/third-party/"&gt;third-party scorer/evaluation&lt;/a&gt; library: DeepEval.&lt;/p&gt;
&lt;h2 id="What-You'll-Learn"&gt;&lt;a id="toc2_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;What You'll Learn&lt;/a&gt;&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Build a minimal RAG pipeline&lt;/strong&gt; using LangChain and FAISS&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Create a golden evaluation dataset&lt;/strong&gt; with expected answers&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Evaluate RAG quality&lt;/strong&gt; using RAGAS metrics (Faithfulness, Context Precision, Context Recall, Factual Correctness)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Track results in MLflow&lt;/strong&gt; for systematic comparison&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Support multiple LLM providers&lt;/strong&gt;: OpenAI, Azure OpenAI, and Ollama&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="Prerequisites"&gt;&lt;a id="toc3_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;Prerequisites&lt;/a&gt;&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;Python 3.10+&lt;/li&gt;
&lt;li&gt;API key for your chosen LLM provider&lt;/li&gt;
&lt;li&gt;Basic understanding of RAG concepts&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;h2 id="Setup-and-Configuration"&gt;&lt;a id="toc4_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;Setup and Configuration&lt;/a&gt;&lt;/h2&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell jp-mod-noOutputs"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;warnings&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;enum&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Enum&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;pd&lt;/span&gt;

&lt;span class="n"&gt;warnings&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;filterwarnings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"ignore"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;h3 id="LLM-Provider-Configuration"&gt;&lt;a id="toc4_1_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;LLM Provider Configuration&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;This tutorial supports three LLM providers. Choose your provider and configure the appropriate environment variables:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Provider&lt;/th&gt;
&lt;th&gt;Required Environment Variables&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;&lt;code&gt;OPENAI_API_KEY&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Azure OpenAI&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AZURE_OPENAI_ENDPOINT&lt;/code&gt;, &lt;code&gt;AZURE_OPENAI_API_KEY&lt;/code&gt;, &lt;code&gt;AZURE_OPENAI_DEPLOYMENT_NAME&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ollama&lt;/td&gt;
&lt;td&gt;None (runs locally on &lt;code&gt;http://localhost:11434&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;LLMProvider&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;OPENAI&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"openai"&lt;/span&gt;
    &lt;span class="n"&gt;AZURE_OPENAI&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"azure_openai"&lt;/span&gt;
    &lt;span class="n"&gt;OLLAMA&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ollama"&lt;/span&gt;


&lt;span class="c1"&gt;# === CONFIGURE YOUR PROVIDER HERE ===&lt;/span&gt;
&lt;span class="n"&gt;PROVIDER&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AZURE_OPENAI&lt;/span&gt;

&lt;span class="c1"&gt;# Model names per provider&lt;/span&gt;
&lt;span class="n"&gt;MODEL_CONFIG&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OPENAI&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;"chat_model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"gpt-4o-mini"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;"embedding_model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"text-embedding-3-small"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AZURE_OPENAI&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;"chat_model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"AZURE_OPENAI_DEPLOYMENT_NAME"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"gpt-4o-mini"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s2"&gt;"embedding_model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"AZURE_OPENAI_EMBEDDING_DEPLOYMENT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"text-embedding-3-small"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OLLAMA&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;"chat_model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"llama3.2:3b"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;"embedding_model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"nomic-embed-text"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Using provider: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;PROVIDER&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Chat model: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;MODEL_CONFIG&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;PROVIDER&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'chat_model'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Embedding model: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;MODEL_CONFIG&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;PROVIDER&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'embedding_model'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;Using provider: azure_openai
Chat model: gpt-4o-mini
Embedding model: text-embedding-ada-002-v2
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;validate_environment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;"""Validate required environment variables for the selected provider."""&lt;/span&gt;
    &lt;span class="n"&gt;required_vars&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OPENAI&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"OPENAI_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AZURE_OPENAI&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s2"&gt;"AZURE_OPENAI_ENDPOINT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s2"&gt;"AZURE_OPENAI_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OLLAMA&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;missing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;var&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;var&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;required_vars&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;missing&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;EnvironmentError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Missing environment variables for &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;missing&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Please set them before continuing."&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="c1"&gt;# Set provider-specific env vars for litellm (used by RAGAS scorers)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OLLAMA&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;setdefault&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"OLLAMA_API_BASE"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"http://localhost:11434"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"OLLAMA_API_BASE set to: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'OLLAMA_API_BASE'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AZURE_OPENAI&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Set litellm Azure env vars from Azure OpenAI vars&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"AZURE_OPENAI_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"AZURE_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"AZURE_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"AZURE_OPENAI_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"AZURE_OPENAI_ENDPOINT"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"AZURE_API_BASE"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"AZURE_API_BASE"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"AZURE_OPENAI_ENDPOINT"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;setdefault&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"AZURE_API_VERSION"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"AZURE_OPENAI_API_VERSION"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"2024-02-01"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Azure litellm env vars configured"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Environment validated for &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="n"&gt;validate_environment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PROVIDER&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;Azure litellm env vars configured
Environment validated for azure_openai
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;langchain_openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;OpenAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AzureChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AzureOpenAIEmbeddings&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;langchain_community.embeddings&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OllamaEmbeddings&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;langchain_community.chat_models&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatOllama&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;"""Factory function to create LLM instance based on provider."""&lt;/span&gt;
    &lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MODEL_CONFIG&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OPENAI&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"chat_model"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AZURE_OPENAI&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;AzureChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;azure_deployment&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"chat_model"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;api_version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"AZURE_OPENAI_API_VERSION"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"2024-02-01"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OLLAMA&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;ChatOllama&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"chat_model"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"http://localhost:11434"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_embeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;"""Factory function to create embeddings instance based on provider."""&lt;/span&gt;
    &lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MODEL_CONFIG&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OPENAI&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;OpenAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"embedding_model"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AZURE_OPENAI&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;AzureOpenAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;azure_deployment&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"embedding_model"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;api_version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"AZURE_OPENAI_API_VERSION"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"2024-02-01"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OLLAMA&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;OllamaEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"embedding_model"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"http://localhost:11434"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_mlflow_model_uri&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;"""Get MLflow model URI for RAGAS scorers (uses litellm format)."""&lt;/span&gt;
    &lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MODEL_CONFIG&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OPENAI&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"openai:/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'chat_model'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AZURE_OPENAI&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Azure format: azure/&amp;lt;deployment_name&amp;gt;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"azure:/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'chat_model'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;LLMProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OLLAMA&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Ollama format for litellm: ollama/&amp;lt;model_name&amp;gt;&lt;/span&gt;
        &lt;span class="c1"&gt;# Note: ollama_chat format has issues with litellm, use ollama/ prefix&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"ollama:/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'chat_model'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;


&lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PROVIDER&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_embeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PROVIDER&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;mlflow_model_uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_mlflow_model_uri&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PROVIDER&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"LLM initialized: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__name__&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Embeddings initialized: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__name__&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"MLflow model URI: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;mlflow_model_uri&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;LLM initialized: AzureChatOpenAI
Embeddings initialized: AzureOpenAIEmbeddings
MLflow model URI: azure:/gpt-4o-mini
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;h2 id="Sample-Knowledge-Base"&gt;&lt;a id="toc5_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;Sample Knowledge Base&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;We'll create a small knowledge base about &lt;strong&gt;MLflow&lt;/strong&gt; - fitting for a tutorial that uses MLflow for evaluation! This dataset contains key concepts that our RAG system will retrieve from.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;json&lt;/span&gt;

&lt;span class="c1"&gt;# Load knowledge base from external file&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"data/knowledge_base.json"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;KNOWLEDGE_BASE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Knowledge base contains &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;KNOWLEDGE_BASE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; documents"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;KNOWLEDGE_BASE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;preview&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"  &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;. &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;preview&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;..."&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;Knowledge base contains 20 documents
  1. MLflow Tracking is an API and UI for logging parameters, code versions, metrics,...
  2. The MLflow Model Registry is a centralized model store that provides model linea...
  3. MLflow GenAI provides specialized tools for developing and evaluating generative...
  4. RAGAS (Retrieval Augmented Generation Assessment) is an evaluation framework int...
  5. MLflow Projects package code in a reusable, reproducible form. A project is simp...
  6. MLflow's autolog feature automatically logs metrics, parameters, and models duri...
  7. The MLflow Model format is a standard format for packaging machine learning mode...
  8. Evaluation in MLflow can be performed using mlflow.evaluate() for traditional ML...
  9. MLflow Model Serving enables deploying models as REST API endpoints. You can ser...
  10. MLflow Recipes (formerly MLflow Pipelines) provide predefined templates for comm...
  11. The MLflow CLI provides commands for running projects, serving models, and manag...
  12. MLflow's REST API allows programmatic access to the tracking server. Endpoints i...
  13. MLflow experiments organize runs into logical groups. Each experiment has a uniq...
  14. MLflow provides run comparison capabilities through the UI and API. The Compare ...
  15. MLflow artifacts are files associated with runs, such as models, data files, and...
  16. Model signatures in MLflow define the expected input and output schema for model...
  17. MLflow on Databricks provides managed MLflow tracking, model registry, and model...
  18. MLflow supports multiple environment managers for reproducibility. Projects can ...
  19. MLflow Prompt Engineering tools help develop and version prompts for LLM applica...
  20. MLflow integrates with LangChain through mlflow.langchain module. The integratio...
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;langchain_text_splitters&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;langchain_community.vectorstores&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FAISS&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;langchain_core.documents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Document&lt;/span&gt;

&lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;KNOWLEDGE_BASE&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;text_splitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;chunk_overlap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;splits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text_splitter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Split into &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;splits&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; chunks"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;Split into 20 chunks
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;vectorstore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;FAISS&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;splits&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;retriever&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vectorstore&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;as_retriever&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;search_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"k"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Vector store created with &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;vectorstore&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ntotal&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; vectors"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;Vector store created with 20 vectors
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;test_query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"What metrics does RAGAS provide?"&lt;/span&gt;
&lt;span class="n"&gt;test_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Test query: '&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;test_query&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Retrieved &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; documents:"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;--- Document &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; ---"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;Test query: 'What metrics does RAGAS provide?'
Retrieved 3 documents:

--- Document 1 ---
RAGAS (Retrieval Augmented Generation Assessment) is an evaluation framework integrated with MLflow for assessing RAG pipelines. Key metrics include: Faithfulness (measures if the answer is grounded i...

--- Document 2 ---
MLflow GenAI provides specialized tools for developing and evaluating generative AI applications. It includes mlflow.genai.evaluate() for systematic assessment of LLM outputs using configurable scorer...

--- Document 3 ---
Evaluation in MLflow can be performed using mlflow.evaluate() for traditional ML models or mlflow.genai.evaluate() for generative AI applications. For GenAI, evaluation uses Scorer objects that can be...
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;h2 id="Minimal-RAG-Pipeline"&gt;&lt;a id="toc6_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;Minimal RAG Pipeline&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;We'll build a simple RAG chain using LangChain's LCEL (LangChain Expression Language) that:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Retrieves relevant context from our FAISS vector store&lt;/li&gt;
&lt;li&gt;Formats a prompt with the context and question&lt;/li&gt;
&lt;li&gt;Generates an answer using the LLM&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;langchain_core.prompts&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatPromptTemplate&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;langchain_core.output_parsers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StrOutputParser&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;langchain_core.runnables&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RunnablePassthrough&lt;/span&gt;

&lt;span class="n"&gt;RAG_PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ChatPromptTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_template&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"""&lt;/span&gt;
&lt;span class="s2"&gt;You are a helpful assistant answering questions about MLflow.&lt;/span&gt;
&lt;span class="s2"&gt;Use ONLY the following context to answer the question.&lt;/span&gt;
&lt;span class="s2"&gt;If the context doesn't contain the answer, say "I don't have enough information to answer this question."&lt;/span&gt;

&lt;span class="s2"&gt;Context:&lt;/span&gt;
&lt;span class="si"&gt;{context}&lt;/span&gt;

&lt;span class="s2"&gt;Question: &lt;/span&gt;&lt;span class="si"&gt;{question}&lt;/span&gt;

&lt;span class="s2"&gt;Answer:&lt;/span&gt;
&lt;span class="s2"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;format_docs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="n"&gt;rag_chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;"context"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;retriever&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;format_docs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;"question"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;RunnablePassthrough&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;RAG_PROMPT&lt;/span&gt;
    &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;
    &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;StrOutputParser&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"RAG chain created successfully"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;RAG chain created successfully
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;test_answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rag_chain&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"What is MLflow Tracking?"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Test Question: What is MLflow Tracking?"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;Answer: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;test_answer&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;Test Question: What is MLflow Tracking?

Answer: MLflow Tracking is an API and UI for logging parameters, code versions, metrics, and artifacts when running your machine learning code. It allows you to log and query experiments using Python, REST, R API, and Java API. The MLflow Tracking component lets you log source code, models, and visualizations. Each run records: code version, start and end time, source, parameters, metrics, and artifacts.
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;h3 id="Enable-MLflow-Tracing"&gt;&lt;a id="toc6_1_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;Enable MLflow Tracing&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;MLflow's LangChain integration can automatically capture traces of our RAG pipeline invocations. This is essential for evaluation - RAGAS scorers analyze these traces to compute metrics.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;mlflow&lt;/span&gt;

&lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_experiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"RAG-Evaluation-Tutorial"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;langchain&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;autolog&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;log_traces&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"MLflow experiment: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_experiment_by_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'RAG-Evaluation-Tutorial'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"LangChain autologging enabled with tracing"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="application/vnd.jupyter.stderr"&gt;
&lt;pre&gt;2026/01/11 09:22:25 INFO mlflow.store.db.utils: Creating initial MLflow database tables...
2026/01/11 09:22:25 INFO mlflow.store.db.utils: Updating database tables
2026/01/11 09:22:25 INFO alembic.runtime.migration: Context impl SQLiteImpl.
2026/01/11 09:22:25 INFO alembic.runtime.migration: Will assume non-transactional DDL.
2026/01/11 09:22:25 INFO alembic.runtime.migration: Context impl SQLiteImpl.
2026/01/11 09:22:25 INFO alembic.runtime.migration: Will assume non-transactional DDL.
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;MLflow experiment: RAG-Evaluation-Tutorial
LangChain autologging enabled with tracing
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;NOTE&lt;/strong&gt;: If you don't have tracing enabled in your RAG there is still possibility to pass the evaluation data to MLflow to ease analysis. If this is your case please refer to the text on my blog explaining how to do it: &lt;a href="https://safjan.com/using-mlflowragas-integration-without-tracing"&gt;RAG Evaluation with RAGAS and MLflow - without tracing&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;h2 id="Load-Golden-Dataset"&gt;&lt;a id="toc7_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;Load Golden Dataset&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;A &lt;strong&gt;golden dataset&lt;/strong&gt; (also called ground truth or evaluation dataset) contains:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Questions&lt;/strong&gt;: User queries we want to evaluate&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Expected Answers&lt;/strong&gt;: The correct/ideal responses&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Expected Contexts&lt;/strong&gt; (optional): Which documents should be retrieved&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This dataset allows us to systematically measure our RAG system's quality.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# Load golden dataset from external file&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"data/golden_dataset.json"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;GOLDEN_DATASET&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;eval_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;GOLDEN_DATASET&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Golden dataset contains &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;GOLDEN_DATASET&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; evaluation samples"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;eval_df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;Golden dataset contains 20 evaluation samples
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="jp-OutputArea-child jp-OutputArea-executeResult"&gt;

&lt;div class="jp-RenderedHTMLCommon jp-RenderedHTML jp-OutputArea-output jp-OutputArea-executeResult" data-mime-type="text/html"&gt;
&lt;div&gt;
&lt;style scoped=""&gt;
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
&lt;/style&gt;
&lt;table border="1" class="dataframe"&gt;
&lt;thead&gt;
&lt;tr style="text-align: right;"&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;question&lt;/th&gt;
&lt;th&gt;ground_truth&lt;/th&gt;
&lt;th&gt;contexts&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;th&gt;0&lt;/th&gt;
&lt;td&gt;What is MLflow Tracking used for?&lt;/td&gt;
&lt;td&gt;MLflow Tracking is used for logging parameters...&lt;/td&gt;
&lt;td&gt;[MLflow Tracking is an API and UI for logging ...&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;th&gt;1&lt;/th&gt;
&lt;td&gt;What features does the MLflow Model Registry p...&lt;/td&gt;
&lt;td&gt;The MLflow Model Registry provides model linea...&lt;/td&gt;
&lt;td&gt;[The MLflow Model Registry is a centralized mo...&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;h3 id="Generate-RAG-Responses-for-Evaluation"&gt;&lt;a id="toc7_1_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;Generate RAG Responses for Evaluation&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;We'll run our RAG pipeline on each question and collect the responses along with the retrieved contexts. This data will be used by RAGAS scorers.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell jp-mod-noOutputs"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# Define traced RAG function for evaluation&lt;/span&gt;
&lt;span class="c1"&gt;# IMPORTANT: Function parameter names must match keys in data['inputs']&lt;/span&gt;
&lt;span class="c1"&gt;# Since inputs={'question': ...}, the function must accept 'question' parameter&lt;/span&gt;

&lt;span class="nd"&gt;@mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;trace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;span_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"CHAIN"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;traced_rag_predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;"""Traced RAG prediction function for mlflow.genai.evaluate().&lt;/span&gt;
&lt;span class="sd"&gt;    &lt;/span&gt;
&lt;span class="sd"&gt;    Args:&lt;/span&gt;
&lt;span class="sd"&gt;        question: The question to answer (matches inputs['question'] key)&lt;/span&gt;
&lt;span class="sd"&gt;    &lt;/span&gt;
&lt;span class="sd"&gt;    Returns:&lt;/span&gt;
&lt;span class="sd"&gt;        dict with 'response' and 'retrieved_contexts' for RAGAS scorers&lt;/span&gt;
&lt;span class="sd"&gt;    """&lt;/span&gt;
    &lt;span class="c1"&gt;# Retrieval step - creates RETRIEVER span&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_span&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"retriever"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;span_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"RETRIEVER"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;retrieved_docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;contexts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;retrieved_docs&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_inputs&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s2"&gt;"question"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_outputs&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s2"&gt;"retrieved_contexts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;contexts&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    
    &lt;span class="c1"&gt;# Generation step - creates LLM span&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_span&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"generator"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;span_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"LLM"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rag_chain&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_inputs&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s2"&gt;"question"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"contexts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;contexts&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_outputs&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s2"&gt;"response"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;"response"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;"retrieved_contexts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;contexts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# Preview a sample from the golden dataset&lt;/span&gt;
&lt;span class="c1"&gt;# Note: With predict_fn approach, answers are generated during evaluation&lt;/span&gt;
&lt;span class="n"&gt;sample_idx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Sample evaluation record #&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;sample_idx&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;Question: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;eval_df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iloc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;sample_idx&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'question'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;Expected Answer: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;eval_df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iloc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;sample_idx&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'ground_truth'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Show what the traced function would produce for this question&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;--- Testing RAG response for this question ---"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;test_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;traced_rag_predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;eval_df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iloc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;sample_idx&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'question'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;RAG Answer: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;test_output&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'response'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;Retrieved Contexts (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_output&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'retrieved_contexts'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;):&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_output&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'retrieved_contexts'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"  Context &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;...&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;Sample evaluation record #3:

Question: What metrics does RAGAS provide for RAG evaluation?

Expected Answer: RAGAS provides four key metrics: Faithfulness (measures if the answer is grounded in context), Context Precision (evaluates if relevant documents are ranked higher), Context Recall (checks if context contains all needed information), and Factual Correctness (compares output against expected answers).

--- Testing RAG response for this question ---

RAG Answer: RAGAS provides the following key metrics for RAG evaluation: Faithfulness, Context Precision, Context Recall, and Factual Correctness.

Retrieved Contexts (3):

  Context 1: RAGAS (Retrieval Augmented Generation Assessment) is an evaluation framework integrated with MLflow ...

  Context 2: Evaluation in MLflow can be performed using mlflow.evaluate() for traditional ML models or mlflow.ge...

  Context 3: MLflow GenAI provides specialized tools for developing and evaluating generative AI applications. It...

&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;h2 id="RAGAS-Evaluation-with-MLflow"&gt;&lt;a id="toc8_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;RAGAS Evaluation with MLflow&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;Now we'll use MLflow's RAGAS integration to evaluate our RAG pipeline. The key metrics we'll compute:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;What it measures&lt;/th&gt;
&lt;th&gt;Required Data&lt;/th&gt;
&lt;th&gt;Common Failure&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Faithfulness&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Is the answer grounded in retrieved context?&lt;/td&gt;
&lt;td&gt;answer, contexts&lt;/td&gt;
&lt;td&gt;Missing RETRIEVER spans&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Context Precision&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Are relevant docs ranked higher?&lt;/td&gt;
&lt;td&gt;question, contexts, ground_truth&lt;/td&gt;
&lt;td&gt;No ground_truth provided&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Context Recall&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Does context contain needed info?&lt;/td&gt;
&lt;td&gt;contexts, ground_truth&lt;/td&gt;
&lt;td&gt;No ground_truth provided&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Factual Correctness&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Does answer match expected?&lt;/td&gt;
&lt;td&gt;answer, ground_truth&lt;/td&gt;
&lt;td&gt;Semantic mismatch (strict)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;These metrics provide an initial quantitative assessment of RAG quality across multiple dimensions. There are more RAGAS tool metrics available through the  MLflow integration.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note on LLM Judge&lt;/strong&gt;: RAGAS metrics use an LLM as a judge. For best results, use &lt;strong&gt;OpenAI&lt;/strong&gt; (gpt-4o-mini) as the judge model even if you're using Ollama for RAG generation. Ollama/local models may have issues with litellm's structured output parsing. Set &lt;code&gt;JUDGE_PROVIDER = LLMProvider.OPENAI&lt;/code&gt; below if you encounter scoring errors with Ollama.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# Test litellm connectivity (optional - helps debug scoring issues)&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;litellm&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_litellm_connection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_uri&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;"""Test if litellm can connect to the model."""&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;litellm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model_uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="s2"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"Say 'test' and nothing else."&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
            &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"✓ litellm connection successful: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;model_uri&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"  Response: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;..."&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;True&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="ne"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"✗ litellm connection failed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;model_uri&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"  Error: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__name__&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)[:&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;False&lt;/span&gt;

&lt;span class="c1"&gt;# Test the judge model connection&lt;/span&gt;
&lt;span class="n"&gt;judge_model_uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_mlflow_model_uri&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PROVIDER&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Testing judge model: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;judge_model_uri&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;litellm_ok&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;test_litellm_connection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;judge_model_uri&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;litellm_ok&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;⚠️  Consider using OpenAI as judge model for reliable scoring."&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"   Set JUDGE_PROVIDER = LLMProvider.OPENAI in the next cell."&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;Testing judge model: azure:/gpt-4o-mini


&lt;span class="ansi-red-intense-fg ansi-bold"&gt;Provider List: https://docs.litellm.ai/docs/providers&lt;/span&gt;


&lt;span class="ansi-red-intense-fg ansi-bold"&gt;Provider List: https://docs.litellm.ai/docs/providers&lt;/span&gt;

✗ litellm connection failed: azure:/gpt-4o-mini
  Error: BadRequestError: litellm.BadRequestError: LLM Provider NOT provided. Pass in the LLM provider you are trying to call.

⚠️  Consider using OpenAI as judge model for reliable scoring.
   Set JUDGE_PROVIDER = LLMProvider.OPENAI in the next cell.
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;mlflow.genai.scorers.ragas&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;Faithfulness&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ContextPrecision&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ContextRecall&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;FactualCorrectness&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Configure the judge model for RAGAS evaluation&lt;/span&gt;
&lt;span class="c1"&gt;# For reliable scoring, use OpenAI even when using Ollama for RAG generation&lt;/span&gt;
&lt;span class="n"&gt;JUDGE_PROVIDER&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PROVIDER&lt;/span&gt;  &lt;span class="c1"&gt;# Change to LLMProvider.OPENAI for better results&lt;/span&gt;
&lt;span class="n"&gt;judge_model_uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_mlflow_model_uri&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;JUDGE_PROVIDER&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Judge model: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;judge_model_uri&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"(Change JUDGE_PROVIDER to LLMProvider.OPENAI if scoring fails with Ollama)&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Note: ContextPrecision and ContextRecall require traces with RETRIEVER spans&lt;/span&gt;
&lt;span class="c1"&gt;# For evaluation without traces, use Faithfulness and FactualCorrectness&lt;/span&gt;
&lt;span class="n"&gt;scorers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="n"&gt;Faithfulness&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;judge_model_uri&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;FactualCorrectness&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;judge_model_uri&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="c1"&gt;# These require traces with retriever spans - may show errors without proper tracing:&lt;/span&gt;
    &lt;span class="n"&gt;ContextPrecision&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;judge_model_uri&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;ContextRecall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;judge_model_uri&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Configured &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scorers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; RAGAS scorers:"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;scorer&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;scorers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"  - &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scorer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__name__&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;Judge model: azure:/gpt-4o-mini
(Change JUDGE_PROVIDER to LLMProvider.OPENAI if scoring fails with Ollama)

Configured 4 RAGAS scorers:
  - Faithfulness
  - FactualCorrectness
  - ContextPrecision
  - ContextRecall
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;h3 id="Prepare-Evaluation-Data"&gt;&lt;a id="toc8_1_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;Prepare Evaluation Data&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;MLflow's &lt;code&gt;genai.evaluate()&lt;/code&gt; expects data in a specific format. We need to map our data to the expected schema.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# Prepare evaluation data for predict_fn approach&lt;/span&gt;
&lt;span class="c1"&gt;# With predict_fn, we pass inputs and expectations - outputs come from the traced function&lt;/span&gt;
&lt;span class="n"&gt;eval_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;eval_df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iterrows&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="s2"&gt;"inputs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"question"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"question"&lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;
        &lt;span class="s2"&gt;"expectations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;"ground_truth"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"ground_truth"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s2"&gt;"contexts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"contexts"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]),&lt;/span&gt;  &lt;span class="c1"&gt;# For ContextRecall&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Prepared &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; samples for evaluation"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;Sample format:"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"  inputs: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'inputs'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"  expectations: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'expectations'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'expectations'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'ground_truth'&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"  ground_truth contexts: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'expectations'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'ground_truth'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; items"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;Note: outputs will be generated by traced_rag_predict() during evaluation"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"      ground_truth enables ContextPrecision and ContextRecall metrics"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;Prepared 20 samples for evaluation

Sample format:
  inputs: ['question']
  expectations: ['ground_truth', 'contexts']
  ground_truth contexts: 205 items

Note: outputs will be generated by traced_rag_predict() during evaluation
      ground_truth enables ContextPrecision and ContextRecall metrics
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Running RAGAS evaluation with traced predict_fn..."&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"This generates traces with RETRIEVER spans for Faithfulness metric.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"ragas-evaluation-traced"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"provider"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PROVIDER&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MODEL_CONFIG&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;PROVIDER&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s2"&gt;"chat_model"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"num_samples"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"retriever_k"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"evaluation_mode"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"predict_fn"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Use predict_fn to generate traces with RETRIEVER spans&lt;/span&gt;
    &lt;span class="c1"&gt;# This allows Faithfulness scorer to access retrieved_contexts&lt;/span&gt;
    &lt;span class="n"&gt;eval_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;genai&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;predict_fn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;traced_rag_predict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;scorers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;scorers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;run_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run_id&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;Evaluation complete! Run ID: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="application/vnd.jupyter.stderr"&gt;
&lt;pre&gt;2026/01/11 09:22:33 INFO mlflow.models.evaluation.utils.trace: Auto tracing is temporarily enabled during the model evaluation for computing some metrics and debugging. To disable tracing, call `mlflow.autolog(disable=True)`.
2026/01/11 09:22:33 INFO mlflow.genai.utils.data_validation: Testing model prediction with the first sample in the dataset. To disable this check, set the MLFLOW_GENAI_EVAL_SKIP_TRACE_VALIDATION environment variable to True.
2026/01/11 09:22:33 WARNING mlflow.tracing.fluent: Failed to start span VectorStoreRetriever: 'NonRecordingSpan' object has no attribute 'context'. For full traceback, set logging level to debug.
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;Running RAGAS evaluation with traced predict_fn...
This generates traces with RETRIEVER spans for Faithfulness metric.

&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="application/vnd.jupyter.stderr"&gt;
&lt;pre&gt;2026/01/11 09:22:34 WARNING mlflow.tracing.fluent: Failed to start span RunnableSequence: 'NonRecordingSpan' object has no attribute 'context'. For full traceback, set logging level to debug.
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;Evaluating:   0%|          | 0/20 [Elapsed: 00:00, Remaining: ?] &lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;
✨ Evaluation completed.

Metrics and evaluation results are logged to the MLflow run:
  Run name: &lt;span class="ansi-blue-intense-fg"&gt;ragas-evaluation-traced&lt;/span&gt;
  Run ID: &lt;span class="ansi-blue-intense-fg"&gt;35029b87d0e542128dedd53531ba0710&lt;/span&gt;

To view the detailed evaluation results with sample-wise scores,
open the &lt;span class="ansi-yellow-intense-fg ansi-bold"&gt;Traces&lt;/span&gt; tab in the Run page in the MLflow UI.


Evaluation complete! Run ID: 35029b87d0e542128dedd53531ba0710
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;h2 id="MLflow-Results-Analysis"&gt;&lt;a id="toc9_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;MLflow Results Analysis&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;Let's examine the evaluation results both programmatically and understand how to view them in the MLflow UI.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"="&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"RAGAS EVALUATION RESULTS"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"="&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;results_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;eval_results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tables&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"eval_results"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Find RAGAS scorer columns (Faithfulness, FactualCorrectness, Context*)&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;pd&lt;/span&gt;
&lt;span class="n"&gt;ragas_metrics&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'Faithfulness'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'FactualCorrectness'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'ContextPrecision'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'ContextRecall'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;value_columns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results_df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt; 
                 &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;endswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/value'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nb"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ragas_metrics&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="n"&gt;error_columns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results_df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt; 
                 &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;endswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/error'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nb"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ragas_metrics&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;RAGAS Metrics:"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"-"&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;successful_metrics&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;failed_metrics&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;value_columns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Convert to numeric, coercing errors to NaN&lt;/span&gt;
    &lt;span class="n"&gt;numeric_col&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_numeric&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'coerce'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;non_null&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;numeric_col&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropna&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results_df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;success_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;non_null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;success_count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;mean_val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;non_null&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;std_val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;non_null&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;non_null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"  ✓ &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;mean_val&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;.3f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; (±&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;std_val&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;.3f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;) [&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;success_count&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; samples]"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;successful_metrics&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"  ✗ &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: NO SCORES (0/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; samples succeeded)"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;failed_metrics&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;Summary: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;successful_metrics&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; metrics succeeded, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;failed_metrics&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; metrics failed"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"   Total samples: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results_df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Error diagnostics (if any metrics failed)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;failed_metrics&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s2"&gt;"="&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"🔍 DIAGNOSTIC: Error Details for Failed Metrics"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"="&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;error_columns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;metric_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/error'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;errors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;results_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropna&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;❌ &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;metric_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="c1"&gt;# Get first unique error message&lt;/span&gt;
            &lt;span class="n"&gt;unique_errors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;unique&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;unique_errors&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;  &lt;span class="c1"&gt;# Show max 2 unique errors&lt;/span&gt;
                &lt;span class="c1"&gt;# Truncate long error messages&lt;/span&gt;
                &lt;span class="n"&gt;err_str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)[:&lt;/span&gt;&lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;err_str&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="s2"&gt;"..."&lt;/span&gt;
                &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"   &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;err_str&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s2"&gt;"-"&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Common fixes:"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"   1. Use OpenAI as judge: JUDGE_PROVIDER = LLMProvider.OPENAI"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"   2. For Ollama: ensure model is running and OLLAMA_API_BASE is set"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"   3. ContextPrecision/ContextRecall require traces with RETRIEVER spans"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;✅ All metrics computed successfully!"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;============================================================
RAGAS EVALUATION RESULTS
============================================================

RAGAS Metrics:
----------------------------------------
  ✓ ContextRecall/value: 0.853 (±0.196) [20/20 samples]
  ✓ ContextPrecision/value: 0.967 (±0.116) [20/20 samples]
  ✓ Faithfulness/value: 0.984 (±0.047) [20/20 samples]
  ✓ FactualCorrectness/value: 0.613 (±0.250) [20/20 samples]

Summary: 4 metrics succeeded, 0 metrics failed
   Total samples: 20

✅ All metrics computed successfully!
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# Helper function to extract question from request column&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;extract_question&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request_data&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;"""Extract question from MLflow request column."""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request_data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"question"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"N/A"&lt;/span&gt;&lt;span class="p"&gt;))[:&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;request_data&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;"N/A"&lt;/span&gt;

&lt;span class="c1"&gt;# Display results summary with metric columns&lt;/span&gt;
&lt;span class="n"&gt;available_cols&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;value_columns&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results_df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;results_summary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;results_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;available_cols&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Add question column from request data&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s2"&gt;"request"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results_df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results_summary&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"question"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;results_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"request"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;extract_question&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;results_summary&lt;/span&gt;


&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;Identifying Low-Scoring Samples:"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"-"&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;value_columns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results_df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;numeric_col&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_numeric&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'coerce'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;low_mask&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;numeric_col&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;
        &lt;span class="n"&gt;low_scores&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;results_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;low_mask&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;low_scores&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;⚠️  &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; &amp;lt; 0.5: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;low_scores&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; samples"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;low_scores&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iterrows&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                &lt;span class="n"&gt;question&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;extract_question&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"request"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}))&lt;/span&gt;
                &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;numeric_col&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;notna&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"    - [&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;.2f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;] &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;..."&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;
Identifying Low-Scoring Samples:
----------------------------------------

⚠️  ContextRecall/value &amp;lt; 0.5: 1 samples
    - [0.33] What is MLflow GenAI used for?...

⚠️  ContextPrecision/value &amp;lt; 0.5: 1 samples
    - [0.50] How can you run MLflow Projects?...

⚠️  FactualCorrectness/value &amp;lt; 0.5: 6 samples
    - [0.46] What is MLflow Tracking used for?...
    - [0.30] What is MLflow GenAI used for?...
    - [0.47] How can you run MLflow Projects?...
    - [0.12] What is Faithfulness in RAGAS?...
    - [0.31] What frameworks support MLflow autolog?...
    - [0.40] How can you access MLflow programmatically via REST API?...
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;h3 id="Interpreting-RAGAS-Scores"&gt;&lt;a id="toc9_1_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;Interpreting RAGAS Scores&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;All RAGAS metrics return scores between 0.0 and 1.0. Here's rough guidance:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Score Range&lt;/th&gt;
&lt;th&gt;Interpretation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0.9 - 1.0&lt;/td&gt;
&lt;td&gt;Excellent - production ready&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0.7 - 0.9&lt;/td&gt;
&lt;td&gt;Good - minor improvements needed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0.5 - 0.7&lt;/td&gt;
&lt;td&gt;Fair - significant room for improvement&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&amp;lt; 0.5&lt;/td&gt;
&lt;td&gt;Poor - investigate specific failures&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;Important caveats:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;These thresholds are guidelines, not absolutes&lt;/li&gt;
&lt;li&gt;Different applications have different quality requirements&lt;/li&gt;
&lt;li&gt;Low FactualCorrectness often reflects semantic similarity issues, not actual incorrectness&lt;/li&gt;
&lt;li&gt;Focus on relative improvements when comparing variants, not absolute scores&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;p&gt;To view detailed results in the MLflow UI:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Start MLflow UI (if not running):
&lt;code&gt;$ mlflow ui --port 5000&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Open &lt;a href="http://localhost:5000"&gt;http://localhost:5000&lt;/a&gt; in your browser&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Navigate to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Experiment: 'RAG-Evaluation-Tutorial'&lt;/li&gt;
&lt;li&gt;Run: 'ragas-evaluation'&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In the run details, you'll find:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Parameters: model configuration&lt;/li&gt;
&lt;li&gt;Metrics: aggregate RAGAS scores&lt;/li&gt;
&lt;li&gt;Artifacts: detailed evaluation tables&lt;/li&gt;
&lt;li&gt;Traces: individual RAG invocations&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Run ID: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;Run ID: 35029b87d0e542128dedd53531ba0710
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s2"&gt;"="&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"🎉 Tutorial Complete!"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"="&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"""&lt;/span&gt;
&lt;span class="s2"&gt;Summary:&lt;/span&gt;
&lt;span class="s2"&gt;  - Provider: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;PROVIDER&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
&lt;span class="s2"&gt;  - Model: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;MODEL_CONFIG&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;PROVIDER&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'chat_model'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
&lt;span class="s2"&gt;  - Samples evaluated: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
&lt;span class="s2"&gt;  - MLflow Run ID: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;

&lt;span class="s2"&gt;View results: mlflow ui --port 5000&lt;/span&gt;
&lt;span class="s2"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;
============================================================
🎉 Tutorial Complete!
============================================================

Summary:
  - Provider: azure_openai
  - Model: gpt-4o-mini
  - Samples evaluated: 20
  - MLflow Run ID: 35029b87d0e542128dedd53531ba0710

View results: mlflow ui --port 5000

&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;h2 id="Common-Pitfalls-and-Solutions"&gt;&lt;a id="toc10_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;Common Pitfalls and Solutions&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;During development of this tutorial, several non-obvious issues emerged:&lt;/p&gt;
&lt;h3 id="Model-URI-Format"&gt;&lt;a id="toc10_1_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;Model URI Format&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;MLflow uses litellm under the hood. The URI format matters:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;OpenAI: &lt;code&gt;openai:/gpt-4o-mini&lt;/code&gt; (note the colon-slash)&lt;/li&gt;
&lt;li&gt;Azure: &lt;code&gt;azure:/deployment-name&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Ollama: &lt;code&gt;ollama:/llama3.2:3b&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Using &lt;code&gt;azure/&lt;/code&gt; instead of &lt;code&gt;azure:/&lt;/code&gt; will fail silently or produce cryptic errors.&lt;/p&gt;
&lt;h3 id="Function-Signature-Must-Match-Input-Keys"&gt;&lt;a id="toc10_2_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;Function Signature Must Match Input Keys&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;When using &lt;code&gt;predict_fn&lt;/code&gt;, the function parameter names must exactly match the keys in your &lt;code&gt;inputs&lt;/code&gt; dictionary:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# If your data has: {"inputs": {"question": "..."}}&lt;/span&gt;
&lt;span class="c1"&gt;# Your function MUST be: def predict(question: str)  # NOT def predict(query: str)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id="RETRIEVER-Spans-for-Context-Metrics"&gt;&lt;a id="toc10_3_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;RETRIEVER Spans for Context Metrics&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;Faithfulness, ContextPrecision, and ContextRecall require traces with RETRIEVER-type spans. Without them, these metrics return errors or incorrect values. The &lt;code&gt;traced_rag_predict&lt;/code&gt; function in this tutorial creates these spans explicitly.&lt;/p&gt;
&lt;h3 id="Judge-Model-Limitations"&gt;&lt;a id="toc10_4_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;Judge Model Limitations&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;RAGAS metrics use an LLM as a judge. Local models (Ollama) may struggle with the structured output parsing that RAGAS requires. For reliable scoring, consider using OpenAI/Azure as the judge even when your RAG uses a different provider.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;hr/&gt;
&lt;h2 id="Extras:-Comparing-RAG-Variants-with-MLflow"&gt;&lt;a id="toc11_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;Extras: Comparing RAG Variants with MLflow&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;One of MLflow's key strengths is enabling systematic A/B comparisons between different RAG configurations. Here's how to structure experiments comparing variants like chunk sizes, models, or retrieval strategies.&lt;/p&gt;
&lt;h3 id="Example:-Comparing-Chunk-Sizes"&gt;&lt;a id="toc11_1_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;Example: Comparing Chunk Sizes&lt;/a&gt;&lt;/h3&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# Comparing RAG Variants: Different Chunk Sizes&lt;/span&gt;
&lt;span class="c1"&gt;# This demonstrates how to evaluate the same RAG pipeline with different configurations&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Running chunk size comparison experiments..."&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"="&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;CHUNK_SIZES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;150&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;experiment_run_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;chunk_size&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;CHUNK_SIZES&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;Testing chunk_size=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="c1"&gt;# Rebuild the vector store with new chunk size&lt;/span&gt;
    &lt;span class="n"&gt;text_splitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;chunk_overlap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;chunk_size&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text_splitter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="c1"&gt;# Update the global vectorstore and retriever used by traced_rag_predict&lt;/span&gt;
    &lt;span class="k"&gt;global&lt;/span&gt; &lt;span class="n"&gt;vectorstore&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;retriever&lt;/span&gt;
    &lt;span class="n"&gt;vectorstore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;FAISS&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;retriever&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vectorstore&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;as_retriever&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;search_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"k"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"   Created &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; chunks"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="c1"&gt;# Run evaluation&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"chunk-size-&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"chunk_size"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"chunk_overlap"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_size&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"num_chunks"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        
        &lt;span class="n"&gt;eval_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;genai&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;predict_fn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;traced_rag_predict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;scorers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;scorers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;experiment_run_ids&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"   ✓ Run ID: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s2"&gt;"="&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Completed &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CHUNK_SIZES&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; experiments. Run IDs saved for comparison."&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;Running chunk size comparison experiments...
============================================================

Testing chunk_size=50
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="application/vnd.jupyter.stderr"&gt;
&lt;pre&gt;2026/01/11 09:23:10 INFO mlflow.genai.utils.data_validation: Testing model prediction with the first sample in the dataset. To disable this check, set the MLFLOW_GENAI_EVAL_SKIP_TRACE_VALIDATION environment variable to True.
2026/01/11 09:23:10 WARNING mlflow.tracing.fluent: Failed to start span VectorStoreRetriever: 'NonRecordingSpan' object has no attribute 'context'. For full traceback, set logging level to debug.
2026/01/11 09:23:10 WARNING mlflow.tracing.fluent: Failed to start span RunnableSequence: 'NonRecordingSpan' object has no attribute 'context'. For full traceback, set logging level to debug.
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;   Created 175 chunks
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;Evaluating:   0%|          | 0/20 [Elapsed: 00:00, Remaining: ?] &lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;
✨ Evaluation completed.

Metrics and evaluation results are logged to the MLflow run:
  Run name: &lt;span class="ansi-blue-intense-fg"&gt;chunk-size-50&lt;/span&gt;
  Run ID: &lt;span class="ansi-blue-intense-fg"&gt;34464d4c3bb34a0a8ce4d43760149f1e&lt;/span&gt;

To view the detailed evaluation results with sample-wise scores,
open the &lt;span class="ansi-yellow-intense-fg ansi-bold"&gt;Traces&lt;/span&gt; tab in the Run page in the MLflow UI.

   ✓ Run ID: 34464d4c3bb34a0a8ce4d43760149f1e

Testing chunk_size=150
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="application/vnd.jupyter.stderr"&gt;
&lt;pre&gt;2026/01/11 09:23:33 INFO mlflow.genai.utils.data_validation: Testing model prediction with the first sample in the dataset. To disable this check, set the MLFLOW_GENAI_EVAL_SKIP_TRACE_VALIDATION environment variable to True.
2026/01/11 09:23:33 WARNING mlflow.tracing.fluent: Failed to start span VectorStoreRetriever: 'NonRecordingSpan' object has no attribute 'context'. For full traceback, set logging level to debug.
2026/01/11 09:23:33 WARNING mlflow.tracing.fluent: Failed to start span RunnableSequence: 'NonRecordingSpan' object has no attribute 'context'. For full traceback, set logging level to debug.
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;   Created 62 chunks
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;Evaluating:   0%|          | 0/20 [Elapsed: 00:00, Remaining: ?] &lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;
✨ Evaluation completed.

Metrics and evaluation results are logged to the MLflow run:
  Run name: &lt;span class="ansi-blue-intense-fg"&gt;chunk-size-150&lt;/span&gt;
  Run ID: &lt;span class="ansi-blue-intense-fg"&gt;c77546d4d930426b98ebeb1bf1a09a3f&lt;/span&gt;

To view the detailed evaluation results with sample-wise scores,
open the &lt;span class="ansi-yellow-intense-fg ansi-bold"&gt;Traces&lt;/span&gt; tab in the Run page in the MLflow UI.

   ✓ Run ID: c77546d4d930426b98ebeb1bf1a09a3f

============================================================
Completed 2 experiments. Run IDs saved for comparison.
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;h3 id="Comparing-Results-in-MLflow-UI"&gt;&lt;a id="toc11_2_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;Comparing Results in MLflow UI&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;After running multiple variants:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Open MLflow UI: &lt;code&gt;mlflow ui --port 5000&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Navigate to your experiment&lt;/li&gt;
&lt;li&gt;Select runs to compare using checkboxes&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Compare&lt;/strong&gt; to see side-by-side metrics&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;Chart&lt;/strong&gt; view to visualize metric differences&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;You can also compare programmatically:&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;&lt;div class="jp-Cell jp-CodeCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;



&lt;div class="highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# Compare Results Programmatically&lt;/span&gt;
&lt;span class="c1"&gt;# Query MLflow for runs and display a formatted comparison table&lt;/span&gt;

&lt;span class="n"&gt;experiment_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"RAG-Evaluation-Tutorial"&lt;/span&gt;

&lt;span class="c1"&gt;# Get runs with chunk_size parameter (our comparison experiments)&lt;/span&gt;
&lt;span class="n"&gt;runs_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mlflow&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;search_runs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;experiment_names&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;experiment_name&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;filter_string&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"params.chunk_size != ''"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;order_by&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"params.chunk_size ASC"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;runs_df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"No chunk size comparison runs found. Run the comparison cell above first."&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Debug: show available metric columns&lt;/span&gt;
    &lt;span class="n"&gt;metric_cols&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;runs_df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"metrics."&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Available metric columns (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metric_cols&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; total):"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;metric_cols&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;  &lt;span class="c1"&gt;# Show first 8&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"  - &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="c1"&gt;# Define metrics we want (will search for partial matches)&lt;/span&gt;
    &lt;span class="n"&gt;metric_names&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Faithfulness"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"FactualCorrectness"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"ContextPrecision"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"ContextRecall"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    
    &lt;span class="c1"&gt;# Find actual column names (may have backticks or different format)&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;find_metric_col&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metric_name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="sd"&gt;"""Find column containing metric_name in its name."""&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;metric_name&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="s2"&gt;"mean"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt;
    
    &lt;span class="n"&gt;comparison_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;runs_df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iterrows&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;"Run Name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"tags.mlflow.runName"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"N/A"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="s2"&gt;"Chunk Size"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"params.chunk_size"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"N/A"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="s2"&gt;"Num Chunks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"params.num_chunks"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"N/A"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;metric_name&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;metric_names&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;find_metric_col&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;runs_df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metric_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;metric_name&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;.3f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;notna&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="s2"&gt;"N/A"&lt;/span&gt;
            &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;metric_name&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"N/A"&lt;/span&gt;
        &lt;span class="n"&gt;comparison_data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="n"&gt;comparison_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;comparison_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;Chunk Size Comparison Results"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"="&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;comparison_df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    
    &lt;span class="c1"&gt;# Find best configuration&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s2"&gt;"FactualCorrectness"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;comparison_df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;best_idx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;comparison_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"FactualCorrectness"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s2"&gt;"N/A"&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;idxmax&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;✨ Best configuration: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;comparison_df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iloc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;best_idx&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'Run Name'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;


&lt;/div&gt;



&lt;div class="jp-OutputArea jp-Cell-outputArea"&gt;
&lt;div class="jp-OutputArea-child"&gt;

&lt;div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain"&gt;
&lt;pre&gt;Available metric columns (4 total):
  - metrics.Faithfulness/mean
  - metrics.FactualCorrectness/mean
  - metrics.ContextRecall/mean
  - metrics.ContextPrecision/mean

Chunk Size Comparison Results
================================================================================
      Run Name Chunk Size Num Chunks Faithfulness FactualCorrectness ContextPrecision ContextRecall
chunk-size-150        150         62        0.911              0.768            0.992         0.890
chunk-size-150        150         62        0.915              0.781            0.983         0.834
 chunk-size-50         50        175        0.942              0.743            0.983         0.840
 chunk-size-50         50        175        0.944              0.758            0.983         0.844

✨ Best configuration: chunk-size-150
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;h2 id="How-to-inspect-results-in-MLflow-UI:"&gt;&lt;a id="toc12_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;How to inspect results in MLflow UI:&lt;/a&gt;&lt;/h2&gt;&lt;h3 id="Select-the-experiment-to-inspect"&gt;&lt;a id="toc12_1_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;Select the experiment to inspect&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;&lt;img alt="" src="http://safjan.com/images/ragas-mlflow/01_experiments.png"/&gt;
&lt;strong&gt;Figure 1&lt;/strong&gt;: Select the experiment you want to analyze.&lt;/p&gt;
&lt;p&gt;Experiment type should be automatically recognized as "GenAI Evaluation" - when opening the experiment for the first time - you need to confirm this. Perhaps there is a way to pass this parameter when creating the experiment via code, but I have not found it yet.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;h3 id="Configure-comparison-of-the-runs"&gt;&lt;a id="toc12_2_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;Configure comparison of the runs&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;&lt;img alt="" src="http://safjan.com/images/ragas-mlflow/02_runs.png"/&gt;
&lt;strong&gt;Figure 2&lt;/strong&gt;: Runs overview - high level overview of the results achieved for the various configurations/variant of your RAG under evaluation.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;You can &lt;strong&gt;edit experiment name and description&lt;/strong&gt; here. Informative names help when comparing multiple experiments. Description can provide additional context about the experiment's purpose.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;NOTE: Perhaps there is a way to pass description when creating the experiment via code, but I have not found it yet.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;In this view you can see all the runs (evaluate RAG variants) that belongs to this experiment. Each run corresponds to a different RAG configuration (e.g., different chunk sizes, models, etc.). You can see &lt;strong&gt;parameters&lt;/strong&gt; (e.g., model name, chunk size), &lt;strong&gt;aggregated metrics&lt;/strong&gt; (e.g., mean Faithfulness, mean Context Precision). The displayed columns with parameters, metrics can be customized using the "Columns" button on the top right, so you can focus on the most relevant information.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If you want to do the comparison of two runs, select the runs you want to compare by checking the checkboxes next to each run. Note that, the second run you select will be treated as the "baseline" run in the comparison. The score changes of the first selected run will be calculated against the second selected run.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You can select multiple runs to compare their metrics side-by-side. This is useful for evaluating different RAG configurations.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You can also select columns to display in the comparison table.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;h3 id="View-Comparison-Results"&gt;&lt;a id="toc12_3_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;View Comparison Results&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;&lt;img alt="Compare - two pannels" src="http://safjan.com/images/ragas-mlflow/05_questions_compare_single.png"/&gt;
&lt;strong&gt;Figure 3&lt;/strong&gt;: Comparison of individual question results for selected runs.
In this view, you can see detailed comparison of individual question results for the selected runs. The zoom of the single panel is presented in the Figure 4** below.  This helps to analyze how each RAG configuration performed on specific queries. You can inspects full RAG output and details like retrieved contexts for each question.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Zoom at the detailed results" src="http://safjan.com/images/ragas-mlflow/03_compare.png"/&gt;
&lt;strong&gt;Figure 4&lt;/strong&gt;: Zoom at the detailed results for the single variant&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;p&gt;&lt;img alt="Detailed view showing metrics for each question" src="http://safjan.com/images/ragas-mlflow/04_questions_compare_both.png"/&gt;
&lt;strong&gt;Figure 5&lt;/strong&gt;: Comparison of individual question results for selected runs - detailed view showing metrics for each question.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;h2 id="More-from-MLflow"&gt;&lt;a id="toc13_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;More from MLflow&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;This tutorial focused on RAG evaluation using RAGAS metrics. MLflow offers many more features for RAG or GenAI model management, including:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;built-in metrics&lt;/strong&gt;
&lt;a href="https://mlflow.org/docs/latest/genai/eval-monitor/scorers/llm-judge/predefined/"&gt;MLFlow predefined&lt;/a&gt; metrics for GenAI models&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Guidelines-based LLM Scorers&lt;/strong&gt;
&lt;a href="https://mlflow.org/docs/latest/genai/eval-monitor/scorers/llm-judge/guidelines/"&gt;Guidelines-based LLM Scorers&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;from MLflow Documentation:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Guidelines is a powerful scorer class designed to let you quickly and easily customize evaluation by defining natural language criteria that are framed as pass/fail conditions. It is ideal for checking compliance with rules, style guides, or information inclusion/exclusion.&lt;/p&gt;
&lt;p&gt;Guidelines have the distinct advantage of being easy to explain to business stakeholders ("we are evaluating if the app delivers upon this set of rules") and, as such, can often be directly written by domain experts.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;MCP server&lt;/strong&gt;
See the documentation on how to add MLflow MCP server in poular IDEs and Agentic conding tools: &lt;a href="https://mlflow.org/docs/latest/genai/mcp/index.html"&gt;MLflow MCP Server&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;...and many more features. Explore the &lt;a href="https://mlflow.org/docs/latest/genai/index.html"&gt;MLflow GenAI documentation&lt;/a&gt; for more details.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="jp-Cell jp-MarkdownCell jp-Notebook-cell"&gt;


&lt;div class="jp-InputArea jp-Cell-inputArea"&gt;&lt;div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown"&gt;
&lt;h2 id="References,-further-reading"&gt;&lt;a id="toc14_"&gt;&lt;/a&gt;&lt;a href="#toc0_"&gt;References, further reading&lt;/a&gt;&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;the code for this tutorial is available on my GitHub: &lt;a href="https://github.com/izikeros/2026-01-08-ragas-in-mlfow-rag-eval-demo"&gt;2026-01-08-ragas-in-mlfow-rag-eval-demo&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mlflow.org/docs/latest/genai/eval-monitor/scorers/third-party/ragas/"&gt;MLflow Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mlflow.org/docs/latest/genai/flavors/langchain/notebooks/langchain-retriever"&gt;Introduction to RAG with MLflow and LangChain&lt;/a&gt; - MLflow documentation - exemplary implementation of RAG with LangChain and MLflow (without RAGAS evaluation).&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/MoRaouf/rag_evaluation_and_tracking"&gt;GitHub - rag_evaluation_and_tracking&lt;/a&gt; - This project houses a Retrieval Augmented Generation (RAG) LLM application built for robust and context-aware text generation. It leverages the combined power of LangChain for orchestration, MLflow for tracking and experimentation, DVC for version control, and RAGAS for evaluation.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;script type="text/javascript"&gt;if (!document.getElementById('mathjaxscript_pelican_#%@#$@#')) {
    var mathjaxscript = document.createElement('script');
    mathjaxscript.id = 'mathjaxscript_pelican_#%@#$@#';
    mathjaxscript.type = 'text/javascript';
    mathjaxscript.src = '//cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.1/MathJax.js?config=TeX-AMS-MML_HTMLorMML';
    mathjaxscript[(window.opera ? "innerHTML" : "text")] =
        "MathJax.Hub.Config({" +
        "    config: ['MMLorHTML.js']," +
        "    TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' } }," +
        "    jax: ['input/TeX','input/MathML','output/HTML-CSS']," +
        "    extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," +
        "    displayAlign: 'center'," +
        "    displayIndent: '0em'," +
        "    showMathMenu: true," +
        "    tex2jax: { " +
        "        inlineMath: [ ['$','$'] ], " +
        "        displayMath: [ ['$$','$$'] ]," +
        "        processEscapes: true," +
        "        preview: 'TeX'," +
        "    }, " +
        "    'HTML-CSS': { " +
        " linebreaks: { automatic: true, width: '95% container' }, " +
        "        styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'black ! important'} }" +
        "    } " +
        "}); ";
    (document.body || document.getElementsByTagName('head')[0]).appendChild(mathjaxscript);
}
&lt;/script&gt;
</content><category term="Data Science"/><category term="python"/><category term="mlflow"/><category term="ragas"/><category term="rag"/><category term="llm"/><category term="evaluation"/><category term="langchain"/><category term="tutorial"/></entry><entry><title>Working Faster with Git Worktrees and AI-Based Multi-Workflow Development</title><link href="https://www.safjan.com/git-worktrees-ai-multi-workflows/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-11-28T00:00:00+01:00</published><updated>2025-11-28T00:00:00+01:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-11-28:/git-worktrees-ai-multi-workflows/</id><summary type="html">&lt;p&gt;A practical, hands-on guide to turning a single codebase into a multi-workflow environment using Git worktrees, VS Code, and AI coding assistants. The tutorial shows how to isolate experiments, compare results, and speed up refactoring work with agents and Copilot.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;em&gt;A practical step-by-step tutorial using Git worktrees, small and large refactorings, and GitHub Copilot.&lt;/em&gt;&lt;/p&gt;
&lt;h2 id="prerequisites"&gt;Prerequisites&lt;/h2&gt;
&lt;p&gt;Before starting, ensure you have:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Git 2.5+&lt;/strong&gt; (check with &lt;code&gt;git --version&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Python 3.8+&lt;/strong&gt; and pip&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;VS Code&lt;/strong&gt; (optional but recommended)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;pytest&lt;/strong&gt; (&lt;code&gt;pip install pytest&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GitHub Copilot&lt;/strong&gt; or another AI coding assistant&lt;/li&gt;
&lt;li&gt;Basic familiarity with Git branching and command-line operations&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="introduction"&gt;Introduction&lt;/h2&gt;
&lt;p&gt;There is a moment when a project grows from a single-track workflow into something more complex. The codebase stays small, but you want to try a few big ideas at once. Or you want GitHub Copilot (or any AI coding agent) to attempt two different solutions while you continue working normally. One branch is not enough. Stashes are too messy. Full clones eat time and attention.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://git-scm.com/docs/git-worktree"&gt;Git worktrees&lt;/a&gt; solve this elegantly. They let one repository produce several working directories, each on its own branch, all sharing the same underlying store. One directory becomes the human-edited version. Another becomes the AI sandbox. A third holds an alternative AI proposal. You compare the results side by side without the mental reset of switching branches.&lt;/p&gt;
&lt;p&gt;This tutorial shows you how to build a multi-workflow setup using a simple Python mini-project. You will run two AI-assisted refactoring tasks: one small and one heavy. You will then evaluate them, compare results, validate correctness, and merge only the parts worth keeping.&lt;/p&gt;
&lt;h2 id="starting-point-a-tiny-codebase"&gt;Starting Point: A Tiny Codebase&lt;/h2&gt;
&lt;h3 id="initial-project-setup"&gt;Initial Project Setup&lt;/h3&gt;
&lt;p&gt;First, create a fresh repository:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;mkdir&lt;span class="w"&gt; &lt;/span&gt;project
&lt;span class="nb"&gt;cd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;project
git&lt;span class="w"&gt; &lt;/span&gt;init
git&lt;span class="w"&gt; &lt;/span&gt;checkout&lt;span class="w"&gt; &lt;/span&gt;-b&lt;span class="w"&gt; &lt;/span&gt;develop
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Create a file called &lt;code&gt;stats.py&lt;/code&gt; with some simple statistics functions:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# stats.py&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;average&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;median&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;sorted_values&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;mid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sorted_values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sorted_values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sorted_values&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;mid&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;sorted_values&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;mid&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;sorted_values&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;mid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;variance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Wrong: variance uses mean, not sum of squared values directly&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Commit this baseline:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;git&lt;span class="w"&gt; &lt;/span&gt;add&lt;span class="w"&gt; &lt;/span&gt;stats.py
git&lt;span class="w"&gt; &lt;/span&gt;commit&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Initial commit with stats module&amp;quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;There is an obvious bug in &lt;code&gt;variance&lt;/code&gt;. The right definition uses squared deviations from the mean.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Correct version of variance (population variance)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;variance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;average&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Note: This calculates population variance (dividing by N). Sample variance would use N-1.&lt;/p&gt;
&lt;p&gt;This is our baseline. A candidate for small refactoring is the variance fix. A candidate for heavy refactoring is reorganising everything into a class-based structure or splitting the module into multiple files.&lt;/p&gt;
&lt;h2 id="step-1-create-a-clean-main-worktree"&gt;Step 1: Create a Clean Main Worktree&lt;/h2&gt;
&lt;p&gt;You start in the main repository folder, checked out on &lt;code&gt;develop&lt;/code&gt;. This is your everyday environment.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;project
git&lt;span class="w"&gt; &lt;/span&gt;status
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Everything looks normal.&lt;/p&gt;
&lt;h2 id="step-2-prepare-worktrees-for-ai-workflows"&gt;Step 2: Prepare Worktrees for AI Workflows&lt;/h2&gt;
&lt;p&gt;Now create worktrees with new branches for experimentation. The &lt;code&gt;git worktree add&lt;/code&gt; command creates both the branch and the working directory in one step:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;git&lt;span class="w"&gt; &lt;/span&gt;worktree&lt;span class="w"&gt; &lt;/span&gt;add&lt;span class="w"&gt; &lt;/span&gt;-b&lt;span class="w"&gt; &lt;/span&gt;ai/refactor-small&lt;span class="w"&gt; &lt;/span&gt;../refactor-small
git&lt;span class="w"&gt; &lt;/span&gt;worktree&lt;span class="w"&gt; &lt;/span&gt;add&lt;span class="w"&gt; &lt;/span&gt;-b&lt;span class="w"&gt; &lt;/span&gt;ai/refactor-heavy&lt;span class="w"&gt; &lt;/span&gt;../refactor-heavy
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Your directory structure now looks like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;parent-directory/
├── project/           # main worktree (develop branch)
│   └── stats.py
├── refactor-small/    # worktree (ai/refactor-small branch)
│   └── stats.py
└── refactor-heavy/    # worktree (ai/refactor-heavy branch)
    └── stats.py
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You now have three directories:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;project/&lt;/code&gt; for your normal coding&lt;/li&gt;
&lt;li&gt;&lt;code&gt;refactor-small/&lt;/code&gt; for the AI to attempt precise, localised edits&lt;/li&gt;
&lt;li&gt;&lt;code&gt;refactor-heavy/&lt;/code&gt; for a broad exploration of reorganising the project&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can verify existing workflows with:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;git&lt;span class="w"&gt; &lt;/span&gt;worktree&lt;span class="w"&gt; &lt;/span&gt;list
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Each is a full working directory with its own branch and commit history, but the &lt;code&gt;.git&lt;/code&gt; data is shared. Switching happens simply by changing directories, not by resetting state.&lt;/p&gt;
&lt;h2 id="step-3-run-the-first-workflow-a-small-ai-refactor"&gt;Step 3: Run the First Workflow – A Small AI Refactor&lt;/h2&gt;
&lt;p&gt;Go into the &lt;code&gt;refactor-small&lt;/code&gt; directory and open your IDE.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;../refactor-small
code&lt;span class="w"&gt; &lt;/span&gt;.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Open &lt;code&gt;stats.py&lt;/code&gt; and use &lt;strong&gt;GitHub Copilot Chat&lt;/strong&gt; to fix the variance implementation. Exemplary prompt:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Clean up this file. Fix the variance bug, add type hints, and make error handling more robust. Keep changes minimal.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Review the AI's suggestions carefully. Accept changes that look correct. The result should be something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# stats.py after small refactor&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Iterable&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;average&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Iterable&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;values&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Cannot compute average of empty list&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;median&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Iterable&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;values&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Cannot compute median of empty list&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;mid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;mid&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;mid&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;mid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;variance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Iterable&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;values&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Cannot compute variance of empty list&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;average&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Commit the result.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;git&lt;span class="w"&gt; &lt;/span&gt;add&lt;span class="w"&gt; &lt;/span&gt;stats.py
git&lt;span class="w"&gt; &lt;/span&gt;commit&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Small refactor: fix variance, add type hints, add basic validation&amp;quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Nothing touches the &lt;code&gt;project/&lt;/code&gt; folder. No stashes. No switching.&lt;/p&gt;
&lt;h2 id="step-4-run-the-heavy-workflow-a-large-scale-ai-refactor"&gt;Step 4: Run the Heavy Workflow – A Large-Scale AI Refactor&lt;/h2&gt;
&lt;p&gt;Open the second worktree in a &lt;strong&gt;separate VS Code window&lt;/strong&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;../refactor-heavy
code&lt;span class="w"&gt; &lt;/span&gt;.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This time, ask your AI assistant:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Rewrite this module using an object-oriented structure. Split statistics operations into separate classes if needed. Add validation. Keep the public API clear and ergonomic. Keep the code in the same file do not split into multiple files.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Expect a much bigger rewrite. Verify changes and commit again:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;git&lt;span class="w"&gt; &lt;/span&gt;add&lt;span class="w"&gt; &lt;/span&gt;stats.py
git&lt;span class="w"&gt; &lt;/span&gt;commit&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Major rewrite: object-oriented Stats class with reorganized structure&amp;quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Now you have two valid, complete refactor attempts captured in versioned form, side by side, without any switching friction.&lt;/p&gt;
&lt;h2 id="step-5-validate-and-compare-the-results"&gt;Step 5: Validate and Compare the Results&lt;/h2&gt;
&lt;p&gt;Here is where multi-workflows shine. You can run tests, linters, or performance checks in each directory independently and in parallel.&lt;/p&gt;
&lt;p&gt;Back in the main project directory, create a test script:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# test_stats.py (in project/ and refactor-small/)&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;stats&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;average&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;median&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;variance&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_all&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;values&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;average&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;median&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nb"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;variance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mf"&gt;2.0&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="vm"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;__main__&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;test_all&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;✓ All tests passed&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Copy this file to both worktrees. For the heavy version, create a different test:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# test_stats.py (in refactor-heavy/)&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;stats&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Stats&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_stats_class&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Stats&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;average&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;median&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nb"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;variance&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mf"&gt;2.0&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="vm"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;__main__&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;test_stats_class&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;All tests passed&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Run tests separately in each worktree:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;project
python&lt;span class="w"&gt; &lt;/span&gt;test_stats.py
&lt;span class="c1"&gt;# Output: All tests passed&lt;/span&gt;

&lt;span class="nb"&gt;cd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;../refactor-small
python&lt;span class="w"&gt; &lt;/span&gt;test_stats.py
&lt;span class="c1"&gt;# Output: All tests passed&lt;/span&gt;

&lt;span class="nb"&gt;cd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;../refactor-heavy
python&lt;span class="w"&gt; &lt;/span&gt;test_stats.py
&lt;span class="c1"&gt;# Output: All tests passed&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You get independent validation. If the heavy refactor has bugs, you know immediately without polluting the main line of work.&lt;/p&gt;
&lt;h2 id="step-6-decide-what-to-merge"&gt;Step 6: Decide What to Merge&lt;/h2&gt;
&lt;p&gt;You now have three evolving branches:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;develop&lt;/code&gt; with your original code&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ai/refactor-small&lt;/code&gt; with a careful, stable refactor&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ai/refactor-heavy&lt;/code&gt; with a more ambitious rewrite&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Decision time usually involves:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Code clarity&lt;/li&gt;
&lt;li&gt;Compatibility with existing imports&lt;/li&gt;
&lt;li&gt;Test coverage&lt;/li&gt;
&lt;li&gt;Future extensibility&lt;/li&gt;
&lt;li&gt;Performance&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For many teams, the small refactor becomes a straightforward merge:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;../project&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;# Return to main worktree&lt;/span&gt;
git&lt;span class="w"&gt; &lt;/span&gt;switch&lt;span class="w"&gt; &lt;/span&gt;develop
git&lt;span class="w"&gt; &lt;/span&gt;merge&lt;span class="w"&gt; &lt;/span&gt;ai/refactor-small
&lt;span class="c1"&gt;# Output: Fast-forward or Merge made by the &amp;#39;recursive&amp;#39; strategy.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;After merging, run tests again to ensure everything works:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;python&lt;span class="w"&gt; &lt;/span&gt;test_stats.py
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The heavy refactor might need additional time, iteration, or might stay as a long-running branch until more confidence is built. You can keep that worktree around for continued experimentation.&lt;/p&gt;
&lt;h2 id="step-7-cleaning-up"&gt;Step 7: Cleaning Up&lt;/h2&gt;
&lt;p&gt;After merging what you want, remove worktrees you no longer need.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;git&lt;span class="w"&gt; &lt;/span&gt;worktree&lt;span class="w"&gt; &lt;/span&gt;remove&lt;span class="w"&gt; &lt;/span&gt;../refactor-small
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;For branches you want to keep but not actively work on:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Archive the heavy refactor branch for future reference&lt;/span&gt;
git&lt;span class="w"&gt; &lt;/span&gt;branch&lt;span class="w"&gt; &lt;/span&gt;--move&lt;span class="w"&gt; &lt;/span&gt;ai/refactor-heavy&lt;span class="w"&gt; &lt;/span&gt;archive/refactor-heavy-2025-12
git&lt;span class="w"&gt; &lt;/span&gt;worktree&lt;span class="w"&gt; &lt;/span&gt;remove&lt;span class="w"&gt; &lt;/span&gt;../refactor-heavy
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Or delete the branch entirely if you don't need it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;git&lt;span class="w"&gt; &lt;/span&gt;worktree&lt;span class="w"&gt; &lt;/span&gt;remove&lt;span class="w"&gt; &lt;/span&gt;../refactor-heavy
git&lt;span class="w"&gt; &lt;/span&gt;branch&lt;span class="w"&gt; &lt;/span&gt;-D&lt;span class="w"&gt; &lt;/span&gt;ai/refactor-heavy
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;If Git complains the directory still exists, delete it manually and run:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;rm&lt;span class="w"&gt; &lt;/span&gt;-rf&lt;span class="w"&gt; &lt;/span&gt;../refactor-heavy
git&lt;span class="w"&gt; &lt;/span&gt;worktree&lt;span class="w"&gt; &lt;/span&gt;prune
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;List remaining worktrees:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;git&lt;span class="w"&gt; &lt;/span&gt;worktree&lt;span class="w"&gt; &lt;/span&gt;list
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="why-this-workflow-works-so-well"&gt;Why This Workflow Works So Well&lt;/h2&gt;
&lt;p&gt;This approach removes the friction of experimental coding. AI tools prefer entire files or folders as context, and they sometimes rewrite aggressively rather than patch small areas. Worktrees give them dedicated sandboxes, and they give you mental clarity: each branch corresponds to a purpose. You can compare outputs by simply opening two folders side by side instead of juggling switching commands that reset your IDE state.&lt;/p&gt;
&lt;p&gt;Worktrees also encourage discipline. You evaluate AI-generated changes deliberately, not impulsively. You treat each branch as a self-contained proposal. This makes working with powerful automated tools safer, faster, and easier to reason about.&lt;/p&gt;
&lt;p&gt;The biggest advantage is velocity without chaos: several workflows move forward independently, but none interfere with your main development branch. You keep the codebase clean, and you can run experiments all day without ever fearing that an agent will overwrite something you care about.&lt;/p&gt;
&lt;p&gt;If you use AI in your everyday coding, this structure should become second nature. It transforms the way you evaluate ideas, especially when dealing with both small fixes and large rewrites.&lt;/p&gt;
&lt;h2 id="troubleshooting-common-issues"&gt;Troubleshooting Common Issues&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Problem&lt;/strong&gt;: "fatal: 'ai/refactor-small' is already checked out at..."&lt;br&gt;
&lt;strong&gt;Solution&lt;/strong&gt;: You can't check out the same branch in multiple worktrees. Use different branch names.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Problem&lt;/strong&gt;: Changes appear in the wrong worktree&lt;br&gt;
&lt;strong&gt;Solution&lt;/strong&gt;: Check which directory you're in with &lt;code&gt;pwd&lt;/code&gt;. Verify the branch with &lt;code&gt;git branch --show-current&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Problem&lt;/strong&gt;: VS Code commits to wrong branch&lt;br&gt;
&lt;strong&gt;Solution&lt;/strong&gt;: Always open the worktree directory as the workspace root, not a parent folder. Check the branch indicator in the bottom-left corner.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Problem&lt;/strong&gt;: Tests fail after AI refactor&lt;br&gt;
&lt;strong&gt;Solution&lt;/strong&gt;: Review AI changes line-by-line. AI tools sometimes introduce subtle bugs. Use &lt;code&gt;git diff&lt;/code&gt; to see exactly what changed.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Problem&lt;/strong&gt;: Can't delete worktree&lt;br&gt;
&lt;strong&gt;Solution&lt;/strong&gt;: Close all editors and terminals in that worktree, then use &lt;code&gt;git worktree remove --force&lt;/code&gt; if needed.&lt;/p&gt;
&lt;h2 id="extras"&gt;Extras&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;###&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Using&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;VS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Code&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;with&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Multiple&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Worktrees&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Without&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Losing&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Your&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Mind&lt;/span&gt;

&lt;span class="nv"&gt;Now&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;comes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;tricky&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;part&lt;/span&gt;.&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Developers&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;get&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;confused&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;multi&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nv"&gt;worktree&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;setups&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;not&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;because&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Git&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;is&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;hard&lt;/span&gt;,&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;but&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;because&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;editor&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;happily&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;opens&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;whatever&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;folder&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;you&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;point&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;at&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;then&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;people&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;forget&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;which&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;workflow&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;they&lt;/span&gt;’&lt;span class="nv"&gt;re&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;editing&lt;/span&gt;.&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;One&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;tiny&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;commit&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;into&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;wrong&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;branch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;your&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;careful&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;separation&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;collapses&lt;/span&gt;.

&lt;span class="nv"&gt;The&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;easiest&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;way&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;stay&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;sane&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;is&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;open&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;each&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;worktree&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;separate&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;VS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Code&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;window&lt;/span&gt;.

&lt;span class="k"&gt;If&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;you&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;open&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;root&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;project&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;folder&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;start&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;navigating&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;into&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;`.&lt;span class="nv"&gt;worktrees&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;...`,&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;you&lt;/span&gt;’&lt;span class="nv"&gt;re&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;setting&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;yourself&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;up&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;mistakes&lt;/span&gt;.&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Instead&lt;/span&gt;,&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;open&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;worktree&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;directory&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;itself&lt;/span&gt;:

```&lt;span class="nv"&gt;bash&lt;/span&gt;
&lt;span class="nv"&gt;code&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;..&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nv"&gt;project&lt;/span&gt;
&lt;span class="nv"&gt;code&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;..&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nv"&gt;refactor&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nv"&gt;small&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This gives each workflow its own VS Code instance, with its own branch badge in the bottom-left status bar. You never want both instances to point at the same folder tree.&lt;/p&gt;
&lt;p&gt;The moment you enter a file in the wrong window, you’ll see the branch indicator complaining. Git worktree integration in VS Code is surprisingly stable as long as each window corresponds to just one worktree.&lt;/p&gt;</content><category term="note"/><category term="git"/><category term="worktrees"/><category term="ai-coding"/><category term="refactoring"/><category term="workflows"/><category term="vscode"/><category term="productivity"/></entry><entry><title>Using Git Worktrees as Clean Rooms for AI-Assisted Coding</title><link href="https://www.safjan.com/git-worktrees-for-ai-coding/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-11-28T00:00:00+01:00</published><updated>2025-11-28T00:00:00+01:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-11-28:/git-worktrees-for-ai-coding/</id><summary type="html">&lt;p&gt;Learn how Git worktrees create isolated environments for AI-assisted coding, allowing you to keep your main development line clean while experimenting with AI suggestions in dedicated branches.&lt;/p&gt;</summary><content type="html">&lt;h2 id="the-mental-model"&gt;The Mental Model&lt;/h2&gt;
&lt;p&gt;Git worktrees let a single repository sprout multiple working directories, each tied to a different branch but sharing the same object store. It feels almost like having parallel realities: one directory on &lt;code&gt;develop&lt;/code&gt;, another on a feature branch, yet another on a temporary test branch, all living side by side without fighting for checkout state. There is no deep magic behind this; Git simply keeps the &lt;code&gt;.git&lt;/code&gt; metadata in one place and mounts lightweight directory views that track their own branch heads.&lt;/p&gt;
&lt;p&gt;A tiny example explains the idea better than abstraction ever could:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Add a new worktree checked out to branch feature/refactor&lt;/span&gt;
git&lt;span class="w"&gt; &lt;/span&gt;worktree&lt;span class="w"&gt; &lt;/span&gt;add&lt;span class="w"&gt; &lt;/span&gt;../refactor-sandbox&lt;span class="w"&gt; &lt;/span&gt;feature/refactor
&lt;span class="sb"&gt;````&lt;/span&gt;

Your&lt;span class="w"&gt; &lt;/span&gt;filesystem&lt;span class="w"&gt; &lt;/span&gt;now&lt;span class="w"&gt; &lt;/span&gt;holds&lt;span class="w"&gt; &lt;/span&gt;two&lt;span class="w"&gt; &lt;/span&gt;siblings:

__OBSIDIAN_CODEBLOCK_1__

The&lt;span class="w"&gt; &lt;/span&gt;first&lt;span class="w"&gt; &lt;/span&gt;stays&lt;span class="w"&gt; &lt;/span&gt;on&lt;span class="w"&gt; &lt;/span&gt;your&lt;span class="w"&gt; &lt;/span&gt;main&lt;span class="w"&gt; &lt;/span&gt;line&lt;span class="w"&gt; &lt;/span&gt;of&lt;span class="w"&gt; &lt;/span&gt;work.&lt;span class="w"&gt; &lt;/span&gt;The&lt;span class="w"&gt; &lt;/span&gt;second&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;clean,&lt;span class="w"&gt; &lt;/span&gt;isolated&lt;span class="w"&gt; &lt;/span&gt;world&lt;span class="w"&gt; &lt;/span&gt;where&lt;span class="w"&gt; &lt;/span&gt;an&lt;span class="w"&gt; &lt;/span&gt;AI&lt;span class="w"&gt; &lt;/span&gt;agent&lt;span class="w"&gt; &lt;/span&gt;can&lt;span class="w"&gt; &lt;/span&gt;generate,&lt;span class="w"&gt; &lt;/span&gt;rewrite,&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;destructively&lt;span class="w"&gt; &lt;/span&gt;refactor&lt;span class="w"&gt; &lt;/span&gt;code&lt;span class="w"&gt; &lt;/span&gt;without&lt;span class="w"&gt; &lt;/span&gt;trampling&lt;span class="w"&gt; &lt;/span&gt;over&lt;span class="w"&gt; &lt;/span&gt;your&lt;span class="w"&gt; &lt;/span&gt;ongoing&lt;span class="w"&gt; &lt;/span&gt;tasks.

Here&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;simplified&lt;span class="w"&gt; &lt;/span&gt;illustration&lt;span class="w"&gt; &lt;/span&gt;of&lt;span class="w"&gt; &lt;/span&gt;how&lt;span class="w"&gt; &lt;/span&gt;worktrees&lt;span class="w"&gt; &lt;/span&gt;relate&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;underlying&lt;span class="w"&gt; &lt;/span&gt;object&lt;span class="w"&gt; &lt;/span&gt;store:

__OBSIDIAN_CODEBLOCK_2__

This&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;core&lt;span class="w"&gt; &lt;/span&gt;arrangement:&lt;span class="w"&gt; &lt;/span&gt;many&lt;span class="w"&gt; &lt;/span&gt;worktrees,&lt;span class="w"&gt; &lt;/span&gt;one&lt;span class="w"&gt; &lt;/span&gt;shared&lt;span class="w"&gt; &lt;/span&gt;repository&lt;span class="w"&gt; &lt;/span&gt;brain.

&lt;span class="c1"&gt;## What You Can Do&lt;/span&gt;

The&lt;span class="w"&gt; &lt;/span&gt;real&lt;span class="w"&gt; &lt;/span&gt;advantage&lt;span class="w"&gt; &lt;/span&gt;of&lt;span class="w"&gt; &lt;/span&gt;worktrees&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;not&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;basic&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;command&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;but&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;freedom&lt;span class="w"&gt; &lt;/span&gt;of&lt;span class="w"&gt; &lt;/span&gt;running&lt;span class="w"&gt; &lt;/span&gt;multiple&lt;span class="w"&gt; &lt;/span&gt;contexts&lt;span class="w"&gt; &lt;/span&gt;at&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;same&lt;span class="w"&gt; &lt;/span&gt;time.&lt;span class="w"&gt; &lt;/span&gt;This&lt;span class="w"&gt; &lt;/span&gt;becomes&lt;span class="w"&gt; &lt;/span&gt;especially&lt;span class="w"&gt; &lt;/span&gt;useful&lt;span class="w"&gt; &lt;/span&gt;when&lt;span class="w"&gt; &lt;/span&gt;coding&lt;span class="w"&gt; &lt;/span&gt;with&lt;span class="w"&gt; &lt;/span&gt;agents&lt;span class="w"&gt; &lt;/span&gt;that&lt;span class="w"&gt; &lt;/span&gt;expect&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;clean&lt;span class="w"&gt; &lt;/span&gt;directory&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;stable&lt;span class="w"&gt; &lt;/span&gt;branch&lt;span class="w"&gt; &lt;/span&gt;state.

You&lt;span class="w"&gt; &lt;/span&gt;can&lt;span class="w"&gt; &lt;/span&gt;place&lt;span class="w"&gt; &lt;/span&gt;an&lt;span class="w"&gt; &lt;/span&gt;AI&lt;span class="w"&gt; &lt;/span&gt;agent&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;dedicated&lt;span class="w"&gt; &lt;/span&gt;worktree&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;it&lt;span class="w"&gt; &lt;/span&gt;operate&lt;span class="w"&gt; &lt;/span&gt;as&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;it&lt;span class="w"&gt; &lt;/span&gt;were&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;only&lt;span class="w"&gt; &lt;/span&gt;thing&lt;span class="w"&gt; &lt;/span&gt;happening&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;project.&lt;span class="w"&gt; &lt;/span&gt;Meanwhile,&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;normal&lt;span class="w"&gt; &lt;/span&gt;development&lt;span class="w"&gt; &lt;/span&gt;somewhere&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;else&lt;/span&gt;,&lt;span class="w"&gt; &lt;/span&gt;without&lt;span class="w"&gt; &lt;/span&gt;stashing,&lt;span class="w"&gt; &lt;/span&gt;switching,&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;cleaning&lt;span class="w"&gt; &lt;/span&gt;up.

__OBSIDIAN_CODEBLOCK_3__

The&lt;span class="w"&gt; &lt;/span&gt;agent&lt;span class="w"&gt; &lt;/span&gt;gets&lt;span class="w"&gt; &lt;/span&gt;its&lt;span class="w"&gt; &lt;/span&gt;own&lt;span class="w"&gt; &lt;/span&gt;branch&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;folder,&lt;span class="w"&gt; &lt;/span&gt;untouched&lt;span class="w"&gt; &lt;/span&gt;by&lt;span class="w"&gt; &lt;/span&gt;half-written&lt;span class="w"&gt; &lt;/span&gt;ideas&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;files&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;forgot&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;commit.&lt;span class="w"&gt; &lt;/span&gt;When&lt;span class="w"&gt; &lt;/span&gt;it&lt;span class="w"&gt; &lt;/span&gt;produces&lt;span class="w"&gt; &lt;/span&gt;large&lt;span class="w"&gt; &lt;/span&gt;structural&lt;span class="w"&gt; &lt;/span&gt;changes,&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;can&lt;span class="w"&gt; &lt;/span&gt;evaluate&lt;span class="w"&gt; &lt;/span&gt;them&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;isolation&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;merge&lt;span class="w"&gt; &lt;/span&gt;only&lt;span class="w"&gt; &lt;/span&gt;when&lt;span class="w"&gt; &lt;/span&gt;convinced&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;proposal&lt;span class="w"&gt; &lt;/span&gt;makes&lt;span class="w"&gt; &lt;/span&gt;sense.

You&lt;span class="w"&gt; &lt;/span&gt;can&lt;span class="w"&gt; &lt;/span&gt;go&lt;span class="w"&gt; &lt;/span&gt;further&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;up&lt;span class="w"&gt; &lt;/span&gt;multiple&lt;span class="w"&gt; &lt;/span&gt;competing&lt;span class="w"&gt; &lt;/span&gt;proposals:

__OBSIDIAN_CODEBLOCK_4__

Agents&lt;span class="w"&gt; &lt;/span&gt;operate&lt;span class="w"&gt; &lt;/span&gt;freely.&lt;span class="w"&gt; &lt;/span&gt;You&lt;span class="w"&gt; &lt;/span&gt;compare&lt;span class="w"&gt; &lt;/span&gt;their&lt;span class="w"&gt; &lt;/span&gt;results&lt;span class="w"&gt; &lt;/span&gt;with&lt;span class="w"&gt; &lt;/span&gt;human&lt;span class="w"&gt; &lt;/span&gt;eyes.

&lt;span class="c1"&gt;## Practical Scenarios&lt;/span&gt;

This&lt;span class="w"&gt; &lt;/span&gt;workflow&lt;span class="w"&gt; &lt;/span&gt;shines&lt;span class="w"&gt; &lt;/span&gt;when&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;are&lt;span class="w"&gt; &lt;/span&gt;working&lt;span class="w"&gt; &lt;/span&gt;with&lt;span class="w"&gt; &lt;/span&gt;tools&lt;span class="w"&gt; &lt;/span&gt;that&lt;span class="w"&gt; &lt;/span&gt;tend&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;rewrite&lt;span class="w"&gt; &lt;/span&gt;code&lt;span class="w"&gt; &lt;/span&gt;rather&lt;span class="w"&gt; &lt;/span&gt;than&lt;span class="w"&gt; &lt;/span&gt;patch&lt;span class="w"&gt; &lt;/span&gt;it.&lt;span class="w"&gt; &lt;/span&gt;Many&lt;span class="w"&gt; &lt;/span&gt;agents&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;do&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;this.&lt;span class="w"&gt; &lt;/span&gt;A&lt;span class="w"&gt; &lt;/span&gt;worktree&lt;span class="w"&gt; &lt;/span&gt;acts&lt;span class="w"&gt; &lt;/span&gt;like&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;disposable&lt;span class="w"&gt; &lt;/span&gt;environment.

You&lt;span class="w"&gt; &lt;/span&gt;might&lt;span class="w"&gt; &lt;/span&gt;use&lt;span class="w"&gt; &lt;/span&gt;them&lt;span class="w"&gt; &lt;/span&gt;when&lt;span class="w"&gt; &lt;/span&gt;engaging&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;heavy&lt;span class="w"&gt; &lt;/span&gt;refactors,&lt;span class="w"&gt; &lt;/span&gt;upgrades&lt;span class="w"&gt; &lt;/span&gt;across&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;whole&lt;span class="w"&gt; &lt;/span&gt;codebase,&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;exploratory&lt;span class="w"&gt; &lt;/span&gt;changes&lt;span class="w"&gt; &lt;/span&gt;generated&lt;span class="w"&gt; &lt;/span&gt;by&lt;span class="w"&gt; &lt;/span&gt;an&lt;span class="w"&gt; &lt;/span&gt;agent,&lt;span class="w"&gt; &lt;/span&gt;where&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;risk&lt;span class="w"&gt; &lt;/span&gt;of&lt;span class="w"&gt; &lt;/span&gt;polluting&lt;span class="w"&gt; &lt;/span&gt;your&lt;span class="w"&gt; &lt;/span&gt;current&lt;span class="w"&gt; &lt;/span&gt;branch&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;high.&lt;span class="w"&gt; &lt;/span&gt;You&lt;span class="w"&gt; &lt;/span&gt;can&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;agent&lt;span class="w"&gt; &lt;/span&gt;run&lt;span class="w"&gt; &lt;/span&gt;tests&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;static&lt;span class="w"&gt; &lt;/span&gt;checks&lt;span class="w"&gt; &lt;/span&gt;inside&lt;span class="w"&gt; &lt;/span&gt;that&lt;span class="w"&gt; &lt;/span&gt;environment&lt;span class="w"&gt; &lt;/span&gt;as&lt;span class="w"&gt; &lt;/span&gt;well,&lt;span class="w"&gt; &lt;/span&gt;because&lt;span class="w"&gt; &lt;/span&gt;nothing&lt;span class="w"&gt; &lt;/span&gt;touches&lt;span class="w"&gt; &lt;/span&gt;your&lt;span class="w"&gt; &lt;/span&gt;main&lt;span class="w"&gt; &lt;/span&gt;working&lt;span class="w"&gt; &lt;/span&gt;directory.

There&lt;span class="w"&gt; &lt;/span&gt;are&lt;span class="w"&gt; &lt;/span&gt;situations&lt;span class="w"&gt; &lt;/span&gt;where&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;worktree&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;not&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;good&lt;span class="w"&gt; &lt;/span&gt;fit.&lt;span class="w"&gt; &lt;/span&gt;If&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;are&lt;span class="w"&gt; &lt;/span&gt;working&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;simple&lt;span class="w"&gt; &lt;/span&gt;repository&lt;span class="w"&gt; &lt;/span&gt;with&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;single&lt;span class="w"&gt; &lt;/span&gt;branch&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;rely&lt;span class="w"&gt; &lt;/span&gt;heavily&lt;span class="w"&gt; &lt;/span&gt;on&lt;span class="w"&gt; &lt;/span&gt;ephemeral&lt;span class="w"&gt; &lt;/span&gt;changes,&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;then&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;switching&lt;span class="w"&gt; &lt;/span&gt;branches&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;using&lt;span class="w"&gt; &lt;/span&gt;commit-based&lt;span class="w"&gt; &lt;/span&gt;snapshots&lt;span class="w"&gt; &lt;/span&gt;might&lt;span class="w"&gt; &lt;/span&gt;be&lt;span class="w"&gt; &lt;/span&gt;simpler.&lt;span class="w"&gt; &lt;/span&gt;Also,&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;an&lt;span class="w"&gt; &lt;/span&gt;agent&lt;span class="w"&gt; &lt;/span&gt;expects&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;operate&lt;span class="w"&gt; &lt;/span&gt;on&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;fully&lt;span class="w"&gt; &lt;/span&gt;isolated&lt;span class="w"&gt; &lt;/span&gt;clone&lt;span class="w"&gt; &lt;/span&gt;rather&lt;span class="w"&gt; &lt;/span&gt;than&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;shared&lt;span class="w"&gt; &lt;/span&gt;store,&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;then&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;real&lt;span class="w"&gt; &lt;/span&gt;clone&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;safer.&lt;span class="w"&gt; &lt;/span&gt;Some&lt;span class="w"&gt; &lt;/span&gt;poorly&lt;span class="w"&gt; &lt;/span&gt;implemented&lt;span class="w"&gt; &lt;/span&gt;tools&lt;span class="w"&gt; &lt;/span&gt;still&lt;span class="w"&gt; &lt;/span&gt;assume&lt;span class="w"&gt; &lt;/span&gt;that&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;.git&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;lives&lt;span class="w"&gt; &lt;/span&gt;inside&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;working&lt;span class="w"&gt; &lt;/span&gt;directory&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;those&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;worktree&lt;span class="w"&gt; &lt;/span&gt;layout.

A&lt;span class="w"&gt; &lt;/span&gt;real-world&lt;span class="w"&gt; &lt;/span&gt;example:&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;have&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;large&lt;span class="w"&gt; &lt;/span&gt;Python&lt;span class="w"&gt; &lt;/span&gt;codebase.&lt;span class="w"&gt; &lt;/span&gt;You&lt;span class="w"&gt; &lt;/span&gt;want&lt;span class="w"&gt; &lt;/span&gt;an&lt;span class="w"&gt; &lt;/span&gt;agent&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;attempt&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;dependency&lt;span class="w"&gt; &lt;/span&gt;upgrade&lt;span class="w"&gt; &lt;/span&gt;across&lt;span class="w"&gt; &lt;/span&gt;many&lt;span class="w"&gt; &lt;/span&gt;modules.&lt;span class="w"&gt; &lt;/span&gt;You&lt;span class="w"&gt; &lt;/span&gt;create&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;fresh&lt;span class="w"&gt; &lt;/span&gt;worktree&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;branch&lt;span class="w"&gt; &lt;/span&gt;called&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;upgrade/fastapi-0.115&lt;span class="sb"&gt;`&lt;/span&gt;.&lt;span class="w"&gt; &lt;/span&gt;The&lt;span class="w"&gt; &lt;/span&gt;agent&lt;span class="w"&gt; &lt;/span&gt;runs&lt;span class="w"&gt; &lt;/span&gt;its&lt;span class="w"&gt; &lt;/span&gt;rewrite&lt;span class="w"&gt; &lt;/span&gt;steps&lt;span class="w"&gt; &lt;/span&gt;there.&lt;span class="w"&gt; &lt;/span&gt;You&lt;span class="w"&gt; &lt;/span&gt;stay&lt;span class="w"&gt; &lt;/span&gt;focused&lt;span class="w"&gt; &lt;/span&gt;on&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;bug&lt;span class="w"&gt; &lt;/span&gt;fix&lt;span class="w"&gt; &lt;/span&gt;on&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;develop&lt;span class="sb"&gt;`&lt;/span&gt;,&lt;span class="w"&gt; &lt;/span&gt;undisturbed.&lt;span class="w"&gt; &lt;/span&gt;Later,&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;review&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;agent’s&lt;span class="w"&gt; &lt;/span&gt;output,&lt;span class="w"&gt; &lt;/span&gt;run&lt;span class="w"&gt; &lt;/span&gt;tests,&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;only&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;then&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;merge.&lt;span class="w"&gt; &lt;/span&gt;The&lt;span class="w"&gt; &lt;/span&gt;separation&lt;span class="w"&gt; &lt;/span&gt;reduces&lt;span class="w"&gt; &lt;/span&gt;mistakes.

&lt;span class="c1"&gt;## The Fine Print&lt;/span&gt;

There&lt;span class="w"&gt; &lt;/span&gt;are&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;few&lt;span class="w"&gt; &lt;/span&gt;subtle&lt;span class="w"&gt; &lt;/span&gt;behaviors&lt;span class="w"&gt; &lt;/span&gt;worth&lt;span class="w"&gt; &lt;/span&gt;knowing&lt;span class="w"&gt; &lt;/span&gt;before&lt;span class="w"&gt; &lt;/span&gt;using&lt;span class="w"&gt; &lt;/span&gt;worktrees&lt;span class="w"&gt; &lt;/span&gt;as&lt;span class="w"&gt; &lt;/span&gt;part&lt;span class="w"&gt; &lt;/span&gt;of&lt;span class="w"&gt; &lt;/span&gt;your&lt;span class="w"&gt; &lt;/span&gt;AI-assisted&lt;span class="w"&gt; &lt;/span&gt;coding&lt;span class="w"&gt; &lt;/span&gt;workflow.&lt;span class="w"&gt; &lt;/span&gt;The&lt;span class="w"&gt; &lt;/span&gt;first&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;that&lt;span class="w"&gt; &lt;/span&gt;each&lt;span class="w"&gt; &lt;/span&gt;worktree&lt;span class="w"&gt; &lt;/span&gt;tracks&lt;span class="w"&gt; &lt;/span&gt;its&lt;span class="w"&gt; &lt;/span&gt;own&lt;span class="w"&gt; &lt;/span&gt;branch&lt;span class="w"&gt; &lt;/span&gt;head.&lt;span class="w"&gt; &lt;/span&gt;If&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;forget&lt;span class="w"&gt; &lt;/span&gt;which&lt;span class="w"&gt; &lt;/span&gt;directory&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;are&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;,&lt;span class="w"&gt; &lt;/span&gt;it&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;easy&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;commit&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;wrong&lt;span class="w"&gt; &lt;/span&gt;branch.&lt;span class="w"&gt; &lt;/span&gt;The&lt;span class="w"&gt; &lt;/span&gt;fix&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;simple:&lt;span class="w"&gt; &lt;/span&gt;always&lt;span class="w"&gt; &lt;/span&gt;inspect&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;git&lt;span class="w"&gt; &lt;/span&gt;status&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;before&lt;span class="w"&gt; &lt;/span&gt;committing,&lt;span class="w"&gt; &lt;/span&gt;especially&lt;span class="w"&gt; &lt;/span&gt;when&lt;span class="w"&gt; &lt;/span&gt;moving&lt;span class="w"&gt; &lt;/span&gt;between&lt;span class="w"&gt; &lt;/span&gt;folders.

Another&lt;span class="w"&gt; &lt;/span&gt;nuance&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;deletion&lt;span class="w"&gt; &lt;/span&gt;cycle.&lt;span class="w"&gt; &lt;/span&gt;Removing&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;worktree&lt;span class="w"&gt; &lt;/span&gt;requires&lt;span class="w"&gt; &lt;/span&gt;both&lt;span class="w"&gt; &lt;/span&gt;removing&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;folder&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;cleaning&lt;span class="w"&gt; &lt;/span&gt;up&lt;span class="w"&gt; &lt;/span&gt;Git’s&lt;span class="w"&gt; &lt;/span&gt;internal&lt;span class="w"&gt; &lt;/span&gt;registry.&lt;span class="w"&gt; &lt;/span&gt;If&lt;span class="w"&gt; &lt;/span&gt;an&lt;span class="w"&gt; &lt;/span&gt;AI&lt;span class="w"&gt; &lt;/span&gt;tool&lt;span class="w"&gt; &lt;/span&gt;aggressively&lt;span class="w"&gt; &lt;/span&gt;deletes&lt;span class="w"&gt; &lt;/span&gt;directories,&lt;span class="w"&gt; &lt;/span&gt;Git&lt;span class="w"&gt; &lt;/span&gt;can&lt;span class="w"&gt; &lt;/span&gt;get&lt;span class="w"&gt; &lt;/span&gt;confused&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;until&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;run&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;git&lt;span class="w"&gt; &lt;/span&gt;worktree&lt;span class="w"&gt; &lt;/span&gt;prune&lt;span class="sb"&gt;`&lt;/span&gt;.

Performance&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;usually&lt;span class="w"&gt; &lt;/span&gt;excellent,&lt;span class="w"&gt; &lt;/span&gt;since&lt;span class="w"&gt; &lt;/span&gt;objects&lt;span class="w"&gt; &lt;/span&gt;are&lt;span class="w"&gt; &lt;/span&gt;shared,&lt;span class="w"&gt; &lt;/span&gt;but&lt;span class="w"&gt; &lt;/span&gt;some&lt;span class="w"&gt; &lt;/span&gt;agent&lt;span class="w"&gt; &lt;/span&gt;tools&lt;span class="w"&gt; &lt;/span&gt;expect&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;.git&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;directory&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;be&lt;span class="w"&gt; &lt;/span&gt;physically&lt;span class="w"&gt; &lt;/span&gt;inside&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;folder.&lt;span class="w"&gt; &lt;/span&gt;Worktrees&lt;span class="w"&gt; &lt;/span&gt;create&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;.git&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;file&lt;span class="w"&gt; &lt;/span&gt;that&lt;span class="w"&gt; &lt;/span&gt;points&lt;span class="w"&gt; &lt;/span&gt;elsewhere.&lt;span class="w"&gt; &lt;/span&gt;When&lt;span class="w"&gt; &lt;/span&gt;such&lt;span class="w"&gt; &lt;/span&gt;tools&lt;span class="w"&gt; &lt;/span&gt;fail,&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;workaround&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;copy&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;.git&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;directory&lt;span class="w"&gt; &lt;/span&gt;from&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;main&lt;span class="w"&gt; &lt;/span&gt;repo&lt;span class="w"&gt; &lt;/span&gt;into&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;worktree.&lt;span class="w"&gt; &lt;/span&gt;This&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;safe&lt;span class="w"&gt; &lt;/span&gt;only&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;debugging&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;correct&lt;span class="w"&gt; &lt;/span&gt;fix&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;use&lt;span class="w"&gt; &lt;/span&gt;tools&lt;span class="w"&gt; &lt;/span&gt;that&lt;span class="w"&gt; &lt;/span&gt;support&lt;span class="w"&gt; &lt;/span&gt;worktrees&lt;span class="w"&gt; &lt;/span&gt;properly.

Here&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;an&lt;span class="w"&gt; &lt;/span&gt;example&lt;span class="w"&gt; &lt;/span&gt;of&lt;span class="w"&gt; &lt;/span&gt;what&lt;span class="w"&gt; &lt;/span&gt;not&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;do&lt;/span&gt;:

&lt;span class="sb"&gt;```&lt;/span&gt;bash
&lt;span class="c1"&gt;# Wrong: manually replacing the pointer file&lt;/span&gt;
rm&lt;span class="w"&gt; &lt;/span&gt;agent-run/.git
cp&lt;span class="w"&gt; &lt;/span&gt;-r&lt;span class="w"&gt; &lt;/span&gt;project/.git&lt;span class="w"&gt; &lt;/span&gt;agent-run/.git&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="c1"&gt;# Bad&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This produces a corrupted setup. The correct approach is to use tools that understand Git’s layout or run them in the primary worktree.&lt;/p&gt;
&lt;p&gt;Another detail: worktrees are lightweight but they do not isolate installed dependencies, temporary files, or build artefacts. If your agent modifies environment-level resources, you still need separate virtual environments. Using per-worktree environments avoids cross-contamination:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;python3&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;venv&lt;span class="w"&gt; &lt;/span&gt;venv
&lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;venv/bin/activate
pip&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;-r&lt;span class="w"&gt; &lt;/span&gt;requirements.txt
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Finally, merging large agent-generated changes becomes easier because diffs are clean and self-contained. The trade-off is that you must resist the temptation to accept everything wholesale. A worktree encourages discipline: treat the agent’s output as a draft, not a command.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Bad: blindly accepting generated rewrite&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Lost validation&lt;/span&gt;

&lt;span class="c1"&gt;# Good: incorporate intention, keep safeguards&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;User not found&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Here is the decision flow in a format Mermaid will accept:&lt;/p&gt;
&lt;pre class="mermaid"&gt;
flowchart TD
    A[Start AI task]
    B{Create worktree?}
    C[Isolated branch and folder]
    D[Agent operates safely]
    E[Human review]
    F{Accept changes?}
    G[Merge into main]
    H[Delete worktree]
    I[Risk mixing changes]

    A --&gt; B
    B --&gt; C
    B --&gt; I
    C --&gt; D
    D --&gt; E
    E --&gt; F
    F --&gt; G
    F --&gt; H
&lt;/pre&gt;

&lt;h2 id="see-also"&gt;See also:&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://git-scm.com/docs/git-worktree"&gt;Git - git-worktree Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Step-by-step tutorial to practice working with multiple worktrees &lt;a href="https://www.safjan.com/git-worktrees-ai-multi-workflows/"&gt;Working Faster with Git Worktrees and AI-Based Multi-Workflow Development&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;script type="module"&gt;
    import mermaid from 'https://cdn.jsdelivr.net/npm/mermaid@11/dist/mermaid.esm.min.mjs';
    mermaid.initialize({ startOnLoad: true });
&lt;/script&gt;</content><category term="note"/><category term="git"/><category term="workflows"/><category term="agents"/><category term="ai"/><category term="refactoring"/></entry><entry><title>Avoiding Homebrew Upgrades That Require Sudo on macOS</title><link href="https://www.safjan.com/avoid-homebrew-sudo-upgrades/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-10-31T00:00:00+01:00</published><updated>2025-10-31T00:00:00+01:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-10-31:/avoid-homebrew-sudo-upgrades/</id><summary type="html">&lt;p&gt;Learn how to manage Homebrew upgrades on macOS to avoid sudo prompts, by differentiating between formulae and casks and using specific commands like &lt;code&gt;--formula&lt;/code&gt;, &lt;code&gt;--cask&lt;/code&gt;, and pinning problematic packages.&lt;/p&gt;</summary><content type="html">&lt;h2 id="understanding-the-problem"&gt;Understanding the Problem&lt;/h2&gt;
&lt;p&gt;Homebrew is designed to install software without root privileges. Most formulae live peacefully under &lt;code&gt;/usr/local&lt;/code&gt; or &lt;code&gt;/opt/homebrew&lt;/code&gt;, updating silently. Yet sometimes you hit an upgrade that stops mid-flow asking for your password. This happens when the formula is actually a &lt;strong&gt;cask&lt;/strong&gt; that installs via a macOS &lt;code&gt;.pkg&lt;/code&gt; file. Those installers, like &lt;code&gt;dotnet-sdk&lt;/code&gt;, Java, or Docker, need admin rights to write under &lt;code&gt;/Library&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Running &lt;code&gt;brew update &amp;amp;&amp;amp; brew upgrade&lt;/code&gt; in automation, or even during a morning maintenance routine, becomes painful if one package demands manual confirmation.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="o"&gt;==&lt;/span&gt;&amp;gt;&lt;span class="w"&gt; &lt;/span&gt;Upgrading&lt;span class="w"&gt; &lt;/span&gt;dotnet-sdk
&lt;span class="o"&gt;==&lt;/span&gt;&amp;gt;&lt;span class="w"&gt; &lt;/span&gt;Uninstalling&lt;span class="w"&gt; &lt;/span&gt;packages&lt;span class="w"&gt; &lt;/span&gt;with&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;sudo&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;which&lt;span class="w"&gt; &lt;/span&gt;may&lt;span class="w"&gt; &lt;/span&gt;request&lt;span class="w"&gt; &lt;/span&gt;your&lt;span class="w"&gt; &lt;/span&gt;password&lt;span class="o"&gt;)&lt;/span&gt;...
Password:
&lt;span class="sb"&gt;````&lt;/span&gt;

The&lt;span class="w"&gt; &lt;/span&gt;challenge&lt;span class="w"&gt; &lt;/span&gt;is:&lt;span class="w"&gt; &lt;/span&gt;how&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Homebrew&lt;span class="w"&gt; &lt;/span&gt;upgrade&lt;span class="w"&gt; &lt;/span&gt;everything&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;_except_&lt;span class="w"&gt; &lt;/span&gt;those&lt;span class="w"&gt; &lt;/span&gt;sudo-needy&lt;span class="w"&gt; &lt;/span&gt;packages?

&lt;span class="c1"&gt;## How Homebrew Handles Upgrades&lt;/span&gt;

The&lt;span class="w"&gt; &lt;/span&gt;key&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;understanding&lt;span class="w"&gt; &lt;/span&gt;this&lt;span class="w"&gt; &lt;/span&gt;lies&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;distinction&lt;span class="w"&gt; &lt;/span&gt;between&lt;span class="w"&gt; &lt;/span&gt;**formulae**&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;**casks**.

-&lt;span class="w"&gt; &lt;/span&gt;**Formulae**&lt;span class="w"&gt; &lt;/span&gt;are&lt;span class="w"&gt; &lt;/span&gt;built&lt;span class="w"&gt; &lt;/span&gt;from&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;precompiled&lt;span class="w"&gt; &lt;/span&gt;binaries&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;locally.

-&lt;span class="w"&gt; &lt;/span&gt;**Casks**&lt;span class="w"&gt; &lt;/span&gt;wrap&lt;span class="w"&gt; &lt;/span&gt;macOS&lt;span class="w"&gt; &lt;/span&gt;installers&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;.pkg&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;.dmg&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;,&lt;span class="w"&gt; &lt;/span&gt;often&lt;span class="w"&gt; &lt;/span&gt;using&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;sudo&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;under&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;hood.


__OBSIDIAN_CODEBLOCK_1__

By&lt;span class="w"&gt; &lt;/span&gt;splitting&lt;span class="w"&gt; &lt;/span&gt;upgrades&lt;span class="w"&gt; &lt;/span&gt;this&lt;span class="w"&gt; &lt;/span&gt;way,&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;can&lt;span class="w"&gt; &lt;/span&gt;run&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;formula&lt;span class="w"&gt; &lt;/span&gt;upgrades&lt;span class="w"&gt; &lt;/span&gt;safely&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;delay&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;cask&lt;span class="w"&gt; &lt;/span&gt;upgrades&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;when&lt;span class="w"&gt; &lt;/span&gt;you’re&lt;span class="w"&gt; &lt;/span&gt;ready&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;enter&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;password.


&lt;span class="c1"&gt;## Useful Patterns for Controlling Upgrades&lt;/span&gt;

There&lt;span class="w"&gt; &lt;/span&gt;are&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;few&lt;span class="w"&gt; &lt;/span&gt;practical&lt;span class="w"&gt; &lt;/span&gt;patterns&lt;span class="w"&gt; &lt;/span&gt;that&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;help&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;avoid&lt;span class="w"&gt; &lt;/span&gt;password&lt;span class="w"&gt; &lt;/span&gt;prompts&lt;span class="w"&gt; &lt;/span&gt;without&lt;span class="w"&gt; &lt;/span&gt;breaking&lt;span class="w"&gt; &lt;/span&gt;your&lt;span class="w"&gt; &lt;/span&gt;update&lt;span class="w"&gt; &lt;/span&gt;flow.

&lt;span class="c1"&gt;### 1. Pin the problem packages&lt;/span&gt;

Pinning&lt;span class="w"&gt; &lt;/span&gt;prevents&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;package&lt;span class="w"&gt; &lt;/span&gt;from&lt;span class="w"&gt; &lt;/span&gt;being&lt;span class="w"&gt; &lt;/span&gt;upgraded&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;cleaned&lt;span class="w"&gt; &lt;/span&gt;up.&lt;span class="w"&gt; &lt;/span&gt;It’s&lt;span class="w"&gt; &lt;/span&gt;ideal&lt;span class="w"&gt; &lt;/span&gt;when&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;want&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;hold&lt;span class="w"&gt; &lt;/span&gt;off&lt;span class="w"&gt; &lt;/span&gt;on&lt;span class="w"&gt; &lt;/span&gt;upgrades&lt;span class="w"&gt; &lt;/span&gt;that&lt;span class="w"&gt; &lt;/span&gt;are&lt;span class="w"&gt; &lt;/span&gt;known&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;require&lt;span class="w"&gt; &lt;/span&gt;admin&lt;span class="w"&gt; &lt;/span&gt;privileges.

__OBSIDIAN_CODEBLOCK_2__

You&lt;span class="w"&gt; &lt;/span&gt;can&lt;span class="w"&gt; &lt;/span&gt;unpin&lt;span class="w"&gt; &lt;/span&gt;later&lt;span class="w"&gt; &lt;/span&gt;with:

__OBSIDIAN_CODEBLOCK_3__

Pinned&lt;span class="w"&gt; &lt;/span&gt;packages&lt;span class="w"&gt; &lt;/span&gt;are&lt;span class="w"&gt; &lt;/span&gt;skipped&lt;span class="w"&gt; &lt;/span&gt;automatically&lt;span class="w"&gt; &lt;/span&gt;by&lt;span class="w"&gt; &lt;/span&gt;both&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;brew&lt;span class="w"&gt; &lt;/span&gt;upgrade&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;brew&lt;span class="w"&gt; &lt;/span&gt;cleanup&lt;span class="sb"&gt;`&lt;/span&gt;.

&lt;span class="c1"&gt;### 2. Skip casks entirely&lt;/span&gt;

If&lt;span class="w"&gt; &lt;/span&gt;your&lt;span class="w"&gt; &lt;/span&gt;upgrade&lt;span class="w"&gt; &lt;/span&gt;script&lt;span class="w"&gt; &lt;/span&gt;doesn’t&lt;span class="w"&gt; &lt;/span&gt;need&lt;span class="w"&gt; &lt;/span&gt;GUI&lt;span class="w"&gt; &lt;/span&gt;tools&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;frameworks,&lt;span class="w"&gt; &lt;/span&gt;this&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;simplest&lt;span class="w"&gt; &lt;/span&gt;fix:

__OBSIDIAN_CODEBLOCK_4__

No&lt;span class="w"&gt; &lt;/span&gt;sudo&lt;span class="w"&gt; &lt;/span&gt;prompts,&lt;span class="w"&gt; &lt;/span&gt;no&lt;span class="w"&gt; &lt;/span&gt;surprises.

&lt;span class="c1"&gt;### 3. Filter out known offenders dynamically&lt;/span&gt;

If&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;want&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;more&lt;span class="w"&gt; &lt;/span&gt;flexible&lt;span class="w"&gt; &lt;/span&gt;approach,&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;can&lt;span class="w"&gt; &lt;/span&gt;automate&lt;span class="w"&gt; &lt;/span&gt;selective&lt;span class="w"&gt; &lt;/span&gt;upgrading&lt;span class="w"&gt; &lt;/span&gt;with&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;simple&lt;span class="w"&gt; &lt;/span&gt;shell&lt;span class="w"&gt; &lt;/span&gt;loop:

__OBSIDIAN_CODEBLOCK_5__

This&lt;span class="w"&gt; &lt;/span&gt;simulates&lt;span class="w"&gt; &lt;/span&gt;each&lt;span class="w"&gt; &lt;/span&gt;upgrade&lt;span class="w"&gt; &lt;/span&gt;first.&lt;span class="w"&gt; &lt;/span&gt;Anything&lt;span class="w"&gt; &lt;/span&gt;that&lt;span class="w"&gt; &lt;/span&gt;would&lt;span class="w"&gt; &lt;/span&gt;invoke&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;sudo&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;skipped.


&lt;span class="c1"&gt;## When to Use These Approaches&lt;/span&gt;

&lt;span class="c1"&gt;### When It Makes Sense&lt;/span&gt;

-&lt;span class="w"&gt; &lt;/span&gt;You’re&lt;span class="w"&gt; &lt;/span&gt;automating&lt;span class="w"&gt; &lt;/span&gt;Homebrew&lt;span class="w"&gt; &lt;/span&gt;upgrades&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;CI&lt;span class="w"&gt; &lt;/span&gt;job,&lt;span class="w"&gt; &lt;/span&gt;scheduled&lt;span class="w"&gt; &lt;/span&gt;task,&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;startup&lt;span class="w"&gt; &lt;/span&gt;script.
-&lt;span class="w"&gt; &lt;/span&gt;You’re&lt;span class="w"&gt; &lt;/span&gt;on&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;work&lt;span class="w"&gt; &lt;/span&gt;laptop&lt;span class="w"&gt; &lt;/span&gt;where&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;don’t&lt;span class="w"&gt; &lt;/span&gt;have&lt;span class="w"&gt; &lt;/span&gt;admin&lt;span class="w"&gt; &lt;/span&gt;rights.
-&lt;span class="w"&gt; &lt;/span&gt;You&lt;span class="w"&gt; &lt;/span&gt;want&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;keep&lt;span class="w"&gt; &lt;/span&gt;your&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;local&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;developer&lt;span class="w"&gt; &lt;/span&gt;tools&lt;span class="w"&gt; &lt;/span&gt;up&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;date&lt;span class="w"&gt; &lt;/span&gt;but&lt;span class="w"&gt; &lt;/span&gt;prefer&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;manually&lt;span class="w"&gt; &lt;/span&gt;review&lt;span class="w"&gt; &lt;/span&gt;framework-level&lt;span class="w"&gt; &lt;/span&gt;changes&lt;span class="w"&gt; &lt;/span&gt;like&lt;span class="w"&gt; &lt;/span&gt;.NET&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;Java.

&lt;span class="c1"&gt;### When Not to Use&lt;/span&gt;

-&lt;span class="w"&gt; &lt;/span&gt;You&lt;span class="w"&gt; &lt;/span&gt;manage&lt;span class="w"&gt; &lt;/span&gt;system-wide&lt;span class="w"&gt; &lt;/span&gt;development&lt;span class="w"&gt; &lt;/span&gt;environments&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;actually&lt;span class="w"&gt; &lt;/span&gt;_need_&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;latest&lt;span class="w"&gt; &lt;/span&gt;frameworks&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;compatibility.
-&lt;span class="w"&gt; &lt;/span&gt;You&lt;span class="w"&gt; &lt;/span&gt;rely&lt;span class="w"&gt; &lt;/span&gt;on&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;full&lt;span class="w"&gt; &lt;/span&gt;cask&lt;span class="w"&gt; &lt;/span&gt;ecosystem&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;e.g.,&lt;span class="w"&gt; &lt;/span&gt;GUI&lt;span class="w"&gt; &lt;/span&gt;apps&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;are&lt;span class="w"&gt; &lt;/span&gt;okay&lt;span class="w"&gt; &lt;/span&gt;with&lt;span class="w"&gt; &lt;/span&gt;interactive&lt;span class="w"&gt; &lt;/span&gt;upgrades.

&lt;span class="c1"&gt;## Things to Watch Out For&lt;/span&gt;

Skipping&lt;span class="w"&gt; &lt;/span&gt;casks&lt;span class="w"&gt; &lt;/span&gt;means&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;may&lt;span class="w"&gt; &lt;/span&gt;fall&lt;span class="w"&gt; &lt;/span&gt;behind&lt;span class="w"&gt; &lt;/span&gt;on&lt;span class="w"&gt; &lt;/span&gt;security&lt;span class="w"&gt; &lt;/span&gt;updates&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;those&lt;span class="w"&gt; &lt;/span&gt;apps.&lt;span class="w"&gt; &lt;/span&gt;If&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;pin&lt;span class="w"&gt; &lt;/span&gt;something,&lt;span class="w"&gt; &lt;/span&gt;remember&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;unpin&lt;span class="w"&gt; &lt;/span&gt;periodically&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;check&lt;span class="w"&gt; &lt;/span&gt;whether&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;new&lt;span class="w"&gt; &lt;/span&gt;version&lt;span class="w"&gt; &lt;/span&gt;resolves&lt;span class="w"&gt; &lt;/span&gt;past&lt;span class="w"&gt; &lt;/span&gt;issues.

Also,&lt;span class="w"&gt; &lt;/span&gt;not&lt;span class="w"&gt; &lt;/span&gt;every&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;.pkg&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;installer&lt;span class="w"&gt; &lt;/span&gt;actually&lt;span class="w"&gt; &lt;/span&gt;uses&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;sudo&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;every&lt;span class="w"&gt; &lt;/span&gt;time—it&lt;span class="w"&gt; &lt;/span&gt;depends&lt;span class="w"&gt; &lt;/span&gt;on&lt;span class="w"&gt; &lt;/span&gt;how&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;vendor&lt;span class="w"&gt; &lt;/span&gt;packaged&lt;span class="w"&gt; &lt;/span&gt;it.&lt;span class="w"&gt; &lt;/span&gt;Sometimes&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;version&lt;span class="w"&gt; &lt;/span&gt;update&lt;span class="w"&gt; &lt;/span&gt;switches&lt;span class="w"&gt; &lt;/span&gt;from&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;simple&lt;span class="w"&gt; &lt;/span&gt;binary&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;.pkg&lt;span class="sb"&gt;`&lt;/span&gt;,&lt;span class="w"&gt; &lt;/span&gt;suddenly&lt;span class="w"&gt; &lt;/span&gt;introducing&lt;span class="w"&gt; &lt;/span&gt;password&lt;span class="w"&gt; &lt;/span&gt;prompts.&lt;span class="w"&gt; &lt;/span&gt;You’ll&lt;span class="w"&gt; &lt;/span&gt;notice&lt;span class="w"&gt; &lt;/span&gt;it&lt;span class="w"&gt; &lt;/span&gt;quickly&lt;span class="w"&gt; &lt;/span&gt;because&lt;span class="w"&gt; &lt;/span&gt;your&lt;span class="w"&gt; &lt;/span&gt;upgrade&lt;span class="w"&gt; &lt;/span&gt;automation&lt;span class="w"&gt; &lt;/span&gt;will&lt;span class="w"&gt; &lt;/span&gt;hang.

Here’s&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;mental&lt;span class="w"&gt; &lt;/span&gt;model&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;decide:

__OBSIDIAN_CODEBLOCK_6__

The&lt;span class="w"&gt; &lt;/span&gt;trick&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;control&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;boundary&lt;span class="w"&gt; &lt;/span&gt;between&lt;span class="w"&gt; &lt;/span&gt;convenience&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;responsibility.&lt;span class="w"&gt; &lt;/span&gt;Homebrew&lt;span class="w"&gt; &lt;/span&gt;gives&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;tools—&lt;span class="sb"&gt;`&lt;/span&gt;--formula&lt;span class="sb"&gt;`&lt;/span&gt;,&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;--cask&lt;span class="sb"&gt;`&lt;/span&gt;,&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;pin&lt;span class="sb"&gt;`&lt;/span&gt;—but&lt;span class="w"&gt; &lt;/span&gt;expects&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;use&lt;span class="w"&gt; &lt;/span&gt;them&lt;span class="w"&gt; &lt;/span&gt;deliberately.

&lt;span class="c1"&gt;## The Fine Print&lt;/span&gt;

There’s&lt;span class="w"&gt; &lt;/span&gt;no&lt;span class="w"&gt; &lt;/span&gt;hidden&lt;span class="w"&gt; &lt;/span&gt;flag&lt;span class="w"&gt; &lt;/span&gt;like&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;--ignore-sudo&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Homebrew’s&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;command&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;set.&lt;span class="w"&gt; &lt;/span&gt;You&lt;span class="w"&gt; &lt;/span&gt;have&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;rely&lt;span class="w"&gt; &lt;/span&gt;on&lt;span class="w"&gt; &lt;/span&gt;composition:&lt;span class="w"&gt; &lt;/span&gt;limiting&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;upgrade&lt;span class="w"&gt; &lt;/span&gt;scope&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;pinning&lt;span class="w"&gt; &lt;/span&gt;selectively.&lt;span class="w"&gt; &lt;/span&gt;The&lt;span class="w"&gt; &lt;/span&gt;nice&lt;span class="w"&gt; &lt;/span&gt;part&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;that&lt;span class="w"&gt; &lt;/span&gt;all&lt;span class="w"&gt; &lt;/span&gt;of&lt;span class="w"&gt; &lt;/span&gt;this&lt;span class="w"&gt; &lt;/span&gt;integrates&lt;span class="w"&gt; &lt;/span&gt;cleanly&lt;span class="w"&gt; &lt;/span&gt;with&lt;span class="w"&gt; &lt;/span&gt;your&lt;span class="w"&gt; &lt;/span&gt;daily&lt;span class="w"&gt; &lt;/span&gt;maintenance&lt;span class="w"&gt; &lt;/span&gt;habits.

A&lt;span class="w"&gt; &lt;/span&gt;typical&lt;span class="w"&gt; &lt;/span&gt;non-interactive&lt;span class="w"&gt; &lt;/span&gt;safe&lt;span class="w"&gt; &lt;/span&gt;sequence&lt;span class="w"&gt; &lt;/span&gt;on&lt;span class="w"&gt; &lt;/span&gt;macOS&lt;span class="w"&gt; &lt;/span&gt;might&lt;span class="w"&gt; &lt;/span&gt;look&lt;span class="w"&gt; &lt;/span&gt;like&lt;span class="w"&gt; &lt;/span&gt;this:

&lt;span class="sb"&gt;```&lt;/span&gt;bash
brew&lt;span class="w"&gt; &lt;/span&gt;update
brew&lt;span class="w"&gt; &lt;/span&gt;upgrade&lt;span class="w"&gt; &lt;/span&gt;--formula
brew&lt;span class="w"&gt; &lt;/span&gt;cleanup
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Then, once a week or month, handle casks manually:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;brew&lt;span class="w"&gt; &lt;/span&gt;upgrade&lt;span class="w"&gt; &lt;/span&gt;--cask
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This rhythm keeps your CLI tools fresh while letting you decide when to deal with packages that need elevated rights. It’s a small adjustment that turns Homebrew into a far smoother experience on macOS.&lt;/p&gt;</content><category term="note"/><category term="homebrew"/><category term="macos"/><category term="automation"/><category term="dev-tools"/></entry><entry><title>Understanding Python Protocols - Structural Subtyping in Practice</title><link href="https://www.safjan.com/python-protocols-structural-subtyping/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-10-29T00:00:00+01:00</published><updated>2025-10-29T00:00:00+01:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-10-29:/python-protocols-structural-subtyping/</id><summary type="html">&lt;p&gt;Learn how Python protocols using structural subtyping in PEP 544 allow for flexible, decoupled code by defining shapes through method signatures rather than class inheritance. Explore examples of implementing read-only attributes and optional methods within protocols.&lt;/p&gt;</summary><content type="html">&lt;h2 id="the-basic-idea"&gt;The Basic Idea&lt;/h2&gt;
&lt;p&gt;Introduced in &lt;strong&gt;Python 3.8&lt;/strong&gt; via &lt;strong&gt;&lt;a href="https://peps.python.org/pep-0544/"&gt;PEP 544&lt;/a&gt;&lt;/strong&gt;, &lt;em&gt;Protocols&lt;/em&gt; extend Python’s static typing system by allowing &lt;strong&gt;structural subtyping&lt;/strong&gt; — also called &lt;em&gt;static duck typing&lt;/em&gt;.&lt;br&gt;
It means that instead of checking &lt;em&gt;what a class inherits from&lt;/em&gt;, Python type checkers can verify &lt;em&gt;what methods and attributes&lt;/em&gt; it provides.&lt;/p&gt;
&lt;p&gt;Let’s unpack this idea with a simple example.&lt;/p&gt;
&lt;p&gt;```python
from typing import Protocol&lt;/p&gt;
&lt;h1 id="define-a-protocol-describing-the-shape-of-a-writer"&gt;Define a protocol describing the "shape" of a writer&lt;/h1&gt;
&lt;p&gt;class Writer(Protocol):
    def write(self, text: str) -&amp;gt; None:
        ...&lt;/p&gt;
&lt;h1 id="any-class-implementing-write-will-satisfy-this-protocol"&gt;Any class implementing 'write' will satisfy this protocol&lt;/h1&gt;
&lt;p&gt;class FileWriter:
    def write(self, text: str) -&amp;gt; None:
        print(f"Writing to file: {text}")&lt;/p&gt;
&lt;p&gt;class ConsoleLogger:
    def write(self, text: str) -&amp;gt; None:
        print(f"LOG: {text}")&lt;/p&gt;
&lt;p&gt;def log_message(writer: Writer, msg: str) -&amp;gt; None:
    writer.write(msg)&lt;/p&gt;
&lt;p&gt;log_message(ConsoleLogger(), "Hello")  # OK
log_message(FileWriter(), "Hello")     # OK
````&lt;/p&gt;
&lt;p&gt;Neither &lt;code&gt;FileWriter&lt;/code&gt; nor &lt;code&gt;ConsoleLogger&lt;/code&gt; inherit from &lt;code&gt;Writer&lt;/code&gt;.&lt;br&gt;
But both &lt;em&gt;structurally match&lt;/em&gt; it — they have a &lt;code&gt;write&lt;/code&gt; method with the same signature.&lt;br&gt;
This is the essence of protocols: &lt;strong&gt;“If it quacks like a duck, type checkers treat it as a duck.”&lt;/strong&gt;&lt;/p&gt;
&lt;h2 id="useful-patterns"&gt;Useful Patterns&lt;/h2&gt;
&lt;p&gt;Protocols aren’t just about pretending to be an interface. They unlock new expressive power for static type hints and help keep your code decoupled. Let’s explore some common variations.&lt;/p&gt;
&lt;h3 id="read-only-attributes"&gt;Read-only Attributes&lt;/h3&gt;
&lt;p&gt;You can describe expected attributes directly in a protocol.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OBSIDIAN_CODEBLOCK_1&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Here, any object with a &lt;code&gt;name&lt;/code&gt; attribute of type &lt;code&gt;str&lt;/code&gt; will pass the type check.&lt;/p&gt;
&lt;h3 id="optional-and-partial-protocols"&gt;Optional and Partial Protocols&lt;/h3&gt;
&lt;p&gt;Sometimes you don’t need to specify all possible methods — only those relevant to your function.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OBSIDIAN_CODEBLOCK_2&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This works beautifully with many built-in classes (&lt;code&gt;io.TextIOWrapper&lt;/code&gt;, sockets, etc.).&lt;/p&gt;
&lt;h3 id="runtime-checkable-protocols"&gt;Runtime Checkable Protocols&lt;/h3&gt;
&lt;p&gt;By default, &lt;code&gt;isinstance()&lt;/code&gt; and &lt;code&gt;issubclass()&lt;/code&gt; don’t work with protocols.&lt;br&gt;
You can enable that explicitly using &lt;code&gt;@runtime_checkable&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OBSIDIAN_CODEBLOCK_3&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Be cautious, though — runtime checking only verifies that a class &lt;em&gt;explicitly&lt;/em&gt; implements the methods (not dynamically added ones).&lt;/p&gt;
&lt;h2 id="where-it-shines"&gt;Where It Shines&lt;/h2&gt;
&lt;p&gt;Protocols shine in situations where &lt;strong&gt;you want flexibility without inheritance&lt;/strong&gt; — particularly in large systems or when using third-party libraries.&lt;/p&gt;
&lt;h3 id="when-to-use"&gt;When to Use&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;When you want to specify &lt;em&gt;expected behavior&lt;/em&gt; without forcing inheritance.&lt;/li&gt;
&lt;li&gt;When integrating code from multiple libraries that weren’t designed to work together.&lt;/li&gt;
&lt;li&gt;When designing APIs that rely on capabilities (“things that can write”, “things that can close”, etc.) rather than hierarchies.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="when-not-to-use"&gt;When Not to Use&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;When you actually need &lt;strong&gt;shared implementation&lt;/strong&gt; (then a base class is better).&lt;/li&gt;
&lt;li&gt;When your codebase doesn’t use static type checking tools like &lt;strong&gt;mypy&lt;/strong&gt;, &lt;strong&gt;pyright&lt;/strong&gt;, or &lt;strong&gt;pylance&lt;/strong&gt; — the benefits won’t materialize at runtime.&lt;/li&gt;
&lt;li&gt;When the protocol would describe a highly dynamic or runtime-altered interface.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="example-logging-abstraction"&gt;Example: Logging Abstraction&lt;/h3&gt;
&lt;p&gt;Let’s look at a practical scenario. Suppose you have a service that writes logs, but you want to support multiple backends.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OBSIDIAN_CODEBLOCK_4&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;You’ve decoupled your code from concrete classes — that’s clean design.&lt;/p&gt;
&lt;h2 id="the-fine-print"&gt;The Fine Print&lt;/h2&gt;
&lt;p&gt;Protocols are deceptively simple but have a few subtleties worth knowing.&lt;/p&gt;
&lt;h3 id="structural-vs-nominal-subtyping"&gt;Structural vs Nominal Subtyping&lt;/h3&gt;
&lt;p&gt;Traditional inheritance in Python is &lt;strong&gt;nominal&lt;/strong&gt;: types are related by declared lineage.&lt;br&gt;
Protocols enable &lt;strong&gt;structural subtyping&lt;/strong&gt;: types are compatible based on structure.&lt;/p&gt;
&lt;p&gt;The diagram below shows the conceptual difference.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OBSIDIAN_CODEBLOCK_5&lt;/strong&gt;&lt;/p&gt;
&lt;h3 id="generic-protocols"&gt;Generic Protocols&lt;/h3&gt;
&lt;p&gt;You can make protocols generic to express relationships between types.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OBSIDIAN_CODEBLOCK_6&lt;/strong&gt;&lt;/p&gt;
&lt;h3 id="common-mistakes"&gt;Common Mistakes&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;OBSIDIAN_CODEBLOCK_7&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Always match both &lt;strong&gt;method name&lt;/strong&gt; and &lt;strong&gt;signature&lt;/strong&gt; exactly — including annotations.&lt;br&gt;
Otherwise, the type checker will reject the class.&lt;/p&gt;
&lt;h3 id="performance-and-design-notes"&gt;Performance and Design Notes&lt;/h3&gt;
&lt;p&gt;Protocols exist only for &lt;em&gt;type checking&lt;/em&gt;; at runtime, they add virtually no overhead.&lt;br&gt;
They don’t enforce behavior — they just describe it.&lt;br&gt;
That’s a strength, but also a risk: they can give a false sense of safety if your codebase doesn’t use static analysis.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;HINT&lt;/strong&gt;: In real projects, protocols often replace ad-hoc duck typing checks like &lt;code&gt;hasattr(obj, "write")&lt;/code&gt; with explicit, type-checked contracts — a major readability and safety win. You can check your code base whether you have intensive usage of &lt;code&gt;hasattr&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id="closing-thought"&gt;Closing Thought&lt;/h2&gt;
&lt;p&gt;Protocols bring static typing and duck typing together in a surprisingly elegant way.&lt;br&gt;
They don’t replace inheritance or abstract base classes — they &lt;strong&gt;complement&lt;/strong&gt; them.&lt;br&gt;
Where ABCs define &lt;em&gt;what must be implemented&lt;/em&gt;, protocols describe &lt;em&gt;what is already provided&lt;/em&gt;.&lt;/p&gt;
&lt;h2 id="references"&gt;References&lt;/h2&gt;
&lt;h2 id="-longer-bottom-up-introduction-to-protocols-python-protocols-leveraging-structural-subtyping-real-python"&gt;- longer, bottom-up introduction to Protocols: &lt;a href="https://realpython.com/python-protocol/"&gt;Python Protocols: Leveraging Structural Subtyping – Real Python&lt;/a&gt;&lt;/h2&gt;</content><category term="note"/><category term="python"/><category term="typing"/><category term="protocols"/><category term="static-typing"/><category term="pep544"/></entry><entry><title>Evolution of Type Hints in Python — From Comments to Inline Typing and Beyond</title><link href="https://www.safjan.com/evolution-of-type-hints-in-python/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-10-24T00:00:00+02:00</published><updated>2025-10-24T00:00:00+02:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-10-24:/evolution-of-type-hints-in-python/</id><summary type="html">&lt;p&gt;Learn about the evolution of type hints in Python, from initial comments to modern inline typing and key features introduced in each major version, enabling powerful static type checking with tools like mypy and pyright.&lt;/p&gt;</summary><content type="html">&lt;h2 id="the-mental-model"&gt;The Mental Model&lt;/h2&gt;
&lt;p&gt;Python typing wasn’t born overnight. It crept into the language slowly, first as a loose suggestion and later as a core part of modern codebases.&lt;br&gt;
Originally, you could only hint types using comments (&lt;code&gt;# type: int&lt;/code&gt;), but as Python matured, its typing syntax grew more expressive, more compact, and more powerful.&lt;/p&gt;
&lt;p&gt;The journey started with &lt;strong&gt;PEP 484&lt;/strong&gt; in &lt;strong&gt;Python 3.5&lt;/strong&gt;, introducing the &lt;code&gt;typing&lt;/code&gt; module and annotations as first-class citizens. Since then, nearly every minor version brought a refinement or simplification, allowing developers to express richer constraints without resorting to verbose generics.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Python 3.5 (PEP 484)&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;greet_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;names&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;names&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Hello, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;!&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="err"&gt;````&lt;/span&gt;

&lt;span class="n"&gt;This&lt;/span&gt; &lt;span class="n"&gt;basic&lt;/span&gt; &lt;span class="n"&gt;form&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;static&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="n"&gt;opened&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;door&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="n"&gt;like&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;mypy&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;pyright&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;pylance&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;most&lt;/span&gt; &lt;span class="n"&gt;recently&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;ty&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;pyrefly&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;provide&lt;/span&gt; &lt;span class="n"&gt;editor&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;level&lt;/span&gt; &lt;span class="n"&gt;correctness&lt;/span&gt; &lt;span class="n"&gt;checks&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;

&lt;span class="n"&gt;This&lt;/span&gt; &lt;span class="n"&gt;basic&lt;/span&gt; &lt;span class="n"&gt;form&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;static&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="n"&gt;opened&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;door&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="n"&gt;like&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;mypy&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;https&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;mypy&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;lang&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;org&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;pyright&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;https&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;github&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;microsoft&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;pyright&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;pylance&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;https&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;marketplace&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;visualstudio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="err"&gt;?&lt;/span&gt;&lt;span class="n"&gt;itemName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;python&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;vscode&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;pylance&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;most&lt;/span&gt; &lt;span class="n"&gt;recently&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;ty&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;https&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;astral&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;ty&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;pyrefly&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;https&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;pyrefly&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;org&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;provide&lt;/span&gt; &lt;span class="n"&gt;editor&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;level&lt;/span&gt; &lt;span class="n"&gt;correctness&lt;/span&gt; &lt;span class="n"&gt;checks&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;

&lt;span class="c1"&gt;## Key Features Through Versions&lt;/span&gt;

&lt;span class="c1"&gt;### Python 3.5 (PEP 484) — The Birth of Typing&lt;/span&gt;

&lt;span class="n"&gt;Introduced&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;typing&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt; &lt;span class="n"&gt;module&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;Union&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;TypeVar&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;Generic&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;Function&lt;/span&gt; &lt;span class="n"&gt;annotations&lt;/span&gt; &lt;span class="n"&gt;became&lt;/span&gt; &lt;span class="n"&gt;official&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;

&lt;span class="c1"&gt;### Python 3.6 — Variable Annotations&lt;/span&gt;

&lt;span class="n"&gt;You&lt;/span&gt; &lt;span class="n"&gt;could&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt; &lt;span class="n"&gt;variables&lt;/span&gt; &lt;span class="n"&gt;directly&lt;/span&gt; &lt;span class="n"&gt;without&lt;/span&gt; &lt;span class="n"&gt;using&lt;/span&gt; &lt;span class="n"&gt;comments&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="n"&gt;__OBSIDIAN_CODEBLOCK_1__&lt;/span&gt;

&lt;span class="n"&gt;Also&lt;/span&gt; &lt;span class="n"&gt;introduced&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;typing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NamedTuple&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;typing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewType&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;

&lt;span class="c1"&gt;### Python 3.7 — Postponed Evaluation (PEP 563)&lt;/span&gt;

&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="n"&gt;hints&lt;/span&gt; &lt;span class="n"&gt;were&lt;/span&gt; &lt;span class="n"&gt;treated&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;strings&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="n"&gt;by&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;via&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;__future__&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;annotations&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;delaying&lt;/span&gt; &lt;span class="n"&gt;their&lt;/span&gt; &lt;span class="n"&gt;evaluation&lt;/span&gt; &lt;span class="n"&gt;until&lt;/span&gt; &lt;span class="n"&gt;runtime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;  
&lt;span class="n"&gt;This&lt;/span&gt; &lt;span class="n"&gt;solved&lt;/span&gt; &lt;span class="n"&gt;circular&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;issues&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;reduced&lt;/span&gt; &lt;span class="n"&gt;overhead&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;function&lt;/span&gt; &lt;span class="n"&gt;definitions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;

&lt;span class="c1"&gt;### Python 3.8 — TypedDict and Literal Types&lt;/span&gt;

&lt;span class="n"&gt;Added&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;TypedDict&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;describing&lt;/span&gt; &lt;span class="n"&gt;dicts&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;specific&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;Final&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;Literal&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;immutability&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;constant&lt;/span&gt; &lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;Protocol&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PEP&lt;/span&gt; &lt;span class="mi"&gt;544&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;structural&lt;/span&gt; &lt;span class="n"&gt;subtyping&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;

&lt;span class="n"&gt;__OBSIDIAN_CODEBLOCK_2__&lt;/span&gt;

&lt;span class="c1"&gt;### Python 3.9 — Built-in Generics (PEP 585)&lt;/span&gt;

&lt;span class="n"&gt;This&lt;/span&gt; &lt;span class="n"&gt;was&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;huge&lt;/span&gt; &lt;span class="n"&gt;quality&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;of&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;life&lt;/span&gt; &lt;span class="n"&gt;upgrade&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  
&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt; &lt;span class="n"&gt;replaced&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt; &lt;span class="n"&gt;replaced&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;

&lt;span class="n"&gt;__OBSIDIAN_CODEBLOCK_3__&lt;/span&gt;

&lt;span class="c1"&gt;### Python 3.10 — Union Operator (PEP 604)&lt;/span&gt;

&lt;span class="n"&gt;Simplified&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;Union&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt; &lt;span class="n"&gt;syntax&lt;/span&gt; &lt;span class="n"&gt;using&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt; &lt;span class="n"&gt;operator&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;

&lt;span class="n"&gt;__OBSIDIAN_CODEBLOCK_4__&lt;/span&gt;

&lt;span class="n"&gt;Also&lt;/span&gt; &lt;span class="n"&gt;improved&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt; &lt;span class="n"&gt;narrowing&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt; &lt;span class="n"&gt;statements&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;structural&lt;/span&gt; &lt;span class="n"&gt;pattern&lt;/span&gt; &lt;span class="n"&gt;matching&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;

&lt;span class="c1"&gt;### Python 3.11 — Self Type, Variadic Generics, TypedDict Enhancements&lt;/span&gt;

&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;Self&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;methods&lt;/span&gt; &lt;span class="n"&gt;returning&lt;/span&gt; &lt;span class="n"&gt;their&lt;/span&gt; &lt;span class="n"&gt;own&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PEP&lt;/span&gt; &lt;span class="mi"&gt;673&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;TypeVarTuple&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;Unpack&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;variadic&lt;/span&gt; &lt;span class="n"&gt;generics&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PEP&lt;/span&gt; &lt;span class="mi"&gt;646&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;NotRequired&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;Required&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;TypedDict&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt; &lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;

&lt;span class="n"&gt;__OBSIDIAN_CODEBLOCK_5__&lt;/span&gt;

&lt;span class="c1"&gt;### Python 3.12 — The `typing` Cleanup&lt;/span&gt;

&lt;span class="n"&gt;Deprecated&lt;/span&gt; &lt;span class="n"&gt;old&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;style&lt;/span&gt; &lt;span class="n"&gt;generics&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;etc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;  
&lt;span class="n"&gt;Introduced&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;TypeAliasType&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PEP&lt;/span&gt; &lt;span class="mi"&gt;695&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;made&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt; &lt;span class="n"&gt;annotations&lt;/span&gt; &lt;span class="n"&gt;simpler&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="n"&gt;__OBSIDIAN_CODEBLOCK_6__&lt;/span&gt;

&lt;span class="c1"&gt;### Python 3.13 — No More `from __future__ import annotations`&lt;/span&gt;

&lt;span class="n"&gt;Postponed&lt;/span&gt; &lt;span class="n"&gt;evaluation&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;annotations&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;removing&lt;/span&gt; &lt;span class="n"&gt;one&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;last&lt;/span&gt; &lt;span class="n"&gt;confusing&lt;/span&gt; &lt;span class="n"&gt;steps&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="n"&gt;setup&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;

&lt;span class="c1"&gt;## Useful Patterns&lt;/span&gt;

&lt;span class="c1"&gt;### Simplified Union Types&lt;/span&gt;

&lt;span class="n"&gt;Readable&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;concise&lt;/span&gt; &lt;span class="n"&gt;unions&lt;/span&gt; &lt;span class="n"&gt;are&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="n"&gt;idiomatic&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="err"&gt;```&lt;/span&gt;&lt;span class="n"&gt;python&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h3 id="structural-subtyping-with-protocols"&gt;Structural Subtyping with Protocols&lt;/h3&gt;
&lt;p&gt;Protocol-based typing (instead of inheritance) allows flexible contracts:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Protocol&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SupportsClose&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Protocol&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;cleanup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;SupportsClose&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h3 id="typeddict-for-json-like-data"&gt;TypedDict for JSON-like Data&lt;/h3&gt;
&lt;p&gt;Great for static checking of structured but dict-based data:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TypedDict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h3 id="self-type-for-fluent-interfaces"&gt;Self Type for Fluent Interfaces&lt;/h3&gt;
&lt;p&gt;Commonly used in builder-style classes:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;condition&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Self&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="o"&gt;...&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h3 id="a-quick-summary-of-typing-evolution"&gt;A Quick Summary of Typing Evolution&lt;/h3&gt;
&lt;pre class="mermaid"&gt;
graph LR
A[3.5: typing module (PEP 484)] --&gt; B[3.6: variable annotations]
B --&gt; C[3.7: postponed evaluation]
C --&gt; D[3.8: TypedDict, Literal, Protocol]
D --&gt; E[3.9: built-in generics]
E --&gt; F[3.10: union | operator]
F --&gt; G[3.11: Self, variadic generics]
G --&gt; H[3.12: TypeAliasType, PEP 695]
H --&gt; I[3.13: annotations postponed by default]
&lt;/pre&gt;

&lt;p&gt;Typing has moved from verbose and experimental to elegant and essential.&lt;br&gt;
Modern Python encourages typing as part of its design philosophy — readable, predictable, and expressive.&lt;/p&gt;
&lt;script type="module"&gt;
    import mermaid from 'https://cdn.jsdelivr.net/npm/mermaid@11/dist/mermaid.esm.min.mjs';
    mermaid.initialize({ startOnLoad: true });
&lt;/script&gt;</content><category term="note"/><category term="python"/><category term="typing"/><category term="pep484"/><category term="static-typing"/><category term="mypy"/></entry><entry><title>Keeping performance results in a separate Git branch using `git checkout --orphan`</title><link href="https://www.safjan.com/git-checkout-orphan-gh-pages-performance-results/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-10-24T00:00:00+02:00</published><updated>2025-10-24T00:00:00+02:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-10-24:/git-checkout-orphan-gh-pages-performance-results/</id><summary type="html">&lt;p&gt;Learn how to use &lt;code&gt;git checkout --orphan&lt;/code&gt; to create a separate Git branch with no history, ideal for storing performance results or other generated content independently from your codebase. Discover the process and benefits of using orphan branches, including an example GitHub Actions workflow for publishing test results.&lt;/p&gt;</summary><content type="html">&lt;h2 id="understanding-orphan-branches-in-git"&gt;Understanding orphan branches in Git&lt;/h2&gt;
&lt;p&gt;There’s a lesser-known Git feature that allows you to start a branch with no history at all:&lt;br&gt;
&lt;code&gt;git checkout --orphan &amp;lt;branch&amp;gt;&lt;/code&gt;.&lt;br&gt;
It creates a new branch &lt;strong&gt;disconnected&lt;/strong&gt; from any commits or files that exist in the current branch.&lt;/p&gt;
&lt;p&gt;Think of it as a fresh repository living inside your repo — same &lt;code&gt;.git&lt;/code&gt; database, different history.&lt;br&gt;
Everything from your current branch remains in your working directory, but the new branch starts with no commits.&lt;br&gt;
You can then decide which files to keep, stage them, and make an initial commit that becomes the root of this branch.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Create a new orphan branch named &amp;#39;gh-pages&amp;#39;&lt;/span&gt;
git&lt;span class="w"&gt; &lt;/span&gt;checkout&lt;span class="w"&gt; &lt;/span&gt;--orphan&lt;span class="w"&gt; &lt;/span&gt;gh-pages

&lt;span class="c1"&gt;# Clean out files from the working directory&lt;/span&gt;
git&lt;span class="w"&gt; &lt;/span&gt;rm&lt;span class="w"&gt; &lt;/span&gt;-rf&lt;span class="w"&gt; &lt;/span&gt;.

&lt;span class="c1"&gt;# Add whatever you want to keep in this branch (e.g. built docs, reports)&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Performance results&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&amp;gt;&lt;span class="w"&gt; &lt;/span&gt;index.html
git&lt;span class="w"&gt; &lt;/span&gt;add&lt;span class="w"&gt; &lt;/span&gt;index.html
git&lt;span class="w"&gt; &lt;/span&gt;commit&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Initial gh-pages commit&amp;quot;&lt;/span&gt;
&lt;span class="sb"&gt;````&lt;/span&gt;

At&lt;span class="w"&gt; &lt;/span&gt;this&lt;span class="w"&gt; &lt;/span&gt;point,&lt;span class="w"&gt; &lt;/span&gt;you’ve&lt;span class="w"&gt; &lt;/span&gt;got&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;brand&lt;span class="w"&gt; &lt;/span&gt;new&lt;span class="w"&gt; &lt;/span&gt;branch&lt;span class="w"&gt; &lt;/span&gt;—&lt;span class="w"&gt; &lt;/span&gt;no&lt;span class="w"&gt; &lt;/span&gt;link&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;your&lt;span class="w"&gt; &lt;/span&gt;project’s&lt;span class="w"&gt; &lt;/span&gt;main&lt;span class="w"&gt; &lt;/span&gt;history.&lt;span class="w"&gt;  &lt;/span&gt;
It’s&lt;span class="w"&gt; &lt;/span&gt;perfect&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;things&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;don’t&lt;span class="w"&gt; &lt;/span&gt;want&lt;span class="w"&gt; &lt;/span&gt;tangled&lt;span class="w"&gt; &lt;/span&gt;with&lt;span class="w"&gt; &lt;/span&gt;your&lt;span class="w"&gt; &lt;/span&gt;codebase:&lt;span class="w"&gt; &lt;/span&gt;generated&lt;span class="w"&gt; &lt;/span&gt;reports,&lt;span class="w"&gt; &lt;/span&gt;documentation,&lt;span class="w"&gt; &lt;/span&gt;benchmarks,&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;built&lt;span class="w"&gt; &lt;/span&gt;assets.

__OBSIDIAN_CODEBLOCK_1__

Unlike&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;normal&lt;span class="w"&gt; &lt;/span&gt;branch,&lt;span class="w"&gt; &lt;/span&gt;an&lt;span class="w"&gt; &lt;/span&gt;orphan&lt;span class="w"&gt; &lt;/span&gt;branch&lt;span class="w"&gt; &lt;/span&gt;doesn’t&lt;span class="w"&gt; &lt;/span&gt;have&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;parent&lt;span class="w"&gt; &lt;/span&gt;commit,&lt;span class="w"&gt; &lt;/span&gt;so&lt;span class="w"&gt; &lt;/span&gt;merges&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;rebases&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;it&lt;span class="w"&gt; &lt;/span&gt;don’t&lt;span class="w"&gt; &lt;/span&gt;make&lt;span class="w"&gt; &lt;/span&gt;sense&lt;span class="w"&gt; &lt;/span&gt;—&lt;span class="w"&gt; &lt;/span&gt;it’s&lt;span class="w"&gt; &lt;/span&gt;meant&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;live&lt;span class="w"&gt; &lt;/span&gt;independently.

&lt;span class="c1"&gt;## Using it with GitHub Actions&lt;/span&gt;

Let’s&lt;span class="w"&gt; &lt;/span&gt;say&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;run&lt;span class="w"&gt; &lt;/span&gt;automated&lt;span class="w"&gt; &lt;/span&gt;performance&lt;span class="w"&gt; &lt;/span&gt;tests&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;your&lt;span class="w"&gt; &lt;/span&gt;CI&lt;span class="w"&gt; &lt;/span&gt;pipeline&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;want&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;**publish&lt;span class="w"&gt; &lt;/span&gt;their&lt;span class="w"&gt; &lt;/span&gt;results**&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;separate&lt;span class="w"&gt; &lt;/span&gt;branch&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;gh-pages&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;—&lt;span class="w"&gt; &lt;/span&gt;so&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;can&lt;span class="w"&gt; &lt;/span&gt;host&lt;span class="w"&gt; &lt;/span&gt;them&lt;span class="w"&gt; &lt;/span&gt;via&lt;span class="w"&gt; &lt;/span&gt;GitHub&lt;span class="w"&gt; &lt;/span&gt;Pages&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;just&lt;span class="w"&gt; &lt;/span&gt;keep&lt;span class="w"&gt; &lt;/span&gt;them&lt;span class="w"&gt; &lt;/span&gt;visible.

Here’s&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;minimal&lt;span class="w"&gt; &lt;/span&gt;GitHub&lt;span class="w"&gt; &lt;/span&gt;Actions&lt;span class="w"&gt; &lt;/span&gt;workflow&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;make&lt;span class="w"&gt; &lt;/span&gt;that&lt;span class="w"&gt; &lt;/span&gt;happen:

__OBSIDIAN_CODEBLOCK_2__

The&lt;span class="w"&gt; &lt;/span&gt;important&lt;span class="w"&gt; &lt;/span&gt;detail:&lt;span class="w"&gt;  &lt;/span&gt;
&lt;span class="sb"&gt;`&lt;/span&gt;git&lt;span class="w"&gt; &lt;/span&gt;checkout&lt;span class="w"&gt; &lt;/span&gt;--orphan&lt;span class="w"&gt; &lt;/span&gt;gh-pages&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;creates&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;clean&lt;span class="w"&gt; &lt;/span&gt;branch&lt;span class="w"&gt; &lt;/span&gt;inside&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;same&lt;span class="w"&gt; &lt;/span&gt;repo,&lt;span class="w"&gt;  &lt;/span&gt;
and&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;git&lt;span class="w"&gt; &lt;/span&gt;push&lt;span class="w"&gt; &lt;/span&gt;--force&lt;span class="w"&gt; &lt;/span&gt;origin&lt;span class="w"&gt; &lt;/span&gt;gh-pages&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;overwrites&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;branch&lt;span class="w"&gt; &lt;/span&gt;each&lt;span class="w"&gt; &lt;/span&gt;time,&lt;span class="w"&gt; &lt;/span&gt;keeping&lt;span class="w"&gt; &lt;/span&gt;only&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;latest&lt;span class="w"&gt; &lt;/span&gt;reports.

You&lt;span class="w"&gt; &lt;/span&gt;could&lt;span class="w"&gt; &lt;/span&gt;later&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;enable&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;**GitHub&lt;span class="w"&gt; &lt;/span&gt;Pages**&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;serve&lt;span class="w"&gt; &lt;/span&gt;that&lt;span class="w"&gt; &lt;/span&gt;branch,&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;your&lt;span class="w"&gt; &lt;/span&gt;performance&lt;span class="w"&gt; &lt;/span&gt;dashboard&lt;span class="w"&gt; &lt;/span&gt;would&lt;span class="w"&gt; &lt;/span&gt;live&lt;span class="w"&gt; &lt;/span&gt;at&lt;span class="w"&gt;  &lt;/span&gt;
&lt;span class="sb"&gt;`&lt;/span&gt;https://&amp;lt;username&amp;gt;.github.io/&amp;lt;repo&amp;gt;/&lt;span class="sb"&gt;`&lt;/span&gt;.


&lt;span class="c1"&gt;## When this pattern makes sense&lt;/span&gt;

This&lt;span class="w"&gt; &lt;/span&gt;approach&lt;span class="w"&gt; &lt;/span&gt;shines&lt;span class="w"&gt; &lt;/span&gt;when&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;want&lt;span class="w"&gt; &lt;/span&gt;to:

-&lt;span class="w"&gt; &lt;/span&gt;Publish&lt;span class="w"&gt; &lt;/span&gt;generated&lt;span class="w"&gt; &lt;/span&gt;files&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;reports,&lt;span class="w"&gt; &lt;/span&gt;docs,&lt;span class="w"&gt; &lt;/span&gt;dashboards&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;without&lt;span class="w"&gt; &lt;/span&gt;polluting&lt;span class="w"&gt; &lt;/span&gt;your&lt;span class="w"&gt; &lt;/span&gt;main&lt;span class="w"&gt; &lt;/span&gt;branch.

-&lt;span class="w"&gt; &lt;/span&gt;Keep&lt;span class="w"&gt; &lt;/span&gt;automation&lt;span class="w"&gt; &lt;/span&gt;output&lt;span class="w"&gt; &lt;/span&gt;under&lt;span class="w"&gt; &lt;/span&gt;version&lt;span class="w"&gt; &lt;/span&gt;control&lt;span class="w"&gt; &lt;/span&gt;but&lt;span class="w"&gt; &lt;/span&gt;separated&lt;span class="w"&gt; &lt;/span&gt;from&lt;span class="w"&gt; &lt;/span&gt;your&lt;span class="w"&gt; &lt;/span&gt;code.

-&lt;span class="w"&gt; &lt;/span&gt;Host&lt;span class="w"&gt; &lt;/span&gt;static&lt;span class="w"&gt; &lt;/span&gt;files&lt;span class="w"&gt; &lt;/span&gt;directly&lt;span class="w"&gt; &lt;/span&gt;from&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;repository&lt;span class="w"&gt; &lt;/span&gt;with&lt;span class="w"&gt; &lt;/span&gt;GitHub&lt;span class="w"&gt; &lt;/span&gt;Pages.


It’s&lt;span class="w"&gt; &lt;/span&gt;less&lt;span class="w"&gt; &lt;/span&gt;suited&lt;span class="w"&gt; &lt;/span&gt;when:

-&lt;span class="w"&gt; &lt;/span&gt;You&lt;span class="w"&gt; &lt;/span&gt;need&lt;span class="w"&gt; &lt;/span&gt;historical&lt;span class="w"&gt; &lt;/span&gt;diffs&lt;span class="w"&gt; &lt;/span&gt;between&lt;span class="w"&gt; &lt;/span&gt;runs&lt;span class="w"&gt; &lt;/span&gt;—&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;--force&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;push&lt;span class="w"&gt; &lt;/span&gt;wipes&lt;span class="w"&gt; &lt;/span&gt;history.

-&lt;span class="w"&gt; &lt;/span&gt;Reports&lt;span class="w"&gt; &lt;/span&gt;are&lt;span class="w"&gt; &lt;/span&gt;huge&lt;span class="w"&gt; &lt;/span&gt;—&lt;span class="w"&gt; &lt;/span&gt;pushing&lt;span class="w"&gt; &lt;/span&gt;megabytes&lt;span class="w"&gt; &lt;/span&gt;of&lt;span class="w"&gt; &lt;/span&gt;HTML&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;JSON&lt;span class="w"&gt; &lt;/span&gt;repeatedly&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;slow.

-&lt;span class="w"&gt; &lt;/span&gt;The&lt;span class="w"&gt; &lt;/span&gt;repo&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;private&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;GitHub&lt;span class="w"&gt; &lt;/span&gt;Pages&lt;span class="w"&gt; &lt;/span&gt;isn’t&lt;span class="w"&gt; &lt;/span&gt;enabled&lt;span class="w"&gt; &lt;/span&gt;—&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;might&lt;span class="w"&gt; &lt;/span&gt;prefer&lt;span class="w"&gt; &lt;/span&gt;an&lt;span class="w"&gt; &lt;/span&gt;artifact&lt;span class="w"&gt; &lt;/span&gt;store&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;S3.


For&lt;span class="w"&gt; &lt;/span&gt;long-term&lt;span class="w"&gt; &lt;/span&gt;tracking,&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;might&lt;span class="w"&gt; &lt;/span&gt;instead&lt;span class="w"&gt; &lt;/span&gt;append&lt;span class="w"&gt; &lt;/span&gt;results&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;dedicated&lt;span class="w"&gt; &lt;/span&gt;folder&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;main&lt;span class="w"&gt; &lt;/span&gt;branch&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;store&lt;span class="w"&gt; &lt;/span&gt;them&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;database.

&lt;span class="c1"&gt;## Subtle details and common pitfalls&lt;/span&gt;

A&lt;span class="w"&gt; &lt;/span&gt;few&lt;span class="w"&gt; &lt;/span&gt;things&lt;span class="w"&gt; &lt;/span&gt;can&lt;span class="w"&gt; &lt;/span&gt;bite&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;when&lt;span class="w"&gt; &lt;/span&gt;using&lt;span class="w"&gt; &lt;/span&gt;orphan&lt;span class="w"&gt; &lt;/span&gt;branches&lt;span class="w"&gt; &lt;/span&gt;from&lt;span class="w"&gt; &lt;/span&gt;automation:

&lt;span class="m"&gt;1&lt;/span&gt;.&lt;span class="w"&gt; &lt;/span&gt;**Forgetting&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;clean&lt;span class="w"&gt; &lt;/span&gt;up.**&lt;span class="w"&gt;  &lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;After&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;git&lt;span class="w"&gt; &lt;/span&gt;checkout&lt;span class="w"&gt; &lt;/span&gt;--orphan&lt;span class="sb"&gt;`&lt;/span&gt;,&lt;span class="w"&gt; &lt;/span&gt;your&lt;span class="w"&gt; &lt;/span&gt;working&lt;span class="w"&gt; &lt;/span&gt;directory&lt;span class="w"&gt; &lt;/span&gt;still&lt;span class="w"&gt; &lt;/span&gt;has&lt;span class="w"&gt; &lt;/span&gt;files&lt;span class="w"&gt; &lt;/span&gt;from&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;previous&lt;span class="w"&gt; &lt;/span&gt;branch.&lt;span class="w"&gt;  &lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;If&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;forget&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;git&lt;span class="w"&gt; &lt;/span&gt;rm&lt;span class="w"&gt; &lt;/span&gt;-rf&lt;span class="w"&gt; &lt;/span&gt;.&lt;span class="sb"&gt;`&lt;/span&gt;,&lt;span class="w"&gt; &lt;/span&gt;those&lt;span class="w"&gt; &lt;/span&gt;files&lt;span class="w"&gt; &lt;/span&gt;will&lt;span class="w"&gt; &lt;/span&gt;get&lt;span class="w"&gt; &lt;/span&gt;re-committed.

&lt;span class="m"&gt;2&lt;/span&gt;.&lt;span class="w"&gt; &lt;/span&gt;**Forcing&lt;span class="w"&gt; &lt;/span&gt;overwrites.**&lt;span class="w"&gt;  &lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;Each&lt;span class="w"&gt; &lt;/span&gt;run&lt;span class="w"&gt; &lt;/span&gt;should&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;do&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;clean&lt;span class="w"&gt; &lt;/span&gt;commit&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;--force&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;push.&lt;span class="w"&gt; &lt;/span&gt;Without&lt;span class="w"&gt; &lt;/span&gt;that,&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;may&lt;span class="w"&gt; &lt;/span&gt;get&lt;span class="w"&gt; &lt;/span&gt;“non-fast-forward”&lt;span class="w"&gt; &lt;/span&gt;errors&lt;span class="w"&gt; &lt;/span&gt;because&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;branch&lt;span class="w"&gt; &lt;/span&gt;histories&lt;span class="w"&gt; &lt;/span&gt;are&lt;span class="w"&gt; &lt;/span&gt;unrelated.

&lt;span class="m"&gt;3&lt;/span&gt;.&lt;span class="w"&gt; &lt;/span&gt;**Detached&lt;span class="w"&gt; &lt;/span&gt;history.**&lt;span class="w"&gt;  &lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;You&lt;span class="w"&gt; &lt;/span&gt;can’t&lt;span class="w"&gt; &lt;/span&gt;merge&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;diff&lt;span class="w"&gt; &lt;/span&gt;easily&lt;span class="w"&gt; &lt;/span&gt;between&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;gh-pages&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;main&lt;span class="sb"&gt;`&lt;/span&gt;.&lt;span class="w"&gt; &lt;/span&gt;This&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;by&lt;span class="w"&gt; &lt;/span&gt;design,&lt;span class="w"&gt; &lt;/span&gt;but&lt;span class="w"&gt; &lt;/span&gt;it&lt;span class="w"&gt; &lt;/span&gt;means&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;can’t&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;git&lt;span class="w"&gt; &lt;/span&gt;diff&lt;span class="w"&gt; &lt;/span&gt;main...gh-pages&lt;span class="sb"&gt;`&lt;/span&gt;.

&lt;span class="m"&gt;4&lt;/span&gt;.&lt;span class="w"&gt; &lt;/span&gt;**Keeping&lt;span class="w"&gt; &lt;/span&gt;GitHub&lt;span class="w"&gt; &lt;/span&gt;Pages&lt;span class="w"&gt; &lt;/span&gt;clean.**&lt;span class="w"&gt;  &lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;If&lt;span class="w"&gt; &lt;/span&gt;you&lt;span class="w"&gt; &lt;/span&gt;only&lt;span class="w"&gt; &lt;/span&gt;need&lt;span class="w"&gt; &lt;/span&gt;artifacts&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;internal&lt;span class="w"&gt; &lt;/span&gt;review,&lt;span class="w"&gt; &lt;/span&gt;consider&lt;span class="w"&gt; &lt;/span&gt;naming&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;branch&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;reports&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;artifacts&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;instead&lt;span class="w"&gt; &lt;/span&gt;of&lt;span class="w"&gt; &lt;/span&gt;using&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;gh-pages&lt;span class="sb"&gt;`&lt;/span&gt;.


Here’s&lt;span class="w"&gt; &lt;/span&gt;what&lt;span class="w"&gt; &lt;/span&gt;_not_&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;do&lt;/span&gt;:

&lt;span class="sb"&gt;```&lt;/span&gt;bash
&lt;span class="c1"&gt;# Wrong: this will mix commits from main&lt;/span&gt;
git&lt;span class="w"&gt; &lt;/span&gt;checkout&lt;span class="w"&gt; &lt;/span&gt;-b&lt;span class="w"&gt; &lt;/span&gt;gh-pages&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;# Bad, keeps full history&lt;/span&gt;

&lt;span class="c1"&gt;# Correct&lt;/span&gt;
git&lt;span class="w"&gt; &lt;/span&gt;checkout&lt;span class="w"&gt; &lt;/span&gt;--orphan&lt;span class="w"&gt; &lt;/span&gt;gh-pages
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Orphan branches are like disposable notebooks: use them for output, not for code - and they’ll serve you well.&lt;/p&gt;</content><category term="note"/><category term="git"/><category term="github-actions"/><category term="ci"/><category term="deployment"/><category term="performance"/></entry><entry><title>Understanding the Language Server Protocol through a Minimal Working Example</title><link href="https://www.safjan.com/language-server-protocol-minimal-example/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-10-16T00:00:00+02:00</published><updated>2025-10-16T00:00:00+02:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-10-16:/language-server-protocol-minimal-example/</id><summary type="html">&lt;p&gt;Learn how the Language Server Protocol standardizes communication between code editors and language servers, enabling editors like VS Code to request features such as diagnostics and completions from a single server, simplifying development workflows.&lt;/p&gt;</summary><content type="html">&lt;h2 id="the-mental-model-what-is-the-language-server-protocol"&gt;The Mental Model: What Is the Language Server Protocol&lt;/h2&gt;
&lt;p&gt;The &lt;strong&gt;Language Server Protocol (LSP)&lt;/strong&gt; is one of those invisible technologies that quietly revolutionized the developer experience. It defines a &lt;strong&gt;standard way for code editors (clients)&lt;/strong&gt; to communicate with &lt;strong&gt;language-specific analysis engines (servers)&lt;/strong&gt;.  &lt;/p&gt;
&lt;p&gt;Before LSP, every editor (VS Code, Vim, Sublime, Atom, etc.) had to build their own support for each programming language. That meant dozens of editors and dozens of languages — an explosion of duplicated work.  &lt;/p&gt;
&lt;p&gt;LSP fixed that by saying: “Let’s agree on how editors talk to language tools.”  &lt;/p&gt;
&lt;p&gt;Now, editors can speak a &lt;strong&gt;common protocol&lt;/strong&gt; over JSON-RPC (a lightweight request-response format over stdin/stdout). A single server can work with many editors.&lt;/p&gt;
&lt;p&gt;You can think of it like this:&lt;/p&gt;
&lt;pre class="mermaid"&gt;
graph TD
    A[VS Code Editor] --&gt;|JSON-RPC: initialize, textDocument/didOpen| B[Language Server]
    B --&gt;|diagnostics, hover info, completions| A
&lt;/pre&gt;

&lt;p&gt;In short:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;editor&lt;/strong&gt; (client) sends notifications and requests: &lt;em&gt;“File opened”, “User typed this”, “What symbols are here?”&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;server&lt;/strong&gt; responds with diagnostics, completions, definitions, etc.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Once you get this model, the magic of modern editor intelligence starts to feel less mysterious.&lt;/p&gt;
&lt;script type="module"&gt;
    import mermaid from 'https://cdn.jsdelivr.net/npm/mermaid@11/dist/mermaid.esm.min.mjs';
    mermaid.initialize({ startOnLoad: true });
&lt;/script&gt;</content><category term="note"/><category term="language-server-protocol"/><category term="vscode"/><category term="lsp"/><category term="developer-tools"/><category term="editors"/><category term="programming"/></entry><entry><title>Using CSS Variables for Dynamic and Reusable Styling</title><link href="https://www.safjan.com/css-variables-dynamic-reusable-styling/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-10-14T00:00:00+02:00</published><updated>2025-10-14T00:00:00+02:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-10-14:/css-variables-dynamic-reusable-styling/</id><summary type="html">&lt;p&gt;Learn how to use CSS variables for dynamic and reusable styling, enabling features like interactive UIs, easy theming, and design consistency across components. Discover practical usage examples and best practices while understanding limitations and browser support issues.&lt;/p&gt;</summary><content type="html">&lt;h2 id="the-core-idea"&gt;The Core Idea&lt;/h2&gt;
&lt;p&gt;CSS variables (custom properties) let you store reusable values directly in CSS and update them at runtime.&lt;br&gt;
They use the &lt;code&gt;--name&lt;/code&gt; syntax and are accessed with &lt;code&gt;var()&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nd"&gt;root&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;--bg&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mh"&gt;#222&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;--text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mh"&gt;#eee&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nt"&gt;body&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;background&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;var&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;--bg&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;color&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;var&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;--text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="err"&gt;````&lt;/span&gt;

&lt;span class="nt"&gt;Variables&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;defined&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nd"&gt;root&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;are&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;global&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;but&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;can&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;be&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;overridden&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;specific&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;scopes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;or&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;components&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;

&lt;span class="err"&gt;##&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;Practical&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;Usage&lt;/span&gt;

&lt;span class="err"&gt;###&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;Where&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;It&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;Shines&lt;/span&gt;

&lt;span class="nt"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="nt"&gt;Dynamic&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;theming&lt;/span&gt;&lt;span class="o"&gt;:**&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;dark&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nt"&gt;light&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;high-contrast&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;themes&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;seasonal&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;designs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;
&lt;span class="nt"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="nt"&gt;Design&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;consistency&lt;/span&gt;&lt;span class="o"&gt;:**&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;same&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;color&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;or&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;spacing&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;reused&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;across&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;multiple&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;components&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
&lt;span class="nt"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="nt"&gt;Runtime&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;manipulation&lt;/span&gt;&lt;span class="o"&gt;:**&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;interactive&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;UIs&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;where&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;JS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;can&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;tweak&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;design&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;parameters&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;instantly&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
&lt;span class="nt"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="nt"&gt;Design&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;systems&lt;/span&gt;&lt;span class="o"&gt;:**&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;variables&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;act&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;shared&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;language&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;between&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;designers&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;developers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;

&lt;span class="err"&gt;###&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;When&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;Not&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;Use&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;It&lt;/span&gt;

&lt;span class="nt"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="nt"&gt;Heavy&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;preprocessing&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;needs&lt;/span&gt;&lt;span class="o"&gt;:**&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;you&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;rely&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;on&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;conditionals&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;or&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;functions&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;like&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="nt"&gt;darken&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;&lt;span class="err"&gt;`—&lt;/span&gt;&lt;span class="nt"&gt;Sass&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;still&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;wins&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;there&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;

&lt;span class="nt"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="nt"&gt;Older&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;browser&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;support&lt;/span&gt;&lt;span class="o"&gt;:**&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;IE11&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;doesn&lt;/span&gt;&lt;span class="err"&gt;’&lt;/span&gt;&lt;span class="nt"&gt;t&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;support&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;CSS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;variables&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nt"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;never&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;will&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;

&lt;span class="err"&gt;###&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;Example&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;color&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;picker&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;that&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;changes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;page&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;color&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;live&lt;/span&gt;

&lt;span class="nt"&gt;An&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;example&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;simple&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;color&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;picker&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;that&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;changes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;page&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;color&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;live&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="err"&gt;```&lt;/span&gt;&lt;span class="nt"&gt;js&lt;/span&gt;
&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;input&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;color&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;picker&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;script&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;picker&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;getElementById&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;picker&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;picker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;addEventListener&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;input&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;e&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="err"&gt;document.documentElement.style.setProperty(&amp;#39;--main-bg&amp;#39;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;e.target.value)&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;script&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h3 id="example-theming"&gt;Example: Theming&lt;/h3&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nd"&gt;root&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;--bg&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mh"&gt;#fff&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;--text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mh"&gt;#000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;dark&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;--bg&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mh"&gt;#121212&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;--text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mh"&gt;#fff&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Switch themes easily:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;classList&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;toggle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;dark&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h3 id="fallbacks-and-responsiveness"&gt;Fallbacks and Responsiveness&lt;/h3&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nt"&gt;h1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;color&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;var&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;--title-color&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mh"&gt;#444&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;/* fallback if missing */&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="k"&gt;media&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nt"&gt;min-width&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;800px&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nd"&gt;root&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nv"&gt;--gap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="kt"&gt;rem&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="things-to-watch-out-for"&gt;Things to Watch Out For&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Scoped variables&lt;/strong&gt; don’t leak outside their selector&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;IE11&lt;/strong&gt; doesn’t support them&lt;/li&gt;
&lt;li&gt;Changing variables with JS triggers &lt;strong&gt;repaints&lt;/strong&gt;, so avoid in animations&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;CSS variables behave like live, inheritable properties—perfect for theming and dynamic UI tweaks without recompiling styles.&lt;/p&gt;</content><category term="note"/><category term="css"/><category term="web-development"/><category term="front-end"/></entry><entry><title>Bare Asterisk in Python Function Signatures - Keyword Only Arguments</title><link href="https://www.safjan.com/bare-asterisk-in-python-function-signatures/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-10-10T00:00:00+02:00</published><updated>2025-10-10T00:00:00+02:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-10-10:/bare-asterisk-in-python-function-signatures/</id><summary type="html">&lt;p&gt;Learn how to use the bare asterisk in Python function signatures to enforce keyword-only arguments, enhancing clarity and preventing argument order bugs in your code.&lt;/p&gt;</summary><content type="html">&lt;h2 id="core-principle"&gt;Core Principle&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;*&lt;/code&gt; by itself in a function signature forces everything after it to be keyword-only arguments. It's a syntax barrier - arguments before the asterisk can be positional, arguments after it must be named.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;pass&lt;/span&gt;

&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# works&lt;/span&gt;
&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;      &lt;span class="c1"&gt;# breaks&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;History:&lt;/strong&gt; Introduced in Python 3.0 via PEP 3102. This was part of the Python 3 overhaul, so it's never been available in Python 2.&lt;/p&gt;
&lt;h2 id="useful-extensions"&gt;Useful Extensions&lt;/h2&gt;
&lt;p&gt;You can mix this with other parameter types in ways that make sense for your API:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;With default values:&lt;/strong&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;required_arg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;optional&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;debug&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;pass&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;*_Combined with _args:__&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;separator&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;, &amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# items catches unlimited positional args&lt;/span&gt;
    &lt;span class="c1"&gt;# separator and prefix must be keyword-only&lt;/span&gt;
    &lt;span class="k"&gt;pass&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;All keyword-only (rare but valid):&lt;/strong&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;configure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Everything must be named, nothing positional&lt;/span&gt;
    &lt;span class="k"&gt;pass&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="specific-use-cases"&gt;Specific Use Cases&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Library APIs where clarity matters&lt;/strong&gt; - When you have optional parameters that could be confused if passed positionally. The httpx example I saw does this: &lt;code&gt;def __init__(self, message: str, request: httpx.Request, *, body: object | None)&lt;/code&gt; - the body parameter is important enough to deserve an explicit name.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Future-proofing&lt;/strong&gt; - If you might add more required positional parameters later, putting optional ones after &lt;code&gt;*&lt;/code&gt; means you won't break existing calls. Old code keeps working even as your signature evolves.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Preventing argument order bugs&lt;/strong&gt; - When you have multiple parameters of the same type, forcing keywords prevents &lt;code&gt;foo(timeout, retries)&lt;/code&gt; vs &lt;code&gt;foo(retries, timeout)&lt;/code&gt; mixups.&lt;/p&gt;
&lt;h2 id="nuances"&gt;Nuances&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;The bare asterisk doesn't capture anything&lt;/strong&gt; - it's purely a delimiter that says "keyword-only from here on." This is different from &lt;code&gt;*args&lt;/code&gt;, which actually captures excess positional arguments into a tuple.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Bare asterisk - just marks the boundary&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;pass&lt;/span&gt;

&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;        &lt;span class="c1"&gt;# works&lt;/span&gt;
&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;          &lt;span class="c1"&gt;# TypeError: too many positional arguments&lt;/span&gt;
&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;     &lt;span class="c1"&gt;# TypeError: too many positional arguments&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The function above only accepts one positional argument (&lt;code&gt;a&lt;/code&gt;). The &lt;code&gt;*&lt;/code&gt; doesn't "consume" anything - it just blocks additional positional arguments from being accepted.&lt;/p&gt;
&lt;p&gt;*_When you use _args with keyword-only parameters__ - you put a name on the asterisk, and now it captures all the excess positional arguments, but you can still have keyword-only parameters after it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;a=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, args=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, b=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;              &lt;span class="c1"&gt;# a=1, args=(), b=2&lt;/span&gt;
&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;        &lt;span class="c1"&gt;# a=1, args=(2, 3), b=4&lt;/span&gt;
&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;          &lt;span class="c1"&gt;# TypeError: missing keyword-only argument &amp;#39;b&amp;#39;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Here &lt;code&gt;*args&lt;/code&gt; is greedy - it captures all positional arguments after &lt;code&gt;a&lt;/code&gt;. But &lt;code&gt;b&lt;/code&gt; still must be passed by keyword because it comes after the &lt;code&gt;*args&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The key distinction:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;def func(a, *, b):&lt;/code&gt; - accepts exactly one positional arg, &lt;code&gt;b&lt;/code&gt; must be keyword&lt;/li&gt;
&lt;li&gt;&lt;code&gt;def func(a, *args, b):&lt;/code&gt; - accepts unlimited positional args into &lt;code&gt;args&lt;/code&gt;, &lt;code&gt;b&lt;/code&gt; must be keyword&lt;/li&gt;
&lt;li&gt;&lt;code&gt;def func(a, *args)&lt;/code&gt; - accepts unlimited positional args, no keyword-only parameters&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This matters when designing APIs. Use bare &lt;code&gt;*&lt;/code&gt; when you want to restrict positional arguments. Use &lt;code&gt;*args&lt;/code&gt; when you want to accept a variable number of them but still have some parameters that must be named.&lt;/p&gt;
&lt;h2 id="references"&gt;References&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://peps.python.org/pep-3102/"&gt;PEP 3102 – Keyword-Only Arguments | peps.python.org&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</content><category term="note"/><category term="python"/><category term="api"/><category term="pep"/><category term="keyword-arguments"/><category term="future-proofing"/></entry><entry><title>Six Weeks, Real Progress - Exploring Shape Up for Product Work</title><link href="https://www.safjan.com/six-weeks-real-progress-exploring-shape-up-for-product-work/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-10-06T00:00:00+02:00</published><updated>2026-02-07T00:00:00+01:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-10-06:/six-weeks-real-progress-exploring-shape-up-for-product-work/</id><summary type="html">&lt;p&gt;Shape Up replaces two-week sprints with six-week cycles, kills the backlog, and lets small teams decide how to build things. Here is when it works, when it doesn't, and what I think about it after digging in.&lt;/p&gt;</summary><content type="html">&lt;ul&gt;
&lt;li&gt;&lt;a href="#what-makes-shape-up-different"&gt;What makes Shape Up different?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#when-shape-up-works-well"&gt;When Shape Up works well&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#when-it-falls-short"&gt;When it falls short&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#so-what-should-you-actually-do"&gt;So what should you actually do?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#further-reading"&gt;Further reading&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I keep coming back to the same frustration with how we build software. Not the code, the &lt;em&gt;process&lt;/em&gt; around it. Sprint planning that eats three hours. Standups that turn into debugging sessions. A backlog so long nobody even scrolls to the bottom anymore.&lt;/p&gt;
&lt;p&gt;Agile was supposed to fix all this. Scrum, Kanban, SAFe, whatever. And those frameworks helped, I won't deny that. But at some point things drifted. Two-week sprints started fragmenting work into artificial chunks. Context switching never stopped. There was this constant pressure to estimate story points for work we barely understood yet, and I started noticing we were spending more time managing the process than actually building things.&lt;/p&gt;
&lt;p&gt;Then I found &lt;a href="https://basecamp.com/shapeup"&gt;Shape Up&lt;/a&gt;, the approach from Basecamp (now &lt;a href="https://37signals.com/"&gt;37signals&lt;/a&gt;). It described exactly what was bothering me, better than I could.&lt;/p&gt;
&lt;p&gt;&lt;a id="what-makes-shape-up-different"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="what-makes-shape-up-different"&gt;What makes Shape Up different?&lt;/h2&gt;
&lt;p&gt;Shape Up is not Agile with different labels. The structure is different enough that it changes how you think about planning.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Work in six-week cycles. Not two weeks. Six. Long enough to build something real, short enough that you can't hide. "Plus it gives you about eight chances a year to recalibrate and decide what to work on next." (see: &lt;a href="https://37signals.com/06"&gt;37signals&lt;/a&gt;). Between cycles, there's a two-week cooldown for bugs, tech debt, or just breathing.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Before a cycle starts, senior people do what Basecamp calls "&lt;a href="https://basecamp.com/shapeup/0.3-chapter-01#shaping-the-work"&gt;shaping&lt;/a&gt;." They take raw ideas and turn them into pitches with clear limits. Not detailed specs, but something with clear boundaries and a sense of what you're &lt;em&gt;not&lt;/em&gt; going to build. That last part is important.&lt;/p&gt;
&lt;p&gt;Then small teams, usually a designer and one or two programmers, take the shaped project and run with it. No daily standups. No one hovering for status updates. The time is fixed at six weeks, but the scope flexes. If something is taking too long, you cut scope, not time.&lt;/p&gt;
&lt;p&gt;There's no backlog. Ideas that don't make it into a cycle just disappear. If they matter, they'll resurface. If they don't, you saved yourself from a list of tickets nobody will ever touch.&lt;/p&gt;
&lt;p&gt;&lt;a id="when-shape-up-works-well"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="when-shape-up-works-well"&gt;When Shape Up works well&lt;/h2&gt;
&lt;p&gt;I've seen it work for product teams that are tired of the sprint treadmill. It fits when you're building features that need real design thinking, the kind of work that breaks apart when you force it into two-week slices.&lt;/p&gt;
&lt;p&gt;It works when your team is experienced enough to work without someone checking on them every day. When people can make decisions without approval for every small thing, and when they'll actually say something when they're stuck instead of waiting for a standup.&lt;/p&gt;
&lt;p&gt;Six weeks lets you sit with a problem long enough to solve it properly. A couple people I've talked to said they felt less scattered, and one mentioned actually having time to think about edge cases instead of just filing them as follow-up tickets. I've had the same feeling on longer projects.&lt;/p&gt;
&lt;p&gt;The no-backlog thing is better than it sounds. You stop maintaining a guilt list of things you'll never get to. You stop feeling bad about the tickets that have been sitting there for a year and a half. Every cycle is a clean slate.&lt;/p&gt;
&lt;p&gt;&lt;a id="when-it-falls-short"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="when-it-falls-short"&gt;When it falls short&lt;/h2&gt;
&lt;p&gt;Shape Up is not a universal fix.&lt;/p&gt;
&lt;p&gt;If most of your work is small customer requests and bug fixes, the six-week cycle feels heavy. Sometimes you just need to knock out twenty little improvements, and trying to shape each one into a project is forced. The cooldown weeks help some, but if 80% of your work looks like this, you're probably fighting the framework.&lt;/p&gt;
&lt;p&gt;Shaping requires people who understand both the business and the technology. If you don't have senior folks who can do this well, you end up with vague projects that either blow up in scope or leave teams stuck. Good shaping is hard. It takes practice.&lt;/p&gt;
&lt;p&gt;Teams that need predictability may struggle too. "We'll ship something good in six weeks" is a very different promise than "We'll deliver these fourteen story points by the end of Sprint 23." If your stakeholders need firm dates on specific features, the flexible scope will create problems.&lt;/p&gt;
&lt;p&gt;It's also wrong for true discovery mode. Six weeks is too much commitment when you need to pivot every few days based on user feedback. Shape Up assumes you have &lt;em&gt;some&lt;/em&gt; idea of what you're building.&lt;/p&gt;
&lt;p&gt;And if your organization is deeply invested in Agile tooling, velocity charts, burndown graphs, story points, the transition is a hard sell. Some teams aren't ready to give those up, and pushing Shape Up into that environment just creates a different kind of process overhead.&lt;/p&gt;
&lt;p&gt;&lt;a id="so-what-should-you-actually-do"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="so-what-should-you-actually-do"&gt;So what should you actually do?&lt;/h2&gt;
&lt;p&gt;The useful question isn't "is Shape Up better than Agile?" It's "what's actually broken in how we work right now?"&lt;/p&gt;
&lt;p&gt;If your problem is fragmented work and constant context switching, Shape Up is worth trying. If you're drowning in process overhead, read the book and see what makes sense for you. But if you mostly need to ship small changes on a predictable schedule, you probably don't need to change anything.&lt;/p&gt;
&lt;p&gt;I don't really care if Shape Up is "the right framework." What I care about is whether teams are honest about why they work the way they do. Most of us inherited our process from a previous team or a blog post from 2015. Maybe it's time to revisit that.&lt;/p&gt;
&lt;p&gt;&lt;a id="further-reading"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="further-reading"&gt;Further reading&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;overview article on shape up method - &lt;a href="https://agilefirst.io/what-is-shape-up/"&gt;Shape Up: a complete guide to this new development methodology (2024)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;free book on shape up - &lt;a href="https://basecamp.com/shapeup"&gt;Shape Up: Stop Running in Circles and Ship Work that Matters&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://37signals.com/"&gt;37signals&lt;/a&gt; - the company behind Basecamp, Ruby on Rails, and several opinionated books about how to run a software business&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Edits:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;2026-02-07: Added table of contents with anchor links&lt;/li&gt;
&lt;/ul&gt;</content><category term="Software Development"/><category term="agile"/><category term="scrum"/><category term="shape-up"/><category term="basecamp"/><category term="37-signals"/><category term="shape-up-method"/><category term="six-week-cycles"/><category term="product-development-process"/><category term="agile-alternatives"/><category term="autonomous-teams"/><category term="shaping-phase"/><category term="delivery-rhythm"/><category term="software-process-design"/><category term="software-development"/></entry><entry><title>Simpler Parallelism with concurrent.futures</title><link href="https://www.safjan.com/concurrent-futures-simpler-parallelism/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-09-21T00:00:00+02:00</published><updated>2025-09-21T00:00:00+02:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-09-21:/concurrent-futures-simpler-parallelism/</id><summary type="html">&lt;p&gt;Learn how to simplify parallel and concurrent programming in Python using &lt;code&gt;concurrent.futures&lt;/code&gt;, including executors for managing threads and processes, and futures for handling task results.&lt;/p&gt;</summary><content type="html">&lt;h2 id="the-high-level-approach"&gt;The High-Level Approach&lt;/h2&gt;
&lt;p&gt;Introduced in Python 3.2 via PEP 3148, &lt;code&gt;concurrent.futures&lt;/code&gt; gives you a unified interface for running code in parallel. Instead of wrestling with threads and processes directly, you get executors that handle the messy details. You submit tasks, get back futures, and collect results when they're ready.&lt;/p&gt;
&lt;p&gt;The module provides two main executors: &lt;code&gt;ThreadPoolExecutor&lt;/code&gt; for I/O-bound work and &lt;code&gt;ProcessPoolExecutor&lt;/code&gt; for CPU-bound tasks. Both share the same API, which means you can swap them out with minimal code changes.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;concurrent.futures&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;requests&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_url&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;urls&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;https://example.com&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;https://python.org&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;https://github.com&amp;#39;&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Context manager handles cleanup automatically&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fetch_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; bytes&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The executor manages a pool of workers for you. You don't create threads manually or worry about joining them. The context manager ensures everything gets cleaned up properly, even if exceptions occur.&lt;/p&gt;
&lt;h2 id="working-with-futures"&gt;Working with Futures&lt;/h2&gt;
&lt;p&gt;The real power shows up when you need more control than &lt;code&gt;map()&lt;/code&gt; provides. The &lt;code&gt;submit()&lt;/code&gt; method returns a Future object immediately, letting you track individual tasks and handle them independently.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;concurrent.futures&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;as_completed&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;time&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_item&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;delay&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;id&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;value&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;

&lt;span class="n"&gt;items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;id&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;value&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;delay&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;id&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;value&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;delay&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;id&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;value&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;delay&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Submit all tasks, get futures back&lt;/span&gt;
    &lt;span class="n"&gt;future_to_item&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;submit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;process_item&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; 
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;# Process results as they complete&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;future&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;as_completed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;future_to_item&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;future_to_item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;future&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;result_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result_value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;future&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Item &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result_value&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="ne"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Item &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;id&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; failed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;as_completed()&lt;/code&gt; function is particularly useful because it yields futures as soon as they finish, rather than in submission order. This means you can start processing early results while slower tasks are still running.&lt;/p&gt;
&lt;p&gt;You can also wait for specific conditions using &lt;code&gt;wait()&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;concurrent.futures&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;wait&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FIRST_COMPLETED&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ALL_COMPLETED&lt;/span&gt;

&lt;span class="c1"&gt;# Submit multiple tasks&lt;/span&gt;
&lt;span class="n"&gt;futures&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;submit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;slow_function&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

&lt;span class="c1"&gt;# Wait for the first one to finish&lt;/span&gt;
&lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pending&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;wait&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;futures&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;return_when&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;FIRST_COMPLETED&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;fastest_result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;iter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Cancel the rest if you only needed one result&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;future&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;future&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cancel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The Future objects themselves provide several useful methods. You can check if a task is done with &lt;code&gt;.done()&lt;/code&gt;, cancel pending tasks with &lt;code&gt;.cancel()&lt;/code&gt;, and attach callbacks with &lt;code&gt;.add_done_callback()&lt;/code&gt; that fire when the task completes.&lt;/p&gt;
&lt;h2 id="when-each-executor-makes-sense"&gt;When Each Executor Makes Sense&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;ThreadPoolExecutor&lt;/code&gt; works best for I/O-bound operations where your code spends time waiting. Network requests, file I/O, database queries—these are all good candidates. Python's Global Interpreter Lock (GIL) doesn't hurt you here because threads release the GIL during I/O operations.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;concurrent.futures&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;sqlite3&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;query_database&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;db_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sqlite3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;db_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;

&lt;span class="n"&gt;databases&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;users.db&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;orders.db&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;inventory.db&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;SELECT COUNT(*) FROM main_table&amp;quot;&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;counts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;query_database&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
        &lt;span class="n"&gt;databases&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;code&gt;ProcessPoolExecutor&lt;/code&gt; is your choice for CPU-intensive work like data processing, image manipulation, or mathematical computations. Each process gets its own Python interpreter and memory space, bypassing the GIL completely.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;concurrent.futures&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ProcessPoolExecutor&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;hashlib&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;hash_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filepath&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;hasher&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sha256&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filepath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;rb&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;iter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sa"&gt;b&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;hasher&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;filepath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hasher&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;files&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;large_file1.bin&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;large_file2.bin&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;large_file3.bin&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;ProcessPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;filepath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;digest&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hash_file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;files&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;filepath&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;digest&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Don't use &lt;code&gt;ProcessPoolExecutor&lt;/code&gt; for quick tasks or when you're passing large amounts of data. Spawning processes and serializing data between them has significant overhead. If your tasks take less than 0.1 seconds, the overhead probably exceeds the benefit.&lt;/p&gt;
&lt;p&gt;Avoid threads for pure CPU-bound work. The GIL means only one thread executes Python bytecode at a time, so you won't get parallel execution. You might even see slower performance due to context switching overhead.&lt;/p&gt;
&lt;h2 id="the-subtle-bits"&gt;The Subtle Bits&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;max_workers&lt;/code&gt; parameter matters more than you might think. Too few workers and you're not utilizing available resources. Too many and you waste memory while adding context-switching overhead. For I/O-bound work, you can often use more workers than CPU cores. For CPU-bound work, using more processes than cores typically doesn't help.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;concurrent.futures&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ProcessPoolExecutor&lt;/span&gt;

&lt;span class="c1"&gt;# Good default for CPU-bound work&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;ProcessPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpu_count&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cpu_intensive_function&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# For I/O-bound work, you might go higher&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fetch_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;When using &lt;code&gt;ProcessPoolExecutor&lt;/code&gt;, remember that arguments and return values must be picklable. This means you can't pass lambdas, local functions, or objects with unpicklable attributes. If you need to share configuration, consider using &lt;code&gt;functools.partial()&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;concurrent.futures&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ProcessPoolExecutor&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;functools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;partial&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_with_config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Use config dict to guide processing&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;multiplier&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;multiplier&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Wrong - lambdas aren&amp;#39;t picklable&lt;/span&gt;
&lt;span class="c1"&gt;# with ProcessPoolExecutor() as executor:&lt;/span&gt;
&lt;span class="c1"&gt;#     results = executor.map(lambda x: process_with_config(x, config), data)&lt;/span&gt;

&lt;span class="c1"&gt;# Right - use partial to bind the config argument&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;ProcessPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;process_func&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;partial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;process_with_config&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;process_func&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Exception handling requires attention because exceptions happen in worker threads or processes, not your main thread. Always wrap &lt;code&gt;.result()&lt;/code&gt; calls in try-except blocks. If you use &lt;code&gt;map()&lt;/code&gt;, exceptions won't raise until you iterate over the results.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;might_fail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Negative values not allowed&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;might_fail&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Exception raises here, not during map()&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="ne"&gt;ValueError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Error: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The executors don't automatically time out. If a task hangs, it'll block forever unless you specify a timeout:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;future&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;submit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;potentially_slow_function&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;future&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;5.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Wait max 5 seconds&lt;/span&gt;
&lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="ne"&gt;TimeoutError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Function took too long&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;future&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cancel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# Won&amp;#39;t stop already-running tasks&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;One common mistake is thinking &lt;code&gt;.cancel()&lt;/code&gt; will stop running tasks. It only prevents pending tasks from starting. Once a task begins execution, cancellation doesn't interrupt it. If you need interruptible tasks, you'll need to implement that logic yourself, typically using threading events or multiprocessing shared values.&lt;/p&gt;
&lt;p&gt;The module handles resource cleanup well through context managers, but if you don't use them, call &lt;code&gt;.shutdown(wait=True)&lt;/code&gt; explicitly. This ensures all pending tasks complete and resources get released. Forgetting this can leave threads or processes hanging around.&lt;/p&gt;</content><category term="note"/><category term="python"/><category term="concurrency"/><category term="parallelism"/><category term="threading"/><category term="multiprocessing"/><category term="performance"/></entry><entry><title>Threading vs Multiprocessing in Python - GIL Implications and Choosing the Right Tool</title><link href="https://www.safjan.com/threading-vs-multiprocessing-gil-implications/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-09-10T00:00:00+02:00</published><updated>2025-09-10T00:00:00+02:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-09-10:/threading-vs-multiprocessing-gil-implications/</id><summary type="html">&lt;p&gt;Learn about the Global Interpreter Lock (GIL) and how threading and multiprocessing in Python differ, with examples showing that multiprocessing is better for CPU-bound tasks due to GIL limitations.&lt;/p&gt;</summary><content type="html">&lt;h2 id="core-principle"&gt;Core Principle&lt;/h2&gt;
&lt;p&gt;Python has two built-in ways to run code concurrently: &lt;strong&gt;threading&lt;/strong&gt; and &lt;strong&gt;multiprocessing&lt;/strong&gt;. The critical difference comes down to the &lt;strong&gt;Global Interpreter Lock (GIL)&lt;/strong&gt; - a mutex that protects access to Python objects, preventing multiple threads from executing Python bytecode at once. This means:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Threading&lt;/strong&gt; - multiple threads in one process, but only one thread executes Python code at a time due to the GIL. Good for I/O-bound tasks.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Multiprocessing&lt;/strong&gt; - separate processes with separate Python interpreters, each with its own GIL. True parallelism for CPU-bound tasks.&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;threading&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;multiprocessing&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;time&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;cpu_bound_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Heavy computation&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# Threading - doesn&amp;#39;t help with CPU work&lt;/span&gt;
&lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;threads&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;threading&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Thread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;cpu_bound_task&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10_000_000&lt;/span&gt;&lt;span class="p"&gt;,))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;threads&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;threads&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Threading: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;.2f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;s&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# ~4 seconds on 4-core machine&lt;/span&gt;

&lt;span class="c1"&gt;# Multiprocessing - achieves true parallelism&lt;/span&gt;
&lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;processes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;multiprocessing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;cpu_bound_task&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10_000_000&lt;/span&gt;&lt;span class="p"&gt;,))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;processes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;processes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Multiprocessing: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;.2f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;s&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# ~1 second on 4-core machine&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;History:&lt;/strong&gt; The GIL has been in CPython since the beginning. PEP 703 (approved in 2023) outlines a path to making the GIL optional in Python 3.13+, but it's still fundamental to understand for current Python versions.&lt;/p&gt;
&lt;h2 id="useful-extensions"&gt;Useful Extensions&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Threading with ThreadPoolExecutor (simpler interface):&lt;/strong&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;concurrent.futures&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;requests&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_url&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;urls&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;http://example.com&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;

&lt;span class="c1"&gt;# Old way - manual thread management&lt;/span&gt;
&lt;span class="n"&gt;threads&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;threading&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Thread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;fetch_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;threads&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;threads&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Better way - thread pool handles everything&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fetch_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Multiprocessing with ProcessPoolExecutor:&lt;/strong&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;concurrent.futures&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ProcessPoolExecutor&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;expensive_calculation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;numbers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;10_000_000&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;ProcessPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expensive_calculation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;numbers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Sharing data between processes:&lt;/strong&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;multiprocessing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Process&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Array&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;ctypes&lt;/span&gt;

&lt;span class="c1"&gt;# Queue - safe inter-process communication&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;worker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Result from worker&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;worker&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,))&lt;/span&gt;
&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Shared memory with Value and Array&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;increment_counter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;

&lt;span class="n"&gt;counter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;i&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# shared integer&lt;/span&gt;
&lt;span class="n"&gt;arr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;i&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;  &lt;span class="c1"&gt;# shared array&lt;/span&gt;

&lt;span class="n"&gt;processes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;increment_counter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;processes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;processes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Counter: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# 5&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Array: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# [50, 1, 2]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Thread-safe operations with Lock:&lt;/strong&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;threading&lt;/span&gt;

&lt;span class="n"&gt;counter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;lock&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;threading&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Lock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;global&lt;/span&gt; &lt;span class="n"&gt;counter&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100_000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;lock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="c1"&gt;# Only one thread can execute this block at a time&lt;/span&gt;
            &lt;span class="n"&gt;counter&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="n"&gt;threads&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;threading&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Thread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;threads&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;threads&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# 1,000,000 (correct with lock, random without)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="specific-use-cases"&gt;Specific Use Cases&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Use threading for I/O-bound tasks:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Making multiple HTTP requests (web scraping, API calls)&lt;/li&gt;
&lt;li&gt;Reading/writing multiple files&lt;/li&gt;
&lt;li&gt;Database queries where you're waiting for responses&lt;/li&gt;
&lt;li&gt;Network operations (socket communication)&lt;/li&gt;
&lt;li&gt;Any operation where you spend time waiting for external resources&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The GIL doesn't matter here because threads release it during I/O operations. While one thread waits for network/disk, others can run.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Use multiprocessing for CPU-bound tasks:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Image/video processing&lt;/li&gt;
&lt;li&gt;Data analysis and numerical computations&lt;/li&gt;
&lt;li&gt;Encryption/decryption&lt;/li&gt;
&lt;li&gt;Machine learning model training&lt;/li&gt;
&lt;li&gt;Parsing large files&lt;/li&gt;
&lt;li&gt;Any computation-heavy work where the CPU is the bottleneck&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You need separate processes to get around the GIL and use multiple CPU cores effectively.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Real-world example - web scraper:&lt;/strong&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;concurrent.futures&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;requests&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;scrape_page&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Process the page&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;extract_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;urls&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;http://example.com/page1&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;http://example.com/page2&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Threading is perfect here - lots of waiting for network responses&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scrape_page&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Real-world example - image processing:&lt;/strong&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;concurrent.futures&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ProcessPoolExecutor&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;PIL&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# CPU-intensive operations&lt;/span&gt;
    &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;resize&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;800&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;600&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ImageFilter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SHARPEN&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;processed_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;filename&lt;/span&gt;

&lt;span class="n"&gt;images&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;img1.jpg&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;img2.jpg&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;img3.jpg&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Multiprocessing is needed - CPU-intensive work&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;ProcessPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;process_image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;images&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="nuances"&gt;Nuances&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;The GIL releases during I/O operations&lt;/strong&gt; - this is why threading works for I/O-bound tasks. When a thread calls a blocking I/O function (like &lt;code&gt;requests.get()&lt;/code&gt; or &lt;code&gt;file.read()&lt;/code&gt;), it releases the GIL so other threads can run. The GIL only prevents multiple threads from executing Python bytecode simultaneously.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Multiprocessing has overhead&lt;/strong&gt; - creating processes is expensive (memory and startup time). Each process needs its own Python interpreter and memory space. For small tasks, this overhead can outweigh the benefits of parallelism:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Bad - overhead dominates&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;ProcessPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# Good - task is substantial enough to justify processes&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;ProcessPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expensive_function&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;large_dataset&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Data serialization between processes&lt;/strong&gt; - when you pass data to a process or get results back, Python uses &lt;code&gt;pickle&lt;/code&gt; to serialize it. Large objects or objects that can't be pickled cause problems:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# This won&amp;#39;t work - lambda functions can&amp;#39;t be pickled&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;ProcessPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;numbers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Error&lt;/span&gt;

&lt;span class="c1"&gt;# This works - regular functions can be pickled&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;multiply_by_two&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;ProcessPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;multiply_by_two&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;numbers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Works&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Threading has less isolation&lt;/strong&gt; - all threads share the same memory space, which means bugs in one thread (like accessing shared data without locks) can corrupt data across the entire program. Processes are isolated - a crash in one process doesn't affect others.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;When neither helps&lt;/strong&gt; - if your program is both CPU and I/O bound, you might need a hybrid approach: processes for CPU work, each using threads for I/O. Or consider &lt;code&gt;asyncio&lt;/code&gt; for I/O operations instead of threads if you're doing lots of concurrent I/O.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Process pool size guidelines&lt;/strong&gt; - for CPU-bound work, use &lt;code&gt;os.cpu_count()&lt;/code&gt; workers (one per CPU core). For I/O-bound work with threads, you can use many more (tens or hundreds) since threads spend most time waiting. Experiment to find the sweet spot.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;concurrent.futures&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ProcessPoolExecutor&lt;/span&gt;

&lt;span class="c1"&gt;# CPU-bound - match core count&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;ProcessPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpu_count&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cpu_intensive_func&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# I/O-bound - can use many more&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;io_intensive_func&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;The simple decision tree:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Waiting for network/disk/external services? Use &lt;strong&gt;threading&lt;/strong&gt; (or &lt;code&gt;asyncio&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Doing heavy calculations/data processing? Use &lt;strong&gt;multiprocessing&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Doing simple sequential work? Use &lt;strong&gt;neither&lt;/strong&gt; - regular code is simpler&lt;/li&gt;
&lt;/ul&gt;</content><category term="note"/><category term="python"/><category term="threading"/><category term="multiprocessing"/><category term="gil"/><category term="concurrency"/><category term="parallelism"/><category term="performance"/></entry><entry><title>asyncio Basics - async/await and When to Actually Use Them</title><link href="https://www.safjan.com/asyncio-basics-async-await-when-to-use/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-09-09T00:00:00+02:00</published><updated>2025-09-09T00:00:00+02:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-09-09:/asyncio-basics-async-await-when-to-use/</id><summary type="html">&lt;p&gt;Learn how &lt;code&gt;async&lt;/code&gt;/&lt;code&gt;await&lt;/code&gt; enables efficient concurrent programming by handling I/O waits without blocking, and discover various ways to run tasks concurrently, manage context managers, and handle timeouts.&lt;/p&gt;</summary><content type="html">&lt;h2 id="core-principle"&gt;Core Principle&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;async&lt;/code&gt;/&lt;code&gt;await&lt;/code&gt; is Python's way of writing concurrent code that can handle multiple I/O operations without blocking. The key insight: while one task is waiting for something (network response, file read, database query), other tasks can run. This is &lt;strong&gt;cooperative multitasking&lt;/strong&gt; - tasks voluntarily yield control during waits.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;asyncio&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# This function can be paused and resumed&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Simulating network delay&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Data from &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# Run three fetches concurrently&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;fetch_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;api.example.com/1&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;fetch_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;api.example.com/2&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;fetch_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;api.example.com/3&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Run the async code&lt;/span&gt;
&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This takes roughly 1 second total, not 3, because the waits happen concurrently.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;History:&lt;/strong&gt; &lt;code&gt;async&lt;/code&gt;/&lt;code&gt;await&lt;/code&gt; syntax was introduced in Python 3.5 via PEP 492. Earlier async code used &lt;code&gt;@asyncio.coroutine&lt;/code&gt; and &lt;code&gt;yield from&lt;/code&gt;, but &lt;code&gt;async&lt;/code&gt;/&lt;code&gt;await&lt;/code&gt; is cleaner and now standard.&lt;/p&gt;
&lt;h2 id="useful-extensions"&gt;Useful Extensions&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Multiple ways to run concurrent tasks:&lt;/strong&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# gather - run multiple tasks, collect all results&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task1&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;task2&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;task3&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="c1"&gt;# create_task - start a task in the background&lt;/span&gt;
&lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;create_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;long_running_operation&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="c1"&gt;# Do other stuff&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;  &lt;span class="c1"&gt;# Wait for it when you need the result&lt;/span&gt;

&lt;span class="c1"&gt;# as_completed - process results as they finish&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;coro&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;as_completed&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;task1&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;task2&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;task3&lt;/span&gt;&lt;span class="p"&gt;()]):&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;coro&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Got result: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Async context managers and iterators:&lt;/strong&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Async context manager&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;aiohttp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Async iterator&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;async_generator&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Timeouts and cancellation:&lt;/strong&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wait_for&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;slow_operation&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;5.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TimeoutError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Operation took too long&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Cancel a running task&lt;/span&gt;
&lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;create_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;operation&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cancel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="specific-use-cases"&gt;Specific Use Cases&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;When async actually helps&lt;/strong&gt; - I/O-bound operations where you're waiting for external resources:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;HTTP requests to APIs (using &lt;code&gt;aiohttp&lt;/code&gt; or &lt;code&gt;httpx&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Database queries (using &lt;code&gt;asyncpg&lt;/code&gt;, &lt;code&gt;motor&lt;/code&gt; for MongoDB)&lt;/li&gt;
&lt;li&gt;Reading/writing files (using &lt;code&gt;aiofiles&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;WebSocket connections&lt;/li&gt;
&lt;li&gt;Microservice communication&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;When async doesn't help&lt;/strong&gt; - CPU-bound work like calculations, data processing, or image manipulation. Async doesn't give you parallelism for compute work - for that you need &lt;code&gt;multiprocessing&lt;/code&gt; or threads (and threads don't help much due to the GIL). Async is about doing other things while waiting, not doing multiple CPU-intensive things simultaneously.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Real-world example where async shines:&lt;/strong&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Without async: 10 API calls taking 1 second each = 10 seconds total&lt;/span&gt;
&lt;span class="c1"&gt;# With async: 10 API calls taking 1 second each = ~1 second total&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_user_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_ids&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;aiohttp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;tasks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;fetch_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;uid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;uid&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;user_ids&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;api.example.com/users/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="nuances"&gt;Nuances&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;You can't mix sync and async freely&lt;/strong&gt; - once you go async, you need an async ecosystem. You can't &lt;code&gt;await&lt;/code&gt; in a regular function, and you can't call regular blocking functions in async code without consequences:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;bad_example&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# This blocks the entire event loop!&lt;/span&gt;
    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Wrong - use await asyncio.sleep(5)&lt;/span&gt;

    &lt;span class="c1"&gt;# This also blocks everything&lt;/span&gt;
    &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Wrong - use aiohttp or httpx async client&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;good_example&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Non-blocking sleep&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;aiohttp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Non-blocking HTTP&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Running blocking code when necessary&lt;/strong&gt; - sometimes you have to use a blocking library. Use &lt;code&gt;run_in_executor&lt;/code&gt; to run it in a thread pool:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;concurrent.futures&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;use_blocking_library&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;loop&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_event_loop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="c1"&gt;# Run blocking code in a thread&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;loop&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run_in_executor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Use default executor&lt;/span&gt;
        &lt;span class="n"&gt;blocking_function&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;arg1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arg2&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;The event loop is the engine&lt;/strong&gt; - &lt;code&gt;asyncio.run()&lt;/code&gt; creates an event loop, runs your main coroutine, and cleans up. In older code you'll see manual loop management with &lt;code&gt;loop.run_until_complete()&lt;/code&gt;, but &lt;code&gt;asyncio.run()&lt;/code&gt; (added in Python 3.7) is simpler.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Common mistake - forgetting await:&lt;/strong&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_data&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;data&amp;quot;&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fetch_data&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# Wrong! This is a coroutine object, not the result&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Prints: &amp;lt;coroutine object fetch_data at 0x...&amp;gt;&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;fetch_data&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# Correct&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Prints: data&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;When NOT to use async&lt;/strong&gt; - if you're only doing one I/O operation at a time, regular synchronous code is simpler and just as fast. Async adds complexity (mental overhead, debugging difficulty, library compatibility) that's only worth it when you're doing multiple I/O operations concurrently. A script that makes one API call doesn't benefit from async. A web scraper hitting 100 URLs does.&lt;/p&gt;</content><category term="note"/><category term="python"/><category term="asyncio"/><category term="concurrency"/><category term="async-await"/><category term="io-bound"/><category term="performance"/></entry><entry><title>Replacing Makefile with Invoke for Cross-Platform Python Tasks</title><link href="https://www.safjan.com/replacing-makefile-with-invoke-for-crossplatform-python-tasks/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-08-14T00:00:00+02:00</published><updated>2025-08-14T00:00:00+02:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-08-14:/replacing-makefile-with-invoke-for-crossplatform-python-tasks/</id><summary type="html">&lt;p&gt;Learn how switching from Make to Invoke improves cross-platform compatibility for Python project tasks, ensuring consistent behavior across macOS, Linux, and Windows.&lt;/p&gt;</summary><content type="html">&lt;p&gt;I’ve always liked Make. It’s quick to type, powerful, and honestly kind of fun once you know the quirks. On macOS and Linux, it just works.&lt;/p&gt;
&lt;p&gt;Then my teammate on Windows ran &lt;code&gt;make test-unit&lt;/code&gt;. Boom. Red text. Paths broke. Shell flags disappeared. Instead of testing code, we were testing our patience.&lt;/p&gt;
&lt;p&gt;After a couple of these moments, I realized I had two choices:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Keep duct-taping Windows support into the Makefile.&lt;/li&gt;
&lt;li&gt;Switch to a tool that doesn’t care which OS you’re on.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I chose option 2.&lt;/p&gt;
&lt;h2 id="from-my-old-makefile"&gt;From my old Makefile&lt;/h2&gt;
&lt;p&gt;It looked like this — perfectly fine for macOS/Linux, but brittle on Windows:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nv"&gt;SRC_FILES&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;src
&lt;span class="nv"&gt;SRC_AND_TEST_FILES&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;src&lt;span class="w"&gt; &lt;/span&gt;tests
&lt;span class="nv"&gt;R_PYPROJECT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;requirements-pyproject.txt

&lt;span class="nf"&gt;test-unit&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;## Run the unit tests with pytest.&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;@echo&lt;span class="w"&gt; &lt;/span&gt;-e&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="k"&gt;$(&lt;/span&gt;COLOR_CYAN&lt;span class="k"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;Running unit tests...&lt;/span&gt;&lt;span class="k"&gt;$(&lt;/span&gt;COLOR_RESET&lt;span class="k"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;pytest&lt;span class="w"&gt; &lt;/span&gt;--log-cli-level&lt;span class="o"&gt;=&lt;/span&gt;INFO&lt;span class="w"&gt; &lt;/span&gt;-rA&lt;span class="w"&gt; &lt;/span&gt;tests/unit/

&lt;span class="nf"&gt;format&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;## Running code formatter: black and isort&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;@echo&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;(isort) Ordering imports...&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;@isort&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;$(&lt;/span&gt;SRC_AND_TEST_FILES&lt;span class="k"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;@echo&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;(black) Formatting codebase...&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;@black&lt;span class="w"&gt; &lt;/span&gt;--config&lt;span class="w"&gt; &lt;/span&gt;pyproject.toml&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;$(&lt;/span&gt;SRC_AND_TEST_FILES&lt;span class="k"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;@echo&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;(ruff) Running fix only...&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;@ruff&lt;span class="w"&gt; &lt;/span&gt;check&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;$(&lt;/span&gt;SRC_AND_TEST_FILES&lt;span class="k"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--fix-only

&lt;span class="nf"&gt;lint&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;## Run the linter (ruff) to check the code style.&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;@echo&lt;span class="w"&gt; &lt;/span&gt;-e&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="k"&gt;$(&lt;/span&gt;COLOR_CYAN&lt;span class="k"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;Checking code style with ruff...&lt;/span&gt;&lt;span class="k"&gt;$(&lt;/span&gt;COLOR_RESET&lt;span class="k"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;ruff&lt;span class="w"&gt; &lt;/span&gt;check&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;$(&lt;/span&gt;SRC_AND_TEST_FILES&lt;span class="k"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;changelog&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;## Generate a changelog using git-cliff&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;git-cliff&lt;span class="w"&gt; &lt;/span&gt;-o&lt;span class="w"&gt; &lt;/span&gt;CHANGELOG.md
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;It’s not bad code. It’s just not friendly to every shell.&lt;br&gt;
&lt;code&gt;echo -e&lt;/code&gt; works in Bash, not in cmd.exe. Color codes are ignored in PowerShell. Paths and quoting rules differ. Small stuff, but it adds up.&lt;/p&gt;
&lt;h2 id="how-i-rewrote-it-to-work-everywhere"&gt;How I rewrote it to work everywhere&lt;/h2&gt;
&lt;p&gt;I moved the build logic into Python with &lt;a href="https://www.pyinvoke.org/"&gt;Invoke&lt;/a&gt;. The syntax is simple, it runs anywhere Python runs, and I can call the exact same commands on macOS, Linux, and Windows.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# tasks.py — Invoke version&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;invoke&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;

&lt;span class="n"&gt;SRC_FILES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;src&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;SRC_AND_TEST_FILES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;src tests&amp;quot;&lt;/span&gt;

&lt;span class="nd"&gt;@task&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_unit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Running unit tests...&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;python -m pytest --log-cli-level=INFO -rA tests/unit/&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@task&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;(isort) Ordering imports...&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;python -m isort &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;SRC_AND_TEST_FILES&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;(black) Formatting codebase...&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;python -m black --config pyproject.toml &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;SRC_AND_TEST_FILES&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;(ruff) Running fix only...&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;python -m ruff check &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;SRC_AND_TEST_FILES&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; --fix-only&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@task&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Checking code style with ruff...&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;python -m ruff check &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;SRC_AND_TEST_FILES&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@task&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;changelog&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;git-cliff -o CHANGELOG.md&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Now my teammate runs:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;invoke&lt;span class="w"&gt; &lt;/span&gt;test-unit
invoke&lt;span class="w"&gt; &lt;/span&gt;format
invoke&lt;span class="w"&gt; &lt;/span&gt;lint
invoke&lt;span class="w"&gt; &lt;/span&gt;changelog
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;No “works on my machine” syndrome. No branching logic for OS detection. It just works.&lt;/p&gt;
&lt;h2 id="why-this-change-paid-off"&gt;Why this change paid off&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;I write a command once — it runs everywhere.&lt;/li&gt;
&lt;li&gt;No one needs to know shell quirks to contribute.&lt;/li&gt;
&lt;li&gt;Commands are short and consistent across the team.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Make is still a great tool. But for a Python project with a mixed-OS team, moving the build brain into Python was a quiet productivity win.&lt;/p&gt;</content><category term="note"/><category term="makefile"/><category term="cross-platform-build"/><category term="python"/><category term="invoke"/><category term="task-automation"/><category term="developer-experience"/><category term="windows"/><category term="macos"/><category term="linux"/><category term="team-collaboration"/><category term="build-tools"/><category term="portable-scripts"/></entry><entry><title>Using OpenAI Python SDK with Local Ollama Models (and When to Opt for Alternatives)</title><link href="https://www.safjan.com/openai-python-sdk-with-local-ollama-models-and-alternatives/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-07-29T00:00:00+02:00</published><updated>2025-07-29T00:00:00+02:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-07-29:/openai-python-sdk-with-local-ollama-models-and-alternatives/</id><summary type="html">&lt;p&gt;Learn how to use the official &lt;code&gt;openai&lt;/code&gt; Python package with local Ollama models and when it's better to opt for LiteLLM as a more unified alternative.&lt;/p&gt;</summary><content type="html">&lt;p&gt;I've been diving into how to use the official &lt;code&gt;openai&lt;/code&gt; Python package to &lt;strong&gt;talk to local Ollama models&lt;/strong&gt;—and when it makes sense to bring in abstraction layers like &lt;strong&gt;LiteLLM&lt;/strong&gt;. Let me walk you through what I learned.&lt;/p&gt;
&lt;h3 id="1-can-i-use-the-openai-python-package-for-local-ollama-models"&gt;1. Can I use the OpenAI Python package for local Ollama models?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Yes!&lt;/strong&gt; Since early February 2024, Ollama supports the &lt;strong&gt;OpenAI Chat Completions API&lt;/strong&gt;, exposing compatible endpoints locally. You can simply point the OpenAI client at &lt;code&gt;"http://localhost:11434/v1"&lt;/code&gt;, pass a dummy API key, and call completions just like you would to OpenAI’s hosted API (see &lt;a href="https://ollama.com/blog/openai-compatibility?utm_source=chatgpt.com" title="OpenAI compatibility · Ollama Blog"&gt;Ollama blog&lt;/a&gt;).&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;
&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;http://localhost:11434/v1&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;unused-key&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;llama2&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;role&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;system&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;content&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;You are a helpful assistant.&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;role&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;user&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;content&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;What’s the capital of France?&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You can also do embeddings similarly:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;llama2&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Hello world!&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;So for fairly simple local projects, the OpenAI SDK works perfectly with Ollama.&lt;/p&gt;
&lt;h3 id="2-when-should-i-use-litellm-instead"&gt;2. When should I use LiteLLM instead?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;LiteLLM&lt;/strong&gt; is a lightweight Python SDK (and optional proxy server) that provides a &lt;strong&gt;unified API&lt;/strong&gt; for over 100 LLM providers—including OpenAI, Anthropic, HuggingFace—and crucially, &lt;strong&gt;Ollama/local models&lt;/strong&gt; ( nice example with minimal Flask app - poem generator &lt;a href="https://notes.kodekloud.com/docs/Running-Local-LLMs-With-Ollama/Building-AI-Applications/Demo-Creating-an-App-Using-Ollama-OpenAI-Python-Client?utm_source=chatgpt.com" title="Demo Creating an App Using Ollama OpenAI Python Client"&gt;KodeKloud Notes&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;Here are some benefits of using LiteLLM:
- It standardizes completions, embeddings, streaming, retries, and fallback logic&lt;br&gt;
- You can swap providers (e.g. &lt;code&gt;openai/gpt‑4&lt;/code&gt;, &lt;code&gt;anthropic/claude&lt;/code&gt;, &lt;code&gt;ollama/llama2&lt;/code&gt;) with no code changes&lt;br&gt;
- Proxy server mode offers observability, logging, rate limiting, and cost tracking across providers (see LiteLLM documentation: &lt;a href="https://docs.litellm.ai/?utm_source=chatgpt.com" title="LiteLLM - Getting Started | liteLLM"&gt;LiteLLM&lt;/a&gt;)&lt;/p&gt;
&lt;h3 id="3-example-using-litellm-python-sdk"&gt;3. Example: using LiteLLM Python SDK&lt;/h3&gt;
&lt;p&gt;First install:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;pip&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;litellm
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Then in Python:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;litellm&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;completion&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt;

&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;OPENAI_API_KEY&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;dummy&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;LITELLM_OLLAMA_BASE&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;http://localhost:11434/v1&amp;quot;&lt;/span&gt;

&lt;span class="c1"&gt;# Completion via Ollama&lt;/span&gt;
&lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;completion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;ollama/llama2&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;role&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;user&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;content&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Hello!&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;}])&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;choices&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;message&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;content&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="c1"&gt;# Embeddings via Ollama&lt;/span&gt;
&lt;span class="n"&gt;emb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;ollama/llama2&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Hello world&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;emb&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;data&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;embedding&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Later you can just switch to &lt;code&gt;openai/gpt-4o&lt;/code&gt; or another provider. You keep the same &lt;code&gt;completion(...)&lt;/code&gt; call. No branching logic in your app (&lt;a href="https://langfuse.com/integrations/frameworks/litellm-sdk?utm_source=chatgpt.com" title="Open Source Observability for LiteLLM SDK - Langfuse"&gt;Langfuse&lt;/a&gt;, &lt;a href="https://docs.litellm.ai/?utm_source=chatgpt.com" title="LiteLLM - Getting Started | liteLLM"&gt;LiteLLM&lt;/a&gt;, &lt;a href="https://docs.litellm.ai/docs/?utm_source=chatgpt.com" title="LiteLLM - Getting Started"&gt;LiteLLM&lt;/a&gt;).&lt;/p&gt;
&lt;h3 id="5-alternatives-to-litellm"&gt;5. Alternatives to LiteLLM&lt;/h3&gt;
&lt;p&gt;There are several other frameworks you may consider:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Langchain&lt;/strong&gt;, &lt;strong&gt;Llama‑Index&lt;/strong&gt;, &lt;strong&gt;Guidance&lt;/strong&gt;, &lt;strong&gt;instructor&lt;/strong&gt; – great for structured output, chaining, tool-use, agents, prompt templating. Read more in these sources:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://medium.com/towardsdev/4-proven-ways-to-use-ollama-locally-openai-apis-in-python-fast-flexible-and-scalable-216dca893b1c"&gt;4 Proven Ways to Use Ollama Locally &amp;amp; OpenAI APIs in Python: Fast, Flexible, and Scalable | by Brain Glitch | Jun, 2025 | Towards Dev&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://medium.com/%40hajraali730/unlocking-the-power-of-litellm-a-lightweight-unified-interface-for-llms-5dc09cece265"&gt;Unlocking the Power of LiteLLM: A Lightweight, Unified Interface for LLMs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.truefoundry.com/blog/litellm-alternatives"&gt;Top 5 LiteLLM alternatives of 2025&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simmering.dev/blog/structured_output/"&gt;The best library for structured LLM output – Paul Simmering&lt;/a&gt; - gives different recommendations for four use cases.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;TrueFoundry&lt;/strong&gt; – a more enterprise‑ready orchestration layer with observability, scaling, and deployment support, but heavier than LiteLLM (&lt;a href="https://www.truefoundry.com/blog/litellm-alternatives?utm_source=chatgpt.com" title="Top 5 LiteLLM alternatives of 2025 - TrueFoundry"&gt;truefoundry.com&lt;/a&gt;)&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="summary"&gt;Summary&lt;/h3&gt;
&lt;p&gt;I like using the &lt;strong&gt;OpenAI Python SDK&lt;/strong&gt; with Ollama—it’s quick, reliable, and simple for local use cases. But as soon as I need to add other providers, handle retries/fallbacks, use embeddings, or manage observability and switching logic, &lt;strong&gt;LiteLLM&lt;/strong&gt; becomes more convenient. And if I’m building complex agent pipelines or need structure, then libraries like &lt;strong&gt;Langchain&lt;/strong&gt; or &lt;strong&gt;TrueFoundry&lt;/strong&gt; fit right in.&lt;/p&gt;</content><category term="note"/><category term="openai"/><category term="openai-sdk"/><category term="ollama"/><category term="litellm"/><category term="langchain"/><category term="llama-index"/><category term="guidance"/><category term="instructor"/></entry><entry><title>Building a Multi-Notebook Report with Quarto</title><link href="https://www.safjan.com/building-a-multinotebook-report-with-quarto/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-07-23T00:00:00+02:00</published><updated>2025-07-23T00:00:00+02:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-07-23:/building-a-multinotebook-report-with-quarto/</id><summary type="html">&lt;p&gt;Learn how to split a large Jupyter notebook into multiple notebooks and combine them into a cohesive report using Quarto's book project functionality for HTML, PDF, or EPUB formats.&lt;/p&gt;</summary><content type="html">&lt;p&gt;Over the past few months, I’ve been using Jupyter notebooks to explore and document various data analysis tasks. At some point, I realized I wanted to share the results in a more polished way, so I used quarto to generate report. Over the time analysis notebook has grown to the monster size. I wanted to split it into multiple notebooks, each focusing on a specific aspect of the analysis. But I also wanted to combine them into a single cohesive report.
Something that reads like a real report. I didn't want to just dump a bunch of disconnected notebooks on someone.&lt;/p&gt;
&lt;p&gt;That’s where &lt;strong&gt;Quarto&lt;/strong&gt; with book project functionality came in. In this post, I’ll show you two practical ways to combine multiple notebooks into a single, unified report using Quarto:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Option 1:&lt;/strong&gt; Turn your notebooks into a structured book (like GitBook)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Option 2:&lt;/strong&gt; Embed notebooks into a single &lt;code&gt;.qmd&lt;/code&gt; file&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Both methods let you keep your notebooks clean and modular while generating a polished report that looks great in HTML, PDF, or even EPUB formats.&lt;/p&gt;
&lt;h2 id="option-1-create-a-book-like-report-with-quarto"&gt;Option 1: Create a Book-Like Report with Quarto&lt;/h2&gt;
&lt;p&gt;This is the approach I went with for a larger project. It gives you a navigation sidebar, automatic chapter numbering, and multiple output formats. Think of it as building a small website or PDF book.&lt;/p&gt;
&lt;h3 id="step-1-install-quarto"&gt;Step 1: Install Quarto&lt;/h3&gt;
&lt;p&gt;If you haven’t installed Quarto yet:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# On macOS/Linux&lt;/span&gt;
brew&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;quarto

&lt;span class="c1"&gt;# Or download manually&lt;/span&gt;
https://quarto.org/docs/get-started/
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Make sure Jupyter is also installed and working.&lt;/p&gt;
&lt;h3 id="step-2-create-a-new-book-project"&gt;Step 2: Create a New Book Project&lt;/h3&gt;
&lt;p&gt;Let’s set up the project:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;quarto&lt;span class="w"&gt; &lt;/span&gt;create&lt;span class="w"&gt; &lt;/span&gt;project&lt;span class="w"&gt; &lt;/span&gt;book&lt;span class="w"&gt; &lt;/span&gt;my-report
&lt;span class="nb"&gt;cd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;my-report
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You’ll now see a few files and folders:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;my-report/
├── _quarto.yml
├── index.qmd
├── intro.qmd
└── references.qmd
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This is your basic structure. You can remove &lt;code&gt;intro.qmd&lt;/code&gt; or rename it. For now, we’ll leave it.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;NOTE&lt;/strong&gt;: you can use vscode quarto extensions and start project from vscode as described in Quick Start in &lt;a href="https://quarto.org/docs/books/"&gt;documentation&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id="step-3-add-your-notebooks"&gt;Step 3: Add Your Notebooks&lt;/h3&gt;
&lt;p&gt;Drop your notebooks into the folder. For example:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;my-report/
├── chapter1.ipynb
├── chapter2.ipynb
├── conclusion.ipynb
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You can keep working in Jupyter as usual.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;NOTE&lt;/strong&gt;: I encourage you to read about structuring your "book" in the official docs &lt;a href="https://quarto.org/docs/books/book-structure.html"&gt;here&lt;/a&gt;. There are many other options than chapter: book parts, appendices &lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id="step-4-configure-_quartoyml"&gt;Step 4: Configure &lt;code&gt;_quarto.yml&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;This is where the magic happens. You define the structure of your report here:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nt"&gt;project&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;book&lt;/span&gt;

&lt;span class="nt"&gt;book&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;My&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Analysis&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Report&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;author&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;Your&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Name&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;chapters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;index.qmd&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;chapter1.ipynb&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;chapter2.ipynb&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;conclusion.ipynb&lt;/span&gt;

&lt;span class="nt"&gt;format&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;html&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;toc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;true&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;pdf&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;toc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;true&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;documentclass&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;book&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Make sure the filenames match your actual notebooks. You can mix &lt;code&gt;.qmd&lt;/code&gt; and &lt;code&gt;.ipynb&lt;/code&gt; files freely.&lt;/p&gt;
&lt;p&gt;Each "chapter" notebook can have its own quarto front matter (YAML header) if you want to customize titles, authors, or other metadata.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;NOTE&lt;/strong&gt;: The official documentation describes many options for customizing style, layout, controls, social media sharing options. Check it out &lt;a href="https://quarto.org/docs/books/book-output.html"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id="step-5-preview-your-report"&gt;Step 5: Preview Your Report&lt;/h3&gt;
&lt;p&gt;To preview as you work:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;quarto&lt;span class="w"&gt; &lt;/span&gt;preview
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This starts a live-reloading web server at &lt;code&gt;http://localhost:4200&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;When you’re ready to render:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;quarto&lt;span class="w"&gt; &lt;/span&gt;render
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This generates the output in &lt;code&gt;_book/&lt;/code&gt; (for HTML) and optionally &lt;code&gt;.pdf&lt;/code&gt; or &lt;code&gt;.epub&lt;/code&gt; depending on your formats.&lt;/p&gt;
&lt;h2 id="option-2-one-qmd-file-that-embeds-notebooks"&gt;Option 2: One &lt;code&gt;.qmd&lt;/code&gt; File That Embeds Notebooks&lt;/h2&gt;
&lt;p&gt;Sometimes you want to pick and choose what parts of your notebooks go into a report. Or maybe you want to glue them together with a lot of custom narrative. That’s where embedding comes in.&lt;/p&gt;
&lt;p&gt;Instead of building a whole book, you write one &lt;code&gt;.qmd&lt;/code&gt; file and embed specific cells or whole notebooks into it.&lt;/p&gt;
&lt;h3 id="step-1-create-a-new-quarto-project"&gt;Step 1: Create a New Quarto Project&lt;/h3&gt;
&lt;p&gt;This time we’ll use a regular document project:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;quarto&lt;span class="w"&gt; &lt;/span&gt;create&lt;span class="w"&gt; &lt;/span&gt;project&lt;span class="w"&gt; &lt;/span&gt;article&lt;span class="w"&gt; &lt;/span&gt;embedded-report
&lt;span class="nb"&gt;cd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;embedded-report
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Your structure will look like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;embedded-report/
├── _quarto.yml
└── report.qmd
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You can drop your notebooks here too:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;embedded-report/
├── notebook_a.ipynb
├── notebook_b.ipynb
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h3 id="step-2-write-the-report-with-embeds"&gt;Step 2: Write the Report with Embeds&lt;/h3&gt;
&lt;p&gt;Edit &lt;code&gt;report.qmd&lt;/code&gt; like this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;---
title: &amp;quot;Combined Report&amp;quot;
format: html
execute:
&lt;span class="gu"&gt;  enabled: false&lt;/span&gt;
&lt;span class="gu"&gt;---&lt;/span&gt;

&lt;span class="gh"&gt;# Introduction&lt;/span&gt;

This is a combined report from two notebooks.

&lt;span class="gh"&gt;# Results from Notebook A&lt;/span&gt;

```{=embed}
notebook_a.ipynb
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h1 id="specific-figure-from-notebook-b"&gt;Specific Figure from Notebook B&lt;/h1&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;notebook_b.ipynb#cell-3
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h1 id="full-notebook-b"&gt;Full Notebook B&lt;/h1&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;notebook_b.ipynb
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You can embed the whole notebook or reference individual cells by their labels or indices. If you want to embed only specific cells, give them labels in Jupyter using &lt;code&gt;#| label: my-cell&lt;/code&gt;.&lt;/p&gt;
&lt;h3 id="step-3-render-it"&gt;Step 3: Render It&lt;/h3&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;quarto&lt;span class="w"&gt; &lt;/span&gt;render
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You’ll get a clean report with selected content pulled from your notebooks. You can even combine this approach with more Markdown narrative, custom styling, and conditional content.&lt;/p&gt;
&lt;h2 id="summary"&gt;Summary&lt;/h2&gt;
&lt;p&gt;Both methods have their strengths:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;Quarto Book&lt;/strong&gt; approach is great when you want a navigable multi-chapter report or site. It’s structured, scalable, and looks professional out of the box.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;embedding approach&lt;/strong&gt; gives you more flexibility when curating content or writing more detailed commentary around selected outputs.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I personally use the book format for larger projects, and embedding for quick curated reports. With both approaches, notebooks stay clean and modular — and reports look great with just a few lines of configuration.&lt;/p&gt;
&lt;h2 id="alternative-tools"&gt;Alternative tools&lt;/h2&gt;
&lt;h2 id="additional-tools-optional"&gt;Additional tools (optional)&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Mercury&lt;/strong&gt;: converts notebooks into interactive web apps, with widgets, hideable code, etc. More suited for dashboards rather than structured multi-notebook books. &lt;a href="https://quarto.org/docs/books/?utm_source=chatgpt.com"&gt;Quarto+1Reddit+1&lt;/a&gt;&lt;a href="https://www.reddit.com/r/Python/comments/s7ngj0?utm_source=chatgpt.com"&gt;Reddit+15Reddit+15Quarto+15&lt;/a&gt;&lt;a href="https://csoneson.github.io/ReproduciblePublishing2024/IntroToQuarto/quarto.html?utm_source=chatgpt.com"&gt;Quarto+6csoneson.github.io+6jumpingrivers.com+6&lt;/a&gt;&lt;a href="https://blog.adyog.com/2025/02/15/quarto-convert-jupyter-notebooks-into-professional-reports-websites-and-dashboards/?utm_source=chatgpt.com"&gt;blog.adyog.com&lt;/a&gt;&lt;a href="https://www.jumpingrivers.com/blog/reproducible-reports-jupyter-quarto-python/?utm_source=chatgpt.com"&gt;Reddit+2jumpingrivers.com+2Medium+2&lt;/a&gt;&lt;a href="https://github.com/RobertsLab/resources/discussions/1719?utm_source=chatgpt.com"&gt;GitHub+1Quarto+1&lt;/a&gt;&lt;a href="https://www.reddit.com/r/Python/comments/11tp5fa?utm_source=chatgpt.com"&gt;Reddit+4Reddit+4Reddit+4&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="-pretty-jupyter-a-tool-for-styling-notebook-outputs-into-elegant-selfcontained-htmlless-structured-than-quarto-books-but-quick-and-visually-appealing-reddit2reddit2reddit2"&gt;- &lt;strong&gt;Pretty Jupyter&lt;/strong&gt;: a tool for styling notebook outputs into elegant self‑contained HTML—less structured than Quarto “books”, but quick and visually appealing. &lt;a href="https://www.reddit.com/r/MachineLearning/comments/w9ec2e?utm_source=chatgpt.com"&gt;Reddit+2Reddit+2Reddit+2&lt;/a&gt;&lt;/h2&gt;
&lt;h2 id="references"&gt;References&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://quarto.org/docs/books/"&gt;Creating a Book – Quarto&lt;/a&gt; - official documentation for quarto books&lt;/li&gt;
&lt;li&gt;&lt;a href="https://wesmckinney.com/book/"&gt;Python for Data Analysis, 3E&lt;/a&gt; - exemplary book created with Quarto&lt;/li&gt;
&lt;/ul&gt;</content><category term="note"/><category term="quarto"/><category term="jupyter-notebooks"/><category term="nbconvert"/><category term="reporting"/><category term="experiment-management"/><category term="research"/><category term="research-report"/></entry><entry><title>Downgrade or Upgrade Your Python Version with uv</title><link href="https://www.safjan.com/downgrade-or-upgrade-your-python-version-with-uv/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-06-26T00:00:00+02:00</published><updated>2025-06-26T00:00:00+02:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-06-26:/downgrade-or-upgrade-your-python-version-with-uv/</id><summary type="html">&lt;p&gt;Learn how to downgrade or upgrade your project’s Python version using the &lt;code&gt;uv&lt;/code&gt; tool, including steps for installing, pinning, and recreating your virtual environment.&lt;/p&gt;</summary><content type="html">&lt;p&gt;To downgrade your project’s virtual environment e.g. from Python 3.11 to 3.10 using &lt;strong&gt;uv&lt;/strong&gt;, here’s a step‑by‑step process:&lt;/p&gt;
&lt;h3 id="1-install-the-desired-python-version"&gt;1. Install the desired Python version&lt;/h3&gt;
&lt;p&gt;Run:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;uv&lt;span class="w"&gt; &lt;/span&gt;python&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;.10
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This downloads and manages Python 3.10 in &lt;code&gt;~/.local/share/uv/python/...&lt;/code&gt; (or the equivalent on your OS) (&lt;a href="https://docs.astral.sh/uv/guides/projects/?utm_source=chatgpt.com" title="Working on projects | uv - Astral Docs"&gt;docs.astral.sh&lt;/a&gt;, &lt;a href="https://docs.astral.sh/uv/concepts/python-versions/?utm_source=chatgpt.com" title="Python versions | uv - Astral Docs"&gt;docs.astral.sh&lt;/a&gt;).&lt;/p&gt;
&lt;h3 id="2-pin-your-project-to-that-version"&gt;2. Pin your project to that version&lt;/h3&gt;
&lt;p&gt;Within your project directory:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;uv&lt;span class="w"&gt; &lt;/span&gt;python&lt;span class="w"&gt; &lt;/span&gt;pin&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;.10
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This writes &lt;code&gt;3.10&lt;/code&gt; to &lt;code&gt;.python-version&lt;/code&gt;, ensuring that future commands use that interpreter (&lt;a href="https://docs.astral.sh/uv/?utm_source=chatgpt.com" title="uv - Astral Docs"&gt;docs.astral.sh&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;This might not work in case if your &lt;code&gt;pyproject.toml&lt;/code&gt; file has a python requirement that prevents upgrade - edit this first, e.g. change:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;requires-python = &amp;quot;&amp;gt;=3.11&amp;quot;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;to&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;requires-python = &amp;quot;&amp;gt;=3.10&amp;quot;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You can also edit &lt;code&gt;.python-version&lt;/code&gt; file to have it consistent with the rest of the project.&lt;/p&gt;
&lt;h3 id="3-recreate-the-virtual-environment"&gt;3. Recreate the virtual environment&lt;/h3&gt;
&lt;p&gt;The simplest and clean method:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;rm&lt;span class="w"&gt; &lt;/span&gt;-rf&lt;span class="w"&gt; &lt;/span&gt;.venv
uv&lt;span class="w"&gt; &lt;/span&gt;venv
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This creates a fresh venv using Python 3.10, respecting the pin.&lt;/p&gt;
&lt;p&gt;Alternatively, if you're managing dependencies via &lt;code&gt;pyproject.toml&lt;/code&gt; + &lt;code&gt;uv.lock&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;uv&lt;span class="w"&gt; &lt;/span&gt;sync
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This will recreate the environment from locked specs using the pinned Python version (&lt;a href="https://news.ycombinator.com/item?id=43903914&amp;amp;utm_source=chatgpt.com" title="uv is the way. https://docs.astral.sh/uv/ Sadly it appears that people ..."&gt;news.ycombinator.com&lt;/a&gt;, &lt;a href="https://docs.astral.sh/uv/getting-started/features/?utm_source=chatgpt.com" title="Features | uv - Astral Docs"&gt;docs.astral.sh&lt;/a&gt;).&lt;/p&gt;
&lt;h3 id="optional-verify-interpreter-version"&gt;Optional: Verify interpreter version&lt;/h3&gt;
&lt;p&gt;Run:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;.&lt;span class="w"&gt; &lt;/span&gt;.venv/bin/activate
python&lt;span class="w"&gt; &lt;/span&gt;--version&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;# should show Python 3.10.x&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</content><category term="note"/><category term="uv"/><category term="python-version"/><category term="python-dependencies"/><category term="python-upgrade"/><category term="python-downgrade"/><category term="python-version"/></entry><entry><title>Beyond Coverage - Building Truly Complete Test Suites with GitHub Copilot</title><link href="https://www.safjan.com/beyond-coverage-building-truly-complete-test-suites-with-github-copilot/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-06-15T00:00:00+02:00</published><updated>2026-02-07T00:00:00+01:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-06-15:/beyond-coverage-building-truly-complete-test-suites-with-github-copilot/</id><summary type="html">&lt;p&gt;This article explores how to move beyond simplistic code coverage metrics to build truly comprehensive test suites using GitHub Copilot. Drawing from practical experience, I demonstrate how AI-assisted testing can identify behavioral gaps, validate API contracts, generate maintainable tests, and address flaky tests - ultimately creating a sustainable quality assurance strategy focused on behaviors rather than coverage percentages. Learn specific techniques for behavioral auditing, integration testing, and continuous quality monitoring that have transformed our approach to software reliability.&lt;/p&gt;</summary><content type="html">&lt;ul&gt;
&lt;li&gt;&lt;a href="#introduction"&gt;Introduction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#the-coverage-trap"&gt;The Coverage Trap&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#beyond-line-coverage-behavioral-auditing"&gt;Beyond Line Coverage: Behavioral Auditing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#the-api-first-testing-strategy"&gt;The API-First Testing Strategy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#automating-the-tedious-parts"&gt;Automating the Tedious Parts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#the-flaky-test-problem"&gt;The Flaky Test Problem&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#test-quality-as-a-first-class-concern"&gt;Test Quality as a First-Class Concern&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#integration-and-end-to-end-validation"&gt;Integration and End-to-End Validation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#looking-forward"&gt;Looking Forward&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a id="introduction"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="introduction"&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Over the past year, I've found myself increasingly dissapointed with the traditional approach to test coverage. Sure, hitting 90% line coverage feels good, but I've watched too many "well-tested" codebases crumble under the weight of production bugs that somehow slipped through. The problem isn't just that we're measuring the wrong things - it's that we're treating testing as a checkbox exercise rather than a comprehensive quality assurance strategy.&lt;/p&gt;
&lt;p&gt;That's when I started experimenting with GitHub Copilot's Agent Mode, not just as a code completion tool, but as a systematic approach to building truly complete test suites. What I discovered was a fundamentally different way of thinking about testing - one that goes beyond coverage percentages to focus on behavioral completeness, integration reliability, and long-term maintainability.&lt;/p&gt;
&lt;p&gt;&lt;a id="the-coverage-trap"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="the-coverage-trap"&gt;The Coverage Trap&lt;/h2&gt;
&lt;p&gt;Let me start with a confession: I used to be obsessed with coverage numbers. There's something deeply satisfying about seeing that green 95% coverage badge, but recent research finds "disconcerting trends for maintainability" when we rely too heavily on automated tools without proper oversight. The truth is, coverage metrics can lull you into a false sense of security.&lt;/p&gt;
&lt;p&gt;I learned this the hard way when our system failed spectacularly in production due to a race condition in our retry logic. The lines were covered, but the behavior wasn't tested. That's when I realized we needed to &lt;strong&gt;shift from measuring what code runs to validating what the code actually does&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a id="beyond-line-coverage-behavioral-auditing"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="beyond-line-coverage-behavioral-auditing"&gt;Beyond Line Coverage: Behavioral Auditing&lt;/h2&gt;
&lt;p&gt;The first breakthrough came when I started using Copilot to audit our entire codebase for behavioral gaps. Instead of focusing on lines of code, I began asking it to identify untested public functions and methods. This simple prompt changed everything:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;"Scan the codebase identify all public functions and methods, then report which of them lack any direct test invocation. Group them by module."&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;(NOTE: you should include your code base as context in Agent mode with e.g. &lt;code&gt;#codebase&lt;/code&gt; or specific dir &lt;code&gt;#file:src&lt;/code&gt; )&lt;/p&gt;
&lt;p&gt;This might look similar to coverage testing but instead of covered lines you are getting information about functions that are not called directly by any of the tests.&lt;/p&gt;
&lt;p&gt;What emerged was startling. We had entire utility functions, error handling routines, and data transformation methods that had never been directly tested. They were covered by higher level tests, but their specific behaviors - especially edge cases remained completely unvalidated.&lt;/p&gt;
&lt;p&gt;This behavioral audit approach revealed gaps that traditional coverage tools simply can't detect. When you're validating input spaces rather than code paths, you uncover scenarios like empty inputs, malformed data, and maximum size payloads that can break your system in ways that line coverage never anticipates.&lt;/p&gt;
&lt;p&gt;&lt;a id="the-api-first-testing-strategy"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="the-api-first-testing-strategy"&gt;The API-First Testing Strategy&lt;/h2&gt;
&lt;p&gt;One of the most valuable insights from this journey has been the importance of API surface auditing. Every Flask endpoint, every REST API, every public interface represents a contract with the outside world. Breaking these contracts doesn't just cause bugs - it breaks trust with users and downstream systems.&lt;/p&gt;
&lt;p&gt;I started having Copilot systematically inventory all our endpoints and cross-reference them with our integration tests. The results were eye-opening: critical user journeys like password reset and account verification had comprehensive unit tests but no end-to-end validation. Copilot did the work of finding relevant files, extracting the relevant styles and patterns, and applying those forward to the new test suite that it generated, creating coherent integration tests that followed our established patterns.&lt;/p&gt;
&lt;p&gt;This approach catches issues that unit tests simply can't see serialization problems, authentication flows, error response formats, and the subtle ways that components interact when they're wired together in a real system.&lt;/p&gt;
&lt;p&gt;&lt;a id="automating-the-tedious-parts"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="automating-the-tedious-parts"&gt;Automating the Tedious Parts&lt;/h2&gt;
&lt;p&gt;Once I had a clear picture of what needed testing, the next challenge was actually writing all those tests. This is where GitHub Copilot's ability to generate tests becomes invaluable - you can select the code you want to test, right-click in your IDE and select Copilot -&amp;gt; Generate Tests, or use slash commands to quickly scaffold test suites.&lt;/p&gt;
&lt;p&gt;But I discovered that the real power isn't in generating individual tests - it's in systematically working through entire modules. I'd point Copilot at a file like &lt;code&gt;payment_processor.py&lt;/code&gt; and ask it to generate pytest tests covering valid payments, negative amounts, and simulated network failures using mocks. The agent would create the test file, inject proper fixtures, write assertions, and even run the tests to check for immediate failures.&lt;/p&gt;
&lt;p&gt;More importantly, Copilot excels at converting repetitive test patterns into parameterized tests. Instead of five nearly identical tests for different input values, I could ask it to consolidate them into a single &lt;code&gt;@pytest.mark.parametrize&lt;/code&gt; block. This not only reduces maintenance overhead but makes it trivial to add new edge cases as you discover them.&lt;/p&gt;
&lt;p&gt;&lt;a id="the-flaky-test-problem"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="the-flaky-test-problem"&gt;The Flaky Test Problem&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;Fleaky tests - the tests that pass sometimes and fail other times are the bane of every CI pipeline. They waste developer time, obscure real issues, and erode trust in your test suite.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;No discussion of comprehensive testing is complete without addressing the elephant in the room: flaky tests. There are two main types of flaky tests: those that are flaky due to some external conditions, such as network issues, machine crashes, power outages, and those that are flaky due to test design issues.&lt;/p&gt;
&lt;p&gt;To spot flaky tests, you need to compare test results from multiple test runs. This analysis would be a time consuming process to perform manually, but fortunately, many CI servers detect flaky tests automatically. The key insight is that Copilot can go beyond just detection to root cause analysis and remediation.&lt;/p&gt;
&lt;p&gt;For timing related flakiness, it suggests explicit waits or better synchronization. For external dependency issues, it recommends proper mocking. For shared state problems, it proposes better isolation techniques. The goal isn't to eliminate all flakiness - that's impossible, but to make your test suite reliable enough that failures actually mean something.&lt;/p&gt;
&lt;p&gt;&lt;a id="test-quality-as-a-first-class-concern"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="test-quality-as-a-first-class-concern"&gt;Test Quality as a First-Class Concern&lt;/h2&gt;
&lt;p&gt;As our test suite grew, I realized that test quality itself needed to become a first-class concern. Bad tests are worse than no tests - they give you false confidence while slowing down development. This is where Copilot's analytical capabilities really shine.&lt;/p&gt;
&lt;p&gt;I started having it audit our test directory for common anti-patterns: empty test functions, duplicated assertions, magic constants, and tests that rely on implicit ordering. The agent would flag these issues and suggest refactors-converting magic numbers to named constants, extracting common setup into fixtures, and consolidating duplicate logic.&lt;/p&gt;
&lt;p&gt;But the most valuable insight was learning to cross-reference coverage reports with module criticality. Not all code is equally important, and not all untested code represents the same level of risk. By having Copilot map coverage data against business-critical modules like payment processing and authentication, I could focus our testing efforts where they would have the most impact.&lt;/p&gt;
&lt;p&gt;&lt;a id="integration-and-end-to-end-validation"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="integration-and-end-to-end-validation"&gt;Integration and End-to-End Validation&lt;/h2&gt;
&lt;p&gt;Unit tests form the foundation, but they can't catch the subtle ways that components interact in production. This is where integration and end-to-end testing become crucial, and where Copilot's ability to understand entire workflows becomes invaluable.&lt;/p&gt;
&lt;p&gt;I've had great success asking Copilot to generate integration tests that exercise entire user journeys - from account creation through data processing to final output. These tests use in-memory databases for speed but validate the complete data flow including serialization, authentication, and error handling.&lt;/p&gt;
&lt;p&gt;The key is to focus on critical user paths rather than trying to test every possible integration. A single end-to-end test that uploads a CSV file, triggers data ingestion, and verifies the resulting database entries can catch a surprising number of issues that unit tests miss entirely.&lt;/p&gt;
&lt;p&gt;&lt;a id="looking-forward"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="looking-forward"&gt;Looking Forward&lt;/h2&gt;
&lt;p&gt;After a year of experimenting with this approach, I've come to believe that comprehensive testing isn't about reaching some magical coverage percentage - it's about building systems that give you confidence in your code's behavior. Copilot has been instrumental in making this transition from coverage-focused to behavior-focused testing.&lt;/p&gt;
&lt;p&gt;The techniques I've described here - behavioral auditing, API surface validation, automated test generation, flaky test management, and continuous quality monitoring - work together to create a testing strategy that's both comprehensive and maintainable. Each element addresses a different aspect of the testing challenge, from initial coverage gaps to long-term sustainability.&lt;/p&gt;
&lt;p&gt;What excites me most is that this is just the beginning. As AI tools become more sophisticated, I expect we'll see even more powerful approaches to test analysis and generation. The key is to remember that these tools are amplifiers of human insight, not replacements for it. The goal is to spend less time on mechanical test-writing and more time on the kinds of deep, thoughtful testing that actually prevents bugs.&lt;/p&gt;
&lt;p&gt;The future of testing isn't about perfect coverage - it's about perfect understanding of what your code actually does, and having the confidence that comes from knowing you've validated the behaviors that matter most.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Edits:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;2026-02-07: Added table of contents with anchor links&lt;/li&gt;
&lt;/ul&gt;</content><category term="Software Development"/><category term="github-copilot"/><category term="test-coverage"/><category term="test-automation"/><category term="code-quality"/><category term="automated-testing"/><category term="integration-testing"/><category term="flaky-tests"/><category term="ci-cd"/><category term="quality-assurance"/><category term="behavior-driven-testing"/></entry><entry><title>Understanding Python's `copy` vs `deepcopy` - When to Use Each</title><link href="https://www.safjan.com/understanding-pythons-copy-vs-deepcopy-when-to-use-each/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-03-20T00:00:00+01:00</published><updated>2025-03-20T00:00:00+01:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-03-20:/understanding-pythons-copy-vs-deepcopy-when-to-use-each/</id><summary type="html">&lt;p&gt;Learn when to use &lt;code&gt;copy.copy()&lt;/code&gt; for shallow copying and &lt;code&gt;copy.deepcopy()&lt;/code&gt; for deep copying in Python, understanding their differences and typical use cases.&lt;/p&gt;</summary><content type="html">&lt;p&gt;When working with Python objects, understanding how to properly copy them is crucial for avoiding unexpected behaviors in your code. Python provides two main approaches for copying objects: &lt;code&gt;copy&lt;/code&gt; and &lt;code&gt;deepcopy&lt;/code&gt;. Let's explore the differences, use cases, and potential pitfalls of each.&lt;/p&gt;
&lt;h2 id="the-basics-shallow-vs-deep-copying"&gt;The Basics: Shallow vs. Deep Copying&lt;/h2&gt;
&lt;p&gt;Python's &lt;code&gt;copy&lt;/code&gt; module provides two primary functions:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;code&gt;copy.copy()&lt;/code&gt; - Creates a shallow copy&lt;/li&gt;
&lt;li&gt;&lt;code&gt;copy.deepcopy()&lt;/code&gt; - Creates a deep copy&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The difference lies in how they handle nested objects.&lt;/p&gt;
&lt;h3 id="shallow-copy-copycopy"&gt;Shallow Copy (&lt;code&gt;copy.copy()&lt;/code&gt;)&lt;/h3&gt;
&lt;p&gt;A shallow copy creates a new object, but doesn't create copies of nested objects - it just copies references to them. This means changes to nested objects in the copy will affect the original, and vice versa.&lt;/p&gt;
&lt;h3 id="deep-copy-copydeepcopy"&gt;Deep Copy (&lt;code&gt;copy.deepcopy()&lt;/code&gt;)&lt;/h3&gt;
&lt;p&gt;A deep copy creates a completely independent clone - it recursively copies all nested objects, creating a fully separate hierarchy of objects.&lt;/p&gt;
&lt;h2 id="lets-see-it-in-action"&gt;Let's See It In Action&lt;/h2&gt;
&lt;p&gt;Here's a practical example showing the difference:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;copy&lt;/span&gt;

&lt;span class="c1"&gt;# Let&amp;#39;s create a nested list&lt;/span&gt;
&lt;span class="n"&gt;original&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

&lt;span class="c1"&gt;# Create a shallow copy&lt;/span&gt;
&lt;span class="n"&gt;shallow_copied&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;original&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Create a deep copy&lt;/span&gt;
&lt;span class="n"&gt;deep_copied&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deepcopy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;original&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Let&amp;#39;s modify the nested list in the original&lt;/span&gt;
&lt;span class="n"&gt;original&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;changed!&amp;#39;&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Original:&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;original&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Shallow copy:&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shallow_copied&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Deep copy:&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;deep_copied&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Original: [1, 2, [&amp;#39;changed!&amp;#39;, 4]]
Shallow copy: [1, 2, [&amp;#39;changed!&amp;#39;, 4]]
Deep copy: [1, 2, [3, 4]]
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Notice how changing the nested list in the original affected the shallow copy but not the deep copy!&lt;/p&gt;
&lt;h2 id="typical-use-cases"&gt;Typical Use Cases&lt;/h2&gt;
&lt;h3 id="when-to-use-shallow-copy-copy"&gt;When to Use Shallow Copy (&lt;code&gt;copy&lt;/code&gt;)&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Performance Matters&lt;/strong&gt;: When dealing with large objects where a deep copy would be expensive, and you're confident you won't modify nested objects.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Simple Data Structures&lt;/strong&gt;: When your object contains only immutable values like numbers, strings, or tuples.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;copy&lt;/span&gt;

&lt;span class="c1"&gt;# Dictionary with simple values&lt;/span&gt;
&lt;span class="n"&gt;user_settings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;theme&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;dark&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;notifications&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;volume&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;75&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Safe to use shallow copy here&lt;/span&gt;
&lt;span class="n"&gt;backup_settings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_settings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h3 id="when-to-use-deep-copy-deepcopy"&gt;When to Use Deep Copy (&lt;code&gt;deepcopy&lt;/code&gt;)&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Complex Nested Objects&lt;/strong&gt;: When working with objects that contain mutable objects like lists, dictionaries, or custom classes.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;When Independence is Critical&lt;/strong&gt;: When you need to ensure modifications to the copy don't affect the original at all.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;copy&lt;/span&gt;

&lt;span class="c1"&gt;# Complex nested structure representing a user profile&lt;/span&gt;
&lt;span class="n"&gt;user_profile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Alex&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;preferences&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;theme&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;dark&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;notifications&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;email&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;push&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;friends&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Taylor&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;status&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;online&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Jordan&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;status&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;offline&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# We need a true independent copy to modify&lt;/span&gt;
&lt;span class="n"&gt;new_profile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deepcopy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_profile&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Now we can safely modify nested lists and dictionaries&lt;/span&gt;
&lt;span class="n"&gt;new_profile&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;friends&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;status&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;away&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;new_profile&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;preferences&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;notifications&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;sms&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Original friend status:&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_profile&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;friends&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;status&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;  &lt;span class="c1"&gt;# Still &amp;quot;online&amp;quot;&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Copy friend status:&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;new_profile&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;friends&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;status&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;  &lt;span class="c1"&gt;# &amp;quot;away&amp;quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="common-gotchas-and-pitfalls"&gt;Common Gotchas and Pitfalls&lt;/h2&gt;
&lt;h3 id="1-forgetting-the-import"&gt;1. Forgetting the import&lt;/h3&gt;
&lt;p&gt;Remember to &lt;code&gt;import copy&lt;/code&gt; before using these functions!&lt;/p&gt;
&lt;h3 id="2-assignment-is-not-copying"&gt;2. Assignment Is Not Copying&lt;/h3&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# This is NOT a copy - it&amp;#39;s just another reference to the same object&lt;/span&gt;
&lt;span class="n"&gt;list2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;list1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h3 id="3-list-slicing-creates-shallow-copies"&gt;3. List Slicing Creates Shallow Copies&lt;/h3&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;original&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="n"&gt;sliced_copy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;original&lt;/span&gt;&lt;span class="p"&gt;[:]&lt;/span&gt;  &lt;span class="c1"&gt;# Equivalent to copy.copy()&lt;/span&gt;

&lt;span class="n"&gt;original&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;changed!&amp;quot;&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sliced_copy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Will show [1, 2, [&amp;#39;changed!&amp;#39;, 4]]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h3 id="4-performance-considerations"&gt;4. Performance Considerations&lt;/h3&gt;
&lt;p&gt;Deep copying can be resource-intensive for large nested structures:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;copy&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;time&lt;/span&gt;

&lt;span class="c1"&gt;# Create a large nested structure&lt;/span&gt;
&lt;span class="n"&gt;large_structure&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

&lt;span class="c1"&gt;# Time shallow copy&lt;/span&gt;
&lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;shallow&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;large_structure&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Shallow copy took: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;.6f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; seconds&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Time deep copy&lt;/span&gt;
&lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;deep&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deepcopy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;large_structure&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Deep copy took: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;.6f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; seconds&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h3 id="5-circular-references"&gt;5. Circular References&lt;/h3&gt;
&lt;p&gt;Be careful with circular references when using &lt;code&gt;deepcopy&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;copy&lt;/span&gt;

&lt;span class="c1"&gt;# Create a circular reference&lt;/span&gt;
&lt;span class="n"&gt;circular&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;circular&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;circular&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# The list contains itself!&lt;/span&gt;

&lt;span class="c1"&gt;# This works fine, handling the circular reference properly&lt;/span&gt;
&lt;span class="n"&gt;deep_circular&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deepcopy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;circular&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="custom-copying-behavior"&gt;Custom Copying Behavior&lt;/h2&gt;
&lt;p&gt;You can customize how your objects are copied by implementing &lt;code&gt;__copy__&lt;/code&gt; and &lt;code&gt;__deepcopy__&lt;/code&gt; methods:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;copy&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Person&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;address&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;address&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;address&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__copy__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# Create a new instance and copy attributes&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Person&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;address&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__deepcopy__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;memo&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# Create a new instance with deeply copied attributes&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Person&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deepcopy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;address&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;memo&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__repr__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Person(name=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, address=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;address&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;)&amp;quot;&lt;/span&gt;

&lt;span class="c1"&gt;# Example usage&lt;/span&gt;
&lt;span class="n"&gt;person&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Person&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Alice&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;city&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;New York&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;zip&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;10001&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="n"&gt;person_copy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;person&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;person_deepcopy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deepcopy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;person&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Modify the address in the original&lt;/span&gt;
&lt;span class="n"&gt;person&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;address&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;city&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Boston&amp;quot;&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;person&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Shows updated city&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;person_copy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Also shows updated city (shallow copy)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;person_deepcopy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Still shows &amp;quot;New York&amp;quot; (deep copy)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="related-topics-to-explore"&gt;Related Topics to Explore&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Immutable vs. Mutable Objects&lt;/strong&gt;: Understanding this fundamental Python concept helps clarify when copying is necessary.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Object References in Python&lt;/strong&gt;: Diving deeper into how Python handles object references.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Memory Management&lt;/strong&gt;: Learning how Python allocates and deallocates memory can help you make better choices about copying.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;The &lt;code&gt;pickle&lt;/code&gt; Module&lt;/strong&gt;: For serializing and deserializing Python objects, another approach to object copying.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Performance Optimization&lt;/strong&gt;: Strategies for efficient copying when working with large data structures.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="references"&gt;References&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.python.org/3/library/copy.html"&gt;Python's Official Documentation on the copy module&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.python.org/3/reference/datamodel.html#object.__copy__"&gt;Python Data Model - Object Customization&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://realpython.com/copying-python-objects/"&gt;Real Python's Guide to Shallow vs Deep Copying&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2 id="key-takeaways"&gt;Key takeaways&lt;/h2&gt;
&lt;p&gt;Understanding the distinction between shallow and deep copying is essential for writing robust Python code that behaves as expected. Choose &lt;code&gt;copy()&lt;/code&gt; when you need a quick, lightweight duplication of simple structures, and &lt;code&gt;deepcopy()&lt;/code&gt; when you need complete independence between the original and copied objects.&lt;/p&gt;</content><category term="note"/><category term="python"/><category term="python-copy"/><category term="deepcopy"/><category term="mutable"/><category term="immutable"/><category term="nested-objects"/></entry><entry><title>Tracking Down zsh Alias Plugin Sources</title><link href="https://www.safjan.com/tracking-down-zsh-alias-plugin-sources/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-02-28T00:00:00+01:00</published><updated>2025-02-28T00:00:00+01:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-02-28:/tracking-down-zsh-alias-plugin-sources/</id><summary type="html">&lt;p&gt;Learn how to trace and identify the source of zsh aliases defined by plugins using verbose tracing and grep, enabling you to pinpoint exactly where custom aliases are created.&lt;/p&gt;</summary><content type="html">&lt;p&gt;When hunting for the origin of mysteriously defined by unknown plugin zsh aliases:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;zsh&lt;span class="w"&gt; &lt;/span&gt;-xv&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&amp;gt;&lt;span class="p"&gt;&amp;amp;&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;grep&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;alias_name&amp;quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This works by:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Starting zsh with &lt;code&gt;-x&lt;/code&gt; (trace) and &lt;code&gt;-v&lt;/code&gt; (verbose) flags&lt;/li&gt;
&lt;li&gt;Redirecting both stdout and stderr (&lt;code&gt;2&amp;gt;&amp;amp;1&lt;/code&gt;) to capture all output&lt;/li&gt;
&lt;li&gt;Filtering with &lt;code&gt;grep&lt;/code&gt; to find when your alias is defined&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For a more targeted approach with less output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;zsh&lt;span class="w"&gt; &lt;/span&gt;-xv&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&amp;gt;&lt;span class="p"&gt;&amp;amp;&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;grep&lt;span class="w"&gt; &lt;/span&gt;-A&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;source.*plugin&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;grep&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;alias_name&amp;quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This helps identify which plugin file is sourcing your alias.&lt;/p&gt;</content><category term="note"/><category term="til"/><category term="zsh"/><category term="zsh-alias"/><category term="alias"/></entry><entry><title>Guide to Managing VS Code Keyboard Shortcuts</title><link href="https://www.safjan.com/managing-vs-code-keyboard-shortcuts-a-complete-guide/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-02-11T00:00:00+01:00</published><updated>2025-02-11T00:00:00+01:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-02-11:/managing-vs-code-keyboard-shortcuts-a-complete-guide/</id><summary type="html">&lt;p&gt;Learn how VS Code handles context-dependent keyboard shortcuts, resolve conflicts using the Keyboard Shortcuts editor, and customize them for efficient coding.&lt;/p&gt;</summary><content type="html">&lt;p&gt;I've been using VS Code for years, and keyboard shortcuts can definitely get messy, especially when you have multiple extensions installed. Let me break this down for you.&lt;/p&gt;
&lt;p&gt;The interesting thing about VS Code shortcuts is that they can indeed be context-dependent. This means the same keyboard combination might do different things depending on whether you're in a Python file, Jupyter notebook, or even the integrated terminal.&lt;/p&gt;
&lt;h2 id="how-vs-code-handles-shortcuts"&gt;How VS Code Handles Shortcuts&lt;/h2&gt;
&lt;p&gt;VS Code uses a system of "when clauses" to determine which shortcut should fire in a given context. Think of it like a priority system where the more specific context wins. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A shortcut that works only in Python files (when: "editorLangId == 'python'")&lt;/li&gt;
&lt;li&gt;A shortcut that works in any text editor (when: "editorFocus")&lt;/li&gt;
&lt;li&gt;A global shortcut that works everywhere&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="finding-and-resolving-conflicts"&gt;Finding and Resolving Conflicts&lt;/h2&gt;
&lt;p&gt;Here's how you can investigate and fix shortcut conflicts:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Open the Keyboard Shortcuts editor:&lt;/li&gt;
&lt;li&gt;Press Cmd+K Cmd+S (Mac) or Ctrl+K Ctrl+S (Windows/Linux)&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Or go to Code &amp;gt; Preferences &amp;gt; Keyboard Shortcuts&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Type Cmd+L (or whatever shortcut you're investigating) in the search bar. This will show you all commands using that combination.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Look for overlapping shortcuts. If you see multiple entries, that's your conflict!&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="fixing-conflicts"&gt;Fixing Conflicts&lt;/h2&gt;
&lt;p&gt;You have several options:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Change the shortcut for the command you use less frequently&lt;/li&gt;
&lt;li&gt;Add a more specific "when" clause to make the shortcuts context-dependent&lt;/li&gt;
&lt;li&gt;Disable one of the conflicting shortcuts&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;To modify a shortcut:
1. Find it in the Keyboard Shortcuts editor
2. Click the pencil icon
3. Press your new desired key combination
4. Press Enter to save&lt;/p&gt;
&lt;h2 id="context-matters"&gt;Context Matters&lt;/h2&gt;
&lt;p&gt;Different shortcuts can be active in different contexts like:
- &lt;code&gt;editorFocus&lt;/code&gt; (when text editor has focus)
- &lt;code&gt;terminalFocus&lt;/code&gt; (when terminal is focused)
- &lt;code&gt;notebookEditorFocus&lt;/code&gt; (in Jupyter notebooks)
- &lt;code&gt;editorLangId == 'python'&lt;/code&gt; (in Python files specifically)&lt;/p&gt;
&lt;p&gt;Want to see what context you're in? Try the "Developer: Toggle Keyboard Shortcuts Troubleshooting" command. It'll show you active contexts as you work.&lt;/p&gt;
&lt;h2 id="pro-tips"&gt;Pro Tips&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Back up your custom shortcuts! They're stored in keybindings.json, which you can access through the Keyboard Shortcuts editor.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Use the "Show Conflicts" option in the Keyboard Shortcuts editor to quickly spot problematic bindings.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Extensions can add their own keybindings. Check their documentation or the Keyboard Shortcuts editor to see what they've added.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="references"&gt;References&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://code.visualstudio.com/docs/getstarted/keybindings"&gt;Keyboard shortcuts for Visual Studio Code&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://toxigon.com/customizing-vs-code-keybindings"&gt;How to Customize VS Code Keybindings: A Comprehensive Guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://trycatchdebug.net/news/1472749/vscode-keybindings-guide"&gt;trycatchdebug.net | 522: Connection timed out&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</content><category term="note"/><category term="vscode"/><category term="keybindings"/><category term="personalization"/><category term="customization"/></entry><entry><title>Simple In-Memory Knowledge Graphs for Quick Graph Querying</title><link href="https://www.safjan.com/simple-inmemory-knowledge-graphs-for-quick-graph-querying/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2025-01-16T00:00:00+01:00</published><updated>2026-02-07T00:00:00+01:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2025-01-16:/simple-inmemory-knowledge-graphs-for-quick-graph-querying/</id><summary type="html">&lt;p&gt;As developers, we often reach for full-scale graph databases when simpler solutions would suffice. When your knowledge graph is modest in size, keeping it in memory can be both efficient and practical. Let's explore some powerful tools that make this approach work beautifully.&lt;/p&gt;</summary><content type="html">&lt;ul&gt;
&lt;li&gt;&lt;a href="#networkx---the-python-swiss-army-knife"&gt;NetworkX - The Python Swiss Army Knife&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#rdflib---when-you-need-semantic-power"&gt;RDFLib - When You Need Semantic Power&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#pygraphviz---visualization-with-query-capabilities"&gt;PyGraphviz - Visualization with Query Capabilities&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#diy-solution---custom-graph-structure"&gt;DIY Solution - Custom Graph Structure&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#making-the-right-choice"&gt;Making the Right Choice&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#performance-considerations"&gt;Performance Considerations&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#beyond-simple-solutions"&gt;Beyond Simple Solutions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#a-note-on-automated-graph-construction"&gt;A Note on Automated Graph Construction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#wrapping-up"&gt;Wrapping Up&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#further-reading-references"&gt;Further Reading, References&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Working with knowledge graphs doesn't always require &lt;a href="https://neo4j.com/"&gt;Neo4j&lt;/a&gt; or other heavyweight solutions. Sometimes you need a lightweight way to represent and query graph data right in memory. Let me share some approachable solutions I've found particularly useful.&lt;/p&gt;
&lt;p&gt;&lt;a id="networkx---the-python-swiss-army-knife"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="networkx-the-python-swiss-army-knife"&gt;NetworkX - The Python Swiss Army Knife&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://github.com/networkx/networkx"&gt;NetworkX&lt;/a&gt; has been my reliable companion for simple graph operations. It's incredibly intuitive and perfect for smaller knowledge graphs:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;networkx&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;nx&lt;/span&gt;

&lt;span class="c1"&gt;# Create a knowledge graph&lt;/span&gt;
&lt;span class="n"&gt;G&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Graph&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Add some knowledge&lt;/span&gt;
&lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Alice&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;knows&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Bob&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Bob&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;works_at&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;TechCorp&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Simple queries&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;find_connections&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;person&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;person&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;person&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You can test it with this example:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Test NetworkX&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;find_connections&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Alice&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;find_connections&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Bob&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The output is:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;NetworkX&lt;/span&gt; &lt;span class="n"&gt;Example&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Bob&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;relationship&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;knows&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;})]&lt;/span&gt;
&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Alice&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;relationship&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;knows&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;TechCorp&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;relationship&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;works_at&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;})]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;a id="rdflib---when-you-need-semantic-power"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="rdflib-when-you-need-semantic-power"&gt;RDFLib - When You Need Semantic Power&lt;/h2&gt;
&lt;p&gt;If you're dealing with semantic data and need &lt;a href="https://en.wikipedia.org/wiki/SPARQL"&gt;SPARQL&lt;/a&gt;-like querying, &lt;a href="https://rdflib.readthedocs.io/en/stable/index.html"&gt;RDFLib&lt;/a&gt; provides a perfect middle ground:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;rdflib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Graph&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Literal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;RDF&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;URIRef&lt;/span&gt;

&lt;span class="c1"&gt;# Create an in-memory graph&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Graph&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Add triples&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;URIRef&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Alice&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;URIRef&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;knows&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;URIRef&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Bob&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;

&lt;span class="c1"&gt;# Query using SPARQL&lt;/span&gt;
&lt;span class="n"&gt;qres&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;SELECT ?s ?o&lt;/span&gt;
&lt;span class="sd"&gt;       WHERE {&lt;/span&gt;
&lt;span class="sd"&gt;          ?s knows ?o .&lt;/span&gt;
&lt;span class="sd"&gt;       }&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;qres&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; -&amp;gt; &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The output is:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;RDFLib&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Example&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="n"&gt;Bob&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;TechCorp&lt;/span&gt;
&lt;span class="n"&gt;Alice&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Bob&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;a id="pygraphviz---visualization-with-query-capabilities"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="pygraphviz-visualization-with-query-capabilities"&gt;PyGraphviz - Visualization with Query Capabilities&lt;/h2&gt;
&lt;p&gt;When you need both visualization and querying use &lt;a href="https://github.com/pygraphviz/pygraphviz"&gt;pygraphviz&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;pygraphviz&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;pgv&lt;/span&gt;

&lt;span class="n"&gt;G&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pgv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AGraph&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Alice&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Bob&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;relationship&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;knows&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;find_relationships&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;edges&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;NOTE&lt;/strong&gt;: There might be an problem when installing pygraphviz in Google Colab, you can use matlplotlib + networkx instead&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;a id="diy-solution---custom-graph-structure"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="diy-solution-custom-graph-structure"&gt;DIY Solution - Custom Graph Structure&lt;/h2&gt;
&lt;p&gt;Sometimes, a custom solution fits best:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;collections&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SimpleKG&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;add_relation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;subject&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predicate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;predicate&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;subject&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;subject&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;predicate&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;subject&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;predicate&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;query_by_subject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;subject&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]]:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subject&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_connected_nodes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;connected&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;predicate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;objects&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="n"&gt;connected&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objects&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;connected&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;find_paths&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_depth&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]]:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;

        &lt;span class="n"&gt;visited&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;])]&lt;/span&gt;
        &lt;span class="n"&gt;paths&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vertex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;connected_nodes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_connected_nodes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vertex&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;next_node&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;connected_nodes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;next_node&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;paths&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;next_node&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
                &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;next_node&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;visited&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;max_depth&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;visited&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;next_node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;next_node&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;next_node&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;paths&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;find_by_predicate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predicate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;subject&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predicates&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;predicate&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;predicates&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;predicates&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;predicate&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                    &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;subject&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;find_connected_through_predicate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predicate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;predicate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Here is simple example how you can test it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Test the implementation&lt;/span&gt;
&lt;span class="n"&gt;kg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SimpleKG&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;kg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_relation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Alice&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;knows&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Bob&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;kg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_relation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Bob&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;works_at&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;TechCorp&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;kg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_relation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;TechCorp&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;located_in&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;San Francisco&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Query by subject &amp;#39;Alice&amp;#39;:&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;query_by_subject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Alice&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Find paths from Alice to TechCorp:&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find_paths&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Alice&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;TechCorp&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The output is:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Query by subject &amp;#39;Alice&amp;#39;: {&amp;#39;knows&amp;#39;: [&amp;#39;Bob&amp;#39;]}
Find paths from Alice to TechCorp: &amp;#39;Alice&amp;#39;, &amp;#39;Bob&amp;#39;, &amp;#39;TechCorp&amp;#39;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;More advanced example:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Create sample data&lt;/span&gt;
&lt;span class="n"&gt;kg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SimpleKG&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Movies data&lt;/span&gt;
&lt;span class="n"&gt;movies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Inception&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;The Dark Knight&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Interstellar&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Dunkirk&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Memento&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;The Prestige&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Tenet&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Fight Club&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Se7en&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;The Social Network&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Gone Girl&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Panic Room&amp;quot;&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;directors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Christopher Nolan&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;David Fincher&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Martin Scorsese&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Quentin Tarantino&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Steven Spielberg&amp;quot;&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;actors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Leonardo DiCaprio&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Christian Bale&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Matthew McConaughey&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Brad Pitt&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Tom Hardy&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Marion Cotillard&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Michael Caine&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Anne Hathaway&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Cillian Murphy&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Joseph Gordon-Levitt&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Ellen Page&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Jesse Eisenberg&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Ben Affleck&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Rosamund Pike&amp;quot;&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Add relationships&lt;/span&gt;
&lt;span class="c1"&gt;# Directors directed movies&lt;/span&gt;
&lt;span class="n"&gt;movie_director&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Inception&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Christopher Nolan&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;The Dark Knight&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Christopher Nolan&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Interstellar&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Christopher Nolan&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Dunkirk&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Christopher Nolan&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Memento&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Christopher Nolan&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;The Prestige&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Christopher Nolan&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Tenet&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Christopher Nolan&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Fight Club&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;David Fincher&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Se7en&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;David Fincher&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;The Social Network&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;David Fincher&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Gone Girl&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;David Fincher&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Panic Room&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;David Fincher&amp;quot;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;director&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;movie_director&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;kg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_relation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;directed_by&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;director&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;kg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_relation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;director&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;directed&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Add actors to movies (random assignment for demonstration)&lt;/span&gt;
&lt;span class="n"&gt;movie_actors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Inception&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Leonardo DiCaprio&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Tom Hardy&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Marion Cotillard&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Michael Caine&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Ellen Page&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Joseph Gordon-Levitt&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;The Dark Knight&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Christian Bale&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Michael Caine&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Cillian Murphy&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Interstellar&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Matthew McConaughey&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Anne Hathaway&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Michael Caine&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Fight Club&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Brad Pitt&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;The Social Network&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Jesse Eisenberg&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Gone Girl&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Ben Affleck&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Rosamund Pike&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cast&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;movie_actors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;actor&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;cast&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;kg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_relation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;stars&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;kg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_relation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;acted_in&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Add some awards&lt;/span&gt;
&lt;span class="n"&gt;awards&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Oscar&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Golden Globe&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;BAFTA&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;director&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;directors&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;award&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;awards&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;awards&lt;/span&gt;&lt;span class="p"&gt;))):&lt;/span&gt;
        &lt;span class="n"&gt;kg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_relation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;director&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;won&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;award&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;actor&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;actors&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;award&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;awards&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;awards&lt;/span&gt;&lt;span class="p"&gt;))):&lt;/span&gt;
        &lt;span class="n"&gt;kg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_relation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;won&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;award&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Example queries&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;1. Find all movies directed by Christopher Nolan:&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;nolan_movies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find_connected_through_predicate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Christopher Nolan&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;directed&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nolan_movies&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;2. Find actors who worked with Christopher Nolan (through any movie):&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;nolan_actors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;movie&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;nolan_movies&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;actors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find_connected_through_predicate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;stars&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;nolan_actors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actors&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nolan_actors&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;3. Find path between Leonardo DiCaprio and Christopher Nolan:&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;paths&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find_paths&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Leonardo DiCaprio&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Christopher Nolan&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Found paths:&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;paths&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot; -&amp;gt; &amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;4. Find Oscar winners:&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;oscar_winners&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find_by_predicate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;won&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;winner&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;winner&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;oscar_winners&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;winner&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Oscar&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;5. Find common movies between Michael Caine and Leonardo DiCaprio:&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;caine_movies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find_connected_through_predicate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Michael Caine&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;acted_in&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;dicaprio_movies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find_connected_through_predicate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Leonardo DiCaprio&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;acted_in&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;caine_movies&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;dicaprio_movies&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="mf"&gt;1.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Find&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;all&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;movies&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;directed&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;by&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Christopher&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Nolan&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;
&lt;span class="err"&gt;[&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;Inception&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;The&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Dark&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Knight&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="nb"&gt;Int&lt;/span&gt;&lt;span class="n"&gt;erstellar&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;Dunkirk&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;Memento&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;The&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Prestige&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;Tenet&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;

&lt;span class="mf"&gt;2.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Find&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;actors&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;who&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;worked&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;with&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Christopher&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Nolan&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;through&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;any&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;span class="err"&gt;[&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;Christian&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Bale&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;Michael&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Caine&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;Leonardo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;DiCaprio&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;Joseph&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kr"&gt;Go&lt;/span&gt;&lt;span class="n"&gt;rdon&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;Levitt&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;Cillian&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Murphy&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;Anne&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Hathaway&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;Matthew&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;McConaughey&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;Marion&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Cotillard&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;Ellen&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Page&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="kr"&gt;To&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Hardy&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;

&lt;span class="mf"&gt;3.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Find&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;between&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Leonardo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;DiCaprio&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ow"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Christopher&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Nolan&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;
&lt;span class="n"&gt;Found&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;paths&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;
&lt;span class="n"&gt;Leonardo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;DiCaprio&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Inception&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Christopher&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Nolan&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;
&lt;span class="n"&gt;Leonardo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;DiCaprio&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Inception&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Michael&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Caine&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;The&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Dark&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Knight&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Christopher&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Nolan&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;
&lt;span class="n"&gt;Leonardo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;DiCaprio&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Inception&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Michael&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Caine&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;Int&lt;/span&gt;&lt;span class="n"&gt;erstellar&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Christopher&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Nolan&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;

&lt;span class="mf"&gt;5.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Find&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Oscar&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;winners&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;
&lt;span class="err"&gt;[&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;Leonardo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;DiCaprio&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="kr"&gt;To&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Hardy&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;Marion&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Cotillard&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;Christian&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Bale&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;Matthew&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;McConaughey&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;Brad&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Pitt&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;Martin&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Scorsese&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;

&lt;span class="mf"&gt;5.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Find&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;common&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;movies&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;between&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Michael&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Caine&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ow"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Leonardo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;DiCaprio&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;
&lt;span class="err"&gt;[&amp;#39;&lt;/span&gt;&lt;span class="n"&gt;Inception&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;a id="making-the-right-choice"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="making-the-right-choice"&gt;Making the Right Choice&lt;/h2&gt;
&lt;p&gt;The best solution depends on your specific needs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Use NetworkX for general graph operations and algorithms&lt;/li&gt;
&lt;li&gt;Choose RDFLib when working with semantic data and SPARQL&lt;/li&gt;
&lt;li&gt;Go with PyGraphviz when visualization is important&lt;/li&gt;
&lt;li&gt;Consider a custom solution for specialized query patterns&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a id="performance-considerations"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="performance-considerations"&gt;Performance Considerations&lt;/h2&gt;
&lt;p&gt;These solutions work well for graphs with thousands of nodes and edges. The key is keeping everything in memory and optimizing your query patterns. For NetworkX and RDFLib, using their built-in query methods is usually faster than writing custom traversal code.&lt;/p&gt;
&lt;p&gt;&lt;a id="beyond-simple-solutions"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="beyond-simple-solutions"&gt;Beyond Simple Solutions&lt;/h2&gt;
&lt;p&gt;When your knowledge graph grows beyond memory constraints or you need more complex querying capabilities, it might be time to consider solutions like &lt;a href="https://neo4j.com/"&gt;Neo4j&lt;/a&gt; or &lt;a href="https://aws.amazon.com/neptune/"&gt;Amazon Neptune&lt;/a&gt;. However, for many use cases, these in-memory solutions provide the perfect balance of simplicity and functionality.&lt;/p&gt;
&lt;p&gt;&lt;a id="a-note-on-automated-graph-construction"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="a-note-on-automated-graph-construction"&gt;A Note on Automated Graph Construction&lt;/h2&gt;
&lt;p&gt;Building knowledge graphs by hand, as shown in our examples, is straightforward. However, automatically constructing them from documents or unstructured data is a complex challenge worthy of its own article. Here are some key challenges you'll face:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Entity recognition&lt;/strong&gt; and &lt;strong&gt;disambiguation&lt;/strong&gt; is perhaps the trickiest part - determining whether "Apple" refers to the fruit or the company, or whether two mentions of "John Smith" refer to the same person. You'll need to handle coreference resolution (understanding that "he" refers to "John" mentioned earlier) and deal with variations in how entities are written ("NYC" vs "New York City").&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Relationship extraction&lt;/strong&gt; comes with its own set of problems. Natural language is complex and often implicit - extracting clear, structured relationships from sentences like "After years at Microsoft, Sarah brought her expertise to the startup" requires sophisticated NLP techniques.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data quality&lt;/strong&gt; and &lt;strong&gt;consistency&lt;/strong&gt; are also major concerns. Sources might conflict with each other, contain outdated information, or present opinions as facts. You'll need strategies for handling uncertainty and conflicting information in your graph.&lt;/p&gt;
&lt;p&gt;If you're interested in automatic graph construction, I'd recommend starting with established NLP libraries and knowledge graph toolkits rather than building everything from scratch. But that's a topic for another deep dive!&lt;/p&gt;
&lt;p&gt;&lt;a id="wrapping-up"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="wrapping-up"&gt;Wrapping Up&lt;/h2&gt;
&lt;p&gt;Don't jump to complex graph databases when simpler solutions might suffice. These in-memory approaches can handle surprisingly complex tasks while keeping your codebase clean and maintainable. Plus, they're perfect for prototyping before committing to a full-scale graph database solution.&lt;/p&gt;
&lt;p&gt;&lt;a id="further-reading-references"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="further-reading-references"&gt;Further Reading, References&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;paper &lt;a href="https://arxiv.org/abs/2305.14485"&gt;[2305.14485] Knowledge Graphs Querying&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;similarar attempt as in this article to build and query knowledge graph: &lt;a href="https://medium.com/analytics-vidhya/querying-using-simple-knowledge-graphs-abeb13d05e48"&gt;Querying using simple knowledge graphs | by Vishnu Nandakumar | Analytics Vidhya | Medium&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://memgraph.com/docs/ai-ecosystem/graph-rag"&gt;GraphRAG&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cognipy.org/"&gt;CogniPy for Pandas - In-memory Graph Database and Knowledge Graph with Natural Language Interface - CogniPy 1.0.0 documentation&lt;/a&gt; - In-memory Graph Database and Knowledge Graph with Natural Language Interface&lt;/li&gt;
&lt;li&gt;not necessarily small and simple solutions: &lt;a href="https://www.puppygraph.com/blog/best-graph-databases"&gt;Title Unavailable | Site Unreachable&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Edits:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;2026-02-07: Added table of contents with anchor links&lt;/li&gt;
&lt;/ul&gt;</content><category term="Howto"/><category term="graph"/><category term="knowledge-graph"/><category term="query"/><category term="networkx"/><category term="neo4j"/><category term="rdflib"/><category term="SPARQL"/></entry><entry><title>Quick Ways to Disable GitHub Actions Workflows Without Deletion</title><link href="https://www.safjan.com/quick-ways-to-disable-github-actions-workflows-without-deletion/?utm_source=rss&amp;utm_medium=feed&amp;utm_campaign=safjan-blog" rel="alternate"/><published>2024-10-03T00:00:00+02:00</published><updated>2024-10-03T00:00:00+02:00</updated><author><name>Krystian Safjan</name></author><id>tag:www.safjan.com,2024-10-03:/quick-ways-to-disable-github-actions-workflows-without-deletion/</id><summary type="html">&lt;p&gt;Learn three quick methods to temporarily disable GitHub Actions workflows without deleting them, including commenting out code, using manual triggers, and adding conditional logic.&lt;/p&gt;</summary><content type="html">&lt;p&gt;GitHub Actions workflows are powerful automation tools, but sometimes you need to temporarily disable them. Here are three simple methods to pause a workflow without deleting its YAML file:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Comment out the file: Add '#' at the start of each line in the workflow file.&lt;/li&gt;
&lt;li&gt;Use manual triggers: Replace existing triggers with &lt;code&gt;on: workflow_dispatch&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Add a false condition: Insert &lt;code&gt;if: false&lt;/code&gt; under the &lt;code&gt;jobs&lt;/code&gt; key.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="example-of-method-3"&gt;Example of method 3&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;My Workflow&lt;/span&gt;

&lt;span class="nt"&gt;on&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;push&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;branches&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p p-Indicator"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;main&lt;/span&gt;&lt;span class="p p-Indicator"&gt;]&lt;/span&gt;

&lt;span class="nt"&gt;jobs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;if&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;false&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;# This disables the entire workflow&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;build&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;runs-on&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;ubuntu-latest&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;uses&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;actions/checkout@v2&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="c1"&gt;# ... rest of the job steps&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</content><category term="note"/><category term="github"/><category term="github-actions"/><category term="workflow"/><category term="ci"/><category term="yaml"/></entry></feed>