<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Phi on Strathweb. A free flowing tech monologue.</title>
    <link>https://www.strathweb.com/categories/phi/</link>
    <description>Recent content in Phi on Strathweb. A free flowing tech monologue.</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-us</language>
    <lastBuildDate>Mon, 23 Feb 2026 07:06:14 +0000</lastBuildDate><atom:link href="https://www.strathweb.com/categories/phi/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Fine-tuning Phi-4 with Azure ML</title>
      <link>https://www.strathweb.com/2026/02/fine-tuning-phi-4-with-azure-ml/</link>
      <pubDate>Mon, 23 Feb 2026 07:06:14 +0000</pubDate>
      
      <guid>https://www.strathweb.com/2026/02/fine-tuning-phi-4-with-azure-ml/</guid>
      <description>&lt;p&gt;Recently, I dedicated quite a lot of room &lt;a href=&#34;https://www.strathweb.com/categories/phi&#34;&gt;on this blog&lt;/a&gt; to the topic of running Phi locally. This time, I want to focus on a different aspect of adopting small language models like Phi - fine-tuning them. I already covered &lt;a href=&#34;https://www.strathweb.com/2025/01/fine-tuning-phi-models-with-mlx&#34;&gt;local fine-tuning in the past&lt;/a&gt;, so today we are going to do this with &lt;a href=&#34;https://learn.microsoft.com/en-us/azure/machine-learning/overview-what-is-azure-machine-learning?view=azureml-api-2&#34;&gt;Azure Machine Learning (Azure ML)&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Azure ML is a comprehensive cloud service for accelerating and managing the machine learning project lifecycle. While local fine-tuning is great, moving to Azure ML makes a lot of sense when you need to scale, and/or when you want to experience the Nvidia GPUs without investing in hardware.&lt;/p&gt;
&lt;p&gt;We are going to do &lt;a href=&#34;https://arxiv.org/abs/2106.09685&#34;&gt;LoRA&lt;/a&gt; fine-tuning of a Phi-4 model, and then deploy it to a managed batch endpoint for inference.&lt;/p&gt;</description>
    </item>
    
    <item>
      <title>SLM-default, LLM-fallback pattern with Agent Framework and Azure AI Foundry</title>
      <link>https://www.strathweb.com/2025/12/slm-default-llm-fallback-pattern-with-agent-framework-and-azure-ai-foundry/</link>
      <pubDate>Fri, 05 Dec 2025 08:00:00 +0000</pubDate>
      
      <guid>https://www.strathweb.com/2025/12/slm-default-llm-fallback-pattern-with-agent-framework-and-azure-ai-foundry/</guid>
      <description>&lt;p&gt;When building AI workflows, we often face a choice: do we use a massive, expensive cloud model for everything (to ensure best reasoning capabilities), or do we cut costs with a smaller local model (and risk hallucinations)? In this post, we&amp;rsquo;ll explore a &amp;ldquo;best of both worlds&amp;rdquo; architecture, as described in the recent survey &amp;ldquo;Small Language Models for Agentic Systems&amp;rdquo; &lt;a href=&#34;https://arxiv.org/abs/2510.03847&#34;&gt;Sharma &amp;amp; Mehta, 2025&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We call this the &amp;ldquo;SLM-default, LLM-fallback&amp;rdquo; pattern. The premise is simple: route all queries to a fast, private, on-device Small Language Model (SLM) first. Only if that model cannot confidently answer the query, do we escalate the request to a paid cloud model (LLM).&lt;/p&gt;</description>
    </item>
    
    <item>
      <title>LLM and SLM collaboration using the Minions pattern (with Phi-4-mini and Azure OpenAI)</title>
      <link>https://www.strathweb.com/2025/10/llm-and-slm-collaboration-using-the-minions-pattern/</link>
      <pubDate>Fri, 24 Oct 2025 07:06:14 +0000</pubDate>
      
      <guid>https://www.strathweb.com/2025/10/llm-and-slm-collaboration-using-the-minions-pattern/</guid>
      <description>&lt;p&gt;In this post, we&amp;rsquo;ll explore a novel approach to optimizing AI workflows by strategically combining large language models (LLMs) with small language models (SLMs) using the &amp;ldquo;Minions pattern.&amp;rdquo; This technique, described in the research paper &lt;a href=&#34;https://arxiv.org/abs/2502.15964&#34;&gt;&amp;ldquo;Minions: Cost-efficient Collaboration Between On-device and Cloud Language Models&amp;rdquo;&lt;/a&gt; by Narayan et al., addresses one of the most pressing challenges in AI application development - the cost of processing large amounts of data with expensive, cloud-based language models. If you&amp;rsquo;ve ever built an AI system that needs to analyze extensive documents or datasets, you&amp;rsquo;ve probably felt the frustration of watching your API costs skyrocket as you process more and more content.&lt;/p&gt;</description>
    </item>
    
    <item>
      <title>Using Phi Silica in Windows App SDK on a Copilot Plus PC</title>
      <link>https://www.strathweb.com/2025/04/using-phi-silica-in-windows-app-sdk-on-copilot-plus-pc/</link>
      <pubDate>Fri, 25 Apr 2025 07:06:14 +0000</pubDate>
      
      <guid>https://www.strathweb.com/2025/04/using-phi-silica-in-windows-app-sdk-on-copilot-plus-pc/</guid>
      <description>&lt;p&gt;&lt;a href=&#34;https://blogs.windows.com/windowsexperience/2024/12/06/phi-silica-small-but-mighty-on-device-slm/&#34;&gt;Last year&lt;/a&gt;, Microsoft announced the Copilot Plus PC, a new class of devices that are designed to run AI workloads locally. The flagship device of the line is of course the &lt;a href=&#34;https://www.microsoft.com/en-us/surface/devices/surface-pro-11th-edition&#34;&gt;Surface Pro 11&lt;/a&gt;, which is powered by the Qualcomm Snapdragon X Elite ARM processor. Unfortunately, since the launch, the AI capabilities have been more than underwhelming, as few applications and workloads are able to take advantage of the integrated NPU hardware.&lt;/p&gt;
&lt;p&gt;One of the milestones in this direction is the &lt;a href=&#34;https://www.microsoft.com/en/windows/business/devices/copilot-plus-pcs&#34;&gt;Phi Silica&lt;/a&gt; model, which is a small but powerful ONNX-Runtime-based on-device SLM (Small Language Model) that is designed to run on the Copilot Plus PC &lt;a href=&#34;https://learn.microsoft.com/en-us/windows/ai/npu-devices/&#34;&gt;NPU&lt;/a&gt;, and that is built into the Windows Copilot Runtime. This removes a lot of the friction that developers have when trying to run models on-device, as they can now simply use the Windows App SDK to access the NPU and invoke the model just like ant other system API.&lt;/p&gt;
&lt;p&gt;Today we will have a look at how to use the Phi Silica model in a Windows App SDK applications.&lt;/p&gt;</description>
    </item>
    
    <item>
      <title>Running Phi models on iOS with Apple MLX Framework</title>
      <link>https://www.strathweb.com/2025/03/running-phi-models-on-ios-with-apple-mlx-framework/</link>
      <pubDate>Mon, 10 Mar 2025 08:30:12 +0000</pubDate>
      
      <guid>https://www.strathweb.com/2025/03/running-phi-models-on-ios-with-apple-mlx-framework/</guid>
      <description>&lt;p&gt;As I previously blogged a few times, I have been working on the &lt;a href=&#34;https://www.strathweb.com/2024/07/announcing-strathweb-phi-engine-a-cross-platform-library-for-running-phi-3-anywhere/&#34;&gt;Strathweb Phi Engine&lt;/a&gt;, a cross-platform library for running Phi model inference via a simple, high-level API, from a number of high-level languages: C#, Swift, Kotlin and Python. This of course includes the capability of running Phi models on iOS devices, and the sample repo contains a &lt;a href=&#34;https://github.com/filipw/strathweb-phi-engine/tree/main/samples/ios/phi.engine.sample&#34;&gt;demo SwiftUI application&lt;/a&gt; that demonstrates how to do this.&lt;/p&gt;
&lt;p&gt;Today I wanted to show an alternative way of running Phi models on iOS devices, using Apple&amp;rsquo;s &lt;a href=&#34;https://opensource.apple.com/projects/mlx/&#34;&gt;MLX framework&lt;/a&gt;. I previously &lt;a href=&#34;https://www.strathweb.com/2025/01/fine-tuning-phi-models-with-mlx&#34;&gt;blogged&lt;/a&gt; about fine-tuning Phi models on iOS using MLX, so that post is a good read if you want to learn more about the MLX framework and how to use it.&lt;/p&gt;</description>
    </item>
    
    <item>
      <title>Strathweb Phi Engine - now with Phi-4 support</title>
      <link>https://www.strathweb.com/2025/02/strathweb-phi-engine-now-with-phi-4-support/</link>
      <pubDate>Mon, 24 Feb 2025 07:06:14 +0000</pubDate>
      
      <guid>https://www.strathweb.com/2025/02/strathweb-phi-engine-now-with-phi-4-support/</guid>
      <description>&lt;p&gt;Last summer, I launched &lt;a href=&#34;https://www.strathweb.com/2024/07/announcing-strathweb-phi-engine-a-cross-platform-library-for-running-phi-3-anywhere&#34;&gt;Strathweb Phi Engine&lt;/a&gt; — a cross-platform library for running Phi model inference via a simple, high-level API, from a number of high-level languages: C#, Swift, Kotlin and Python.&lt;/p&gt;
&lt;p&gt;Today I am happy to announce support for Phi-4, the latest model in the Phi family, which Microsoft AI &lt;a href=&#34;https://techcommunity.microsoft.com/blog/aiplatformblog/introducing-phi-4-microsoft%E2%80%99s-newest-small-language-model-specializing-in-comple/4357090&#34;&gt;released&lt;/a&gt; in December 2024.&lt;/p&gt;</description>
    </item>
    
    <item>
      <title>Fine tuning Phi models with MLX</title>
      <link>https://www.strathweb.com/2025/01/fine-tuning-phi-models-with-mlx/</link>
      <pubDate>Fri, 17 Jan 2025 07:06:14 +0000</pubDate>
      
      <guid>https://www.strathweb.com/2025/01/fine-tuning-phi-models-with-mlx/</guid>
      <description>&lt;p&gt;Recently, I dedicated quite a lot of room &lt;a href=&#34;https://www.strathweb.com/categories/phi/&#34;&gt;on this blog&lt;/a&gt; to the topic of running Phi locally with the &lt;a href=&#34;https://github.com/filipw/strathweb-phi-engine&#34;&gt;Strathweb Phi Engine&lt;/a&gt;. This time, I want to focus on a different aspect of adopting small language models like Phi - fine-tuning them. We are going to do this with Apple&amp;rsquo;s &lt;a href=&#34;https://opensource.apple.com/projects/mlx/&#34;&gt;MLX&lt;/a&gt; library, which offers excellent performance for ML-related tasks on Apple Silicon.&lt;/p&gt;
&lt;p&gt;We are going to do &lt;a href=&#34;https://huggingface.co/docs/peft/main/en/conceptual_guides/lora&#34;&gt;LoRA&lt;/a&gt; fine tuning of a Phi model, and then invoke it using Strathweb Phi Engine.&lt;/p&gt;</description>
    </item>
    
    <item>
      <title>Running Phi Inference in .NET Applications with Strathweb Phi Engine</title>
      <link>https://www.strathweb.com/2024/12/running-phi-inference-in-net-applications-with-strathweb-phi-engine/</link>
      <pubDate>Fri, 20 Dec 2024 07:06:14 +0000</pubDate>
      
      <guid>https://www.strathweb.com/2024/12/running-phi-inference-in-net-applications-with-strathweb-phi-engine/</guid>
      <description>&lt;p&gt;Local AI inference has become increasingly important for developers seeking to build robust, privacy-preserving applications. In this deep dive, I&amp;rsquo;ll show you how to leverage &lt;a href=&#34;https://www.strathweb.com/2024/07/announcing-strathweb-phi-engine-a-cross-platform-library-for-running-phi-3-anywhere&#34;&gt;Strathweb Phi Engine&lt;/a&gt; multi-platform library to run Microsoft&amp;rsquo;s Phi-family models directly in your .NET applications, exploring both basic integration patterns and advanced features that make Phi inference more accessible than ever.&lt;/p&gt;</description>
    </item>
    
    <item>
      <title>Strathweb Phi Engine - now with Safe Tensors support</title>
      <link>https://www.strathweb.com/2024/11/strathweb-phi-engine-now-with-safe-tensors-support/</link>
      <pubDate>Fri, 15 Nov 2024 07:06:14 +0000</pubDate>
      
      <guid>https://www.strathweb.com/2024/11/strathweb-phi-engine-now-with-safe-tensors-support/</guid>
      <description>&lt;p&gt;This summer, I announced the &lt;a href=&#34;https://www.strathweb.com/2024/07/announcing-strathweb-phi-engine-a-cross-platform-library-for-running-phi-3-anywhere&#34;&gt;Strathweb Phi Engine&lt;/a&gt; — a cross-platform library for running Phi inference anywhere. Up until now, the library only supported models in the quantized GGUF format. Today, I&amp;rsquo;m excited to share that the library now also supports the Safe Tensor model format.&lt;/p&gt;
&lt;p&gt;This enhancement significantly expands the scope of use cases and interoperability for the Strathweb Phi Engine. With Safe Tensor support, you can now load and execute models in a format that is not only performant but also prioritizes security and memory safety. Notably, all the Phi models published by Microsoft use the Safe Tensor format by default.&lt;/p&gt;</description>
    </item>
    
    <item>
      <title>Using Local Phi-3 Models in AutoGen with Strathweb Phi Engine</title>
      <link>https://www.strathweb.com/2024/09/using-local-phi-3-models-in-autogen-with-strathweb-phi-engine/</link>
      <pubDate>Fri, 06 Sep 2024 07:06:14 +0000</pubDate>
      
      <guid>https://www.strathweb.com/2024/09/using-local-phi-3-models-in-autogen-with-strathweb-phi-engine/</guid>
      <description>&lt;p&gt;I recently announced &lt;a href=&#34;https://www.strathweb.com/2024/07/announcing-strathweb-phi-engine-a-cross-platform-library-for-running-phi-3-anywhere&#34;&gt;Strathweb Phi Engine&lt;/a&gt;, a cross-platform library/toolset for conveniently running Phi-3 (almost) anywhere. Today I would like to show how to integrate a local Phi-3 model, orchestrated by Strathweb Phi Engine, into an agentic workflow built with &lt;a href=&#34;https://github.com/microsoft/autogen&#34;&gt;AutoGen&lt;/a&gt;.&lt;/p&gt;</description>
    </item>
    
    <item>
      <title>Announcing Strathweb Phi Engine - a cross-platform library for running Phi-3 anywhere</title>
      <link>https://www.strathweb.com/2024/07/announcing-strathweb-phi-engine-a-cross-platform-library-for-running-phi-3-anywhere/</link>
      <pubDate>Thu, 25 Jul 2024 04:06:14 +0000</pubDate>
      
      <guid>https://www.strathweb.com/2024/07/announcing-strathweb-phi-engine-a-cross-platform-library-for-running-phi-3-anywhere/</guid>
      <description>&lt;p&gt;I &lt;a href=&#34;https://www.strathweb.com/2024/05/running-microsoft-phi-3-model-in-an-ios-app-with-rust&#34;&gt;recently&lt;/a&gt; wrote a blog post about using Rust to run Phi-3 model on iOS. The post received an overwhelmingly positive response, and I got a lot of questions about running Phi-3 using similar approach on other platforms, such as Android, Windows, macOS or Linux. Today, I&amp;rsquo;m excited to announce the project I have been working on recently - Strathweb Phi Engine, a cross-platform library for running Phi-3 (almost) anywhere.&lt;/p&gt;</description>
    </item>
    
    <item>
      <title>Running Microsoft&#39;s Phi-3 Model in an iOS app with Rust</title>
      <link>https://www.strathweb.com/2024/05/running-microsoft-phi-3-model-in-an-ios-app-with-rust/</link>
      <pubDate>Thu, 09 May 2024 07:06:14 +0000</pubDate>
      
      <guid>https://www.strathweb.com/2024/05/running-microsoft-phi-3-model-in-an-ios-app-with-rust/</guid>
      <description>&lt;p&gt;Last month, &lt;a href=&#34;https://azure.microsoft.com/en-us/blog/introducing-phi-3-redefining-whats-possible-with-slms/&#34;&gt;Microsoft released&lt;/a&gt; the exciting new minimal AI model, Phi-3 mini. It&amp;rsquo;s a 3.8B model that can outperform many other larger models, while still being small enough to run on a phone. In this post, we&amp;rsquo;ll explore how to run the Phi-3 model inside a SwiftUI iOS application using the minimalist ML framework for Rust, called &lt;a href=&#34;https://github.com/huggingface/candle&#34;&gt;candle&lt;/a&gt;, and built by the nice folks at HuggingFace.&lt;/p&gt;</description>
    </item>
    
  </channel>
</rss>
