Openai on Strathweb. A free flowing tech monologue.

Using o-series Reasoning Models in PromptFlow

Mon, 24 Mar 2025 07:06:14 +0000

If you have tried to use the OpenAI o-series reasoning models, such as o1 or o3, with PromptFlow recently, you certainly ran into a nasty surprise. While PromptFlow supports a wide range of models and providers, the o-series models are not among them. This is of course quite a shame, especially if you’d like to benchmark or evaluate your flows against those models.

In this short post, we will look at a workaround.

How GPT-4o-mini can be simultaneously 20x cheaper and 2x more expensive than GPT-4o

Fri, 25 Oct 2024 07:06:14 +0000

GPT-4o-mini is the small, cost-effective version of the GPT-4o model. It is a great default choice for developers who want a very capable and fast model, but don’t need the full power of the GPT-4o model. However, there are some important things to keep in mind when using GPT-4o-mini, especially when it comes to pricing - some of which is rather contradictory!

Speech-based retrieval augmented generation (RAG) with GPT-4o Realtime API

Mon, 14 Oct 2024 07:06:14 +0000

On October 1st, OpenAI and Microsoft (Azure OpenAI) announced the availability of the GPT-4o Realtime API for speech and audio. It is a new, innovative way of interacting with the GPT-4o model family, the provides a “speech in, speech out” conversational interface. Contrary to traditional text-based APIs, the Realtime API allows sending the audio input directly to the model, and receiving the audio output back. This is a significant improvement over the existing solutions to voice-enabled assistants, which required converting the audio to text first, and then converting the text back to audio. The Realtime API is currently in preview, and the SDKs for various languages have mixed-level of support for them, but it is already possible to build exciting new applications with it.

The low-latency speech-based interface also poses some challenges to established AI architectural patterns, such as Retrieval-Augmented Generation (RAG) - and today we will tackle just that, and have a look at a small sample realtime-voice RAG app in .NET.

Tool Calling with Azure OpenAI - Part 2: Using the tools directly via the SDK

Fri, 19 Apr 2024 07:06:14 +0000

Last time around, we discussed how Large Language Models can select the appropriate tool and its required parameters out of freely flowing conversation text. We also introduced the formal concept of those tools, which are structurally described using an OpenAPI schema.

In this part 2 of the series, we are going to build two different .NET command line assistant applications, both taking advantage of the tool calling integration. We will orchestrate everything by hand - that is, we will only use the Azure OpenAI Service API directly (or rather using the .NET SDK for Azure OpenAI) - without any additional AI frameworks.

Tool Calling with Azure OpenAI - Part 1: The Basics

Thu, 04 Apr 2024 07:06:14 +0000

One of the fantastic capabilities of the Large Language Models is their ability to choose (based on a predefined set of tool definitions) the appropriate tool and its required parameters out of freely flowing conversation text. With that, they can act as facilitators of workflow orchestration, where they would instruct applications to invoke specific tools, with specific set of arguments.

OpenAI announced the built-in capability called function calling in the summer of last year, and by now it is an integral part of working with and building applications on top of the GPT models. The functionality was later renamed in the API to “tools”, to better express their broad scope and nature.

Today I am starting a new multi-post Azure OpenAI blog series focusing specifically on the tool capabilities. We will build a client application with .NET, and explore tool integration from different angles - using the Azure OpenAI .NET SDK directly, using the Assistants SDK and finally leveraging various orchestration frameworks such as Semantic Kernel and AutoGen. In today’s part one, we are going to introduce the basic concepts behind tool calling.

Combining Azure OpenAI with Azure AI Speech

Fri, 08 Mar 2024 07:06:14 +0000

In my recent posts, I’ve been exploring various facets of the Azure OpenAI Service, discussing how it can power up our applications with AI. Today, I’m taking a slightly different angle - I want to dive into how we can enhance our projects further by integrating Azure OpenAI Service with Azure AI Speech. Let’s explore what this integration means and how it could lead to exciting, AI-powered applications.

Using your own data with GPT models in Azure OpenAI - Part 4: Adding vector search

Fri, 23 Feb 2024 07:06:14 +0000

For our Retrieval-Augmented-Generation (RAG) application, we setup AI Search in part 1, however so far we only used it using the basic keyword search.

In this part 4 of the series about bringing your own data to Azure OpenAI Service, we will go ahead and integrate vector search, as a more sophisticated way of performing the search across the Azure AI Search index within our RAG-pattern system.

I already covered vectorization and embeddings using the OpenAI embedding model on this blog, and we will be relying on the same principles here. I recommend reading through that article before continuing if you are not yet familiar with the concept of embeddings.

Using your own data with GPT models in Azure OpenAI - Part 3: Calling Azure OpenAI Service via .NET SDK

Mon, 18 Dec 2023 07:06:14 +0000

In the last post of this series we set up a demo .NET client application that was able to call and utilize a GPT model hosted in Azure OpenAI Service, which in turn was integrated with our own custom data via Azure AI Search. We did this using the bare bones REST API - and in part three, it’s time to shift gears and explore how to accomplish similar task using the .NET SDK, which offers a more streamlined and less ceremonious approach over calling the HTTP endpoints directly.

Using your own data with GPT models in Azure OpenAI - Part 2: Calling Azure OpenAI Service via REST API

Fri, 24 Nov 2023 07:06:14 +0000

In the previous part of this series, we have successfully set up Azure AI Search, to have it ready for integration with Azure OpenAI Service. The ultimate goal is to take advantage of the retrieval-augmented-generation pattern, and enhancing our interactions with the GPT model with our own custom data.

Let’s continue building this today.

Using your own data with GPT models in Azure OpenAI - Part 1: Setting up Azure AI Search

Fri, 10 Nov 2023 07:06:14 +0000

There is no question that the emergence of generative AI is going to significantly alter various aspects of our daily lives. At the same time, most of the large language models (LLMs) are designed as general-purpose black boxes and their utility is initially confined to the data they were trained on. However, it is possible to extend their functionality and reasoning to any custom data set, be it private or public, even without the massive effort that would be needed to retrain or even fine-tune them.

We are going to start exploring that concept today with a multi-part post series on “bringing your own data” to Azure OpenAI. In part one today, we will set up the necessary Azure resources and prepare the stage for a client application integration, which will follow in parts two and further.

Using embeddings model with Azure OpenAI Service

Wed, 13 Sep 2023 07:00:14 +0000

I recently blogged about building GPT-powered applications with Azure OpenAI Service. In that post, we looked at using the text-davinci-003 model to provide classification capabilities for natural text - more specifically, we categorized and rated scientific papers based on the interest area (note that the recommended model for this task now is gpt-35-turbo now).

In today’s post we are going to continue exploring Azure OpenAI Service, this time looking at the embeddings model, text-embedding-ada-002.

Building GPT powered applications with Azure OpenAI Service

Wed, 26 Apr 2023 10:06:14 +0000

In this post we will have a look at how we can utilize Azure OpenAI Service to build applications using various OpenAI models. At the high level, Azure OpenAI allows accessing GPT-4, GPT-3, Codex and Embeddings models using the security boundary of Azure, and while ensuring data privacy and residency and conforming to other common enterprise requirements such as private networking. In other words, it addresses one of the biggest worries of integrating AI services into own applications - the data is never shared with OpenAI.