C#•2y ago

❔ Handling OpenAI/GPT API Functions with strong types

This is a fun design problem and really I have no idea how to approach it, any thoughts are welcome. I have one solution that I've hacked my way towards, which I'll share, but want to make it feel less hacky For ChatGPT, you can define Functions in the json which it can choose to 'execute' instead of responding to a query. 'Executing' a function just means it responds with a message containing Role=Function, Name={FunctionName}, and Arguments={JsonArguments (which you defined a schema for)}. When a function is executed, GPT expects you to return a message with Role=Function, Name={FunctionName}, Content={FunctionOutput}, and let it respond again (and it might choose to do another function, or it might respond to the user and basically end the chain) The data it sends for a 'function call' matches the definition you gave it for the function, which looks like:

{
        "name": "get_n_day_weather_forecast",
        "description": "Get an N-day weather forecast",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA",
                },
                "format": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "The temperature unit to use. Infer this from the users location.",
                },
                "num_days": {
                    "type": "integer",
                    "description": "The number of days to forecast",
                }
            },
            "required": ["location", "format", "num_days"]
        },
    },

{
        "name": "get_n_day_weather_forecast",
        "description": "Get an N-day weather forecast",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA",
                },
                "format": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "The temperature unit to use. Infer this from the users location.",
                },
                "num_days": {
                    "type": "integer",
                    "description": "The number of days to forecast",
                }
            },
            "required": ["location", "format", "num_days"]
        },
    },

I'd like to set up something to strongly type all of this. I want to define a C# method with strongly typed inputs and outputs, maybe register it in startup with DI, and have reflection or similar create everything needed to 'register' it. And also to provide some simple way to execute those functions when requested

7 Replies

D.MentiaOP•2y ago

My current implementation kinda works, but it feels messy and suboptimal. I have:

public interface IGptFunction
{
    public string Name { get; set; }
    public string? Description { get; set; }
    public Dictionary<string, Gpt35Parameter> Properties { get; set; }
    public string Type { get; set; }
    public List<string> Required { get; set; }
    public Task<string> DoCallbackAsync(string input);
}


public interface IGptFunction<TInput, TOutput> : IGptFunction where TInput : new()
{
    [JsonProperty("parameters")]
    public Gpt35ParameterCollection<TInput> Parameters { get; set; }
    
}

public interface IGptFunction
{
    public string Name { get; set; }
    public string? Description { get; set; }
    public Dictionary<string, Gpt35Parameter> Properties { get; set; }
    public string Type { get; set; }
    public List<string> Required { get; set; }
    public Task<string> DoCallbackAsync(string input);
}


public interface IGptFunction<TInput, TOutput> : IGptFunction where TInput : new()
{
    [JsonProperty("parameters")]
    public Gpt35ParameterCollection<TInput> Parameters { get; set; }
    
}

This is basically the model we serialize so it can't change much. This is the 'generic' version of a function, the kind of thing we can store in a dictionary and execute without caring what type it is, and without having to serialize the Parameters (because we don't know what type they are, either) But then it feels like it gets weird. Gpt35FunctionBase just implements that in barebones, but then I have...

public abstract class Gpt35FunctionBase<TInput, TOutput> : Gpt35FunctionBase, IGptFunction<TInput, TOutput> where TInput : new()
{
    public Gpt35ParameterCollection<TInput> Parameters { get; set; }
    [JsonIgnore]
    public Func<TInput, Task<TOutput>> Callback { get; set; }

    public Gpt35FunctionBase(Func<TInput, Task<TOutput>> callback) : base()
    {
        Callback = callback;
        // ... other setup TODO
    }

    public override async Task<string> DoCallbackAsync(string input)
    {
        var arguments = JsonConvert.DeserializeObject<TInput>(input);
        var result = await Callback(arguments);
        if (result is string s)
            return s;
        return JsonConvert.SerializeObject(result);
    }
}

public abstract class Gpt35FunctionBase<TInput, TOutput> : Gpt35FunctionBase, IGptFunction<TInput, TOutput> where TInput : new()
{
    public Gpt35ParameterCollection<TInput> Parameters { get; set; }
    [JsonIgnore]
    public Func<TInput, Task<TOutput>> Callback { get; set; }

    public Gpt35FunctionBase(Func<TInput, Task<TOutput>> callback) : base()
    {
        Callback = callback;
        // ... other setup TODO
    }

    public override async Task<string> DoCallbackAsync(string input)
    {
        var arguments = JsonConvert.DeserializeObject<TInput>(input);
        var result = await Callback(arguments);
        if (result is string s)
            return s;
        return JsonConvert.SerializeObject(result);
    }
}

Not sure if a callback is the best idea there, what other options might I have? I don't feel like I see a lot of callbacks in most code, except as events, but I don't want async voids... As for making the properties, it gets even weirder I think. I ended up making a static constructor for the base which does it via reflection:

static Gpt35FunctionBase()
    {
        // Use reflection to setup and cache Properties
        var properties = typeof(TInput).GetProperties(System.Reflection.BindingFlags.Public & System.Reflection.BindingFlags.Instance);
        foreach (var p in properties)
        {
            cachedProperties[p.Name.ToLower()] = new Gpt35Parameter(p.PropertyType.GetJsonType(), p.CustomAttributes?.FirstOrDefault(a => a.AttributeType == typeof(GptFunctionDescriptionAttribute))?.ConstructorArguments?.FirstOrDefault().Value as string);

            if (p.PropertyType.IsEnum)
                cachedProperties[p.Name.ToLower()].Enum = Enum.GetNames(p.PropertyType);

            if (Nullable.GetUnderlyingType(p.PropertyType) == null)
            {
                // It's not nullable
                cachedRequired.Add(p.Name.ToLower());
            }
        }
    }

static Gpt35FunctionBase()
    {
        // Use reflection to setup and cache Properties
        var properties = typeof(TInput).GetProperties(System.Reflection.BindingFlags.Public & System.Reflection.BindingFlags.Instance);
        foreach (var p in properties)
        {
            cachedProperties[p.Name.ToLower()] = new Gpt35Parameter(p.PropertyType.GetJsonType(), p.CustomAttributes?.FirstOrDefault(a => a.AttributeType == typeof(GptFunctionDescriptionAttribute))?.ConstructorArguments?.FirstOrDefault().Value as string);

            if (p.PropertyType.IsEnum)
                cachedProperties[p.Name.ToLower()].Enum = Enum.GetNames(p.PropertyType);

            if (Nullable.GetUnderlyingType(p.PropertyType) == null)
            {
                // It's not nullable
                cachedRequired.Add(p.Name.ToLower());
            }
        }
    }

That also feels bad, but then each base constructor can fill in its own values from its static class, without needing anything specific to each function And finally, the definition of an actual function class, which does at least seem like it came out reasonably neat

public class GptSummaryFunction : Gpt35FunctionBase<GptSummarizeFunctionArguments, string>
{
    public static string FunctionName { get; } = "summarize";
    public static string FunctionDescription { get; } = "Summarize the entire previous conversation as completely as possible";

    public GptSummaryFunction(Func<GptSummarizeFunctionArguments, Task<string>> callback) : base(callback)
    {
    }
}

public class GptSummaryFunction : Gpt35FunctionBase<GptSummarizeFunctionArguments, string>
{
    public static string FunctionName { get; } = "summarize";
    public static string FunctionDescription { get; } = "Summarize the entire previous conversation as completely as possible";

    public GptSummaryFunction(Func<GptSummarizeFunctionArguments, Task<string>> callback) : base(callback)
    {
    }
}

Created like:

functions = new Dictionary<string, IGptFunction>
        {
            { GptRespondFunction.FunctionName, new GptRespondFunction(SpeakResponseFunction) },
            { GptSummaryFunction.FunctionName, new GptSummaryFunction(ExecuteSummaryFunction) }
        };

functions = new Dictionary<string, IGptFunction>
        {
            { GptRespondFunction.FunctionName, new GptRespondFunction(SpeakResponseFunction) },
            { GptSummaryFunction.FunctionName, new GptSummaryFunction(ExecuteSummaryFunction) }
        };

And used like

if (functions.TryGetValue(summaryResponse.Choices[0].Message.FunctionCall.Name, out var function))
        {
            var summaryContent = await function.DoCallbackAsync(summaryResponse.Choices[0].Message.FunctionCall.Arguments); // strongly typed result
            //...

if (functions.TryGetValue(summaryResponse.Choices[0].Message.FunctionCall.Name, out var function))
        {
            var summaryContent = await function.DoCallbackAsync(summaryResponse.Choices[0].Message.FunctionCall.Arguments); // strongly typed result
            //...

So basically... what I have kinda works, but it feels like I went around in circles and have way too many classes and it just all feels kinda awkward. Any thoughts on a better way to handle any part of this?

JakenVeina•2y ago

hmmm okay so just going by that JSON.... that's the JSON that is sent TO you? in order for YOU to execute the function? looks like that's a "No" that's the JSON you send to ChatGPT to describe what functions you support and it's going to send you a function call object based on that "schema" which bit are you trying to model here? the function definition? or the function call? looks like you're trying to unify the two, in some way? that's not a terrible idea I probably wouldn't or, no.... you're using DoCallbackAsync() as part of the deserialization scheme? definitely gonna say "don't do that" so, this ought to cover both bases

JakenVeina•2y ago

https://paste.mod.gg/svqtpnqgblwg/0

BlazeBin - svqtpnqgblwg

A tool for sharing your source code with the world!

JakenVeina•2y ago

you get a GptFunction.Descriptor class that you subclass for each function you want to implement that object is directly serializable to create the definition JSON to send and it has an ExecuteAsync() method that you can call into, after deserializing a GptFunction.Call<T> deserializing is also straightforward, except you inject one chunk of logic to choose the TArgumentSet to deserialize "arguments" into, after you've read "name" once you have a GptFunction.Call<TArgumentSet>, you can use TArgumentSet to lookup the correct Descriptor<TArgumentSet> and call .ExecuteAsync() on that slap all the right [JsonProperty("")] annotations on everything and this can all be done with System.Text.Json if you want, you can have GptFunction.ParameterSetDescriptor be generated automatically with reflection, based on TArgumentSet, but I probably wouldn't, unless you have a really high number of unique function calls you want to support for just a couple functions, you'd probably write more lines of code to do all the reflection than it would take to just hard-code a constructor tree

D.MentiaOP•2y ago

the main idea is I just want to have one place where I define the function's name, description, input type, and callback your implementation does look a lot like what I'm doing though, can't complain, I just did add the reflection stuff, but good to see that I seem to be on basically the right path. I decided to skip the idea of class GetWeatherForecastFunction entirely, because I don't actually ever need that class for anything, it's always treated generically. I just need a strong input type to make the callbacks easy to write and work with so I end up just declaring them at runtime,

functionManager.AddFunction<GptLoadUrlFunctionArguments>("loadUrl", "Load the contents of a URL, summarized", LoadUrlFunction);

I'll have to look closer at it tomorrow though

JakenVeina•2y ago

yeah, that's fair

Accord•2y ago

Was this issue resolved? If so, run /close - otherwise I will mark this as stale and this post will be archived until there is new activity.

Gaming

Programming

❔ Handling OpenAI/GPT API Functions with strong types

Did you find this page helpful?