Skip to content

API ergonomics enhancement: expose types as enums #545

@lightbody

Description

@lightbody

Right now the way the Java SDK works is you have to do things like:

if (event.isOutputTextDelta()) {
  // do stuff with event.asOutputTextDelta()
} else if (event.isCompleted()) {
  // do stuff with event.asCompleted()
} // etc

Since Java 17, the enhanced switch statement makes it really convenient to use enums with switches and it be nice if we could do something like:

switch (event.type()) {
  case OUTPUT_TEXT_DELTA -> // do stuff with event.asOutputTextDelta()
  case COMPLETED -> // do stuff with event.asCompleted()
}

And since Java 21, you can also do switch pattern matching. Unfortunately, this doesn't seem to work because the style the OpenAI SDK uses does not have classes like ResponseStreamEvent as a base for things like ResponseTextDeltaEvent or ResponseCompletedEvent. If it did, we could do:

switch (event) {
  case ResponseTextDeltaEvent e -> // do stuff with e
  case ResponseCompletedEvent e -> // do stuff with e
}

From a dev ergonomics standpoint I'd be nice if support was added for one or both of these :) Thank you for considering it!

PS: the workaround is easy enough, we just maintain something like this:

public enum ResponseStreamEventType {
    OUTPUT_TEXT_DELTA,
    COMPLETED,
    ERROR,
    UNKNOWN;
    
    // TODO: Add the 40+ additional event types such as...
    // SESSION_CREATED, SESSION_UPDATED, CONVERSATION_ITEM_CREATED, CONVERSATION_ITEM_INPUT_AUDIO_TRANSCRIPTION_COMPLETED,
    // CONVERSATION_ITEM_INPUT_AUDIO_TRANSCRIPTION_FAILED, CONVERSATION_ITEM_TRUNCATED, CONVERSATION_ITEM_DELETED,
    // INPUT_AUDIO_BUFFER_COMMITTED, INPUT_AUDIO_BUFFER_CLEARED, INPUT_AUDIO_BUFFER_SPEECH_STARTED, INPUT_AUDIO_BUFFER_SPEECH_STOPPED,
    // RESPONSE_CREATED, RESPONSE_DONE, RESPONSE_OUTPUT_ITEM_ADDED, RESPONSE_OUTPUT_ITEM_DONE, RESPONSE_CONTENT_PART_ADDED,
    // RESPONSE_CONTENT_PART_DONE, RESPONSE_TEXT_DELTA, RESPONSE_TEXT_DONE, RESPONSE_AUDIO_TRANSCRIPT_DELTA,
    // RESPONSE_AUDIO_TRANSCRIPT_DONE, RESPONSE_AUDIO_DELTA, RESPONSE_AUDIO_DONE, RESPONSE_FUNCTION_CALL_ARGUMENTS_DELTA,
    // RESPONSE_FUNCTION_CALL_ARGUMENTS_DONE, RATE_LIMITS_UPDATED, etc.
    // See: https://platform.openai.com/docs/api-reference/responses-streaming

    public static ResponseStreamEventType classify(ResponseStreamEvent event) {
        if (event.isOutputTextDelta()) {
            return OUTPUT_TEXT_DELTA;
        } else if (event.isCompleted()) {
            return COMPLETED;
        } else if (event.isError()) {
            return ERROR;
        } else {
            return UNKNOWN;
        }
    }
}

And then we can switch on the enum that classify(event) returns.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions