Fix Error Propagation During Chat Streaming #285

gilljon · 2024-11-06T19:19:40Z

If stream = True, we are noticing that a 400 returns (in the server logs):

StreamError("Invalid status code: 400 Bad Request")

On the client side (Python consumption):

httpx.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)

If stream = False, then we still get the 400, but also more rich information:

Traceback (most recent call last):
  File "test.py", line 5, in <module>
    out = client.chat.completions.create(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "python3.12/site-packages/openai/_utils/_utils.py", line 275, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "python3.12/site-packages/openai/resources/chat/completions.py", line 829, in create
    return self._post(
           ^^^^^^^^^^^
  File "python3.12/site-packages/openai/_base_client.py", line 1277, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "python3.12/site-packages/openai/_base_client.py", line 954, in request
    return self._request(
           ^^^^^^^^^^^^^^
  File "python3.12/site-packages/openai/_base_client.py", line 1058, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: After the optional system message, conversation roles must alternate user/assistant/user/assistant/...

What would be great is that we are able to retrieve the enriched error information even if streaming is true. That is the current behavior with the Python client.

The text was updated successfully, but these errors were encountered:

SeseMueller · 2025-02-01T00:34:35Z

Chasing it down a bit, the actual response is contained in the error, so a fix shouldn't be that hard.

Basically, in the file error.rs in the reqwest-eventsource-0.6.0, the error type that is later used is defined. It is on line 43: InvalidStatusCode and contains the StatusCode and the Response. However, it is annotated with the error macro from the thiserror crate (See line 42) and thus only outputs the StatusCode, the Response is discarded.

That is, if the error code defined by the macro is used.

In src/client.rs on line 441, async-openai (0.25.0, sorry) simply calls e.to_string() on the error it might have gotten from the Eventsource, discarding all other information contained in e, such as the Response.

Changing this to something that properly captures the Response or just adds it to the Error String will drastically improve stream debugging capabilities, as was talked about in this issue.

I might work on it later.

…bit#285 ?)

SeseMueller · 2025-02-02T20:28:32Z

I added a simple block to, if an invalid status code or content type is encountered, a custom error String is used instead that also contains the full Response text.

I didn't understand the testing suite or whether this is covered by the tests; sorry about that.

I'll quickly test it in my local environment and report back.

SeseMueller · 2025-02-02T20:50:01Z

It does work, I got it to output a more useful error when calling o3-mini.
Comparison:

Before:

StreamError("Invalid status code: 400 Bad Request")

Now:

StreamError("Invalid status code: 400 Bad Request\n{\n  \"error\": {\n    \"message\": \"Unsupported parameter: 'parallel_tool_calls' is not supported with this model.\",\n    \"type\": \"invalid_request_error\",\n    \"param\": \"parallel_tool_calls\",\n    \"code\": \"unsupported_parameter\"\n  }\n}")

Note that the error is quite ugly, but is much more informative.
I'm not sure how the errors should reasonably be printed in a pretty way that still contains all useful information.

Edit:
This actually helped a lot in working with the new reasoning models, as they have a few parameters they don't want set. The error messages just directly contained the information which parameters I needed to turn off.

64bit · 2025-02-02T23:18:47Z

Thanks for debugging and sharing, it seems like underlying response.text().await can be deserialized into WrappedError and eventually to OpenAIError::ApiError variant to return. When error deserialization of response.text().await to WrapedError fails - fallback to OpenAIError::StreamError(response.text().await.<unwrap-with-default>).

gilljon · 2025-02-05T17:08:00Z

Could someone share a code snippet of what exactly needs to change in order to support better error handling?

SeseMueller · 2025-02-05T18:18:57Z

Could someone share a code snippet of what exactly needs to change in order to support better error handling?

I did a minimal solution on my fork: SeseMueller@4b82e9a

64bit added the enhancement New feature or request label Nov 12, 2024

SeseMueller pushed a commit to SeseMueller/async-openai that referenced this issue Feb 2, 2025

Improve error output on streaming: Add Response body to error (Fix 64…

4b82e9a

…bit#285 ?)

gilljon mentioned this issue Feb 6, 2025

fix: better error handling during streaming + add VLLM options #332

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Error Propagation During Chat Streaming #285

Fix Error Propagation During Chat Streaming #285

gilljon commented Nov 6, 2024

SeseMueller commented Feb 1, 2025

SeseMueller commented Feb 2, 2025

SeseMueller commented Feb 2, 2025 •

edited

Loading

64bit commented Feb 2, 2025 •

edited

Loading

gilljon commented Feb 5, 2025

SeseMueller commented Feb 5, 2025 •

edited

Loading

Fix Error Propagation During Chat Streaming #285

Fix Error Propagation During Chat Streaming #285

Comments

gilljon commented Nov 6, 2024

SeseMueller commented Feb 1, 2025

SeseMueller commented Feb 2, 2025

SeseMueller commented Feb 2, 2025 • edited Loading

64bit commented Feb 2, 2025 • edited Loading

gilljon commented Feb 5, 2025

SeseMueller commented Feb 5, 2025 • edited Loading

SeseMueller commented Feb 2, 2025 •

edited

Loading

64bit commented Feb 2, 2025 •

edited

Loading

SeseMueller commented Feb 5, 2025 •

edited

Loading