From 2d3d6540727be8a74712b7360e695e9c431c48a6 Mon Sep 17 00:00:00 2001
From: Game_Time <108236317+RayBytes@users.noreply.github.com>
Date: Sat, 16 Aug 2025 00:42:38 +0500
Subject: [PATCH] Update README.md
---
README.md | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/README.md b/README.md
index d06d83f..2505f07 100644
--- a/README.md
+++ b/README.md
@@ -81,12 +81,12 @@ curl http://127.0.0.1:8000/v1/chat/completions \
### Thinking effort
-- `--reasoning-effort` (choice of low,medium,high)
+- `--reasoning-effort` (choice of low,medium,high)
GPT-5 has a configurable amount of "effort" it can put into thinking, which may cause it to take more time for a response to return, but may overall give a smarter answer. Applying this parameter after `serve` forces the server to use this reasoning effort by default, unless overrided by the API request with a different effort set. The default reasoning effort without setting this parameter is `medium`.
### Thinking summaries
-- `--reasoning-summary` (choice of auto,concise,detailed,none)
+- `--reasoning-summary` (choice of auto,concise,detailed,none)
Models like GPT-5 do not return raw thinking content, but instead return thinking summaries. These can also be customised by you.
## Notes