Update README.md

2025-08-16 00:41:42 +05:00
parent 43da12410b
commit aebe879619
1 changed files with 8 additions and 6 deletions
--- a/README.md
+++ b/README.md
@@ -79,13 +79,15 @@ curl http://127.0.0.1:8000/v1/chat/completions \
 # Customisation / Configuration
- Thinking effort
+### Thinking effort
  GPT-5 has a configurable amount of "effort" it can put into thinking, which may cause it to take more time for a response to return, but may overall give a smarter answer. Applying this parameter after `serve` forces the server to use this reasoning effort by default, unless overrided by the API request with a different effort set. The default reasoning effort without setting this parameter is `medium`.
  - `--reasoning-effort` (choice of low,medium,high)
- Thinking summaries
+GPT-5 has a configurable amount of "effort" it can put into thinking, which may cause it to take more time for a response to return, but may overall give a smarter answer. Applying this parameter after `serve` forces the server to use this reasoning effort by default, unless overrided by the API request with a different effort set. The default reasoning effort without setting this parameter is `medium`.
-  Models like GPT-5 do not return raw thinking content, but instead return thinking summaries. These can also be customised by you.
+`--reasoning-effort` (choice of low,medium,high)
-  - `--reasoning-summary` (choice of auto,concise,detailed,none)
+
 ### Thinking summaries
 Models like GPT-5 do not return raw thinking content, but instead return thinking summaries. These can also be customised by you.
 `--reasoning-summary` (choice of auto,concise,detailed,none)
 ## Notes
 If you wish to have the fastest responses, I'd recommend setting `--reasoning-effort` to low, and `--reasoning-summary` to none.