Update README.md
This commit is contained in:
@@ -81,13 +81,13 @@ curl http://127.0.0.1:8000/v1/chat/completions \
|
|||||||
|
|
||||||
### Thinking effort
|
### Thinking effort
|
||||||
|
|
||||||
|
- `--reasoning-effort` (choice of low,medium,high)
|
||||||
GPT-5 has a configurable amount of "effort" it can put into thinking, which may cause it to take more time for a response to return, but may overall give a smarter answer. Applying this parameter after `serve` forces the server to use this reasoning effort by default, unless overrided by the API request with a different effort set. The default reasoning effort without setting this parameter is `medium`.
|
GPT-5 has a configurable amount of "effort" it can put into thinking, which may cause it to take more time for a response to return, but may overall give a smarter answer. Applying this parameter after `serve` forces the server to use this reasoning effort by default, unless overrided by the API request with a different effort set. The default reasoning effort without setting this parameter is `medium`.
|
||||||
`--reasoning-effort` (choice of low,medium,high)
|
|
||||||
|
|
||||||
### Thinking summaries
|
### Thinking summaries
|
||||||
|
|
||||||
|
- `--reasoning-summary` (choice of auto,concise,detailed,none)
|
||||||
Models like GPT-5 do not return raw thinking content, but instead return thinking summaries. These can also be customised by you.
|
Models like GPT-5 do not return raw thinking content, but instead return thinking summaries. These can also be customised by you.
|
||||||
`--reasoning-summary` (choice of auto,concise,detailed,none)
|
|
||||||
|
|
||||||
## Notes
|
## Notes
|
||||||
If you wish to have the fastest responses, I'd recommend setting `--reasoning-effort` to low, and `--reasoning-summary` to none.
|
If you wish to have the fastest responses, I'd recommend setting `--reasoning-effort` to low, and `--reasoning-summary` to none.
|
||||||
|
|||||||
Reference in New Issue
Block a user