-
Notifications
You must be signed in to change notification settings - Fork 1
Description
From @houtanb
my plan for this repo (we need buy in for this) is to have another folder, something like
forecaster. That would have a default forecaster that is the ForecastBench zero shot forecaster, with
- a complete list of models ever run on FB (like you've started here)
- the parameters they were run with
- the zero shot prompt
so that FRI staff could easily use a FB forecaster in their code.
But, that
forecasterlibrary would also:
- need to obtain structured output to get forecasts in a consistent way or parse LLM output to provide this (as is currently done on FB)
- handle forecasts on binary questions as with FB, but also quantile forecasts, point forecasts, multiple choice, ...
This
llmlibrary is step on which theforecasterlibrary can be built. But, as is, it's already useful to peolpe at FRI as you've created this great abstraction from all these different APIs such that anyone at FRI can query any model they want without having to look into a specific API.
Until we get that
forecasterlibrary, the list [in llm/model_regsitry.py] would need to be maintained for not much benefit. Also, the list is not complete as many models have been run that are not present
FYI
MODELS_TO_RUNon FB is usually updated at least once every two weeks with new models or pulling in models we've previously run. Just updated today in forecastingresearch/forecastbench@cdc0f9c and will update again when we have access to GPT 5.1 via the API.