Skip to content

Design forecaster package #10

@elsehow

Description

@elsehow

From @houtanb

my plan for this repo (we need buy in for this) is to have another folder, something like forecaster. That would have a default forecaster that is the ForecastBench zero shot forecaster, with

  • a complete list of models ever run on FB (like you've started here)
  • the parameters they were run with
  • the zero shot prompt
    so that FRI staff could easily use a FB forecaster in their code.

But, that forecaster library would also:

  • need to obtain structured output to get forecasts in a consistent way or parse LLM output to provide this (as is currently done on FB)
  • handle forecasts on binary questions as with FB, but also quantile forecasts, point forecasts, multiple choice, ...

This llm library is step on which the forecaster library can be built. But, as is, it's already useful to peolpe at FRI as you've created this great abstraction from all these different APIs such that anyone at FRI can query any model they want without having to look into a specific API.

Until we get that forecaster library, the list [in llm/model_regsitry.py] would need to be maintained for not much benefit. Also, the list is not complete as many models have been run that are not present

FYI MODELS_TO_RUN on FB is usually updated at least once every two weeks with new models or pulling in models we've previously run. Just updated today in forecastingresearch/forecastbench@cdc0f9c and will update again when we have access to GPT 5.1 via the API.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions