diff --git a/.github/workflows/pkgdown.yaml b/.github/workflows/pkgdown.yaml new file mode 100644 index 000000000..663e29ce9 --- /dev/null +++ b/.github/workflows/pkgdown.yaml @@ -0,0 +1,56 @@ +# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples +# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help +on: + # build dev site on merged pushes + push: + branches: [main, master] + # build full site on releases + release: + types: [published] + workflow_dispatch: + +name: pkgdown.yaml + +jobs: + pkgdown: + runs-on: ubuntu-latest + # Only restrict concurrency for non-PR jobs + concurrency: + group: pkgdown-${{ github.event_name != 'pull_request' || github.run_id }} + cancel-in-progress: true + env: + GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} + NOT_CRAN: "true" + permissions: + contents: write + steps: + - uses: actions/checkout@v4 + + - uses: r-lib/actions/setup-pandoc@v2 + + - uses: r-lib/actions/setup-r@v2 + with: + use-public-rspm: true + + - uses: r-lib/actions/setup-r-dependencies@v2 + with: + extra-packages: any::pkgdown, local::., stan-dev/pkgdown-config + + - name: Install CmdStan + run: Rscript -e "cmdstanr::install_cmdstan()" + + - name: Build site + run: | + pkgdown::build_site_github_pages( + lazy = FALSE, # change to TRUE if runner times out. + run_dont_run = TRUE, + new_process = TRUE + ) + shell: Rscript {0} + + - name: Deploy to GitHub pages 🚀 + uses: JamesIves/github-pages-deploy-action@v4 + with: + clean: false + branch: gh-pages + folder: docs \ No newline at end of file diff --git a/_pkgdown.yml b/_pkgdown.yml index 8e28d1fc0..123d2191a 100644 --- a/_pkgdown.yml +++ b/_pkgdown.yml @@ -1,57 +1,40 @@ url: https://mc-stan.org/cmdstanr -destination: docs +destination: "." development: - mode: release + mode: auto template: - params: - bootswatch: cosmo + package: pkgdownconfig navbar: title: "cmdstanr" - left: - - icon: fa-home fa-lg - href: index.html - - text: "Vignettes" - href: articles/index.html - - text: "Functions" - href: reference/index.html - - text: "News" - href: news/index.html - - text: "Other Packages" + + structure: + left: [home, vignettes, functions, news, pkgs, stan] + right: [search, bluesky, forum, github, lightswitch] + + components: + pkgs: + text: Other Packages menu: - - text: "rstan" - href: https://mc-stan.org/rstan - - text: "rstanarm" - href: https://mc-stan.org/rstanarm - - text: "bayesplot" + - text: bayesplot href: https://mc-stan.org/bayesplot - - text: "shinystan" - href: https://mc-stan.org/shinystan - text: "loo" href: https://mc-stan.org/loo - - text: "projpred" + - text: posterior + href: https://mc-stan.org/posterior + - text: projpred href: https://mc-stan.org/projpred - - text: "rstantools" + - text: rstan + href: https://mc-stan.org/rstan + - text: rstanarm + href: https://mc-stan.org/rstanarm + - text: rstantools href: https://mc-stan.org/rstantools - - text: "posterior" - href: https://mc-stan.org/posterior - - text: "Stan" - href: https://mc-stan.org - right: - - icon: fa-twitter - href: https://twitter.com/mcmc_stan - - icon: fa-github - href: https://github.com/stan-dev/cmdstanr - - icon: fa-users - href: https://discourse.mc-stan.org/ - -home: - links: - - text: Ask a question - href: https://discourse.mc-stan.org/ + - text: shinystan + href: https://mc-stan.org/shinystan toc: depth: 4 @@ -120,4 +103,3 @@ reference: contents: - register_knitr_engine - eng_cmdstan - diff --git a/docs/404.html b/docs/404.html deleted file mode 100644 index 692fbba97..000000000 --- a/docs/404.html +++ /dev/null @@ -1,167 +0,0 @@ - - -
- - - - -This outlines how to propose a change to cmdstanr and is based on similar instructions for tidyverse packages, including the contributing guidelines generated by usethis::use_tidy_contributing().
You can fix typos, spelling mistakes, or grammatical errors in the documentation directly using the GitHub web interface, as long as the changes are made in the source file. This generally means you’ll need to edit roxygen2 comments in an .R, not a .Rd file. You can find the .R file that generates the .Rd by reading the comment in the first line.
If you want to make a bigger change, it’s a good idea to first file an issue and make sure someone from the team agrees that it’s needed. If you’ve found a bug, please file an issue that illustrates the bug with a minimal reproducible example (see e.g. the tidyverse reprex instructions). The tidyverse guide on how to create a great issue has more advice.
-If you are new to creating pull requests here are some tips. Using the functions from the usethis package is not required but can be helpful if this process is new to you.
Fork the package and clone onto your computer. If you haven’t done this before, we recommend using usethis::create_from_github("stan-dev/cmdstanr", fork = TRUE).
Install all development dependencies with devtools::install_dev_deps(), and then make sure the package passes R CMD check by running devtools::check(). If R CMD check doesn’t pass cleanly, it’s a good idea to ask for help before continuing.
Create a Git branch for your pull request (PR). We recommend using usethis::pr_init("brief-description-of-change").
Make your changes, commit to git, and then create a PR by running usethis::pr_push(), and following the prompts in your browser. The title of your PR should briefly describe the change. The body of your PR should contain Fixes #issue-number.
For user-facing changes, add a bullet to the top of NEWS.md (i.e. just below the first header). Follow the style already used in NEWS.md.
New code should attempt to follow the style used in the package. When in doubt follow the tidyverse style guide.
We use roxygen2, with Markdown syntax, for documentation.
We use testthat for unit tests. Contributions with test cases included are easier to accept.
Please note that the cmdstanr project follows the Stan project’s Code of Conduct. By contributing to this project you agree to abide by its terms.
-YEAR: 2019 -COPYRIGHT HOLDER: Stan Developers and their Assignees -- -
Copyright (c) 2019, Stan Developers and their Assignees All rights reserved.
-Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
-Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-vignettes/articles-online-only/opencl.Rmd
- opencl.RmdThis vignette demonstrates how to use the OpenCL capabilities of -CmdStan with CmdStanR. The functionality described in this vignette -requires CmdStan 2.26.1 or newer.
-As of version 2.26.1, users can expect speedups with OpenCL when
-using vectorized probability distribution functions (functions with the
-_lpdf or _lpmf suffix) and when the input
-variables contain at least 20,000 elements.
The actual speedup for a model will depend on the particular
-lpdf/lpmf functions used and whether the
-lpdf/lpmf functions are the bottlenecks of the model. The
-more computationally complex the function is, the larger the expected
-speedup. The biggest speedups are expected when using the specialized
-GLM functions.
In order to establish the bottlenecks in your model we recommend -using profiling, -which was introduced in Stan version 2.26.0.
-OpenCL is supported on most modern CPUs and GPUs. In order to use -OpenCL in CmdStanR, an OpenCL runtime for the target device must be -installed. A guide for the most common devices is available in the -CmdStan manual’s chapter -on parallelization.
-In case of using Windows, CmdStan requires the
-OpenCL.lib to compile the model. If you experience issue
-compiling the model with OpenCL, run the below script and set
-path_to_opencl_lib to the path to the
-OpenCL.lib file on your system. If you are using CUDA, the
-path should be similar to the one listed here.
-path_to_opencl_lib <- "C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.5/lib/x64"
-cpp_options = list(
- paste0("LDFLAGS+= -L\"",path_to_opencl_lib,"\" -lOpenCL")
-)
-
-cmdstanr::cmdstan_make_local(cpp_options = cpp_options)
-cmdstanr::rebuild_cmdstan()By default, models in CmdStanR are compiled without OpenCL -support. Once OpenCL support is enabled, a CmdStan model will make use -of OpenCL if the functions in the model support it. Technically no -changes to a model are required to support OpenCL since the choice of -using OpenCL is handled by the compiler, but it can still be useful to -rewrite a model to be more OpenCL-friendly by using vectorization as -much as possible when using probability distributions.
-Consider a simple logistic regression with parameters
-alpha and beta, covariates X, and
-outcome y.
data {
- int<lower=1> k;
- int<lower=0> n;
- matrix[n, k] X;
- array[n] int y;
-}
-parameters {
- vector[k] beta;
- real alpha;
-}
-model {
- target += std_normal_lpdf(beta);
- target += std_normal_lpdf(alpha);
- target += bernoulli_logit_glm_lpmf(y | X, alpha, beta);
-}
-Some fake data will be useful to run this model:
-
-library(cmdstanr)
-
-# Generate some fake data
-n <- 250000
-k <- 20
-X <- matrix(rnorm(n * k), ncol = k)
-y <- rbinom(n, size = 1, prob = plogis(3 * X[,1] - 2 * X[,2] + 1))
-mdata <- list(k = k, n = n, y = y, X = X)In this model, most of the computation will be handled by the
-bernoulli_logit_glm_lpmf function. Because this is a
-supported GPU function, it should be possible to accelerate it with
-OpenCL. Check here
-for a list of functions with OpenCL support.
To build the model with OpenCL support, add
-cpp_options = list(stan_opencl = TRUE) at the compilation
-step.
-# Compile the model with STAN_OPENCL=TRUE
-mod_cl <- cmdstan_model("opencl-files/bernoulli_logit_glm.stan",
- cpp_options = list(stan_opencl = TRUE))Running models with OpenCL requires specifying the OpenCL platform
-and device on which to run the model (there can be multiple). If the
-system has one GPU and no OpenCL CPU runtime, the platform and device
-IDs of the GPU are typically both 0, but the
-clinfo tool can be used to figure out for sure which
-devices are available.
On an Ubuntu system with both CPU and GPU OpenCL support,
-clinfo -l outputs:
Platform #0: AMD Accelerated Parallel Processing
- `-- Device #0: gfx906+sram-ecc
-Platform #1: Intel(R) CPU Runtime for OpenCL(TM) Applications
- `-- Device #0: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
-On this system the GPU is platform ID 0 and device ID 0, while the
-CPU is platform ID 1, device ID 0. These can be specified with the
-opencl_ids argument when running a model. The
-opencl_ids is supplied as a vector of length 2, where the
-first element is the platform ID and the second argument is the device
-ID.
-fit_cl <- mod_cl$sample(data = mdata, chains = 4, parallel_chains = 4,
- opencl_ids = c(0, 0), refresh = 0)We’ll also run a version without OpenCL and compare the run -times.
-
-# no OpenCL version
-mod <- cmdstan_model("opencl-files/bernoulli_logit_glm.stan", force_recompile = TRUE)
-fit_cpu <- mod$sample(data = mdata, chains = 4, parallel_chains = 4, refresh = 0)The speedup of the OpenCL model is:
-
-fit_cpu$time()$total / fit_cl$time()$totalThis speedup will be determined by the particular GPU/CPU used, the -input problem sizes (data as well as parameters) and if the model uses -functions that can be run on the GPU or other OpenCL devices.
-vignettes/children/settings-knitr.Rmd
- settings-knitr.Rmdvignettes/cmdstanr-internals.Rmd
- cmdstanr-internals.RmdThis vignette is intended to be read after the Getting -started with CmdStanR vignette. Please read that first for -important background. In this document we provide additional details -about compiling models, passing in data, and how CmdStan output is saved -and read back into R.
-We will only use the $sample() method in examples, but
-all model fitting methods work in a similar way under the hood.
-library(cmdstanr)
-check_cmdstan_toolchain(fix = TRUE, quiet = TRUE)The cmdstan_model() function creates a new
-CmdStanModel object. The CmdStanModel object
-stores the path to a Stan program as well as the path to a compiled
-executable.
-stan_file <- file.path(cmdstan_path(), "examples", "bernoulli", "bernoulli.stan")
-mod <- cmdstan_model(stan_file)
-mod$print()data {
- int<lower=0> N;
- array[N] int<lower=0, upper=1> y;
-}
-parameters {
- real<lower=0, upper=1> theta;
-}
-model {
- theta ~ beta(1, 1); // uniform prior on interval 0,1
- y ~ bernoulli(theta);
-}
-
-mod$stan_file()[1] "/Users/jgabry/.cmdstan/cmdstan-2.36.0/examples/bernoulli/bernoulli.stan"
-
-mod$exe_file()[1] "/Users/jgabry/.cmdstan/cmdstan-2.36.0/examples/bernoulli/bernoulli"
-Subsequently, if you create a CmdStanModel object from
-the same Stan file then compilation will be skipped (assuming the file
-hasn’t changed).
-mod <- cmdstan_model(stan_file)Internally, cmdstan_model() first creates the
-CmdStanModel object from just the Stan file and then calls
-its $compile()
-method. Optional arguments to the $compile() method can be
-passed via ....
-mod <- cmdstan_model(
- stan_file,
- force_recompile = TRUE,
- include_paths = "paths/to/directories/with/included/files",
- cpp_options = list(stan_threads = TRUE, STANC2 = TRUE)
-)It is also possible to delay compilation when creating the
-CmdStanModel object by specifying
-compile=FALSE and then later calling the
-$compile() method directly.
-unlink(mod$exe_file())
-mod <- cmdstan_model(stan_file, compile = FALSE)
-mod$exe_file() # not yet createdcharacter(0)
-
-mod$compile()
-mod$exe_file()[1] "/Users/jgabry/.cmdstan/cmdstan-2.36.0/examples/bernoulli/bernoulli"
-If you are using CmdStan version 2.24 or later and CmdStanR version -0.2.1 or later, you can run a pedantic check for your model. CmdStanR -will always check that your Stan program does not contain any invalid -syntax but with pedantic mode enabled the check will also warn you about -other potential issues in your model, for example:
-~ symbols).For the latest information on the checks performed in pedantic mode -see the Pedantic -mode chapter in the Stan Reference Manual.
-Pedantic mode is available when compiling the model or when using the
-separate $check_syntax() method of a
-CmdStanModel object. Internally this corresponds to setting
-the stanc (Stan transpiler) option
-warn-pedantic. Here we demonstrate pedantic mode with a
-Stan program that is syntactically correct but is missing a lower bound
-and a prior for a parameter.
-stan_file_pedantic <- write_stan_file("
-data {
- int N;
- array[N] int y;
-}
-parameters {
- // should have <lower=0> but omitting to demonstrate pedantic mode
- real lambda;
-}
-model {
- y ~ poisson(lambda);
-}
-")To turn on pedantic mode at compile time you can set
-pedantic=TRUE in the call to cmdstan_model()
-(or when calling the $compile() method directly if using
-the delayed compilation approach described above).
mod_pedantic <- cmdstan_model(stan_file_pedantic, pedantic = TRUE)
-Warning in '/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/model-d1b6650db73c.stan', line 11, column 14: A
- poisson distribution is given parameter lambda as a rate parameter
- (argument 1), but lambda was not constrained to be strictly positive.
-Warning: The parameter lambda has no priors. This means either no prior is
- provided, or the prior(s) depend on data variables. In the later case,
- this may be a false positive.To turn on pedantic mode separately from compilation use the
-pedantic argument to the $check_syntax()
-method.
mod_pedantic$check_syntax(pedantic = TRUE)
-Warning in '/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/model_febb1e69c7387a0e64cf13583e078104.stan', line 11, column 14: A
- poisson distribution is given parameter lambda as a rate parameter
- (argument 1), but lambda was not constrained to be strictly positive.
-Warning: The parameter lambda has no priors. This means either no prior is
- provided, or the prior(s) depend on data variables. In the later case,
- this may be a false positive.
-Stan program is syntactically correctUsing pedantic=TRUE via the $check_syntax()
-method also has the advantage that it can be used even if the model
-hasn’t been compiled yet. This can be helpful because the pedantic and
-syntax checks themselves are much faster than compilation.
file.remove(mod_pedantic$exe_file()) # delete compiled executable
-[1] TRUE
-rm(mod_pedantic)
-
-mod_pedantic <- cmdstan_model(stan_file_pedantic, compile = FALSE)
-mod_pedantic$check_syntax(pedantic = TRUE)
-Warning in '/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/model_febb1e69c7387a0e64cf13583e078104.stan', line 11, column 14: A
- poisson distribution is given parameter lambda as a rate parameter
- (argument 1), but lambda was not constrained to be strictly positive.
-Warning: The parameter lambda has no priors. This means either no prior is
- provided, or the prior(s) depend on data variables. In the later case,
- this may be a false positive.
-Stan program is syntactically correctIf using CmdStan 2.27 or newer, you can obtain the names, types and
-dimensions of the data, parameters, transformed parameters and generated
-quantities variables of a Stan model using the $variables()
-method of the CmdStanModel object.
-stan_file_variables <- write_stan_file("
-data {
- int<lower=1> J;
- vector<lower=0>[J] sigma;
- vector[J] y;
-}
-parameters {
- real mu;
- real<lower=0> tau;
- vector[J] theta_raw;
-}
-transformed parameters {
- vector[J] theta = mu + tau * theta_raw;
-}
-model {
- target += normal_lpdf(tau | 0, 10);
- target += normal_lpdf(mu | 0, 10);
- target += normal_lpdf(theta_raw | 0, 1);
- target += normal_lpdf(y | theta, sigma);
-}
-")
-mod_v <- cmdstan_model(stan_file_variables)
-variables <- mod_v$variables()The $variables() method returns a list with
-data, parameters,
-transformed_parameters and
-generated_quantities elements, each corresponding to
-variables in their respective block of the program. Transformed data
-variables are not listed as they are not used in the model’s input or
-output.
-names(variables)[1] "parameters" "included_files" "data"
-[4] "transformed_parameters" "generated_quantities"
-
-names(variables$data)[1] "J" "sigma" "y"
-
-names(variables$parameters)[1] "mu" "tau" "theta_raw"
-
-names(variables$transformed_parameters)[1] "theta"
-
-names(variables$generated_quantities)character(0)
-Each variable is represented as a list containing the type
-information (currently limited to real or int)
-and the number of dimensions.
-variables$data$J$type
-[1] "int"
-
-$dimensions
-[1] 0
-
-variables$data$sigma$type
-[1] "real"
-
-$dimensions
-[1] 1
-
-variables$parameters$tau$type
-[1] "real"
-
-$dimensions
-[1] 0
-
-variables$transformed_parameters$theta$type
-[1] "real"
-
-$dimensions
-[1] 1
-By default, the executable is created in the same directory as the
-file containing the Stan program. You can also specify a different
-location with the dir argument.
-mod <- cmdstan_model(stan_file, dir = "path/to/directory/for/executable")There are three data formats that CmdStanR allows when fitting a -model:
-Like the RStan interface, CmdStanR accepts a named list of R objects
-where the names correspond to variables declared in the data block of
-the Stan program. In the Bernoulli model the data is N, the
-number of data points, and y an integer array of
-observations.
-mod$print()data {
- int<lower=0> N;
- array[N] int<lower=0, upper=1> y;
-}
-parameters {
- real<lower=0, upper=1> theta;
-}
-model {
- theta ~ beta(1, 1); // uniform prior on interval 0,1
- y ~ bernoulli(theta);
-}
-
-# data block has 'N' and 'y'
-data_list <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1))
-fit <- mod$sample(data = data_list)Because CmdStan doesn’t accept lists of R objects, CmdStanR will
-first write the data to a temporary JSON file using
-write_stan_json(). This happens internally, but it is also
-possible to call write_stan_json() directly.
-data_list <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1))
-json_file <- tempfile(fileext = ".json")
-write_stan_json(data_list, json_file)
-cat(readLines(json_file), sep = "\n"){
- "N": 10,
- "y": [0, 1, 0, 0, 0, 0, 0, 0, 0, 1]
-}
-If you already have your data in a JSON file you can just pass that
-file directly to CmdStanR instead of using a list of R objects. For
-example, we could pass in the JSON file we created above using
-write_stan_json():
-fit <- mod$sample(data = json_file)Finally, it is also possible to use the R dump file format. This is
-not recommended because CmdStan can process JSON faster than R
-dump, but CmdStanR allows it because CmdStan will accept files created
-by rstan::stan_rdump():
When fitting a model, the default behavior is to write the output -from CmdStan to CSV files in a temporary directory.
-
-fit$output_files()[1] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/bernoulli-202503310850-1-5c7cee.csv"
-[2] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/bernoulli-202503310850-2-5c7cee.csv"
-[3] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/bernoulli-202503310850-3-5c7cee.csv"
-[4] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/bernoulli-202503310850-4-5c7cee.csv"
-These files will be lost if you end your R session or if you remove
-the fit object and force (or wait for) garbage
-collection.
-files <- fit$output_files()
-file.exists(files)[1] TRUE TRUE TRUE TRUE
-
- used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
-Ncells 1161565 62.1 2340494 125 NA 1595651 85.3
-Vcells 2051945 15.7 8388608 64 32768 4957304 37.9
-
-file.exists(files)[1] FALSE FALSE FALSE FALSE
-To save these files to a non-temporary location there are two
-options. You can either specify the output_dir argument to
-mod$sample() or use fit$save_output_files()
-after fitting the model.
-# see ?save_output_files for info on optional arguments
-fit$save_output_files(dir = "path/to/directory")
-fit <- mod$sample(
- data = data_list,
- output_dir = "path/to/directory"
-)With the exception of some diagnostic information, the CSV files are
-not read into R until their contents are requested by calling a method
-that requires them (e.g., fit$draws(),
-fit$summary(), etc.). If we examine the structure of the
-fit object, notice how the Private slot
-draws_ is NULL, indicating that the CSV files
-haven’t yet been read into R.
-str(fit)Classes 'CmdStanMCMC', 'CmdStanFit', 'R6' <CmdStanMCMC>
- Inherits from: <CmdStanFit>
- Public:
- clone: function (deep = FALSE)
- cmdstan_diagnose: function ()
- cmdstan_summary: function (flags = NULL)
- code: function ()
- config_files: function (include_failed = FALSE)
- constrain_variables: function (unconstrained_variables, transformed_parameters = TRUE,
- data_file: function ()
- diagnostic_summary: function (diagnostics = c("divergences", "treedepth", "ebfmi"),
- draws: function (variables = NULL, inc_warmup = FALSE, format = getOption("cmdstanr_draws_format",
- expose_functions: function (global = FALSE, verbose = FALSE)
- functions: environment
- grad_log_prob: function (unconstrained_variables, jacobian = TRUE, jacobian_adjustment = NULL)
- hessian: function (unconstrained_variables, jacobian = TRUE, jacobian_adjustment = NULL)
- init: function ()
- init_model_methods: function (seed = 1, verbose = FALSE, hessian = FALSE)
- initialize: function (runset)
- inv_metric: function (matrix = TRUE)
- latent_dynamics_files: function (include_failed = FALSE)
- log_prob: function (unconstrained_variables, jacobian = TRUE, jacobian_adjustment = NULL)
- loo: function (variables = "log_lik", r_eff = TRUE, moment_match = FALSE,
- lp: function ()
- metadata: function ()
- metric_files: function (include_failed = FALSE)
- num_chains: function ()
- num_procs: function ()
- output: function (id = NULL)
- output_files: function (include_failed = FALSE)
- print: function (variables = NULL, ..., digits = 2, max_rows = getOption("cmdstanr_max_rows",
- profile_files: function (include_failed = FALSE)
- profiles: function ()
- return_codes: function ()
- runset: CmdStanRun, R6
- sampler_diagnostics: function (inc_warmup = FALSE, format = getOption("cmdstanr_draws_format",
- save_config_files: function (dir = ".", basename = NULL, timestamp = TRUE, random = TRUE)
- save_data_file: function (dir = ".", basename = NULL, timestamp = TRUE, random = TRUE)
- save_latent_dynamics_files: function (dir = ".", basename = NULL, timestamp = TRUE, random = TRUE)
- save_metric_files: function (dir = ".", basename = NULL, timestamp = TRUE, random = TRUE)
- save_object: function (file, ...)
- save_output_files: function (dir = ".", basename = NULL, timestamp = TRUE, random = TRUE)
- save_profile_files: function (dir = ".", basename = NULL, timestamp = TRUE, random = TRUE)
- summary: function (variables = NULL, ...)
- time: function ()
- unconstrain_draws: function (files = NULL, draws = NULL, format = getOption("cmdstanr_draws_format",
- unconstrain_variables: function (variables)
- variable_skeleton: function (transformed_parameters = TRUE, generated_quantities = TRUE)
- Private:
- draws_: NULL
- init_: NULL
- inv_metric_: list
- metadata_: list
- model_methods_env_: environment
- profiles_: NULL
- read_csv_: function (variables = NULL, sampler_diagnostics = NULL, format = getOption("cmdstanr_draws_format",
- return_codes_: 0 0 0 0
- sampler_diagnostics_: 2 1 1 2 2 1 2 2 1 2 1 1 1 2 2 1 2 1 1 1 1 2 2 2 1 2 1 1 ...
- warmup_draws_: NULL
- warmup_sampler_diagnostics_: NULL
-After we call a method that requires the draws then if we reexamine
-the structure of the object we will see that the draws_
-slot in Private is no longer empty.
-draws <- fit$draws() # force CSVs to be read into R
-str(fit)Classes 'CmdStanMCMC', 'CmdStanFit', 'R6' <CmdStanMCMC>
- Inherits from: <CmdStanFit>
- Public:
- clone: function (deep = FALSE)
- cmdstan_diagnose: function ()
- cmdstan_summary: function (flags = NULL)
- code: function ()
- config_files: function (include_failed = FALSE)
- constrain_variables: function (unconstrained_variables, transformed_parameters = TRUE,
- data_file: function ()
- diagnostic_summary: function (diagnostics = c("divergences", "treedepth", "ebfmi"),
- draws: function (variables = NULL, inc_warmup = FALSE, format = getOption("cmdstanr_draws_format",
- expose_functions: function (global = FALSE, verbose = FALSE)
- functions: environment
- grad_log_prob: function (unconstrained_variables, jacobian = TRUE, jacobian_adjustment = NULL)
- hessian: function (unconstrained_variables, jacobian = TRUE, jacobian_adjustment = NULL)
- init: function ()
- init_model_methods: function (seed = 1, verbose = FALSE, hessian = FALSE)
- initialize: function (runset)
- inv_metric: function (matrix = TRUE)
- latent_dynamics_files: function (include_failed = FALSE)
- log_prob: function (unconstrained_variables, jacobian = TRUE, jacobian_adjustment = NULL)
- loo: function (variables = "log_lik", r_eff = TRUE, moment_match = FALSE,
- lp: function ()
- metadata: function ()
- metric_files: function (include_failed = FALSE)
- num_chains: function ()
- num_procs: function ()
- output: function (id = NULL)
- output_files: function (include_failed = FALSE)
- print: function (variables = NULL, ..., digits = 2, max_rows = getOption("cmdstanr_max_rows",
- profile_files: function (include_failed = FALSE)
- profiles: function ()
- return_codes: function ()
- runset: CmdStanRun, R6
- sampler_diagnostics: function (inc_warmup = FALSE, format = getOption("cmdstanr_draws_format",
- save_config_files: function (dir = ".", basename = NULL, timestamp = TRUE, random = TRUE)
- save_data_file: function (dir = ".", basename = NULL, timestamp = TRUE, random = TRUE)
- save_latent_dynamics_files: function (dir = ".", basename = NULL, timestamp = TRUE, random = TRUE)
- save_metric_files: function (dir = ".", basename = NULL, timestamp = TRUE, random = TRUE)
- save_object: function (file, ...)
- save_output_files: function (dir = ".", basename = NULL, timestamp = TRUE, random = TRUE)
- save_profile_files: function (dir = ".", basename = NULL, timestamp = TRUE, random = TRUE)
- summary: function (variables = NULL, ...)
- time: function ()
- unconstrain_draws: function (files = NULL, draws = NULL, format = getOption("cmdstanr_draws_format",
- unconstrain_variables: function (variables)
- variable_skeleton: function (transformed_parameters = TRUE, generated_quantities = TRUE)
- Private:
- draws_: -7.45006 -7.1701 -7.11437 -6.74834 -7.04353 -6.79703 -7. ...
- init_: NULL
- inv_metric_: list
- metadata_: list
- model_methods_env_: environment
- profiles_: NULL
- read_csv_: function (variables = NULL, sampler_diagnostics = NULL, format = getOption("cmdstanr_draws_format",
- return_codes_: 0 0 0 0
- sampler_diagnostics_: 2 1 1 2 2 1 2 2 1 2 1 1 1 2 2 1 2 1 1 1 1 2 2 2 1 2 1 1 ...
- warmup_draws_: NULL
- warmup_sampler_diagnostics_: NULL
-For models with many parameters, transformed parameters, or generated
-quantities, if only some are requested (e.g., by specifying the
-variables argument to fit$draws()) then
-CmdStanR will only read in the requested variables (unless they have
-already been read in).
Internally, the read_cmdstan_csv() function is used to
-read the CmdStan CSV files into R. This function is exposed to users, so
-you can also call it directly.
-# see ?read_cmdstan_csv for info on optional arguments controlling
-# what information is read in
-csv_contents <- read_cmdstan_csv(fit$output_files())
-str(csv_contents)List of 8
- $ metadata :List of 42
- ..$ stan_version_major : num 2
- ..$ stan_version_minor : num 36
- ..$ stan_version_patch : num 0
- ..$ start_datetime : chr "2025-03-31 14:50:13 UTC"
- ..$ method : chr "sample"
- ..$ save_warmup : int 0
- ..$ thin : num 1
- ..$ gamma : num 0.05
- ..$ kappa : num 0.75
- ..$ t0 : num 10
- ..$ init_buffer : num 75
- ..$ term_buffer : num 50
- ..$ window : num 25
- ..$ save_metric : int 0
- ..$ algorithm : chr "hmc"
- ..$ engine : chr "nuts"
- ..$ metric : chr "diag_e"
- ..$ stepsize_jitter : num 0
- ..$ num_chains : num 1
- ..$ id : num [1:4] 1 2 3 4
- ..$ init : num [1:4] 2 2 2 2
- ..$ seed : num 31749990
- ..$ refresh : num 100
- ..$ sig_figs : num -1
- ..$ profile_file : chr "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/bernoulli-profile-202503310850-1-2d08be.csv"
- ..$ save_cmdstan_config : int 0
- ..$ stanc_version : chr "stanc3 v2.36.0"
- ..$ sampler_diagnostics : chr [1:6] "accept_stat__" "stepsize__" "treedepth__" "n_leapfrog__" ...
- ..$ variables : chr [1:2] "lp__" "theta"
- ..$ step_size_adaptation: num [1:4] 0.973 0.888 0.942 1.142
- ..$ model_name : chr "bernoulli_model"
- ..$ adapt_engaged : int 1
- ..$ adapt_delta : num 0.8
- ..$ max_treedepth : num 10
- ..$ step_size : num [1:4] 1 1 1 1
- ..$ iter_warmup : num 1000
- ..$ iter_sampling : num 1000
- ..$ threads_per_chain : num 1
- ..$ time :'data.frame': 4 obs. of 4 variables:
- .. ..$ chain_id: num [1:4] 1 2 3 4
- .. ..$ warmup : num [1:4] 0.004 0.004 0.004 0.005
- .. ..$ sampling: num [1:4] 0.014 0.014 0.012 0.013
- .. ..$ total : num [1:4] 0.018 0.018 0.016 0.018
- ..$ stan_variable_sizes :List of 2
- .. ..$ lp__ : num 1
- .. ..$ theta: num 1
- ..$ stan_variables : chr [1:2] "lp__" "theta"
- ..$ model_params : chr [1:2] "lp__" "theta"
- $ time :List of 2
- ..$ total : int NA
- ..$ chains:'data.frame': 4 obs. of 4 variables:
- .. ..$ chain_id: num [1:4] 1 2 3 4
- .. ..$ warmup : num [1:4] 0.004 0.004 0.004 0.005
- .. ..$ sampling: num [1:4] 0.014 0.014 0.012 0.013
- .. ..$ total : num [1:4] 0.018 0.018 0.016 0.018
- $ inv_metric :List of 4
- ..$ 1: num 0.489
- ..$ 2: num 0.45
- ..$ 3: num 0.577
- ..$ 4: num 0.541
- $ step_size :List of 4
- ..$ 1: num 0.973
- ..$ 2: num 0.888
- ..$ 3: num 0.942
- ..$ 4: num 1.14
- $ warmup_draws : NULL
- $ post_warmup_draws : 'draws_array' num [1:1000, 1:4, 1:2] -7.45 -7.17 -7.11 -6.75 -7.04 ...
- ..- attr(*, "dimnames")=List of 3
- .. ..$ iteration: chr [1:1000] "1" "2" "3" "4" ...
- .. ..$ chain : chr [1:4] "1" "2" "3" "4"
- .. ..$ variable : chr [1:2] "lp__" "theta"
- $ warmup_sampler_diagnostics : NULL
- $ post_warmup_sampler_diagnostics: 'draws_array' num [1:1000, 1:4, 1:6] 0.9 1 1 0.911 0.801 ...
- ..- attr(*, "dimnames")=List of 3
- .. ..$ iteration: chr [1:1000] "1" "2" "3" "4" ...
- .. ..$ chain : chr [1:4] "1" "2" "3" "4"
- .. ..$ variable : chr [1:6] "accept_stat__" "stepsize__" "treedepth__" "n_leapfrog__" ...
-If you need to manually create fitted model objects from CmdStan CSV
-files use as_cmdstan_fit().
-fit2 <- as_cmdstan_fit(fit$output_files())This is pointless in our case since we have the original
-fit object, but this function can be used to create fitted
-model objects (CmdStanMCMC, CmdStanMLE, etc.)
-from any CmdStan CSV files.
If save_latent_dynamics is set to TRUE when
-running the $sample() method then additional CSV files are
-created (one per chain) that provide access to quantities used under the
-hood by Stan’s implementation of dynamic Hamiltonian Monte Carlo.
CmdStanR does not yet provide a special method for processing these -files but they can be read into R using R’s standard CSV reading -functions.
-
-fit <- mod$sample(data = data_list, save_latent_dynamics = TRUE)
-fit$latent_dynamics_files()[1] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/bernoulli-diagnostic-202503310850-1-060b43.csv"
-[2] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/bernoulli-diagnostic-202503310850-2-060b43.csv"
-[3] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/bernoulli-diagnostic-202503310850-3-060b43.csv"
-[4] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/bernoulli-diagnostic-202503310850-4-060b43.csv"
-
-# read one of the files in
-x <- utils::read.csv(fit$latent_dynamics_files()[1], comment.char = "#")
-head(x) lp__ accept_stat__ stepsize__ treedepth__ n_leapfrog__ divergent__
-1 -7.00899 0.935119 1.06923 1 3 0
-2 -7.29900 0.886772 1.06923 1 1 0
-3 -6.82892 1.000000 1.06923 1 1 0
-4 -6.82025 1.000000 1.06923 1 1 0
-5 -6.75311 0.993504 1.06923 2 3 0
-6 -6.89721 0.960772 1.06923 1 3 0
- energy__ theta p_theta g_theta
-1 7.05933 -0.633893 0.440038 1.159540
-2 7.38011 -0.432323 -0.558539 1.722860
-3 7.12220 -0.836018 -1.062060 0.628489
-4 6.85073 -0.850225 0.342380 0.592629
-5 6.85434 -1.166260 0.623978 -0.149621
-6 6.92007 -0.744453 0.296526 0.864374
-The column lp__ is also provided via
-fit$draws(), and the columns accept_stat__,
-stepsize__, treedepth__,
-n_leapfrog__, divergent__, and
-energy__ are also provided by
-fit$sampler_diagnostics(), but there are several columns
-unique to the latent dynamics file.
theta p_theta g_theta
-1 -0.633893 0.440038 1.159540
-2 -0.432323 -0.558539 1.722860
-3 -0.836018 -1.062060 0.628489
-4 -0.850225 0.342380 0.592629
-5 -1.166260 0.623978 -0.149621
-6 -0.744453 0.296526 0.864374
-Our model has a single parameter theta and the three
-columns above correspond to theta in the
-unconstrained space (theta on the constrained
-space is accessed via fit$draws()), the auxiliary momentum
-p_theta, and the gradient g_theta. In general,
-each of these three columns will exist for every parameter in
-the model.
CmdStanR can of course be used for developing other packages that -require compiling and running Stan models as well as using new or custom -Stan features available through CmdStan.
-You may compile a Stan model at runtime (e.g. just before sampling), -or you may compile all the models inside the package file system in -advance at installation time. The latter avoids compilations at runtime, -which matters in centrally managed R installations where users should -not compile their own software.
-To pre-compile all the models in a package, you may create top-level
-scripts configure and configure.win which run
-cmdstan_model() with compile = TRUE and save
-the compiled executables somewhere inside the inst/ folder
-of the package source. The instantiate
-package helps developers configure packages this way, and it documents
-other topics such as submitting to CRAN and administering CmdStan. Kevin
-Ushey’s configure
-package helps create and manage package configuration files in
-general.
When developing or testing new features it might be useful to have
-more information on how CmdStan is called internally and to see more
-information printed when compiling or running models. This can be
-enabled for an entire R session by setting the option
-"cmdstanr_verbose" to TRUE.
-options("cmdstanr_verbose"=TRUE)
-
-mod <- cmdstan_model(stan_file, force_recompile = TRUE)Running make \
- /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/model-d1b65860dc19 \
- "STANCFLAGS += --name='bernoulli_model'"
-
---- Translating Stan model to C++ code ---
-bin/stanc --name='bernoulli_model' --o=/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/model-d1b65860dc19.hpp /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/model-d1b65860dc19.stan
-
---- Compiling C++ code ---
-clang++ -O3 -march=native -mtune=native -Wno-deprecated-declarations -std=c++17 -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT -Wno-ignored-attributes -I stan/lib/stan_math/lib/tbb_2020.3/include -O3 -I src -I stan/src -I stan/lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.4.0 -I stan/lib/stan_math/lib/boost_1.84.0 -I stan/lib/stan_math/lib/sundials_6.1.1/include -I stan/lib/stan_math/lib/sundials_6.1.1/src/sundials -DBOOST_DISABLE_ASSERTS -c -include-pch stan/src/stan/model/model_header.hpp.gch/model_header_16_0.hpp.gch -x c++ -o /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/model-d1b65860dc19.o /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/model-d1b65860dc19.hpp
-
---- Linking model ---
-clang++ -O3 -march=native -mtune=native -Wno-deprecated-declarations -std=c++17 -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT -Wno-ignored-attributes -I stan/lib/stan_math/lib/tbb_2020.3/include -O3 -I src -I stan/src -I stan/lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.4.0 -I stan/lib/stan_math/lib/boost_1.84.0 -I stan/lib/stan_math/lib/sundials_6.1.1/include -I stan/lib/stan_math/lib/sundials_6.1.1/src/sundials -DBOOST_DISABLE_ASSERTS -Wl,-L,"/Users/jgabry/.cmdstan/cmdstan-2.36.0/stan/lib/stan_math/lib/tbb" -Wl,-rpath,"/Users/jgabry/.cmdstan/cmdstan-2.36.0/stan/lib/stan_math/lib/tbb" /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/model-d1b65860dc19.o src/cmdstan/main.o -ltbb stan/lib/stan_math/lib/sundials_6.1.1/lib/libsundials_nvecserial.a stan/lib/stan_math/lib/sundials_6.1.1/lib/libsundials_cvodes.a stan/lib/stan_math/lib/sundials_6.1.1/lib/libsundials_idas.a stan/lib/stan_math/lib/sundials_6.1.1/lib/libsundials_kinsol.a stan/lib/stan_math/lib/tbb/libtbb.dylib stan/lib/stan_math/lib/tbb/libtbbmalloc.dylib stan/lib/stan_math/lib/tbb/libtbbmalloc_proxy.dylib -o /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/model-d1b65860dc19
-rm /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/model-d1b65860dc19.o /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/model-d1b65860dc19.hpp
-
-fit <- mod$sample(
- data = data_list,
- chains = 1,
- iter_warmup = 100,
- iter_sampling = 100
-)Running MCMC with 1 chain...
-
-Running ./bernoulli 'id=1' random 'seed=1376020223' data \
- 'file=/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/standata-d1b61e77387.json' \
- output \
- 'file=/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/bernoulli-202503310850-1-1ead10.csv' \
- 'profile_file=/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/bernoulli-profile-202503310850-1-3e64eb.csv' \
- 'method=sample' 'num_samples=100' 'num_warmup=100' 'save_warmup=0' \
- 'algorithm=hmc' 'engine=nuts' adapt 'engaged=1'
-Chain 1 method = sample (Default)
-Chain 1 sample
-Chain 1 num_samples = 100
-Chain 1 num_warmup = 100
-Chain 1 save_warmup = false (Default)
-Chain 1 thin = 1 (Default)
-Chain 1 adapt
-Chain 1 engaged = true (Default)
-Chain 1 gamma = 0.05 (Default)
-Chain 1 delta = 0.8 (Default)
-Chain 1 kappa = 0.75 (Default)
-Chain 1 t0 = 10 (Default)
-Chain 1 init_buffer = 75 (Default)
-Chain 1 term_buffer = 50 (Default)
-Chain 1 window = 25 (Default)
-Chain 1 save_metric = false (Default)
-Chain 1 algorithm = hmc (Default)
-Chain 1 hmc
-Chain 1 engine = nuts (Default)
-Chain 1 nuts
-Chain 1 max_depth = 10 (Default)
-Chain 1 metric = diag_e (Default)
-Chain 1 metric_file = (Default)
-Chain 1 stepsize = 1 (Default)
-Chain 1 stepsize_jitter = 0 (Default)
-Chain 1 num_chains = 1 (Default)
-Chain 1 id = 1 (Default)
-Chain 1 data
-Chain 1 file = /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/standata-d1b61e77387.json
-Chain 1 init = 2 (Default)
-Chain 1 random
-Chain 1 seed = 1376020223
-Chain 1 output
-Chain 1 file = /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/bernoulli-202503310850-1-1ead10.csv
-Chain 1 diagnostic_file = (Default)
-Chain 1 refresh = 100 (Default)
-Chain 1 sig_figs = -1 (Default)
-Chain 1 profile_file = /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpI5U8HG/bernoulli-profile-202503310850-1-3e64eb.csv
-Chain 1 save_cmdstan_config = false (Default)
-Chain 1 num_threads = 1 (Default)
-Chain 1 Gradient evaluation took 1e-05 seconds
-Chain 1 1000 transitions using 10 leapfrog steps per transition would take 0.1 seconds.
-Chain 1 Adjust your expectations accordingly!
-Chain 1 WARNING: There aren't enough warmup iterations to fit the
-Chain 1 three stages of adaptation as currently configured.
-Chain 1 Reducing each adaptation stage to 15%/75%/10% of
-Chain 1 the given number of warmup iterations:
-Chain 1 init_buffer = 15
-Chain 1 adapt_window = 75
-Chain 1 term_buffer = 10
-Chain 1 Iteration: 1 / 200 [ 0%] (Warmup)
-Chain 1 Iteration: 100 / 200 [ 50%] (Warmup)
-Chain 1 Iteration: 101 / 200 [ 50%] (Sampling)
-Chain 1 Iteration: 200 / 200 [100%] (Sampling)
-Chain 1 Elapsed Time: 0 seconds (Warm-up)
-Chain 1 0.001 seconds (Sampling)
-Chain 1 0.001 seconds (Total)
-Chain 1 finished in 0.0 seconds.
-vignettes/cmdstanr.Rmd
- cmdstanr.RmdCmdStanR (Command Stan R) is a lightweight interface to Stan for R users that provides an -alternative to the traditional RStan interface. See the Comparison with RStan section -later in this vignette for more details on how the two interfaces -differ.
-Using CmdStanR requires installing the cmdstanr R -package and also CmdStan, the command line interface to Stan. First we -install the R package by running the following command in R.
-
-# we recommend running this is a fresh R session or restarting your current session
-install.packages("cmdstanr", repos = c('https://stan-dev.r-universe.dev', getOption("repos")))We can now load the package like any other R package. We’ll also load -the bayesplot and posterior packages -to use later in examples.
- -CmdStanR requires a working installation of CmdStan, -the shell interface to Stan. If you don’t have CmdStan installed then -CmdStanR can install it for you, assuming you have a suitable C++ -toolchain. The requirements are described in the CmdStan Guide:
- -To double check that your toolchain is set up properly you can call
-the check_cmdstan_toolchain() function:
The C++ toolchain required for CmdStan is setup properly!
-If your toolchain is configured correctly then CmdStan can be
-installed by calling the install_cmdstan()
-function:
-install_cmdstan(cores = 2)Before CmdStanR can be used it needs to know where the CmdStan -installation is located. When the package is loaded it tries to help -automate this to avoid having to manually set the path every -session:
-If the environment variable "CMDSTAN" exists at load
-time then its value will be automatically set as the default path to
-CmdStan for the R session. This is useful if your CmdStan installation
-is not located in the default directory that would have been used by
-install_cmdstan() (see #2).
If no environment variable is found when loaded but any directory
-in the form ".cmdstan/cmdstan-[version]", for example
-".cmdstan/cmdstan-2.23.0", exists in the user’s home
-directory (Sys.getenv("HOME"), not the current
-working directory) then the path to the CmdStan with the largest version
-number will be set as the path to CmdStan for the R session. This is the
-same as the default directory that install_cmdstan() uses
-to install the latest version of CmdStan, so if that’s how you installed
-CmdStan you shouldn’t need to manually set the path to CmdStan when
-loading CmdStanR.
If neither of these applies (or you want to subsequently change the
-path) you can use the set_cmdstan_path() function:
-set_cmdstan_path(PATH_TO_CMDSTAN)To check the path to the CmdStan installation and the CmdStan version
-number you can use cmdstan_path() and
-cmdstan_version():
-cmdstan_path()[1] "/Users/jgabry/.cmdstan/cmdstan-2.36.0"
-
-[1] "2.36.0"
-The cmdstan_model() function creates a new CmdStanModel
-object from a file containing a Stan program. Under the hood, CmdStan is
-called to translate a Stan program to C++ and create a compiled
-executable. Here we’ll use the example Stan program that comes with the
-CmdStan installation:
-file <- file.path(cmdstan_path(), "examples", "bernoulli", "bernoulli.stan")
-mod <- cmdstan_model(file)The object mod is an R6 reference object of class CmdStanModel
-and behaves similarly to R’s reference class objects and those in object
-oriented programming languages. Methods are accessed using the
-$ operator. This design choice allows for CmdStanR and CmdStanPy to provide a
-similar user experience and share many implementation details.
The Stan program can be printed using the $print()
-method:
-mod$print()data {
- int<lower=0> N;
- array[N] int<lower=0, upper=1> y;
-}
-parameters {
- real<lower=0, upper=1> theta;
-}
-model {
- theta ~ beta(1, 1); // uniform prior on interval 0,1
- y ~ bernoulli(theta);
-}
-The path to the compiled executable is returned by the
-$exe_file() method:
-mod$exe_file()[1] "/Users/jgabry/.cmdstan/cmdstan-2.36.0/examples/bernoulli/bernoulli"
-The $sample()
-method for CmdStanModel
-objects runs Stan’s default MCMC algorithm. The data
-argument accepts a named list of R objects (like for RStan) or a path to
-a data file compatible with CmdStan (JSON or R dump).
-# names correspond to the data block in the Stan program
-data_list <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1))
-
-fit <- mod$sample(
- data = data_list,
- seed = 123,
- chains = 4,
- parallel_chains = 4,
- refresh = 500 # print update every 500 iters
-)Running MCMC with 4 parallel chains...
-
-Chain 1 Iteration: 1 / 2000 [ 0%] (Warmup)
-Chain 1 Iteration: 500 / 2000 [ 25%] (Warmup)
-Chain 1 Iteration: 1000 / 2000 [ 50%] (Warmup)
-Chain 1 Iteration: 1001 / 2000 [ 50%] (Sampling)
-Chain 1 Iteration: 1500 / 2000 [ 75%] (Sampling)
-Chain 1 Iteration: 2000 / 2000 [100%] (Sampling)
-Chain 2 Iteration: 1 / 2000 [ 0%] (Warmup)
-Chain 2 Iteration: 500 / 2000 [ 25%] (Warmup)
-Chain 2 Iteration: 1000 / 2000 [ 50%] (Warmup)
-Chain 2 Iteration: 1001 / 2000 [ 50%] (Sampling)
-Chain 2 Iteration: 1500 / 2000 [ 75%] (Sampling)
-Chain 2 Iteration: 2000 / 2000 [100%] (Sampling)
-Chain 3 Iteration: 1 / 2000 [ 0%] (Warmup)
-Chain 3 Iteration: 500 / 2000 [ 25%] (Warmup)
-Chain 3 Iteration: 1000 / 2000 [ 50%] (Warmup)
-Chain 3 Iteration: 1001 / 2000 [ 50%] (Sampling)
-Chain 3 Iteration: 1500 / 2000 [ 75%] (Sampling)
-Chain 3 Iteration: 2000 / 2000 [100%] (Sampling)
-Chain 4 Iteration: 1 / 2000 [ 0%] (Warmup)
-Chain 4 Iteration: 500 / 2000 [ 25%] (Warmup)
-Chain 4 Iteration: 1000 / 2000 [ 50%] (Warmup)
-Chain 4 Iteration: 1001 / 2000 [ 50%] (Sampling)
-Chain 4 Iteration: 1500 / 2000 [ 75%] (Sampling)
-Chain 4 Iteration: 2000 / 2000 [100%] (Sampling)
-Chain 1 finished in 0.0 seconds.
-Chain 2 finished in 0.0 seconds.
-Chain 3 finished in 0.0 seconds.
-Chain 4 finished in 0.0 seconds.
-
-All 4 chains finished successfully.
-Mean chain execution time: 0.0 seconds.
-Total execution time: 0.4 seconds.
-There are many more arguments that can be passed to the
-$sample() method. For details follow this link to its
-separate documentation page:
$sample()The $sample() method creates R6 CmdStanMCMC objects,
-which have many associated methods. Below we will demonstrate some of
-the most important methods. For a full list, follow this link to the
-CmdStanMCMC documentation:
CmdStanMCMCThe $summary()
-method calls summarise_draws() from the
-posterior package. The first argument specifies the
-variables to summarize and any arguments after that are passed on to
-posterior::summarise_draws() to specify which summaries to
-compute, whether to use multiple cores, etc.
-fit$summary()
-fit$summary(variables = c("theta", "lp__"), "mean", "sd")
-
-# use a formula to summarize arbitrary functions, e.g. Pr(theta <= 0.5)
-fit$summary("theta", pr_lt_half = ~ mean(. <= 0.5))
-
-# summarise all variables with default and additional summary measures
-fit$summary(
- variables = NULL,
- posterior::default_summary_measures(),
- extra_quantiles = ~posterior::quantile2(., probs = c(.0275, .975))
-) variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
-1 lp__ -7.30 -6.99 0.79 0.33 -8.927 -6.75 1 1769 1938
-2 theta 0.25 0.24 0.12 0.12 0.078 0.48 1 1228 1521
- variable mean sd
-1 theta 0.25 0.12
-2 lp__ -7.30 0.79
- variable pr_lt_half
-1 theta 0.96
- variable mean median sd mad q5 q95 q2.75 q97.5
-1 lp__ -7.30 -6.99 0.79 0.33 -8.927 -6.75 -9.429 -6.75
-2 theta 0.25 0.24 0.12 0.12 0.078 0.48 0.062 0.54
-The $draws()
-method can be used to extract the posterior draws in formats provided by
-the posterior
-package. Here we demonstrate only the draws_array and
-draws_df formats, but the posterior
-package supports other useful formats as well.
-# default is a 3-D draws_array object from the posterior package
-# iterations x chains x variables
-draws_arr <- fit$draws() # or format="array"
-str(draws_arr) 'draws_array' num [1:1000, 1:4, 1:2] -7.01 -7.89 -7.41 -6.75 -6.91 ...
- - attr(*, "dimnames")=List of 3
- ..$ iteration: chr [1:1000] "1" "2" "3" "4" ...
- ..$ chain : chr [1:4] "1" "2" "3" "4"
- ..$ variable : chr [1:2] "lp__" "theta"
-
-# draws x variables data frame
-draws_df <- fit$draws(format = "df")
-str(draws_df)draws_df [4,000 × 5] (S3: draws_df/draws/tbl_df/tbl/data.frame)
- $ lp__ : num [1:4000] -7.01 -7.89 -7.41 -6.75 -6.91 ...
- $ theta : num [1:4000] 0.168 0.461 0.409 0.249 0.185 ...
- $ .chain : int [1:4000] 1 1 1 1 1 1 1 1 1 1 ...
- $ .iteration: int [1:4000] 1 2 3 4 5 6 7 8 9 10 ...
- $ .draw : int [1:4000] 1 2 3 4 5 6 7 8 9 10 ...
-
-print(draws_df)# A draws_df: 1000 iterations, 4 chains, and 2 variables
- lp__ theta
-1 -7.0 0.17
-2 -7.9 0.46
-3 -7.4 0.41
-4 -6.7 0.25
-5 -6.9 0.18
-6 -6.9 0.33
-7 -7.2 0.15
-8 -6.8 0.29
-9 -6.8 0.24
-10 -6.8 0.24
-# ... with 3990 more draws
-# ... hidden reserved variables {'.chain', '.iteration', '.draw'}
-To convert an existing draws object to a different format use the
-posterior::as_draws_*() functions.
-# this should be identical to draws_df created via draws(format = "df")
-draws_df_2 <- as_draws_df(draws_arr)
-identical(draws_df, draws_df_2)[1] TRUE
-In general, converting to a different draws format in this way will
-be slower than just setting the appropriate format initially in the call
-to the $draws() method, but in most cases the speed
-difference will be minor.
The vignette Working with
-Posteriors has more details on posterior draws, including how to
-reproduce the structured output RStan users are accustomed to getting
-from rstan::extract().
The $sampler_diagnostics()
-method extracts the values of the sampler parameters
-(treedepth__, divergent__, etc.) in formats
-supported by the posterior package. The default is as a
-3-D array (iteration x chain x variable).
-# this is a draws_array object from the posterior package
-str(fit$sampler_diagnostics()) 'draws_array' num [1:1000, 1:4, 1:6] 2 1 2 2 2 1 2 1 2 1 ...
- - attr(*, "dimnames")=List of 3
- ..$ iteration: chr [1:1000] "1" "2" "3" "4" ...
- ..$ chain : chr [1:4] "1" "2" "3" "4"
- ..$ variable : chr [1:6] "treedepth__" "divergent__" "energy__" "accept_stat__" ...
-
-# this is a draws_df object from the posterior package
-str(fit$sampler_diagnostics(format = "df"))draws_df [4,000 × 9] (S3: draws_df/draws/tbl_df/tbl/data.frame)
- $ treedepth__ : num [1:4000] 2 1 2 2 2 1 2 1 2 1 ...
- $ divergent__ : num [1:4000] 0 0 0 0 0 0 0 0 0 0 ...
- $ energy__ : num [1:4000] 8.95 8.77 7.87 7.64 6.93 ...
- $ accept_stat__: num [1:4000] 0.688 0.811 1 0.966 0.976 ...
- $ stepsize__ : num [1:4000] 0.905 0.905 0.905 0.905 0.905 ...
- $ n_leapfrog__ : num [1:4000] 3 3 3 3 3 3 3 3 3 3 ...
- $ .chain : int [1:4000] 1 1 1 1 1 1 1 1 1 1 ...
- $ .iteration : int [1:4000] 1 2 3 4 5 6 7 8 9 10 ...
- $ .draw : int [1:4000] 1 2 3 4 5 6 7 8 9 10 ...
-The $diagnostic_summary() method will display any
-sampler diagnostic warnings and return a summary of diagnostics for each
-chain.
-fit$diagnostic_summary()$num_divergent
-[1] 0 0 0 0
-
-$num_max_treedepth
-[1] 0 0 0 0
-
-$ebfmi
-[1] 1.11 0.76 1.19 1.08
-We see the number of divergences for each of the four chains, the -number of times the maximum treedepth was hit for each chain, and the -E-BFMI for each chain.
-In this case there were no warnings, so in order to demonstrate the -warning messages we’ll use one of the CmdStanR example models that -suffers from divergences.
-
-fit_with_warning <- cmdstanr_example("schools")Warning: 143 of 4000 (4.0%) transitions ended with a divergence.
-See https://mc-stan.org/misc/warnings for details.
-Warning: 1 of 4 chains had an E-BFMI less than 0.3.
-See https://mc-stan.org/misc/warnings for details.
-After fitting there is a warning about divergences. We can also
-regenerate this warning message later using
-fit$diagnostic_summary().
-diagnostics <- fit_with_warning$diagnostic_summary()Warning: 143 of 4000 (4.0%) transitions ended with a divergence.
-See https://mc-stan.org/misc/warnings for details.
-Warning: 1 of 4 chains had an E-BFMI less than 0.3.
-See https://mc-stan.org/misc/warnings for details.
-
-print(diagnostics)$num_divergent
-[1] 1 37 75 30
-
-$num_max_treedepth
-[1] 0 0 0 0
-
-$ebfmi
-[1] 0.17 0.35 0.34 0.42
-
-# number of divergences reported in warning is the sum of the per chain values
-sum(diagnostics$num_divergent)[1] 143
-CmdStanR also supports running Stan’s optimization algorithms and its
-algorithms for variational approximation of full Bayesian inference.
-These are run via the $optimize(), $laplace(),
-$variational(), and $pathfinder() methods,
-which are called in a similar way to the $sample() method
-demonstrated above.
We can find the (penalized) maximum likelihood estimate (MLE) using
-$optimize().
-fit_mle <- mod$optimize(data = data_list, seed = 123)Initial log joint probability = -16.144
- Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
- 6 -5.00402 0.000246518 8.73164e-07 1 1 9
-Optimization terminated normally:
- Convergence detected: relative gradient magnitude is below tolerance
-Finished in 0.2 seconds.
-
-fit_mle$print() # includes lp__ (log prob calculated by Stan program) variable estimate
- lp__ -5.00
- theta 0.20
-
-fit_mle$mle("theta")theta
- 0.2
-Here’s a plot comparing the penalized MLE to the posterior
-distribution of theta.

For optimization, by default the mode is calculated without the
-Jacobian adjustment for constrained variables, which shifts the mode due
-to the change of variables. To include the Jacobian adjustment and
-obtain a maximum a posteriori (MAP) estimate set
-jacobian=TRUE. See the Maximum
-Likelihood Estimation section of the CmdStan User’s Guide for more
-details.
-fit_map <- mod$optimize(
- data = data_list,
- jacobian = TRUE,
- seed = 123
-)Initial log joint probability = -18.2733
- Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
- 5 -6.74802 0.000708195 1.43227e-05 1 1 8
-Optimization terminated normally:
- Convergence detected: relative gradient magnitude is below tolerance
-Finished in 0.1 seconds.
-The $laplace()
-method produces a sample from a normal approximation centered at the
-mode of a distribution in the unconstrained space. If the mode is a MAP
-estimate, the samples provide an estimate of the mean and standard
-deviation of the posterior distribution. If the mode is the MLE, the
-sample provides an estimate of the standard error of the likelihood.
-Whether the mode is the MAP or MLE depends on the value of the
-jacobian argument when running optimization. See the Laplace
-Sampling chapter of the CmdStan User’s Guide for more details.
Here we pass in the fit_map object from above as the
-mode argument. If mode is omitted then
-optimization will be run internally before taking draws from the normal
-approximation.
-fit_laplace <- mod$laplace(
- mode = fit_map,
- draws = 4000,
- data = data_list,
- seed = 123,
- refresh = 1000
- )Calculating Hessian
-Calculating inverse of Cholesky factor
-Generating draws
-iteration: 0
-iteration: 1000
-iteration: 2000
-iteration: 3000
-Finished in 0.1 seconds.
-
-fit_laplace$print("theta") variable mean median sd mad q5 q95
- theta 0.27 0.25 0.12 0.12 0.10 0.51
-
-mcmc_hist(fit_laplace$draws("theta"), binwidth = 0.025)
We can run Stan’s experimental Automatic Differentiation Variational
-Inference (ADVI) using the $variational()
-method. For details on the ADVI algorithm see the CmdStan
-User’s Guide.
-fit_vb <- mod$variational(
- data = data_list,
- seed = 123,
- draws = 4000
-)------------------------------------------------------------
-EXPERIMENTAL ALGORITHM:
- This procedure has not been thoroughly tested and may be unstable
- or buggy. The interface is subject to change.
-------------------------------------------------------------
-Gradient evaluation took 9e-06 seconds
-1000 transitions using 10 leapfrog steps per transition would take 0.09 seconds.
-Adjust your expectations accordingly!
-Begin eta adaptation.
-Iteration: 1 / 250 [ 0%] (Adaptation)
-Iteration: 50 / 250 [ 20%] (Adaptation)
-Iteration: 100 / 250 [ 40%] (Adaptation)
-Iteration: 150 / 250 [ 60%] (Adaptation)
-Iteration: 200 / 250 [ 80%] (Adaptation)
-Success! Found best value [eta = 1] earlier than expected.
-Begin stochastic gradient ascent.
- iter ELBO delta_ELBO_mean delta_ELBO_med notes
- 100 -6.164 1.000 1.000
- 200 -6.225 0.505 1.000
- 300 -6.186 0.339 0.010 MEDIAN ELBO CONVERGED
-Drawing a sample of size 4000 from the approximate posterior...
-COMPLETED.
-Finished in 0.1 seconds.
-
-fit_vb$print("theta") variable mean median sd mad q5 q95
- theta 0.26 0.24 0.11 0.11 0.11 0.46
-
-mcmc_hist(fit_vb$draws("theta"), binwidth = 0.025)
Stan version 2.33 introduced a new variational method called
-Pathfinder, which is intended to be faster and more stable than ADVI.
-For details on how Pathfinder works see the section in the CmdStan
-User’s Guide. Pathfinder is run using the $pathfinder()
-method.
-fit_pf <- mod$pathfinder(
- data = data_list,
- seed = 123,
- draws = 4000
-)Path [1] :Initial log joint density = -18.273334
-Path [1] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
- 5 -6.748e+00 7.082e-04 1.432e-05 1.000e+00 1.000e+00 126 -6.145e+00 -6.145e+00
-Path [1] :Best Iter: [5] ELBO (-6.145070) evaluations: (126)
-Path [2] :Initial log joint density = -19.192715
-Path [2] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
- 5 -6.748e+00 2.015e-04 2.228e-06 1.000e+00 1.000e+00 126 -6.223e+00 -6.223e+00
-Path [2] :Best Iter: [2] ELBO (-6.170358) evaluations: (126)
-Path [3] :Initial log joint density = -6.774820
-Path [3] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
- 4 -6.748e+00 1.137e-04 2.596e-07 1.000e+00 1.000e+00 101 -6.178e+00 -6.178e+00
-Path [3] :Best Iter: [4] ELBO (-6.177909) evaluations: (101)
-Path [4] :Initial log joint density = -7.949193
-Path [4] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
- 5 -6.748e+00 2.145e-04 1.301e-06 1.000e+00 1.000e+00 126 -6.197e+00 -6.197e+00
-Path [4] :Best Iter: [5] ELBO (-6.197118) evaluations: (126)
-Total log probability function evaluations:4379
-Finished in 0.1 seconds.
-
-fit_pf$print("theta") variable mean median sd mad q5 q95
- theta 0.25 0.24 0.12 0.12 0.08 0.47
-Let’s extract the draws, make the same plot we made after running the -other algorithms, and compare them all. approximation, and compare them -all. In this simple example the distributions are quite similar, but -this will not always be the case for more challenging problems.
-
-mcmc_hist(fit_pf$draws("theta"), binwidth = 0.025) +
- ggplot2::labs(subtitle = "Approximate posterior from pathfinder") +
- ggplot2::xlim(0, 1)
-mcmc_hist(fit_vb$draws("theta"), binwidth = 0.025) +
- ggplot2::labs(subtitle = "Approximate posterior from variational") +
- ggplot2::xlim(0, 1)
-mcmc_hist(fit_laplace$draws("theta"), binwidth = 0.025) +
- ggplot2::labs(subtitle = "Approximate posterior from Laplace") +
- ggplot2::xlim(0, 1)
-mcmc_hist(fit$draws("theta"), binwidth = 0.025) +
- ggplot2::labs(subtitle = "Posterior from MCMC") +
- ggplot2::xlim(0, 1)
For more details on the $optimize(),
-$laplace(), $variational(), and
-pathfinder() methods, follow these links to their
-documentation pages.
The $save_object()
-method provided by CmdStanR is the most convenient way to save a fitted
-model object to disk and ensure that all of the contents are available
-when reading the object back into R.
-fit$save_object(file = "fit.RDS")
-
-# can be read back in using readRDS
-fit2 <- readRDS("fit.RDS")But if your model object is large, then $save_object()
-could take a long time. $save_object()
-reads the CmdStan results files into memory, stores them in the model
-object, and saves the object with saveRDS(). To speed up
-the process, you can emulate $save_object()
-and replace saveRDS with the much faster
-qsave() function from the qs package.
-# Load CmdStan output files into the fitted model object.
-fit$draws() # Load posterior draws into the object.
-try(fit$sampler_diagnostics(), silent = TRUE) # Load sampler diagnostics.
-try(fit$init(), silent = TRUE) # Load user-defined initial values.
-try(fit$profiles(), silent = TRUE) # Load profiling samples.
-
-# Save the object to a file.
-qs::qsave(x = fit, file = "fit.qs")
-
-# Read the object.
-fit2 <- qs::qread("fit.qs")Storage is even faster if you discard results you do not need to -save. The following example saves only posterior draws and discards -sampler diagnostics, user-specified initial values, and profiling -data.
-
-# Load posterior draws into the fitted model object and omit other output.
-fit$draws()
-
-# Save the object to a file.
-qs::qsave(x = fit, file = "fit.qs")
-
-# Read the object.
-fit2 <- qs::qread("fit.qs")See the vignette How -does CmdStanR work? for more information about the composition -of CmdStanR objects.
-The RStan interface (rstan package) is -an in-memory interface to Stan and relies on R packages like -Rcpp and inline to call C++ code from -R. On the other hand, the CmdStanR interface does not directly call any -C++ code from R, instead relying on the CmdStan interface behind the -scenes for compilation, running algorithms, and writing results to -output files.
-Allows other developers to distribute R packages with -pre-compiled Stan programs (like rstanarm) on -CRAN. (Note: As of 2023, this can mostly be achieved with CmdStanR as -well. See Developing -using CmdStanR.)
Avoids use of R6 classes, which may result in more familiar -syntax for many R users.
CRAN binaries available for Mac and Windows.
Compatible with latest versions of Stan. Keeping up with Stan
-releases is complicated for RStan, often requiring non-trivial changes
-to the rstan package and new CRAN releases of both
-rstan and StanHeaders. With CmdStanR
-the latest improvements in Stan will be available from R immediately
-after updating CmdStan using
-cmdstanr::install_cmdstan().
Running Stan via external processes results in fewer unexpected -crashes, especially in RStudio.
Less memory overhead.
More permissive license. RStan uses the GPL-3 license while the -license for CmdStanR is BSD-3, which is a bit more permissive and is the -same license used for CmdStan and the Stan C++ source code.
There are additional vignettes available that discuss other aspects -of using CmdStanR. These can be found online at the CmdStanR -website:
- -To ask a question please post on the Stan forums:
- -To report a bug, suggest a feature (including additions to these -vignettes), or to start contributing to CmdStanR development (new -contributors welcome!) please open an issue on GitHub:
- -vignettes/deprecations.Rmd
- deprecations.RmdThis vignette demonstrates how to handle cases where your Stan -program contains deprecated features resulting in deprecation warnings. -In most cases, the Stan-to-C++ compiler can be used to automatically -update your code to a non-deprecated feature that replaces the -deprecated one. This vignette showcases how that automatic conversion -can be done using CmdStanR.
-The automatic conversion of deprecated features to non-deprecated -features is done using the so-called “canonicalizer”, which is part of -the Stan-to-C++ compiler. We recommend using CmdStan 2.29.2 or later -when using the canonicalizer and this vignette. The minimum CmdStanR -version to run the code in the vignette is 0.5.0.
-
-library(cmdstanr)
-check_cmdstan_toolchain(fix = TRUE, quiet = TRUE)The following logistic regression model uses several deprecated -language features, resulting in several warnings during compilation.
-
-stan_file <- write_stan_file("
-data {
- int<lower=1> k;
- int<lower=0> n;
- matrix[n, k] X;
- int y[n];
-}
-parameters {
- vector[k] beta;
- real alpha;
-}
-model {
- # priors
- target += std_normal_log(beta);
- alpha ~ std_normal();
-
- y ~ bernoulli_logit(X * beta + alpha);
-}
-")
-mod <- cmdstan_model(stan_file)Warning in '/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpMBUSHs/model-17e6a34f96f68.stan', line 6, column 2: Declaration
- of arrays by placing brackets after a variable name is deprecated and
- will be removed in Stan 2.33.0. Instead use the array keyword before the
- type. This can be changed automatically using the auto-format flag to
- stanc
-Warning in '/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpMBUSHs/model-17e6a34f96f68.stan', line 13, column 2: Comments
- beginning with # are deprecated and this syntax will be removed in Stan
- 2.33.0. Use // to begin line comments; this can be done automatically
- using the auto-format flag to stanc
-Warning in '/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpMBUSHs/model-17e6a34f96f68.stan', line 14, column 12: std_normal_log
- is deprecated and will be removed in Stan 2.33.0. Use std_normal_lpdf
- instead. This can be automatically changed using the canonicalize flag
- for stanc
-The first warning is about using the deprecated array syntax
-int y[n];
-which should be replaced with the new syntax using the
-array keyword:
array[n] int y;
-The second warning is about using the deprecated commenting symbol
-#, which should be replaced by //.
The last warning is about the use of the deprecated _log
-suffix for probability density and mass functions. In this case the
-_log suffix should be replaced with _lpdf. For
-probability mass functions the suffix _lpmf is used.
We can go and fix these issues manually or use the canonicalizer as -outlined in the next section.
-The canonicalizer is available through the canonicalize
-argument of the $format() method of the
-CmdStanModel class. The arguments accepts TRUE
-and FALSE values, in which case all or none of the features
-of the canonicalizer are used. It can also accept a list of character
-vectors that determine which features of the canonicalizer to use.
The canonincalizer in CmdStan 2.29.2 supports four features:
-parentheses, braces, includes and
-deprecations. The parentheses and
-braces features clean up the use of parentheses and braces,
-while includes will replace #include
-statements with the code from the included files. See the canonicalizer
-section of the Stan User’s Guide for more details.
In this vignette we will be using the deprecations
-feature that replaces deprecated Stan model features with non-deprecated
-ones if possible.
-mod$format(canonicalize = list("deprecations"))data {
- int<lower=1> k;
- int<lower=0> n;
- matrix[n, k] X;
- array[n] int y;
-}
-parameters {
- vector[k] beta;
- real alpha;
-}
-model {
- // priors
- target += std_normal_lpdf(beta);
- alpha ~ std_normal();
-
- y ~ bernoulli_logit(X * beta + alpha);
-}
-By default, the format function will print the resulting model code.
-We can see that all three issues were resolved. y is now
-defined using the new array keyword, the comment uses //
-and the std_normal_log() is replaced with
-std_normal_lpdf().
You can also use the $format() method to write the
-updated version of the model directly to the Stan model file. That can
-be enabled by setting overwrite_file = TRUE. The previous
-version of the file will automatically be backed up to a file with the
-.stan.bak suffix. If that is not desired or you are using a
-version system and making a backup is redundant, you can disable it by
-setting backup = FALSE.
-mod$format(
- canonicalize = list("deprecations"),
- overwrite_file = TRUE,
- backup = FALSE
-)
-mod$print()data {
- int<lower=1> k;
- int<lower=0> n;
- matrix[n, k] X;
- array[n] int y;
-}
-parameters {
- vector[k] beta;
- real alpha;
-}
-model {
- // priors
- target += std_normal_lpdf(beta);
- alpha ~ std_normal();
-
- y ~ bernoulli_logit(X * beta + alpha);
-}
-Installing CmdStan, fitting models, and accessing results.
- -More information about compilation, passing in data, how CmdStan ouput is written to CSV and read back into R, profiling Stan programs, running Stan on GPUs, and using CmdStanR in R Markdown documents.
- -We can easily customize the summary statistics reported by
-$summary() and $print().
-fit <- cmdstanr::cmdstanr_example("schools", method = "sample")Warning: 145 of 4000 (4.0%) transitions ended with a divergence.
-See https://mc-stan.org/misc/warnings for details.
-
-fit$summary() variable mean median sd mad q5 q95
-1 lp__ -58.189615 -58.233950 4.983507 5.199033 -66.09672000 -49.48569
-2 mu 6.770868 6.720615 4.160694 4.104430 0.05392642 13.50031
-3 tau 5.360230 4.524270 3.536662 3.128657 1.41195350 12.01772
-4 theta[1] 9.538150 8.703025 6.992732 6.139536 -0.46183645 21.76395
-5 theta[2] 7.037750 7.084940 5.672420 5.463211 -1.89471350 16.19477
-6 theta[3] 5.865357 5.940265 6.567071 5.598943 -4.94042600 15.87788
-7 theta[4] 7.096898 7.021110 5.846662 5.507303 -2.11556450 16.63653
-8 theta[5] 4.905340 5.104415 5.567945 5.102183 -4.69825650 13.42720
-9 theta[6] 5.743860 5.979900 5.965230 5.476702 -4.29255000 15.06765
-10 theta[7] 9.332145 8.785955 5.911919 5.517199 0.73314065 20.11738
-11 theta[8] 7.283320 7.100935 6.510593 5.937568 -3.18226350 18.23657
- rhat ess_bulk ess_tail
-1 1.029985 154.8234 166.9326
-2 1.004450 669.4002 1293.6704
-3 1.029747 147.4551 128.0359
-4 1.006213 1068.8761 2337.7828
-5 1.002627 1067.0310 2241.5417
-6 1.004902 1217.1050 1912.7529
-7 1.002033 1030.8586 1665.2932
-8 1.009740 637.5400 1725.4378
-9 1.003866 1129.6591 2064.6751
-10 1.003642 989.5674 1661.8675
-11 1.005570 1163.3998 1970.5629
-By default all variables are summaries with the follow functions:
-
-posterior::default_summary_measures()[1] "mean" "median" "sd" "mad" "quantile2"
-To change the variables summarized, we use the variables argument
-
-fit$summary(variables = c("mu", "tau")) variable mean median sd mad q5 q95 rhat
-1 mu 6.770868 6.720615 4.160694 4.104430 0.05392642 13.50031 1.004450
-2 tau 5.360230 4.524270 3.536662 3.128657 1.41195350 12.01772 1.029747
- ess_bulk ess_tail
-1 669.4002 1293.6704
-2 147.4551 128.0359
-We can additionally change which functions are used
-
-fit$summary(variables = c("mu", "tau"), mean, sd) variable mean sd
-1 mu 6.770868 4.160694
-2 tau 5.360230 3.536662
-To summarize all variables with non-default functions, it is
-necessary to set explicitly set the variables argument, either to
-NULL or the full vector of variable names.
-fit$metadata()$model_params [1] "lp__" "mu" "tau" "theta[1]" "theta[2]" "theta[3]"
- [7] "theta[4]" "theta[5]" "theta[6]" "theta[7]" "theta[8]"
-
-fit$summary(variables = NULL, "mean", "median") variable mean median
-1 lp__ -58.189615 -58.233950
-2 mu 6.770868 6.720615
-3 tau 5.360230 4.524270
-4 theta[1] 9.538150 8.703025
-5 theta[2] 7.037750 7.084940
-6 theta[3] 5.865357 5.940265
-7 theta[4] 7.096898 7.021110
-8 theta[5] 4.905340 5.104415
-9 theta[6] 5.743860 5.979900
-10 theta[7] 9.332145 8.785955
-11 theta[8] 7.283320 7.100935
-Summary functions can be specified by character string, function, or
-using a formula (or anything else supported by
-rlang::as_function()). If these arguments are named, those
-names will be used in the tibble output. If the summary results are
-named they will take precedence.
-my_sd <- function(x) c(My_SD = sd(x))
-fit$summary(
- c("mu", "tau"),
- MEAN = mean,
- "median",
- my_sd,
- ~quantile(.x, probs = c(0.1, 0.9)),
- Minimum = function(x) min(x)
-) variable MEAN median My_SD 10% 90% Minimum
-1 mu 6.770868 6.720615 4.160694 1.577812 12.27689 -7.905940
-2 tau 5.360230 4.524270 3.536662 1.751609 10.20606 0.836362
-Arguments to all summary functions can also be specified with
-.args.
variable 2.5% 5% 95% 97.5%
-1 mu -1.320617 0.05392642 13.50031 14.99084
-2 tau 1.176680 1.41195350 12.01772 13.99329
-The summary functions are applied to the array of sample values, with
-dimension iter_samplingxchains.
-fit$summary(variables = NULL, dim, colMeans) variable dim.1 dim.2 1 2 3 4
-1 lp__ 1000 4 -59.512825 -57.501953 -58.519416 -57.224267
-2 mu 1000 4 6.687580 7.323996 6.485391 6.586506
-3 tau 1000 4 6.113202 4.991316 5.617574 4.718827
-4 theta[1] 1000 4 9.845526 10.063756 9.480842 8.762475
-5 theta[2] 1000 4 7.109487 7.340212 6.950864 6.750437
-6 theta[3] 1000 4 5.614556 6.465821 5.368069 6.012981
-7 theta[4] 1000 4 7.095362 7.575608 6.862193 6.854430
-8 theta[5] 1000 4 4.335314 5.915169 4.347114 5.023764
-9 theta[6] 1000 4 5.495814 6.275650 5.270746 5.933231
-10 theta[7] 1000 4 9.739179 9.685921 9.144607 8.758873
-11 theta[8] 1000 4 7.154340 7.740947 7.134879 7.103115
-For this reason users may have unexpected results if they use
-stats::var() directly, as it will return a covariance
-matrix. An alternative is the distributional::variance()
-function, which can also be accessed via
-posterior::variance().
variable posterior::variance ~var(as.vector(.x))
-1 mu 17.31137 17.31137
-2 tau 12.50798 12.50798
-Summary functions need not be numeric, but these won’t work with
-$print().
-strict_pos <- function(x) if (all(x > 0)) "yes" else "no"
-fit$summary(variables = NULL, "Strictly Positive" = strict_pos) variable Strictly Positive
-1 lp__ no
-2 mu no
-3 tau yes
-4 theta[1] no
-5 theta[2] no
-6 theta[3] no
-7 theta[4] no
-8 theta[5] no
-9 theta[6] no
-10 theta[7] no
-11 theta[8] no
-
-# fit$print(variables = NULL, "Strictly Positive" = strict_pos)For more information, see posterior::summarise_draws(),
-which is called by $summary().
The $draws()
-method can be used to extract the posterior draws in formats provided by
-the posterior
-package. Here we demonstrate only the draws_array and
-draws_df formats, but the posterior
-package supports other useful formats as well.
-# default is a 3-D draws_array object from the posterior package
-# iterations x chains x variables
-draws_arr <- fit$draws() # or format="array"
-str(draws_arr) 'draws_array' num [1:1000, 1:4, 1:11] -60.3 -61 -59.5 -58.6 -64.2 ...
- - attr(*, "dimnames")=List of 3
- ..$ iteration: chr [1:1000] "1" "2" "3" "4" ...
- ..$ chain : chr [1:4] "1" "2" "3" "4"
- ..$ variable : chr [1:11] "lp__" "mu" "tau" "theta[1]" ...
-
-# draws x variables data frame
-draws_df <- fit$draws(format = "df")
-str(draws_df)draws_df [4,000 × 14] (S3: draws_df/draws/tbl_df/tbl/data.frame)
- $ lp__ : num [1:4000] -60.3 -61 -59.5 -58.6 -64.2 ...
- $ mu : num [1:4000] 13.44 3.8 5.99 5.51 10.76 ...
- $ tau : num [1:4000] 5.55 7.85 6.33 5.48 8.88 ...
- $ theta[1] : num [1:4000] 18.07 3.52 12.12 4.46 13.9 ...
- $ theta[2] : num [1:4000] 16.476 -0.445 4.075 3.544 4.102 ...
- $ theta[3] : num [1:4000] 11.7279 5.8634 -0.0525 -0.6551 3.4664 ...
- $ theta[4] : num [1:4000] 14.84 3.94 -1.12 6.6 -2.9 ...
- $ theta[5] : num [1:4000] 12.108 0.652 -2.931 7.275 -6.486 ...
- $ theta[6] : num [1:4000] 7.16 9.55 9.69 3.59 6.77 ...
- $ theta[7] : num [1:4000] 11.31 11.43 4.64 11.3 5.55 ...
- $ theta[8] : num [1:4000] 24.13 -7.81 9.13 -3.8 20.05 ...
- $ .chain : int [1:4000] 1 1 1 1 1 1 1 1 1 1 ...
- $ .iteration: int [1:4000] 1 2 3 4 5 6 7 8 9 10 ...
- $ .draw : int [1:4000] 1 2 3 4 5 6 7 8 9 10 ...
-
-print(draws_df)# A draws_df: 1000 iterations, 4 chains, and 11 variables
- lp__ mu tau theta[1] theta[2] theta[3] theta[4] theta[5]
-1 -60 13.4 5.6 18.1 16.48 11.728 14.8 12.11
-2 -61 3.8 7.9 3.5 -0.45 5.863 3.9 0.65
-3 -59 6.0 6.3 12.1 4.08 -0.053 -1.1 -2.93
-4 -59 5.5 5.5 4.5 3.54 -0.655 6.6 7.28
-5 -64 10.8 8.9 13.9 4.10 3.466 -2.9 -6.49
-6 -61 6.6 11.3 19.3 10.54 6.469 5.6 -3.64
-7 -49 7.2 2.1 5.6 6.76 6.771 7.5 7.51
-8 -53 6.8 1.5 6.1 6.97 7.150 3.8 8.49
-9 -53 6.8 1.5 6.1 6.97 7.150 3.8 8.49
-10 -56 6.5 3.8 4.8 3.31 9.891 8.5 4.73
-# ... with 3990 more draws, and 3 more variables
-# ... hidden reserved variables {'.chain', '.iteration', '.draw'}
-To convert an existing draws object to a different format use the
-posterior::as_draws_*() functions.
To manipulate the draws objects use the various methods
-described in the posterior package vignettes
-and documentation.
rstan::extract()
-The posterior package’s rvar format
-provides a multidimensional, sample-based representation of random
-variables. See https://mc-stan.org/posterior/articles/rvar.html for
-details. In addition to being useful in its own right, this format also
-allows CmdStanR users to obtain draws in a similar format to
-rstan::extract().
Suppose we have a parameter matrix[2,3] x. The
-rvar format lets you interact with x as if
-it’s a 2 x 3 matrix and automatically applies operations
-over the many posterior draws of x. To instead directly
-access the draws of x while maintaining the structure of
-the matrix use posterior::draws_of(). For example:
-draws <- posterior::as_draws_rvars(fit$draws())
-x_rvar <- draws$x
-x_array <- posterior::draws_of(draws$x)The object x_rvar will be an rvar that can
-be used like a 2 x 3 matrix, with the draws handled behind
-the scenes. The object x_array will be a
-4000 x 2 x 3 array (assuming 4000 posterior
-draws), which is the same as it would be after being extracted from the
-list returned by rstan::extract().
vignettes/profiling.Rmd
- profiling.RmdThis vignette demonstrates how to use the new profiling functionality -introduced in CmdStan 2.26.0.
-Profiling identifies which parts of a Stan program are taking the -longest time to run and is therefore a useful guide when working on -optimizing the performance of a model.
-However, be aware that the statistical assumptions that go into a -model are the most important factors in overall model performance. It is -often not possible to make up for model problems with just brute force -computation. For ideas on how to address performance of your model from -a statistical perspective, see Gelman (2020).
-
-library(cmdstanr)
-check_cmdstan_toolchain(fix = TRUE, quiet = TRUE)Consider a simple logistic regression with parameters
-alpha and beta, covariates X, and
-outcome y.
data {
- int<lower=1> k;
- int<lower=0> n;
- matrix[n, k] X;
- array[n] int y;
-}
-parameters {
- vector[k] beta;
- real alpha;
-}
-model {
- beta ~ std_normal();
- alpha ~ std_normal();
-
- y ~ bernoulli_logit(X * beta + alpha);
-}
-A simple question is how much time do the prior calculations take
-compared against the likelihood? To answer this we surround the prior
-and likelihood calculations with profile statements.
profile("priors") {
- target += std_normal_lpdf(beta);
- target += std_normal_lpdf(alpha);
-}
-profile("likelihood") {
- target += bernoulli_logit_lpmf(y | X * beta + alpha);
-}
-In general we recommend using a separate .stan file, but
-for convenience in this vignette we’ll write the Stan program as a
-string and use write_stan_file() to write it to a temporary
-file.
-profiling_bernoulli_logit <- write_stan_file('
-data {
- int<lower=1> k;
- int<lower=0> n;
- matrix[n, k] X;
- array[n] int y;
-}
-parameters {
- vector[k] beta;
- real alpha;
-}
-model {
- profile("priors") {
- target += std_normal_lpdf(beta);
- target += std_normal_lpdf(alpha);
- }
- profile("likelihood") {
- target += bernoulli_logit_lpmf(y | X * beta + alpha);
- }
-}
-')We can then run the model as usual and Stan will collect the
-profiling information for any sections with profile
-statements.
-# Compile the model
-model <- cmdstan_model(profiling_bernoulli_logit)
-
-# Generate some fake data
-n <- 1000
-k <- 20
-X <- matrix(rnorm(n * k), ncol = k)
-
-y <- 3 * X[,1] - 2 * X[,2] + 1
-p <- runif(n)
-y <- ifelse(p < (1 / (1 + exp(-y))), 1, 0)
-stan_data <- list(k = ncol(X), n = nrow(X), y = y, X = X)
-
-# Run one chain of the model
-fit <- model$sample(data = stan_data, chains = 1)The raw profiling information can then be accessed with the
-$profiles() method, which returns a list containing one
-data frame per chain (profiles across multiple chains are not
-automatically aggregated). Details on the column names are available in
-the CmdStan
-documentation.
-fit$profiles()[[1]]
- name thread_id total_time forward_time reverse_time chain_stack
-1 likelihood 0x7ff85af4eb00 0.60053000 0.49063200 0.10989800 52272
-2 priors 0x7ff85af4eb00 0.00611123 0.00426232 0.00184891 34848
- no_chain_stack autodiff_calls no_autodiff_calls
-1 34865424 17424 1
-2 34848 17424 1
-The total_time column is the total time spent inside a
-given profile statement. It is clear that the vast majority of time is
-spent in the likelihood function.
Stan’s specialized glm functions can be used to make models like this -faster. In this case the likelihood can be replaced with
-target += bernoulli_logit_glm_lpmf(y | X, alpha, beta);
-We’ll keep the same profile() statements so that the
-profiling information for the new model is collected automatically just
-like for the previous one.
-profiling_bernoulli_logit_glm <- write_stan_file('
-data {
- int<lower=1> k;
- int<lower=0> n;
- matrix[n, k] X;
- array[n] int y;
-}
-parameters {
- vector[k] beta;
- real alpha;
-}
-model {
- profile("priors") {
- target += std_normal_lpdf(beta);
- target += std_normal_lpdf(alpha);
- }
- profile("likelihood") {
- target += bernoulli_logit_glm_lpmf(y | X, alpha, beta);
- }
-}
-')
-model_glm <- cmdstan_model(profiling_bernoulli_logit_glm)
-fit_glm <- model_glm$sample(data = stan_data, chains = 1)
-fit_glm$profiles()[[1]]
- name thread_id total_time forward_time reverse_time chain_stack
-1 likelihood 0x7ff85af4eb00 0.30010000 0.29864700 0.00145329 51729
-2 priors 0x7ff85af4eb00 0.00520466 0.00393537 0.00126928 34486
- no_chain_stack autodiff_calls no_autodiff_calls
-1 17243 17243 1
-2 34486 17243 1
-We can see from the total_time column that this is much
-faster than the previous model.
The other columns of the profiling output are documented in the CmdStan -documentation.
-The timing numbers are broken down by forward pass and reverse pass,
-and the chain_stack and no_chain_stack columns
-contain information about how many autodiff variables were saved in the
-process of performing a calculation.
These numbers are all totals – times are the total times over the
-whole calculation, and chain_stack counts are similarly the
-total counts of autodiff variables used over the whole calculation. It
-is often convenient to have per-gradient calculations (which will be
-more stable across runs with different seeds). To compute these, use the
-autodiff_calls column.
-profile_chain_1 <- fit$profiles()[[1]]
-per_gradient_timing <- profile_chain_1$total_time/profile_chain_1$autodiff_calls
-print(per_gradient_timing) # two elements for the two profile statements in the model[1] 3.446568e-05 3.507363e-07
-After sampling (or optimization or variational inference) finishes,
-CmdStan stores the profiling data in CSV files in a temporary location.
-The paths of the profiling CSV files can be retrieved using
-$profile_files().
-fit$profile_files()[1] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/Rtmp2a6FE1/model_6580008f67848265f3dfd0e7ae3b0600-profile-202503310851-1-810271.csv"
-These can be saved to a more permanent location with the
-$save_profile_files() method.
-# see ?save_profile_files for info on optional arguments
-fit$save_profile_files(dir = "path/to/directory")Gelman, Andrew, Aki Vehtari, Daniel Simpson, Charles C. Margossian, -Bob Carpenter, Yuling Yao, Lauren Kennedy, Jonah Gabry, Paul-Christian -Bürkner, and Martin Modrák. 2020. “Bayesian Workflow.” https://arxiv.org/abs/2011.01808.
-vignettes/r-markdown.Rmd
- r-markdown.RmdR Markdown supports a variety of languages through the use of knitr -language engines. Where users wish to write Stan programs as chunks -directly in R Markdown documents there are three options:
-Behind the scenes in each option, the engine compiles the model code
-in each chunk and creates an object that provides methods to run the
-model: a stanmodel if Rstan is being used, or a
-CmdStanModel in the CmdStanR case. This model object is
-assigned to a variable with the name given by the
-output.var chunk option.
This is the default option. In that case we can write, for -example:
- -If CmdStanR is being used a replacement engine needs to be registered -along the following lines:
-
-library(cmdstanr)
-register_knitr_engine(override = TRUE)This overrides knitr’s built-in stan engine so that all
-stan chunks are processed with CmdStanR, not RStan. Of
-course, this also means that the variable specified by
-output.var will no longer be a stanmodel
-object, but instead a CmdStanModel object, so the example
-code above would look like this:
// This stan chunk results in a CmdStanModel object called "ex1"
-parameters {
- array[2] real y;
-}
-model {
- y[1] ~ normal(0, 1);
- y[2] ~ double_exponential(0, 2);
-}
-ex1$print()
-#> // This stan chunk results in a CmdStanModel object called "ex1"
-#> parameters {
-#> array[2] real y;
-#> }
-#> model {
-#> y[1] ~ normal(0, 1);
-#> y[2] ~ double_exponential(0, 2);
-#> }
-fit <- ex1$sample(
- refresh = 0,
- seed = 42L
-)
-#> Running MCMC with 4 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#> Chain 3 finished in 0.0 seconds.
-#> Chain 4 finished in 0.0 seconds.
-#>
-#> All 4 chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.7 seconds.
-
-print(fit)
-#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
-#> lp__ -1.49 -1.15 1.25 0.98 -4.04 -0.17 1.00 1284 1327
-#> y[1] -0.03 -0.02 1.01 1.02 -1.68 1.62 1.00 2086 1813
-#> y[2] -0.01 -0.05 2.78 1.99 -4.55 4.62 1.00 2015 1281While the default behavior is to override the built-in
-stan engine because the assumption is that the user is
-probably not using both RStan and CmdStanR in the same document or
-project, the option to use both exists. When registering CmdStanR’s
-knitr engine, set override = FALSE to register the engine
-as a cmdstan engine:
-register_knitr_engine(override = FALSE)This will cause stan chunks to be processed by knitr’s
-built-in, RStan-based engine and only use CmdStanR’s knitr engine for
-cmdstan chunks:
Use cache=TRUE chunk option to avoid re-compiling the
-Stan model code every time the R Markdown is knit/rendered.
You can find the Stan model file and the compiled executable in the -document’s cache directory.
-When running chunks interactively in RStudio (e.g. when using R
-Notebooks), it has been observed that the built-in, RStan-based
-engine is used for stan chunks even when CmdStanR’s engine
-has been registered in the session as the engine for stan.
-As a workaround, when running chunks interactively, it is
-recommended to use the override = FALSE option and change
-stan chunks to be cmdstan chunks.
Do not worry: if the template you use supports syntax highlighting
-for the Stan language, that syntax highlighting will be applied to
-cmdstan chunks when the document is knit/rendered.
vignettes/r_markdown.Rmd
- r_markdown.RmdR Markdown supports a variety of languages through the use of knitr language engines. One such engine is the stan engine, which allows users to write Stan programs directly in their R Markdown documents by setting the language of the chunk to stan.
Behind the scenes, the engine relies on RStan to compile the model code into an in-memory stanmodel, which is assigned to a variable with the name given by the output.var chunk option. For example:
```{stan, output.var="model"}
-// Stan model code
-```
-
-```{r}
-rstan::sampling(model)
-```
-CmdStanR provides a replacement engine, which can be registered as follows:
-library(cmdstanr) - -register_knitr_engine()
By default, this overrides knitr’s built-in stan engine so that all stan chunks are processed with CmdStanR, not RStan. Of course, this also means that the variable specified by output.var will no longer be a stanmodel object, but instead a CmdStanModel object, so the code above would look like this:
```{stan, output.var="model"}
-// Stan model code
-```
-
-```{r}
-model$sample()
-```
-// This stan chunk results in a CmdStanModel object called "ex1"
-parameters {
- real y[2];
-}
-model {
- y[1] ~ normal(0, 1);
- y[2] ~ double_exponential(0, 2);
-}
-ex1$print() -#> // This stan chunk results in a CmdStanModel object called "ex1" -#> parameters { -#> real y[2]; -#> } -#> model { -#> y[1] ~ normal(0, 1); -#> y[2] ~ double_exponential(0, 2); -#> }
fit <- ex1$sample( - refresh = 0, - seed = 42L -) -#> Running MCMC with 4 sequential chains... -#> -#> Chain 1 finished in 0.3 seconds. -#> Chain 2 finished in 0.1 seconds. -#> Chain 3 finished in 0.1 seconds. -#> Chain 4 finished in 0.1 seconds. -#> -#> All 4 chains finished successfully. -#> Mean chain execution time: 0.2 seconds. -#> Total execution time: 0.7 seconds. - -print(fit) -#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail -#> lp__ -1.52 -1.20 1.26 0.98 -4.04 -0.17 1.00 1584 1603 -#> y[1] -0.01 0.01 1.02 1.03 -1.69 1.66 1.00 1863 2109 -#> y[2] 0.04 0.03 2.86 2.01 -4.61 4.86 1.00 2201 1625
Use cache=TRUE chunk option to avoid re-compiling the Stan model code every time the R Markdown is knit/rendered.
You can find the Stan model file and the compiled executable in the document’s cache directory.
-While the default behavior is to override the built-in stan engine because the assumption is that the user is probably not using both RStan and CmdStanR in the same document or project, the option to use both exists. When registering CmdStanR’s knitr engine, set override = FALSE to register the engine as a cmdstan engine:
register_knitr_engine(override = FALSE)
This will cause stan chunks to be processed by knitr’s built-in, RStan-based engine and only use CmdStanR’s knitr engine for cmdstan chunks:
```{stan, output.var="model_obj1"}
-// Results in a stanmodel object
-```
-
-```{r}
-rstan::sampling(model_obj1)
-```
-
-```{cmdstan, output.var="model_obj2"}
-// Results in a CmdStanModel object
-```
-
-```{r}
-model_obj2$sample()
-```
-CmdStanR is a lightweight interface to Stan for R users (see CmdStanPy for Python).
-If you are new to CmdStanR we recommend starting with these vignettes:
- -A clean interface to Stan services so that CmdStanR can keep up with Stan releases.
R code that doesn’t interface directly with C++, only calls compiled executables.
Modularity: CmdStanR runs Stan’s algorithms and lets downstream modules do the analysis.
Flexible BSD-3 license.
You can install the latest beta release of the cmdstanr R package with
-
-# we recommend running this in a fresh R session or restarting your current session
-install.packages("cmdstanr", repos = c('https://stan-dev.r-universe.dev', getOption("repos")))This does not install the vignettes, which take a long time to build, but they are always available online at https://mc-stan.org/cmdstanr/articles/.
-To instead install the latest development version of the package from GitHub use
-
-# install.packages("remotes")
-remotes::install_github("stan-dev/cmdstanr")If you don’t already have CmdStan installed then, in addition to installing the R package, it is also necessary to install CmdStan using CmdStanR’s install_cmdstan() function. A suitable C++ toolchain is also required. Instructions are provided in the Getting started with CmdStanR vignette.
There is a lot of work still to be done and we welcome contributions from anyone! If you are interested in contributing please comment on an open issue or open a new one if none are applicable.
-CmdStanR, like CmdStan and the core Stan C++ code, is licensed under the following licenses:
-NEWS.md
- sampler_diagnostics() with fixed_param=TRUE
-loo method (#1057)loo method (#1015)make (#1036)untar to fix installation errors (#1034)make/local when exposing functions or model methods (#1003)optimize and loo methods (#1060)rstan::read_stan_csv due to incompatibility with newer CmdStan outputs (#1018)cmdstanr_print_line_numbers for printing line numbers (#1017)CMDSTANR_USE_RTOOLS environment variable to force stock RTools on Windows by @andrjohns in #980inc_warmup argument to $unconstrain_draws() by @andrjohns in #985$unconstrain_draws() returning incorrect assumptions in some cases by @andrjohns in #983CmdStanFit objects as initial values by @SteveBronder in #937show_messages and show_exceptions arguments to all methods for controlling output by @andrjohns in #897unconstrain_draws() method to specify draws format of return by @andrjohns in #886cmdstanr EBFMI diagnostic threshold with CmdStan by @andrjohns in #892cmdstanr_print_line_numbers to add line number to model printing by @sbfnk in #967save_metric and save_cmdstan_config by @venpopov in #932rstan::extract() using a combination of cmdstanr and posterior by @jgabry in #955psis_resample and calculate_lp arguments added to Pathfinder method by @SteveBronder in #903cmdstanr_warn_inits added to disable warnings about partially specified initial values by @jgabry in #913output_dir documentation by @jgabry in #929num_paths by @andrjohns in #964compile_stanalone=TRUE but no functions are found by @jgabry in #956inv_metric argument with only 1 parameter by @venpopov in #935laplace method by @jgabry in #800pathfinder method by @SteveBronder in #848jacobian_adjustment argument to jacobian by @jgabry in #879expose_functions() method to expose Stan functions to R by @andrjohns in #702. See ?expose_functions.?init_model_methods.wsl=TRUE in install_cmdstan() to install CmdStan for use with WSL. This can offer significant speedups compared to native Windows execution. (#677, @andrjohns)In cmdstan_default_path() we now ignore directories inside .cmdstan that don’t start with "cmdstan-". (#651)
Fixed Windows issue related to not locating grep.exe or when it is located in a path with spaces. (@weshinsley, #661, #663)
Fixed a bug with diagnostic checks when ebfmi is NaN.
Fixed a bug that caused issues when using ~ or . in paths supplied to the cmdstanr_write_stan_file_dir global option.
Fixed a bug that caused the time() method fail when some of the chains failed to finish succesfully.
Refactored toolchain installation and checks for R 4.x on Windows and added support for Rtools42. (#645)
Expanded the use of CMDSTAN environment variable to point to CmdStan installation or directory containing CmdStan installations. (#643)
New vignette on how to handle deprecations using the $format() method. (#644)
format="draws_rvars" in the $draws() method due to a bug. Until this is fixed users can make use of posterior::as_draws_rvars() to convert draws from CmdStanR to the draws_rvars format. (#640)Default directory changed to .cmdstan instead of .cmdstanr so that CmdStanPy and CmdStanR can use the same CmdStan installations. Using .cmdstanr will continue to be supported until version 1.0 but install_cmdstan() will now default to .cmdstan and CmdStanR will first look for .cmdstan before falling back on .cmdstanr. (#454)
New method diagnose() for CmdstanModel objects exposes CmdStan’s diagnose method for comparing Stan’s gradient computations to gradients computed via finite differences. (#485)
New method $variables() for CmdstanModel objects that returns a list of variables in the Stan model, their types and number of dimensions. Does not require the model to be compiled. (#519)
New method $format() for auto-formatting and canonicalizing the Stan models. (#625)
Added the option to create CmdStanModel from the executable only with the exe_file argument. (#564)
Added a convenience argument user_header to $compile() and cmdstan_model() that simplifies the use of an external .hpp file to compile with the model.
Added the cmdstanr_force_recompile global option that is used for forcing recompilation of Stan models. (#580)
New method $code() for all fitted model objects that returns the Stan code associated with the fitted model. (#575)
New method $diagnostic_summary() for CmdStanMCMC objects that summarizes the sampler diagnostics (divergences, treedepth, ebfmi) and can regenerate the related warning messages. (#205)
New diagnostics argument for the $sample() method to specify which diagnostics are checked after sampling. Replaces validate_csv argument. (#205)
Added E-BFMI checks that run automatically post sampling. (#500, @jsocolar)
New methods for posterior::as_draws() for CmdStanR fitted model objects. These are just wrappers around the $draws() method provided for convenience. (#532)
write_stan_file() now choose file names deterministically based on the code so that models do not get unnecessarily recompiled when calling the function multiple times with the same code. (#495, @martinmodrak)
The dir argument for write_stan_file() can now be set with a global option. (#537)
write_stan_json() now handles data of class "table". Tables are converted to vector, matrix, or array depending on the dimensions of the table. (#528)
Improved processing of named lists supplied to the data argument to JSON data files: checking whether the list includes all required elements/Stan variables; improved differentiating arrays/vectors of length 1 and scalars when generating JSON data files; generating floating point numbers with decimal points to fix issue with parsing large numbers. (#538)
install_cmdstan() now automatically installs the Linux ARM CmdStan when Linux distributions running on ARM CPUs are detected. (#531)
New function as_mcmc.list() for converting CmdStanMCMC objects to mcmc.list objects from the coda package. (#584, @MatsuuraKentaro)
New function as_cmdstan_fit() that creates CmdStanMCMC/MLE/VB objects directly from CmdStan CSV files. (#412)
read_cmdstan_csv() now also returns chain run times for MCMC sampling CSV files. (#414)
Faster CSV reading for multiple chains. (#419)
New $profiles() method for fitted model objects accesses profiling information from R if profiling used in the Stan program. Support for profiling Stan programs requires CmdStan >= 2.26. (#434)
New vignette on profiling Stan programs. (#435)
New vignette on running Stan on the GPU with OpenCL. OpenCL device ids can now also be specified at runtime. (#439)
New check for invalid parameter names when supplying init values. (#452, @mike-lawrence)
Suppressing compilation messages when not in interactive mode. (#462, @wlandau)
New error_on_NA argument for cmdstan_version() to optionally return NULL (instead of erroring) if the CmdStan path is not found (#467, @wlandau).
Global option cmdstanr_max_rows can be set as an alternative to specifying max_rows argument to the $print() method. (#470)
New output_basename argument for the model fitting methods. Can be used in conjunction with output_dir to get completely predictable output CSV file paths. (#471)
New format argument for $draws(), $sampler_diagnostics(), read_cmdstan_csv(), and as_cmdstan_fit(). This controls the format of the draws returned or stored in the object. Changing the format can improve speed and memory usage for large models. (#482)
Added $sample_mpi() for MCMC sampling with MPI. (#350)
Added informative messages on compile errors caused by precompiled headers (PCH). (#384)
Added the cmdstanr_verbose option for verbose mode. Intended for troubleshooting, debugging and development. See end of How does CmdStanR work? vignette for details. (#392)
New $loo() method for CmdStanMCMC objects. Requires computing pointwise log-likelihood in Stan program. (#366)
The fitted_params argument to the $generate_quantities() method now also accepts CmdStanVB, posterior::draws_array, and posterior::draws_matrix objects. (#390)
The $optimize() method now supports all of CmdStan’s tolerance-related arguments for (L)BFGS. (#398)
The documentation for the R6 methods now uses @param, which allows package developers to import the CmdStanR documentation using roxygen2’s @inheritParams. (#408)
Fixed bug with processing stanc_options in check_syntax(). (#345)
compile() and check_syntax() methods gain argument pedantic for turning on pedantic mode, which warns about issues with the model beyond syntax errors. (#361)Fix potential indexing error if using read_cmdstan_csv() with CSV files created by CmdStan without CmdStanR. (#291, #292, @johnlees)
Fix error when returning draws or sampler diagnostics for a fit with only warmup and no samples. (#288, #293)
Fix trailing slashes issue for dir in cmdstan_model() and output_dir in fitting methods. (#281, #294)
Fix dimensions error when processing a list of matrices passed in as data. (#296, #302)
Fix reporting of time after using fixed_param method. (#303, #307)
With refresh = 0, no output other than error messages is printed with $optimize() and $variational(). (#324)
Fix issue where names of generated files could clash. (#326, #328)
Fix missing include_paths in $syntax_check(). (#335, @mike-lawrence)
CSV reading is now faster by using data.table::fread(). (#318)
install_cmdstan() gains argument version for specifying which version of CmdStan to install. (#300, #308)
New function check_cmdstan_toolchain() that checks if the appropriate toolchains are available. (#289)
$sample() method for CmdStanModel objects gains argument chain_ids for specifying custom chain IDs. (#319)
Added support for the sig_figs argument in CmdStan versions 2.25 and above. (#327)
Added checks if the user has the necessary permissions in the RTools and temporary folders. (#343)
User is notified by message at load time if a new release of CmdStan is available. (#265, #273)
write_stan_file() replaces write_stan_tempfile(), which is now deprecated. With the addition of the dir argument, the file written is not necessarily temporary. (#267, #272)
eng_cmdstan() and function register_knitr_engine() that allow Stan chunks in R markdown documents to be processed using CmdStanR instead of RStan. The new vignette R Markdown CmdStan Engine provides a demonstration. (#261, #264, @bearloga)Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): INSERT COPYRIGHT HOLDER HERE
-By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:
-A CmdStanDiagnose object is the object returned by the
-$diagnose() method of a CmdStanModel object.
CmdStanDiagnose objects have the following associated
-methods:
| Method | Description |
$gradients() | Return gradients from diagnostic mode. |
$lp() | Return the total log probability density (target). |
$init() | Return user-specified initial values. |
$metadata() | Return a list of metadata gathered from the CmdStan CSV files. |
$save_output_files() | Save output CSV files to a specified location. |
$save_data_file() | Save JSON data file to a specified location. |
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other fitted model objects:
-CmdStanGQ,
-CmdStanLaplace,
-CmdStanMCMC,
-CmdStanMLE,
-CmdStanPathfinder,
-CmdStanVB
# \dontrun{
-test <- cmdstanr_example("logistic", method = "diagnose")
-
-# retrieve the gradients
-test$gradients()
-#> param_idx value model finite_diff error
-#> 1 0 1.025820 -9.58652 -9.58652 1.06300e-08
-#> 2 1 -1.330520 7.42547 7.42547 3.40819e-08
-#> 3 2 -1.187440 13.23460 13.23460 7.67016e-09
-#> 4 3 0.699258 3.50023 3.50023 6.99257e-09
-# }
-
-CmdStanFit-method-save_output_files.RdAll fitted model objects have methods save_output_files() and
-save_data_file(). These methods move csv output files and R dump or json
-data files from the CmdStanR temporary directory to a user-specified
-location. By default the suffix '-<run_id>_<timestamp>' is added to the
-file name(s), where run_id is the chain number if applicable (MCMC only)
-and 1 otherwise. If files with the specified names already exist they are
-overwritten, but this shouldn't occur unless the timestamp argument has
-been intentionally set to FALSE.
$save_output_files(dir = ".", basename = NULL, timestamp = TRUE) -$save_data_file(dir = ".", basename = NULL, timestamp = TRUE) -- -
save_output_files() and save_data_file() have the
-same arguments:
dir: (string) Path to directory where the files should be saved.
basename: (string) Base filename to use.
timestamp: (logical) Should a timestamp be added to the file name(s)?
-Defaults to TRUE. The timestamp is preceeded by an underscore is of
-the form
The paths to the new files or NA for any that couldn't be
-copied.
A CmdStanGQ object is the fitted model object returned by the
-$generate_quantities() method of a
-CmdStanModel object.
CmdStanGQ objects have the following associated methods,
-all of which have their own (linked) documentation pages.
| Method | Description |
$draws() | Return the generated quantities as a draws_array. |
$metadata() | Return a list of metadata gathered from the CmdStan CSV files. |
$code() | Return Stan code as a character vector. |
| Method | Description |
$summary() | Run posterior::summarise_draws(). |
| Method | Description |
$save_object() | Save fitted model object to a file. |
$save_output_files() | Save output CSV files to a specified location. |
$save_data_file() | Save JSON data file to a specified location. |
| Method | Description |
$time() | Report the total run time. |
$output() | Return the stdout and stderr of all chains or pretty print the output for a single chain. |
$return_codes() | Return the return codes from the CmdStan runs. |
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other fitted model objects:
-CmdStanDiagnose,
-CmdStanLaplace,
-CmdStanMCMC,
-CmdStanMLE,
-CmdStanPathfinder,
-CmdStanVB
# \dontrun{
-# first fit a model using MCMC
-mcmc_program <- write_stan_file(
- "data {
- int<lower=0> N;
- array[N] int<lower=0,upper=1> y;
- }
- parameters {
- real<lower=0,upper=1> theta;
- }
- model {
- y ~ bernoulli(theta);
- }"
-)
-mod_mcmc <- cmdstan_model(mcmc_program)
-
-data <- list(N = 10, y = c(1,1,0,0,0,1,0,1,0,0))
-fit_mcmc <- mod_mcmc$sample(data = data, seed = 123, refresh = 0)
-#> Running MCMC with 4 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#> Chain 3 finished in 0.0 seconds.
-#> Chain 4 finished in 0.0 seconds.
-#>
-#> All 4 chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.7 seconds.
-#>
-
-# stan program for standalone generated quantities
-# (could keep model block, but not necessary so removing it)
-gq_program <- write_stan_file(
- "data {
- int<lower=0> N;
- array[N] int<lower=0,upper=1> y;
- }
- parameters {
- real<lower=0,upper=1> theta;
- }
- generated quantities {
- array[N] int y_rep = bernoulli_rng(rep_vector(theta, N));
- }"
-)
-
-mod_gq <- cmdstan_model(gq_program)
-fit_gq <- mod_gq$generate_quantities(fit_mcmc, data = data, seed = 123)
-#> Running standalone generated quantities after 4 MCMC chains, 1 chain at a time ...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#> Chain 3 finished in 0.0 seconds.
-#> Chain 4 finished in 0.0 seconds.
-#>
-#> All 4 chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.5 seconds.
-str(fit_gq$draws())
-#> 'draws_array' int [1:1000, 1:4, 1:10] 0 0 0 1 1 0 1 1 0 1 ...
-#> - attr(*, "dimnames")=List of 3
-#> ..$ iteration: chr [1:1000] "1" "2" "3" "4" ...
-#> ..$ chain : chr [1:4] "1" "2" "3" "4"
-#> ..$ variable : chr [1:10] "y_rep[1]" "y_rep[2]" "y_rep[3]" "y_rep[4]" ...
-
-library(posterior)
-#> This is posterior version 1.6.1
-#>
-#> Attaching package: ‘posterior’
-#> The following objects are masked from ‘package:stats’:
-#>
-#> mad, sd, var
-#> The following objects are masked from ‘package:base’:
-#>
-#> %in%, match
-as_draws_df(fit_gq$draws())
-#> # A draws_df: 1000 iterations, 4 chains, and 10 variables
-#> y_rep[1] y_rep[2] y_rep[3] y_rep[4] y_rep[5] y_rep[6] y_rep[7] y_rep[8]
-#> 1 0 0 0 0 0 1 1 1
-#> 2 0 0 0 0 1 1 0 0
-#> 3 0 0 0 1 0 0 1 1
-#> 4 1 1 0 0 0 0 1 0
-#> 5 1 0 1 0 1 0 1 0
-#> 6 0 0 0 1 1 0 0 0
-#> 7 1 1 0 1 1 1 0 0
-#> 8 1 1 1 1 1 0 1 1
-#> 9 0 1 0 1 0 1 1 0
-#> 10 1 1 1 1 1 1 1 1
-#> # ... with 3990 more draws, and 2 more variables
-#> # ... hidden reserved variables {'.chain', '.iteration', '.draw'}
-# }
-
-A CmdStanLaplace object is the fitted model object returned by the
-$laplace() method of a
-CmdStanModel object.
CmdStanLaplace objects have the following associated methods,
-all of which have their own (linked) documentation pages.
| Method | Description |
$draws() | Return approximate posterior draws as a draws_matrix. |
$mode() | Return the mode as a CmdStanMLE object. |
$lp() | Return the total log probability density (target) computed in the model block of the Stan program. |
$lp_approx() | Return the log density of the approximation to the posterior. |
$init() | Return user-specified initial values. |
$metadata() | Return a list of metadata gathered from the CmdStan CSV files. |
$code() | Return Stan code as a character vector. |
| Method | Description |
$summary() | Run posterior::summarise_draws(). |
| Method | Description |
$save_object() | Save fitted model object to a file. |
$save_output_files() | Save output CSV files to a specified location. |
$save_data_file() | Save JSON data file to a specified location. |
$save_latent_dynamics_files() | Save diagnostic CSV files to a specified location. |
| Method | Description |
$time() | Report the run time of the Laplace sampling step. |
$output() | Pretty print the output that was printed to the console. |
$return_codes() | Return the return codes from the CmdStan runs. |
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other fitted model objects:
-CmdStanDiagnose,
-CmdStanGQ,
-CmdStanMCMC,
-CmdStanMLE,
-CmdStanPathfinder,
-CmdStanVB
A CmdStanMCMC object is the fitted model object returned by
-the $sample() method of a CmdStanModel object.
-Like CmdStanModel objects, CmdStanMCMC objects are R6
-objects.
CmdStanMCMC objects have the following associated
-methods, all of which have their own (linked) documentation pages.
| Method | Description |
$draws() | Return posterior draws using formats from the posterior package. |
$sampler_diagnostics() | Return sampler diagnostics as a draws_array. |
$lp() | Return the total log probability density (target). |
$inv_metric() | Return the inverse metric for each chain. |
$init() | Return user-specified initial values. |
$metadata() | Return a list of metadata gathered from the CmdStan CSV files. |
$num_chains() | Return the number of MCMC chains. |
$code() | Return Stan code as a character vector. |
| Method | Description |
$print() | Run posterior::summarise_draws(). |
$summary() | Run posterior::summarise_draws(). |
$diagnostic_summary() | Get summaries of sampler diagnostics and warning messages. |
$cmdstan_summary() | Run and print CmdStan's bin/stansummary. |
$cmdstan_diagnose() | Run and print CmdStan's bin/diagnose. |
$loo() | Run loo::loo.array() for approximate LOO-CV |
| Method | Description |
$save_object() | Save fitted model object to a file. |
$save_output_files() | Save output CSV files to a specified location. |
$save_data_file() | Save JSON data file to a specified location. |
$save_latent_dynamics_files() | Save diagnostic CSV files to a specified location. |
| Method | Description |
$output() | Return the stdout and stderr of all chains or pretty print the output for a single chain. |
$time() | Report total and chain-specific run times. |
$return_codes() | Return the return codes from the CmdStan runs. |
| Method | Description |
$expose_functions() | Expose Stan functions for use in R. |
$init_model_methods() | Expose methods for log-probability, gradients, parameter constraining and unconstraining. |
$log_prob() | Calculate log-prob. |
$grad_log_prob() | Calculate log-prob and gradient. |
$hessian() | Calculate log-prob, gradient, and hessian. |
$constrain_variables() | Transform a set of unconstrained parameter values to the constrained scale. |
$unconstrain_variables() | Transform a set of parameter values to the unconstrained scale. |
$unconstrain_draws() | Transform all parameter draws to the unconstrained scale. |
$variable_skeleton() | Helper function to re-structure a vector of constrained parameter values. |
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other fitted model objects:
-CmdStanDiagnose,
-CmdStanGQ,
-CmdStanLaplace,
-CmdStanMLE,
-CmdStanPathfinder,
-CmdStanVB
A CmdStanMLE object is the fitted model object returned by the
-$optimize() method of a CmdStanModel object.
-This object will either contain a penalized maximum likelihood estimate
-(MLE) or a maximum a posteriori estimate (MAP), depending on the value of
-the jacobian argument when the model is fit (and whether the model has
-constrained parameters). See $optimize() and the
-CmdStan User's Guide for more details.
CmdStanMLE objects have the following associated methods,
-all of which have their own (linked) documentation pages.
| Method | Description |
draws() | Return the point estimate as a 1-row draws_matrix. |
$mle() | Return the point estimate as a numeric vector. |
$lp() | Return the total log probability density (target). |
$init() | Return user-specified initial values. |
$metadata() | Return a list of metadata gathered from the CmdStan CSV files. |
$code() | Return Stan code as a character vector. |
| Method | Description |
$summary() | Run posterior::summarise_draws(). |
| Method | Description |
$save_object() | Save fitted model object to a file. |
$save_output_files() | Save output CSV files to a specified location. |
$save_data_file() | Save JSON data file to a specified location. |
| Method | Description |
$time() | Report the total run time. |
$output() | Pretty print the output that was printed to the console. |
$return_codes() | Return the return codes from the CmdStan runs. |
| Method | Description |
$expose_functions() | Expose Stan functions for use in R. |
$init_model_methods() | Expose methods for log-probability, gradients, parameter constraining and unconstraining. |
$log_prob() | Calculate log-prob. |
$grad_log_prob() | Calculate log-prob and gradient. |
$hessian() | Calculate log-prob, gradient, and hessian. |
$constrain_variables() | Transform a set of unconstrained parameter values to the constrained scale. |
$unconstrain_variables() | Transform a set of parameter values to the unconstrained scale. |
$unconstrain_draws() | Transform all parameter draws to the unconstrained scale. |
$variable_skeleton() | Helper function to re-structure a vector of constrained parameter values. |
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other fitted model objects:
-CmdStanDiagnose,
-CmdStanGQ,
-CmdStanLaplace,
-CmdStanMCMC,
-CmdStanPathfinder,
-CmdStanVB
CmdStanModel-method-compile.RdThe compile method of a CmdStanModel object calls CmdStan
-to translate a Stan program to C++ and call the C++ compiler. The resulting
-files are placed in the same directory as the Stan program.
$compile() -- -
The compile method returns the CmdStanModel object
-invisibly.
The CmdStanR website (mc-stan.org/cmdstanr) -for online documentation and tutorials.
-The Stan and CmdStan documentation:
Stan doc (html or pdf): mc-stan.org/users/documentation/
CmdStan doc (pdf): (github.com/stan-dev/cmdstan/releases/).
Other CmdStanModel methods: CmdStanModel-method-optimize,
- CmdStanModel-method-sample,
- CmdStanModel-method-variational
-# \dontrun{ -# Set path to cmdstan -# Note: if you installed CmdStan via install_cmdstan() with default settings -# then default below should work. Otherwise use the `path` argument to -# specify the location of your CmdStan installation. - -set_cmdstan_path(path = NULL)#>-# Create a CmdStan model object from a Stan program, -# here using the example model that comes with CmdStan -stan_program <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") -mod <- cmdstan_model(stan_program) -mod$print()#> data { -#> int<lower=0> N; -#> int<lower=0,upper=1> y[N]; -#> } -#> parameters { -#> real<lower=0,upper=1> theta; -#> } -#> model { -#> theta ~ beta(1,1); -#> for (n in 1:N) -#> y[n] ~ bernoulli(theta); -#> }-# Compile to create executable -mod$compile()#> Running make /Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli -#> make: `/Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli' is up to date.-# Run sample method (MCMC via Stan's dynamic HMC/NUTS), -# specifying data as a named list (like RStan) -standata <- list(N = 10, y =c(0,1,0,0,0,0,0,0,0,1)) -fit_mcmc <- mod$sample(data = standata, seed = 123, num_chains = 2)#> method = sample (Default) -#> sample -#> num_samples = 1000 (Default) -#> num_warmup = 1000 (Default) -#> save_warmup = 0 (Default) -#> thin = 1 (Default) -#> adapt -#> engaged = 1 (Default) -#> gamma = 0.050000000000000003 (Default) -#> delta = 0.80000000000000004 (Default) -#> kappa = 0.75 (Default) -#> t0 = 10 (Default) -#> init_buffer = 75 (Default) -#> term_buffer = 50 (Default) -#> window = 25 (Default) -#> algorithm = hmc (Default) -#> hmc -#> engine = nuts (Default) -#> nuts -#> max_depth = 10 (Default) -#> metric = diag_e (Default) -#> metric_file = (Default) -#> stepsize = 1 (Default) -#> stepsize_jitter = 0 (Default) -#> id = 1 -#> data -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b027a7b04df.data.R -#> init = 2 (Default) -#> random -#> seed = 123 -#> output -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-1.csv -#> diagnostic_file = (Default) -#> refresh = 100 (Default) -#> -#> -#> Gradient evaluation took 1.8e-05 seconds -#> 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds. -#> Adjust your expectations accordingly! -#> -#> -#> Iteration: 1 / 2000 [ 0%] (Warmup) -#> Iteration: 100 / 2000 [ 5%] (Warmup) -#> Iteration: 200 / 2000 [ 10%] (Warmup) -#> Iteration: 300 / 2000 [ 15%] (Warmup) -#> Iteration: 400 / 2000 [ 20%] (Warmup) -#> Iteration: 500 / 2000 [ 25%] (Warmup) -#> Iteration: 600 / 2000 [ 30%] (Warmup) -#> Iteration: 700 / 2000 [ 35%] (Warmup) -#> Iteration: 800 / 2000 [ 40%] (Warmup) -#> Iteration: 900 / 2000 [ 45%] (Warmup) -#> Iteration: 1000 / 2000 [ 50%] (Warmup) -#> Iteration: 1001 / 2000 [ 50%] (Sampling) -#> Iteration: 1100 / 2000 [ 55%] (Sampling) -#> Iteration: 1200 / 2000 [ 60%] (Sampling) -#> Iteration: 1300 / 2000 [ 65%] (Sampling) -#> Iteration: 1400 / 2000 [ 70%] (Sampling) -#> Iteration: 1500 / 2000 [ 75%] (Sampling) -#> Iteration: 1600 / 2000 [ 80%] (Sampling) -#> Iteration: 1700 / 2000 [ 85%] (Sampling) -#> Iteration: 1800 / 2000 [ 90%] (Sampling) -#> Iteration: 1900 / 2000 [ 95%] (Sampling) -#> Iteration: 2000 / 2000 [100%] (Sampling) -#> -#> Elapsed Time: 0.01344 seconds (Warm-up) -#> 0.021608 seconds (Sampling) -#> 0.035048 seconds (Total) -#> -#> method = sample (Default) -#> sample -#> num_samples = 1000 (Default) -#> num_warmup = 1000 (Default) -#> save_warmup = 0 (Default) -#> thin = 1 (Default) -#> adapt -#> engaged = 1 (Default) -#> gamma = 0.050000000000000003 (Default) -#> delta = 0.80000000000000004 (Default) -#> kappa = 0.75 (Default) -#> t0 = 10 (Default) -#> init_buffer = 75 (Default) -#> term_buffer = 50 (Default) -#> window = 25 (Default) -#> algorithm = hmc (Default) -#> hmc -#> engine = nuts (Default) -#> nuts -#> max_depth = 10 (Default) -#> metric = diag_e (Default) -#> metric_file = (Default) -#> stepsize = 1 (Default) -#> stepsize_jitter = 0 (Default) -#> id = 2 -#> data -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b027a7b04df.data.R -#> init = 2 (Default) -#> random -#> seed = 124 -#> output -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-2.csv -#> diagnostic_file = (Default) -#> refresh = 100 (Default) -#> -#> -#> Gradient evaluation took 1.9e-05 seconds -#> 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds. -#> Adjust your expectations accordingly! -#> -#> -#> Iteration: 1 / 2000 [ 0%] (Warmup) -#> Iteration: 100 / 2000 [ 5%] (Warmup) -#> Iteration: 200 / 2000 [ 10%] (Warmup) -#> Iteration: 300 / 2000 [ 15%] (Warmup) -#> Iteration: 400 / 2000 [ 20%] (Warmup) -#> Iteration: 500 / 2000 [ 25%] (Warmup) -#> Iteration: 600 / 2000 [ 30%] (Warmup) -#> Iteration: 700 / 2000 [ 35%] (Warmup) -#> Iteration: 800 / 2000 [ 40%] (Warmup) -#> Iteration: 900 / 2000 [ 45%] (Warmup) -#> Iteration: 1000 / 2000 [ 50%] (Warmup) -#> Iteration: 1001 / 2000 [ 50%] (Sampling) -#> Iteration: 1100 / 2000 [ 55%] (Sampling) -#> Iteration: 1200 / 2000 [ 60%] (Sampling) -#> Iteration: 1300 / 2000 [ 65%] (Sampling) -#> Iteration: 1400 / 2000 [ 70%] (Sampling) -#> Iteration: 1500 / 2000 [ 75%] (Sampling) -#> Iteration: 1600 / 2000 [ 80%] (Sampling) -#> Iteration: 1700 / 2000 [ 85%] (Sampling) -#> Iteration: 1800 / 2000 [ 90%] (Sampling) -#> Iteration: 1900 / 2000 [ 95%] (Sampling) -#> Iteration: 2000 / 2000 [100%] (Sampling) -#> -#> Elapsed Time: 0.013296 seconds (Warm-up) -#> 0.023346 seconds (Sampling) -#> 0.036642 seconds (Total) -#>#> Running bin/stansummary \ -#> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-1.csv \ -#> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-2.csv -#> Inference for Stan model: bernoulli_model -#> 2 chains: each with iter=(1000,1000); warmup=(0,0); thin=(1,1); 2000 iterations saved. -#> -#> Warmup took (0.013, 0.013) seconds, 0.027 seconds total -#> Sampling took (0.022, 0.023) seconds, 0.045 seconds total -#> -#> Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat -#> lp__ -7.3 2.7e-02 7.3e-01 -8.9 -7.0 -6.8 738 16427 1.0e+00 -#> accept_stat__ 0.92 3.1e-03 1.3e-01 0.64 0.97 1.0 1670 37139 1.0e+00 -#> stepsize__ 0.92 1.7e-03 1.7e-03 0.92 0.92 0.92 1.0 22 1.4e+12 -#> treedepth__ 1.3 1.1e-02 4.7e-01 1.0 1.0 2.0 1968 43773 1.0e+00 -#> n_leapfrog__ 2.4 2.5e-02 1.0e+00 1.0 3.0 3.0 1640 36485 1.0e+00 -#> divergent__ 0.00 0.0e+00 0.0e+00 0.00 0.00 0.00 1000 22245 nan -#> energy__ 7.8 4.0e-02 1.0e+00 6.8 7.5 9.8 647 14387 1.0e+00 -#> theta 0.24 4.6e-03 1.2e-01 0.077 0.22 0.47 720 16019 1.0e+00 -#> -#> Samples were drawn using hmc with nuts. -#> For each parameter, N_Eff is a crude measure of effective sample size, -#> and R_hat is the potential scale reduction factor on split chains (at -#> convergence, R_hat=1). -#>-# Run optimization method (default is Stan's LBFGS algorithm) -# and also demonstrate specifying data as a path to a file (readable by CmdStan) -my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.R") -fit_optim <- mod$optimize(data = my_data_file, seed = 123)#> Warning: Optimization method is experimental and the structure of returned object may change.#> method = optimize -#> optimize -#> algorithm = lbfgs (Default) -#> lbfgs -#> init_alpha = 0.001 (Default) -#> tol_obj = 9.9999999999999998e-13 (Default) -#> tol_rel_obj = 10000 (Default) -#> tol_grad = 1e-08 (Default) -#> tol_rel_grad = 10000000 (Default) -#> tol_param = 1e-08 (Default) -#> history_size = 5 (Default) -#> iter = 2000 (Default) -#> save_iterations = 0 (Default) -#> id = 1 -#> data -#> file = /Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli.data.R -#> init = 2 (Default) -#> random -#> seed = 123 -#> output -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-optimize-1.csv -#> diagnostic_file = (Default) -#> refresh = 100 (Default) -#> -#> Initial log joint probability = -9.51104 -#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes -#> 6 -5.00402 0.000103557 2.55661e-07 1 1 9 -#> Optimization terminated normally: -#> Convergence detected: relative gradient magnitude is below tolerance#> Estimates from optimization:#> theta lp__ -#> 0.20000 -5.00402-# Run variational Bayes method (default is meanfield ADVI) -fit_vb <- mod$variational(data = standata, seed = 123)#> Warning: Variational inference method is experimental and the structure of returned object may change.#> method = variational -#> variational -#> algorithm = meanfield (Default) -#> meanfield -#> iter = 10000 (Default) -#> grad_samples = 1 (Default) -#> elbo_samples = 100 (Default) -#> eta = 1 (Default) -#> adapt -#> engaged = 1 (Default) -#> iter = 50 (Default) -#> tol_rel_obj = 0.01 (Default) -#> eval_elbo = 100 (Default) -#> output_samples = 1000 (Default) -#> id = 1 -#> data -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b02227d0b4b.data.R -#> init = 2 (Default) -#> random -#> seed = 123 -#> output -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-variational-1.csv -#> diagnostic_file = (Default) -#> refresh = 100 (Default) -#> -#> ------------------------------------------------------------ -#> EXPERIMENTAL ALGORITHM: -#> This procedure has not been thoroughly tested and may be unstable -#> or buggy. The interface is subject to change. -#> ------------------------------------------------------------ -#> -#> -#> -#> Gradient evaluation took 2e-05 seconds -#> 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds. -#> Adjust your expectations accordingly! -#> -#> -#> Begin eta adaptation. -#> Iteration: 1 / 250 [ 0%] (Adaptation) -#> Iteration: 50 / 250 [ 20%] (Adaptation) -#> Iteration: 100 / 250 [ 40%] (Adaptation) -#> Iteration: 150 / 250 [ 60%] (Adaptation) -#> Iteration: 200 / 250 [ 80%] (Adaptation) -#> Success! Found best value [eta = 1] earlier than expected. -#> -#> Begin stochastic gradient ascent. -#> iter ELBO delta_ELBO_mean delta_ELBO_med notes -#> 100 -6.258 1.000 1.000 -#> 200 -6.475 0.517 1.000 -#> 300 -6.228 0.358 0.040 -#> 400 -6.220 0.269 0.040 -#> 500 -6.379 0.220 0.034 -#> 600 -6.195 0.188 0.034 -#> 700 -6.262 0.163 0.030 -#> 800 -6.345 0.144 0.030 -#> 900 -6.201 0.131 0.025 -#> 1000 -6.307 0.119 0.025 -#> 1100 -6.290 0.020 0.023 -#> 1200 -6.238 0.017 0.017 -#> 1300 -6.182 0.014 0.013 -#> 1400 -6.167 0.014 0.013 -#> 1500 -6.219 0.012 0.011 -#> 1600 -6.164 0.010 0.009 MEDIAN ELBO CONVERGED -#> -#> Drawing a sample of size 1000 from the approximate posterior... -#> COMPLETED.#> Running bin/stansummary \ -#> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-variational-1.csv -#> Warning: non-fatal error reading adapation data -#> Inference for Stan model: bernoulli_model -#> 1 chains: each with iter=(1001); warmup=(0); thin=(0); 1001 iterations saved. -#> -#> Warmup took (0.00) seconds, 0.00 seconds total -#> Sampling took (0.00) seconds, 0.00 seconds total -#> -#> Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat -#> lp__ 0.00 0.0e+00 0.00 0.00 0.00 0.0e+00 500 inf nan -#> log_p__ -7.2 2.5e-02 0.72 -8.6 -7.0 -6.8e+00 789 inf 1.0e+00 -#> log_g__ -0.54 2.9e-02 0.76 -2.1 -0.27 -1.5e-03 679 inf 1.0e+00 -#> theta 0.26 4.2e-03 0.12 0.091 0.23 4.9e-01 823 inf 1.0e+00 -#> -#> Samples were drawn using meanfield with . -#> For each parameter, N_Eff is a crude measure of effective sample size, -#> and R_hat is the potential scale reduction factor on split chains (at -#> convergence, R_hat=1). -#>-# For models fit using MCMC, if you like working with RStan's stanfit objects -# then you can create one with rstan::read_stan_csv() -if (require(rstan, quietly = TRUE)) { - stanfit <- rstan::read_stan_csv(fit_mcmc$output_files()) - print(stanfit) -}#>#> -#> -#> -#>#> Inference for Stan model: bernoulli-stan-sample-1. -#> 2 chains, each with iter=2000; warmup=1000; thin=1; -#> post-warmup draws per chain=1000, total post-warmup draws=2000. -#> -#> mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat -#> theta 0.24 0.00 0.12 0.06 0.14 0.22 0.32 0.52 720 1 -#> lp__ -7.32 0.03 0.73 -9.37 -7.53 -7.05 -6.81 -6.75 737 1 -#> -#> Samples were drawn using NUTS(diag_e) at Mon Oct 14 21:41:29 2019. -#> For each parameter, n_eff is a crude measure of effective sample size, -#> and Rhat is the potential scale reduction factor on split chains (at -#> convergence, Rhat=1).-# } - -
CmdStanModel-method-optimize.RdThe optimize method of a CmdStanModel object runs Stan's
-optimizer.
CmdStan can find the posterior mode (assuming there is one). If the -posterior is not convex, there is no guarantee Stan will be able to find -the global mode as opposed to a local optimum of log probability. For -optimization, the mode is calculated without the Jacobian adjustment for -con- strained variables, which shifts the mode due to the change of -variables. Thus modes correspond to modes of the model as written.
--- CmdStan Interface User's Guide
-$optimize( - data = NULL, - seed = NULL, - refresh = NULL, - init = NULL, - algorithm = NULL, - init_alpha = NULL, - iter = NULL -) -- -
The following arguments can
-be specified for any of the fitting methods (sample, optimize,
-variational). Arguments left at NULL default to the default used by the
-installed version of CmdStan.
data (multiple options): The data to use:
A named list of R objects like for RStan;
A path to a data file compatible with CmdStan (R dump or JSON). See -the appendices in the CmdStan manual for details on using these -formats.
seed: (positive integer) A seed for the (P)RNG to pass to CmdStan.
refresh: (non-negative integer) The number of iterations between
-screen updates.
init: (multiple options) The initialization method:
A real number x>0 initializes randomly between [-x,x] (on the
-unconstrained parameter space);
0 initializes to 0;
A character vector of data file paths (one per chain) to -initialization files.
optimize methodIn addition to the
-arguments above, the optimize method also has its own set of arguments.
-These arguments are described briefly here and in greater detail in the
-CmdStan manual. Arguments left at NULL default to the default used by the
-installed version of CmdStan.
algorithm: (string) The optimization algorithm. One of
-"lbfgs", "bfgs", or "newton".
iter: (positive integer) The number of iterations.
init_alpha: (non-negative real) The line search step size for first
-iteration. Not applicable if algorithm="newton".
The optimize method returns a CmdStanMLE object.
The CmdStanR website (mc-stan.org/cmdstanr) -for online documentation and tutorials.
-The Stan and CmdStan documentation:
Stan doc (html or pdf): mc-stan.org/users/documentation/
CmdStan doc (pdf): (github.com/stan-dev/cmdstan/releases/).
Other CmdStanModel methods: CmdStanModel-method-compile,
- CmdStanModel-method-sample,
- CmdStanModel-method-variational
-# \dontrun{ -# Set path to cmdstan -# Note: if you installed CmdStan via install_cmdstan() with default settings -# then default below should work. Otherwise use the `path` argument to -# specify the location of your CmdStan installation. - -set_cmdstan_path(path = NULL)#>-# Create a CmdStan model object from a Stan program, -# here using the example model that comes with CmdStan -stan_program <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") -mod <- cmdstan_model(stan_program) -mod$print()#> data { -#> int<lower=0> N; -#> int<lower=0,upper=1> y[N]; -#> } -#> parameters { -#> real<lower=0,upper=1> theta; -#> } -#> model { -#> theta ~ beta(1,1); -#> for (n in 1:N) -#> y[n] ~ bernoulli(theta); -#> }-# Compile to create executable -mod$compile()#> Running make /Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli -#> make: `/Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli' is up to date.-# Run sample method (MCMC via Stan's dynamic HMC/NUTS), -# specifying data as a named list (like RStan) -standata <- list(N = 10, y =c(0,1,0,0,0,0,0,0,0,1)) -fit_mcmc <- mod$sample(data = standata, seed = 123, num_chains = 2)#> method = sample (Default) -#> sample -#> num_samples = 1000 (Default) -#> num_warmup = 1000 (Default) -#> save_warmup = 0 (Default) -#> thin = 1 (Default) -#> adapt -#> engaged = 1 (Default) -#> gamma = 0.050000000000000003 (Default) -#> delta = 0.80000000000000004 (Default) -#> kappa = 0.75 (Default) -#> t0 = 10 (Default) -#> init_buffer = 75 (Default) -#> term_buffer = 50 (Default) -#> window = 25 (Default) -#> algorithm = hmc (Default) -#> hmc -#> engine = nuts (Default) -#> nuts -#> max_depth = 10 (Default) -#> metric = diag_e (Default) -#> metric_file = (Default) -#> stepsize = 1 (Default) -#> stepsize_jitter = 0 (Default) -#> id = 1 -#> data -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b02e4d2c66.data.R -#> init = 2 (Default) -#> random -#> seed = 123 -#> output -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-1.csv -#> diagnostic_file = (Default) -#> refresh = 100 (Default) -#> -#> -#> Gradient evaluation took 2e-05 seconds -#> 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds. -#> Adjust your expectations accordingly! -#> -#> -#> Iteration: 1 / 2000 [ 0%] (Warmup) -#> Iteration: 100 / 2000 [ 5%] (Warmup) -#> Iteration: 200 / 2000 [ 10%] (Warmup) -#> Iteration: 300 / 2000 [ 15%] (Warmup) -#> Iteration: 400 / 2000 [ 20%] (Warmup) -#> Iteration: 500 / 2000 [ 25%] (Warmup) -#> Iteration: 600 / 2000 [ 30%] (Warmup) -#> Iteration: 700 / 2000 [ 35%] (Warmup) -#> Iteration: 800 / 2000 [ 40%] (Warmup) -#> Iteration: 900 / 2000 [ 45%] (Warmup) -#> Iteration: 1000 / 2000 [ 50%] (Warmup) -#> Iteration: 1001 / 2000 [ 50%] (Sampling) -#> Iteration: 1100 / 2000 [ 55%] (Sampling) -#> Iteration: 1200 / 2000 [ 60%] (Sampling) -#> Iteration: 1300 / 2000 [ 65%] (Sampling) -#> Iteration: 1400 / 2000 [ 70%] (Sampling) -#> Iteration: 1500 / 2000 [ 75%] (Sampling) -#> Iteration: 1600 / 2000 [ 80%] (Sampling) -#> Iteration: 1700 / 2000 [ 85%] (Sampling) -#> Iteration: 1800 / 2000 [ 90%] (Sampling) -#> Iteration: 1900 / 2000 [ 95%] (Sampling) -#> Iteration: 2000 / 2000 [100%] (Sampling) -#> -#> Elapsed Time: 0.012707 seconds (Warm-up) -#> 0.0214 seconds (Sampling) -#> 0.034107 seconds (Total) -#> -#> method = sample (Default) -#> sample -#> num_samples = 1000 (Default) -#> num_warmup = 1000 (Default) -#> save_warmup = 0 (Default) -#> thin = 1 (Default) -#> adapt -#> engaged = 1 (Default) -#> gamma = 0.050000000000000003 (Default) -#> delta = 0.80000000000000004 (Default) -#> kappa = 0.75 (Default) -#> t0 = 10 (Default) -#> init_buffer = 75 (Default) -#> term_buffer = 50 (Default) -#> window = 25 (Default) -#> algorithm = hmc (Default) -#> hmc -#> engine = nuts (Default) -#> nuts -#> max_depth = 10 (Default) -#> metric = diag_e (Default) -#> metric_file = (Default) -#> stepsize = 1 (Default) -#> stepsize_jitter = 0 (Default) -#> id = 2 -#> data -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b02e4d2c66.data.R -#> init = 2 (Default) -#> random -#> seed = 124 -#> output -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-2.csv -#> diagnostic_file = (Default) -#> refresh = 100 (Default) -#> -#> -#> Gradient evaluation took 2.1e-05 seconds -#> 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds. -#> Adjust your expectations accordingly! -#> -#> -#> Iteration: 1 / 2000 [ 0%] (Warmup) -#> Iteration: 100 / 2000 [ 5%] (Warmup) -#> Iteration: 200 / 2000 [ 10%] (Warmup) -#> Iteration: 300 / 2000 [ 15%] (Warmup) -#> Iteration: 400 / 2000 [ 20%] (Warmup) -#> Iteration: 500 / 2000 [ 25%] (Warmup) -#> Iteration: 600 / 2000 [ 30%] (Warmup) -#> Iteration: 700 / 2000 [ 35%] (Warmup) -#> Iteration: 800 / 2000 [ 40%] (Warmup) -#> Iteration: 900 / 2000 [ 45%] (Warmup) -#> Iteration: 1000 / 2000 [ 50%] (Warmup) -#> Iteration: 1001 / 2000 [ 50%] (Sampling) -#> Iteration: 1100 / 2000 [ 55%] (Sampling) -#> Iteration: 1200 / 2000 [ 60%] (Sampling) -#> Iteration: 1300 / 2000 [ 65%] (Sampling) -#> Iteration: 1400 / 2000 [ 70%] (Sampling) -#> Iteration: 1500 / 2000 [ 75%] (Sampling) -#> Iteration: 1600 / 2000 [ 80%] (Sampling) -#> Iteration: 1700 / 2000 [ 85%] (Sampling) -#> Iteration: 1800 / 2000 [ 90%] (Sampling) -#> Iteration: 1900 / 2000 [ 95%] (Sampling) -#> Iteration: 2000 / 2000 [100%] (Sampling) -#> -#> Elapsed Time: 0.012695 seconds (Warm-up) -#> 0.019917 seconds (Sampling) -#> 0.032612 seconds (Total) -#>#> Running bin/stansummary \ -#> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-1.csv \ -#> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-2.csv -#> Inference for Stan model: bernoulli_model -#> 2 chains: each with iter=(1000,1000); warmup=(0,0); thin=(1,1); 2000 iterations saved. -#> -#> Warmup took (0.013, 0.013) seconds, 0.025 seconds total -#> Sampling took (0.021, 0.020) seconds, 0.041 seconds total -#> -#> Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat -#> lp__ -7.3 2.7e-02 7.3e-01 -8.9 -7.0 -6.8 738 17873 1.0e+00 -#> accept_stat__ 0.92 3.1e-03 1.3e-01 0.64 0.97 1.0 1670 40408 1.0e+00 -#> stepsize__ 0.92 1.7e-03 1.7e-03 0.92 0.92 0.92 1.0 24 1.4e+12 -#> treedepth__ 1.3 1.1e-02 4.7e-01 1.0 1.0 2.0 1968 47626 1.0e+00 -#> n_leapfrog__ 2.4 2.5e-02 1.0e+00 1.0 3.0 3.0 1640 39697 1.0e+00 -#> divergent__ 0.00 0.0e+00 0.0e+00 0.00 0.00 0.00 1000 24203 nan -#> energy__ 7.8 4.0e-02 1.0e+00 6.8 7.5 9.8 647 15654 1.0e+00 -#> theta 0.24 4.6e-03 1.2e-01 0.077 0.22 0.47 720 17429 1.0e+00 -#> -#> Samples were drawn using hmc with nuts. -#> For each parameter, N_Eff is a crude measure of effective sample size, -#> and R_hat is the potential scale reduction factor on split chains (at -#> convergence, R_hat=1). -#>-# Run optimization method (default is Stan's LBFGS algorithm) -# and also demonstrate specifying data as a path to a file (readable by CmdStan) -my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.R") -fit_optim <- mod$optimize(data = my_data_file, seed = 123)#> Warning: Optimization method is experimental and the structure of returned object may change.#> method = optimize -#> optimize -#> algorithm = lbfgs (Default) -#> lbfgs -#> init_alpha = 0.001 (Default) -#> tol_obj = 9.9999999999999998e-13 (Default) -#> tol_rel_obj = 10000 (Default) -#> tol_grad = 1e-08 (Default) -#> tol_rel_grad = 10000000 (Default) -#> tol_param = 1e-08 (Default) -#> history_size = 5 (Default) -#> iter = 2000 (Default) -#> save_iterations = 0 (Default) -#> id = 1 -#> data -#> file = /Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli.data.R -#> init = 2 (Default) -#> random -#> seed = 123 -#> output -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-optimize-1.csv -#> diagnostic_file = (Default) -#> refresh = 100 (Default) -#> -#> Initial log joint probability = -9.51104 -#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes -#> 6 -5.00402 0.000103557 2.55661e-07 1 1 9 -#> Optimization terminated normally: -#> Convergence detected: relative gradient magnitude is below tolerance#> Estimates from optimization:#> theta lp__ -#> 0.20000 -5.00402-# Run variational Bayes method (default is meanfield ADVI) -fit_vb <- mod$variational(data = standata, seed = 123)#> Warning: Variational inference method is experimental and the structure of returned object may change.#> method = variational -#> variational -#> algorithm = meanfield (Default) -#> meanfield -#> iter = 10000 (Default) -#> grad_samples = 1 (Default) -#> elbo_samples = 100 (Default) -#> eta = 1 (Default) -#> adapt -#> engaged = 1 (Default) -#> iter = 50 (Default) -#> tol_rel_obj = 0.01 (Default) -#> eval_elbo = 100 (Default) -#> output_samples = 1000 (Default) -#> id = 1 -#> data -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b026c9de3df.data.R -#> init = 2 (Default) -#> random -#> seed = 123 -#> output -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-variational-1.csv -#> diagnostic_file = (Default) -#> refresh = 100 (Default) -#> -#> ------------------------------------------------------------ -#> EXPERIMENTAL ALGORITHM: -#> This procedure has not been thoroughly tested and may be unstable -#> or buggy. The interface is subject to change. -#> ------------------------------------------------------------ -#> -#> -#> -#> Gradient evaluation took 2.1e-05 seconds -#> 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds. -#> Adjust your expectations accordingly! -#> -#> -#> Begin eta adaptation. -#> Iteration: 1 / 250 [ 0%] (Adaptation) -#> Iteration: 50 / 250 [ 20%] (Adaptation) -#> Iteration: 100 / 250 [ 40%] (Adaptation) -#> Iteration: 150 / 250 [ 60%] (Adaptation) -#> Iteration: 200 / 250 [ 80%] (Adaptation) -#> Success! Found best value [eta = 1] earlier than expected. -#> -#> Begin stochastic gradient ascent. -#> iter ELBO delta_ELBO_mean delta_ELBO_med notes -#> 100 -6.258 1.000 1.000 -#> 200 -6.475 0.517 1.000 -#> 300 -6.228 0.358 0.040 -#> 400 -6.220 0.269 0.040 -#> 500 -6.379 0.220 0.034 -#> 600 -6.195 0.188 0.034 -#> 700 -6.262 0.163 0.030 -#> 800 -6.345 0.144 0.030 -#> 900 -6.201 0.131 0.025 -#> 1000 -6.307 0.119 0.025 -#> 1100 -6.290 0.020 0.023 -#> 1200 -6.238 0.017 0.017 -#> 1300 -6.182 0.014 0.013 -#> 1400 -6.167 0.014 0.013 -#> 1500 -6.219 0.012 0.011 -#> 1600 -6.164 0.010 0.009 MEDIAN ELBO CONVERGED -#> -#> Drawing a sample of size 1000 from the approximate posterior... -#> COMPLETED.#> Running bin/stansummary \ -#> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-variational-1.csv -#> Warning: non-fatal error reading adapation data -#> Inference for Stan model: bernoulli_model -#> 1 chains: each with iter=(1001); warmup=(0); thin=(0); 1001 iterations saved. -#> -#> Warmup took (0.00) seconds, 0.00 seconds total -#> Sampling took (0.00) seconds, 0.00 seconds total -#> -#> Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat -#> lp__ 0.00 0.0e+00 0.00 0.00 0.00 0.0e+00 500 inf nan -#> log_p__ -7.2 2.5e-02 0.72 -8.6 -7.0 -6.8e+00 789 inf 1.0e+00 -#> log_g__ -0.54 2.9e-02 0.76 -2.1 -0.27 -1.5e-03 679 inf 1.0e+00 -#> theta 0.26 4.2e-03 0.12 0.091 0.23 4.9e-01 823 inf 1.0e+00 -#> -#> Samples were drawn using meanfield with . -#> For each parameter, N_Eff is a crude measure of effective sample size, -#> and R_hat is the potential scale reduction factor on split chains (at -#> convergence, R_hat=1). -#>-# For models fit using MCMC, if you like working with RStan's stanfit objects -# then you can create one with rstan::read_stan_csv() -if (require(rstan, quietly = TRUE)) { - stanfit <- rstan::read_stan_csv(fit_mcmc$output_files()) - print(stanfit) -}#> Inference for Stan model: bernoulli-stan-sample-1. -#> 2 chains, each with iter=2000; warmup=1000; thin=1; -#> post-warmup draws per chain=1000, total post-warmup draws=2000. -#> -#> mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat -#> theta 0.24 0.00 0.12 0.06 0.14 0.22 0.32 0.52 720 1 -#> lp__ -7.32 0.03 0.73 -9.37 -7.53 -7.05 -6.81 -6.75 737 1 -#> -#> Samples were drawn using NUTS(diag_e) at Mon Oct 14 21:41:30 2019. -#> For each parameter, n_eff is a crude measure of effective sample size, -#> and Rhat is the potential scale reduction factor on split chains (at -#> convergence, Rhat=1).-# } - -
CmdStanModel-method-sample.RdThe sample method of a CmdStanModel object runs the default
-MCMC algorithm in CmdStan (algorithm=hmc engine=nuts), to produce a set
-of draws from the posterior distribution of a model conditioned on some
-data.
$sample( - num_chains = 1, -# num_cores = NULL, # not yet available - data = NULL, - num_warmup = NULL, - num_samples = NULL, - save_warmup = FALSE, - thin = NULL, - refresh = NULL, - init = NULL, - seed = NULL, - max_depth = NULL, - metric = NULL, - stepsize = NULL, - adapt_engaged = NULL, - adapt_delta = NULL -) -- -
The following arguments can
-be specified for any of the fitting methods (sample, optimize,
-variational). Arguments left at NULL default to the default used by the
-installed version of CmdStan.
data (multiple options): The data to use:
A named list of R objects like for RStan;
A path to a data file compatible with CmdStan (R dump or JSON). See -the appendices in the CmdStan manual for details on using these -formats.
seed: (positive integer) A seed for the (P)RNG to pass to CmdStan.
refresh: (non-negative integer) The number of iterations between
-screen updates.
init: (multiple options) The initialization method:
A real number x>0 initializes randomly between [-x,x] (on the
-unconstrained parameter space);
0 initializes to 0;
A character vector of data file paths (one per chain) to -initialization files.
sample methodIn addition to the
-arguments above, the sample method also has its own set of arguments.
-These arguments are described briefly here and in greater detail in the
-CmdStan manual. Arguments left at NULL default to the default used by the
-installed version of CmdStan.
num_samples: (positive integer) The number of sampling iterations.
num_warmup: (positive integer) The number of warmup iterations.
save_warmup: (logical) Should warmup iterations also be streamed
-to the output?
thin: (positive integer) The period between saved samples. This should
-typically be left at its default (no thinning).
adapt_engaged: (logical) Do warmup adaptation?
adapt_delta: (real in (0,1)) The adaptation target acceptance
-statistic.
stepsize: (positive real) The initial step size for the discrete
-approximation to continuous Hamiltonian dynamics. This is further tuned
-during warmup.
metric: (character) The geometry of the base manifold. One of the
-following:
A single string from among "diag_e", "dense_e", "unit_e";
A character vector containing paths to files (one per chain)
-compatible with CmdStan that contain precomputed metrics.
-Each path must be to a JSON or Rdump file that contains an entry
-inv_metric whose value is either the diagonal vector or the full
-covariance matrix.
If you want to turn off adaptation when using a precomuted metric set
-adapt_engaged=FALSE, otherwise it will use the precomputed metric just
-as an initial guess during adaptation. See the Euclidean Metric section
-of the CmdStan manual for more details on these options.
max_depth: (positive integer) The maximum allowed tree depth. See the
-Tree Depth section of the CmdStan manual for more details.
The sample method returns a CmdStanMCMC object.
The CmdStanR website (mc-stan.org/cmdstanr) -for online documentation and tutorials.
-The Stan and CmdStan documentation:
Stan doc (html or pdf): mc-stan.org/users/documentation/
CmdStan doc (pdf): (github.com/stan-dev/cmdstan/releases/).
Other CmdStanModel methods: CmdStanModel-method-compile,
- CmdStanModel-method-optimize,
- CmdStanModel-method-variational
-# \dontrun{ -# Set path to cmdstan -# Note: if you installed CmdStan via install_cmdstan() with default settings -# then default below should work. Otherwise use the `path` argument to -# specify the location of your CmdStan installation. - -set_cmdstan_path(path = NULL)#>-# Create a CmdStan model object from a Stan program, -# here using the example model that comes with CmdStan -stan_program <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") -mod <- cmdstan_model(stan_program) -mod$print()#> data { -#> int<lower=0> N; -#> int<lower=0,upper=1> y[N]; -#> } -#> parameters { -#> real<lower=0,upper=1> theta; -#> } -#> model { -#> theta ~ beta(1,1); -#> for (n in 1:N) -#> y[n] ~ bernoulli(theta); -#> }-# Compile to create executable -mod$compile()#> Running make /Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli -#> make: `/Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli' is up to date.-# Run sample method (MCMC via Stan's dynamic HMC/NUTS), -# specifying data as a named list (like RStan) -standata <- list(N = 10, y =c(0,1,0,0,0,0,0,0,0,1)) -fit_mcmc <- mod$sample(data = standata, seed = 123, num_chains = 2)#> method = sample (Default) -#> sample -#> num_samples = 1000 (Default) -#> num_warmup = 1000 (Default) -#> save_warmup = 0 (Default) -#> thin = 1 (Default) -#> adapt -#> engaged = 1 (Default) -#> gamma = 0.050000000000000003 (Default) -#> delta = 0.80000000000000004 (Default) -#> kappa = 0.75 (Default) -#> t0 = 10 (Default) -#> init_buffer = 75 (Default) -#> term_buffer = 50 (Default) -#> window = 25 (Default) -#> algorithm = hmc (Default) -#> hmc -#> engine = nuts (Default) -#> nuts -#> max_depth = 10 (Default) -#> metric = diag_e (Default) -#> metric_file = (Default) -#> stepsize = 1 (Default) -#> stepsize_jitter = 0 (Default) -#> id = 1 -#> data -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b02199f3176.data.R -#> init = 2 (Default) -#> random -#> seed = 123 -#> output -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-1.csv -#> diagnostic_file = (Default) -#> refresh = 100 (Default) -#> -#> -#> Gradient evaluation took 1.7e-05 seconds -#> 1000 transitions using 10 leapfrog steps per transition would take 0.17 seconds. -#> Adjust your expectations accordingly! -#> -#> -#> Iteration: 1 / 2000 [ 0%] (Warmup) -#> Iteration: 100 / 2000 [ 5%] (Warmup) -#> Iteration: 200 / 2000 [ 10%] (Warmup) -#> Iteration: 300 / 2000 [ 15%] (Warmup) -#> Iteration: 400 / 2000 [ 20%] (Warmup) -#> Iteration: 500 / 2000 [ 25%] (Warmup) -#> Iteration: 600 / 2000 [ 30%] (Warmup) -#> Iteration: 700 / 2000 [ 35%] (Warmup) -#> Iteration: 800 / 2000 [ 40%] (Warmup) -#> Iteration: 900 / 2000 [ 45%] (Warmup) -#> Iteration: 1000 / 2000 [ 50%] (Warmup) -#> Iteration: 1001 / 2000 [ 50%] (Sampling) -#> Iteration: 1100 / 2000 [ 55%] (Sampling) -#> Iteration: 1200 / 2000 [ 60%] (Sampling) -#> Iteration: 1300 / 2000 [ 65%] (Sampling) -#> Iteration: 1400 / 2000 [ 70%] (Sampling) -#> Iteration: 1500 / 2000 [ 75%] (Sampling) -#> Iteration: 1600 / 2000 [ 80%] (Sampling) -#> Iteration: 1700 / 2000 [ 85%] (Sampling) -#> Iteration: 1800 / 2000 [ 90%] (Sampling) -#> Iteration: 1900 / 2000 [ 95%] (Sampling) -#> Iteration: 2000 / 2000 [100%] (Sampling) -#> -#> Elapsed Time: 0.012149 seconds (Warm-up) -#> 0.018859 seconds (Sampling) -#> 0.031008 seconds (Total) -#> -#> method = sample (Default) -#> sample -#> num_samples = 1000 (Default) -#> num_warmup = 1000 (Default) -#> save_warmup = 0 (Default) -#> thin = 1 (Default) -#> adapt -#> engaged = 1 (Default) -#> gamma = 0.050000000000000003 (Default) -#> delta = 0.80000000000000004 (Default) -#> kappa = 0.75 (Default) -#> t0 = 10 (Default) -#> init_buffer = 75 (Default) -#> term_buffer = 50 (Default) -#> window = 25 (Default) -#> algorithm = hmc (Default) -#> hmc -#> engine = nuts (Default) -#> nuts -#> max_depth = 10 (Default) -#> metric = diag_e (Default) -#> metric_file = (Default) -#> stepsize = 1 (Default) -#> stepsize_jitter = 0 (Default) -#> id = 2 -#> data -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b02199f3176.data.R -#> init = 2 (Default) -#> random -#> seed = 124 -#> output -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-2.csv -#> diagnostic_file = (Default) -#> refresh = 100 (Default) -#> -#> -#> Gradient evaluation took 1.8e-05 seconds -#> 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds. -#> Adjust your expectations accordingly! -#> -#> -#> Iteration: 1 / 2000 [ 0%] (Warmup) -#> Iteration: 100 / 2000 [ 5%] (Warmup) -#> Iteration: 200 / 2000 [ 10%] (Warmup) -#> Iteration: 300 / 2000 [ 15%] (Warmup) -#> Iteration: 400 / 2000 [ 20%] (Warmup) -#> Iteration: 500 / 2000 [ 25%] (Warmup) -#> Iteration: 600 / 2000 [ 30%] (Warmup) -#> Iteration: 700 / 2000 [ 35%] (Warmup) -#> Iteration: 800 / 2000 [ 40%] (Warmup) -#> Iteration: 900 / 2000 [ 45%] (Warmup) -#> Iteration: 1000 / 2000 [ 50%] (Warmup) -#> Iteration: 1001 / 2000 [ 50%] (Sampling) -#> Iteration: 1100 / 2000 [ 55%] (Sampling) -#> Iteration: 1200 / 2000 [ 60%] (Sampling) -#> Iteration: 1300 / 2000 [ 65%] (Sampling) -#> Iteration: 1400 / 2000 [ 70%] (Sampling) -#> Iteration: 1500 / 2000 [ 75%] (Sampling) -#> Iteration: 1600 / 2000 [ 80%] (Sampling) -#> Iteration: 1700 / 2000 [ 85%] (Sampling) -#> Iteration: 1800 / 2000 [ 90%] (Sampling) -#> Iteration: 1900 / 2000 [ 95%] (Sampling) -#> Iteration: 2000 / 2000 [100%] (Sampling) -#> -#> Elapsed Time: 0.011733 seconds (Warm-up) -#> 0.019533 seconds (Sampling) -#> 0.031266 seconds (Total) -#>#> Running bin/stansummary \ -#> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-1.csv \ -#> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-2.csv -#> Inference for Stan model: bernoulli_model -#> 2 chains: each with iter=(1000,1000); warmup=(0,0); thin=(1,1); 2000 iterations saved. -#> -#> Warmup took (0.012, 0.012) seconds, 0.024 seconds total -#> Sampling took (0.019, 0.020) seconds, 0.038 seconds total -#> -#> Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat -#> lp__ -7.3 2.7e-02 7.3e-01 -8.9 -7.0 -6.8 738 19235 1.0e+00 -#> accept_stat__ 0.92 3.1e-03 1.3e-01 0.64 0.97 1.0 1670 43487 1.0e+00 -#> stepsize__ 0.92 1.7e-03 1.7e-03 0.92 0.92 0.92 1.0 26 1.4e+12 -#> treedepth__ 1.3 1.1e-02 4.7e-01 1.0 1.0 2.0 1968 51255 1.0e+00 -#> n_leapfrog__ 2.4 2.5e-02 1.0e+00 1.0 3.0 3.0 1640 42721 1.0e+00 -#> divergent__ 0.00 0.0e+00 0.0e+00 0.00 0.00 0.00 1000 26047 nan -#> energy__ 7.8 4.0e-02 1.0e+00 6.8 7.5 9.8 647 16846 1.0e+00 -#> theta 0.24 4.6e-03 1.2e-01 0.077 0.22 0.47 720 18757 1.0e+00 -#> -#> Samples were drawn using hmc with nuts. -#> For each parameter, N_Eff is a crude measure of effective sample size, -#> and R_hat is the potential scale reduction factor on split chains (at -#> convergence, R_hat=1). -#>-# Run optimization method (default is Stan's LBFGS algorithm) -# and also demonstrate specifying data as a path to a file (readable by CmdStan) -my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.R") -fit_optim <- mod$optimize(data = my_data_file, seed = 123)#> Warning: Optimization method is experimental and the structure of returned object may change.#> method = optimize -#> optimize -#> algorithm = lbfgs (Default) -#> lbfgs -#> init_alpha = 0.001 (Default) -#> tol_obj = 9.9999999999999998e-13 (Default) -#> tol_rel_obj = 10000 (Default) -#> tol_grad = 1e-08 (Default) -#> tol_rel_grad = 10000000 (Default) -#> tol_param = 1e-08 (Default) -#> history_size = 5 (Default) -#> iter = 2000 (Default) -#> save_iterations = 0 (Default) -#> id = 1 -#> data -#> file = /Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli.data.R -#> init = 2 (Default) -#> random -#> seed = 123 -#> output -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-optimize-1.csv -#> diagnostic_file = (Default) -#> refresh = 100 (Default) -#> -#> Initial log joint probability = -9.51104 -#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes -#> 6 -5.00402 0.000103557 2.55661e-07 1 1 9 -#> Optimization terminated normally: -#> Convergence detected: relative gradient magnitude is below tolerance#> Estimates from optimization:#> theta lp__ -#> 0.20000 -5.00402-# Run variational Bayes method (default is meanfield ADVI) -fit_vb <- mod$variational(data = standata, seed = 123)#> Warning: Variational inference method is experimental and the structure of returned object may change.#> method = variational -#> variational -#> algorithm = meanfield (Default) -#> meanfield -#> iter = 10000 (Default) -#> grad_samples = 1 (Default) -#> elbo_samples = 100 (Default) -#> eta = 1 (Default) -#> adapt -#> engaged = 1 (Default) -#> iter = 50 (Default) -#> tol_rel_obj = 0.01 (Default) -#> eval_elbo = 100 (Default) -#> output_samples = 1000 (Default) -#> id = 1 -#> data -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b022268471e.data.R -#> init = 2 (Default) -#> random -#> seed = 123 -#> output -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-variational-1.csv -#> diagnostic_file = (Default) -#> refresh = 100 (Default) -#> -#> ------------------------------------------------------------ -#> EXPERIMENTAL ALGORITHM: -#> This procedure has not been thoroughly tested and may be unstable -#> or buggy. The interface is subject to change. -#> ------------------------------------------------------------ -#> -#> -#> -#> Gradient evaluation took 2.1e-05 seconds -#> 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds. -#> Adjust your expectations accordingly! -#> -#> -#> Begin eta adaptation. -#> Iteration: 1 / 250 [ 0%] (Adaptation) -#> Iteration: 50 / 250 [ 20%] (Adaptation) -#> Iteration: 100 / 250 [ 40%] (Adaptation) -#> Iteration: 150 / 250 [ 60%] (Adaptation) -#> Iteration: 200 / 250 [ 80%] (Adaptation) -#> Success! Found best value [eta = 1] earlier than expected. -#> -#> Begin stochastic gradient ascent. -#> iter ELBO delta_ELBO_mean delta_ELBO_med notes -#> 100 -6.258 1.000 1.000 -#> 200 -6.475 0.517 1.000 -#> 300 -6.228 0.358 0.040 -#> 400 -6.220 0.269 0.040 -#> 500 -6.379 0.220 0.034 -#> 600 -6.195 0.188 0.034 -#> 700 -6.262 0.163 0.030 -#> 800 -6.345 0.144 0.030 -#> 900 -6.201 0.131 0.025 -#> 1000 -6.307 0.119 0.025 -#> 1100 -6.290 0.020 0.023 -#> 1200 -6.238 0.017 0.017 -#> 1300 -6.182 0.014 0.013 -#> 1400 -6.167 0.014 0.013 -#> 1500 -6.219 0.012 0.011 -#> 1600 -6.164 0.010 0.009 MEDIAN ELBO CONVERGED -#> -#> Drawing a sample of size 1000 from the approximate posterior... -#> COMPLETED.#> Running bin/stansummary \ -#> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-variational-1.csv -#> Warning: non-fatal error reading adapation data -#> Inference for Stan model: bernoulli_model -#> 1 chains: each with iter=(1001); warmup=(0); thin=(0); 1001 iterations saved. -#> -#> Warmup took (0.00) seconds, 0.00 seconds total -#> Sampling took (0.00) seconds, 0.00 seconds total -#> -#> Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat -#> lp__ 0.00 0.0e+00 0.00 0.00 0.00 0.0e+00 500 inf nan -#> log_p__ -7.2 2.5e-02 0.72 -8.6 -7.0 -6.8e+00 789 inf 1.0e+00 -#> log_g__ -0.54 2.9e-02 0.76 -2.1 -0.27 -1.5e-03 679 inf 1.0e+00 -#> theta 0.26 4.2e-03 0.12 0.091 0.23 4.9e-01 823 inf 1.0e+00 -#> -#> Samples were drawn using meanfield with . -#> For each parameter, N_Eff is a crude measure of effective sample size, -#> and R_hat is the potential scale reduction factor on split chains (at -#> convergence, R_hat=1). -#>-# For models fit using MCMC, if you like working with RStan's stanfit objects -# then you can create one with rstan::read_stan_csv() -if (require(rstan, quietly = TRUE)) { - stanfit <- rstan::read_stan_csv(fit_mcmc$output_files()) - print(stanfit) -}#> Inference for Stan model: bernoulli-stan-sample-1. -#> 2 chains, each with iter=2000; warmup=1000; thin=1; -#> post-warmup draws per chain=1000, total post-warmup draws=2000. -#> -#> mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat -#> theta 0.24 0.00 0.12 0.06 0.14 0.22 0.32 0.52 720 1 -#> lp__ -7.32 0.03 0.73 -9.37 -7.53 -7.05 -6.81 -6.75 737 1 -#> -#> Samples were drawn using NUTS(diag_e) at Mon Oct 14 21:41:31 2019. -#> For each parameter, n_eff is a crude measure of effective sample size, -#> and Rhat is the potential scale reduction factor on split chains (at -#> convergence, Rhat=1).-# } - -
CmdStanModel-method-variational.RdThe variational method of a CmdStanModel object runs
-Stan's variational Bayes (ADVI) algorithms.
CmdStan can fit a variational approximation to the posterior. The
-approximation is a Gaussian in the unconstrained variable space. Stan
-implements two variational algorithms. The algorithm="meanfield" option
-uses a fully factorized Gaussian for the approximation. The
-algorithm="fullrank" option uses a Gaussian with a full-rank covariance
-matrix for the approximation.
-- CmdStan Interface User's Guide
-$variational( - data = NULL, - seed = NULL, - refresh = NULL, - init = NULL, - algorithm = NULL, - iter = NULL, - grad_samples = NULL, - elbo_samples = NULL, - eta = NULL, - adapt_engaged = NULL, - adapt_iter = NULL, - tol_rel_obj = NULL, - eval_elbo = NULL, - output_samples = NULL -) -- -
The following arguments can
-be specified for any of the fitting methods (sample, optimize,
-variational). Arguments left at NULL default to the default used by the
-installed version of CmdStan.
data (multiple options): The data to use:
A named list of R objects like for RStan;
A path to a data file compatible with CmdStan (R dump or JSON). See -the appendices in the CmdStan manual for details on using these -formats.
seed: (positive integer) A seed for the (P)RNG to pass to CmdStan.
refresh: (non-negative integer) The number of iterations between
-screen updates.
init: (multiple options) The initialization method:
A real number x>0 initializes randomly between [-x,x] (on the
-unconstrained parameter space);
0 initializes to 0;
A character vector of data file paths (one per chain) to -initialization files.
variational methodIn addition to the
-arguments above, the variational method also has its own set of
-arguments. These arguments are described briefly here and in greater detail
-in the CmdStan manual. Arguments left at NULL default to the default used
-by the installed version of CmdStan.
algorithm: (string) The algorithm. Either "meanfield" or "fullrank".
iter: (positive integer) The maximum number of iterations.
grad_samples: (positive integer) The number of samples for Monte Carlo
-estimate of gradients.
elbo_samples: (positive integer) The number of samples for Monte Carlo
-estimate of ELBO (objective function).
eta: (positive real) The stepsize weighting parameter for adaptive
-stepsize sequence.
adapt_engaged: (logical) Do warmup adaptation?
adapt_iter: (positive integer) The maximum number of adaptation
-iterations.
tol_rel_obj: (positive real) Convergence tolerance on the relative norm
-of the objective.
eval_elbo: (positive integer) Evaluate ELBO every Nth iteration.
output_samples: (positive integer) Number of posterior samples to
-draw and save.
The variational method returns a CmdStanVB object.
The CmdStanR website (mc-stan.org/cmdstanr) -for online documentation and tutorials.
-The Stan and CmdStan documentation:
Stan doc (html or pdf): mc-stan.org/users/documentation/
CmdStan doc (pdf): (github.com/stan-dev/cmdstan/releases/).
Other CmdStanModel methods: CmdStanModel-method-compile,
- CmdStanModel-method-optimize,
- CmdStanModel-method-sample
-# \dontrun{ -# Set path to cmdstan -# Note: if you installed CmdStan via install_cmdstan() with default settings -# then default below should work. Otherwise use the `path` argument to -# specify the location of your CmdStan installation. - -set_cmdstan_path(path = NULL)#>-# Create a CmdStan model object from a Stan program, -# here using the example model that comes with CmdStan -stan_program <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") -mod <- cmdstan_model(stan_program) -mod$print()#> data { -#> int<lower=0> N; -#> int<lower=0,upper=1> y[N]; -#> } -#> parameters { -#> real<lower=0,upper=1> theta; -#> } -#> model { -#> theta ~ beta(1,1); -#> for (n in 1:N) -#> y[n] ~ bernoulli(theta); -#> }-# Compile to create executable -mod$compile()#> Running make /Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli -#> make: `/Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli' is up to date.-# Run sample method (MCMC via Stan's dynamic HMC/NUTS), -# specifying data as a named list (like RStan) -standata <- list(N = 10, y =c(0,1,0,0,0,0,0,0,0,1)) -fit_mcmc <- mod$sample(data = standata, seed = 123, num_chains = 2)#> method = sample (Default) -#> sample -#> num_samples = 1000 (Default) -#> num_warmup = 1000 (Default) -#> save_warmup = 0 (Default) -#> thin = 1 (Default) -#> adapt -#> engaged = 1 (Default) -#> gamma = 0.050000000000000003 (Default) -#> delta = 0.80000000000000004 (Default) -#> kappa = 0.75 (Default) -#> t0 = 10 (Default) -#> init_buffer = 75 (Default) -#> term_buffer = 50 (Default) -#> window = 25 (Default) -#> algorithm = hmc (Default) -#> hmc -#> engine = nuts (Default) -#> nuts -#> max_depth = 10 (Default) -#> metric = diag_e (Default) -#> metric_file = (Default) -#> stepsize = 1 (Default) -#> stepsize_jitter = 0 (Default) -#> id = 1 -#> data -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b022add5243.data.R -#> init = 2 (Default) -#> random -#> seed = 123 -#> output -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-1.csv -#> diagnostic_file = (Default) -#> refresh = 100 (Default) -#> -#> -#> Gradient evaluation took 2e-05 seconds -#> 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds. -#> Adjust your expectations accordingly! -#> -#> -#> Iteration: 1 / 2000 [ 0%] (Warmup) -#> Iteration: 100 / 2000 [ 5%] (Warmup) -#> Iteration: 200 / 2000 [ 10%] (Warmup) -#> Iteration: 300 / 2000 [ 15%] (Warmup) -#> Iteration: 400 / 2000 [ 20%] (Warmup) -#> Iteration: 500 / 2000 [ 25%] (Warmup) -#> Iteration: 600 / 2000 [ 30%] (Warmup) -#> Iteration: 700 / 2000 [ 35%] (Warmup) -#> Iteration: 800 / 2000 [ 40%] (Warmup) -#> Iteration: 900 / 2000 [ 45%] (Warmup) -#> Iteration: 1000 / 2000 [ 50%] (Warmup) -#> Iteration: 1001 / 2000 [ 50%] (Sampling) -#> Iteration: 1100 / 2000 [ 55%] (Sampling) -#> Iteration: 1200 / 2000 [ 60%] (Sampling) -#> Iteration: 1300 / 2000 [ 65%] (Sampling) -#> Iteration: 1400 / 2000 [ 70%] (Sampling) -#> Iteration: 1500 / 2000 [ 75%] (Sampling) -#> Iteration: 1600 / 2000 [ 80%] (Sampling) -#> Iteration: 1700 / 2000 [ 85%] (Sampling) -#> Iteration: 1800 / 2000 [ 90%] (Sampling) -#> Iteration: 1900 / 2000 [ 95%] (Sampling) -#> Iteration: 2000 / 2000 [100%] (Sampling) -#> -#> Elapsed Time: 0.014348 seconds (Warm-up) -#> 0.021322 seconds (Sampling) -#> 0.03567 seconds (Total) -#> -#> method = sample (Default) -#> sample -#> num_samples = 1000 (Default) -#> num_warmup = 1000 (Default) -#> save_warmup = 0 (Default) -#> thin = 1 (Default) -#> adapt -#> engaged = 1 (Default) -#> gamma = 0.050000000000000003 (Default) -#> delta = 0.80000000000000004 (Default) -#> kappa = 0.75 (Default) -#> t0 = 10 (Default) -#> init_buffer = 75 (Default) -#> term_buffer = 50 (Default) -#> window = 25 (Default) -#> algorithm = hmc (Default) -#> hmc -#> engine = nuts (Default) -#> nuts -#> max_depth = 10 (Default) -#> metric = diag_e (Default) -#> metric_file = (Default) -#> stepsize = 1 (Default) -#> stepsize_jitter = 0 (Default) -#> id = 2 -#> data -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b022add5243.data.R -#> init = 2 (Default) -#> random -#> seed = 124 -#> output -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-2.csv -#> diagnostic_file = (Default) -#> refresh = 100 (Default) -#> -#> -#> Gradient evaluation took 1.9e-05 seconds -#> 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds. -#> Adjust your expectations accordingly! -#> -#> -#> Iteration: 1 / 2000 [ 0%] (Warmup) -#> Iteration: 100 / 2000 [ 5%] (Warmup) -#> Iteration: 200 / 2000 [ 10%] (Warmup) -#> Iteration: 300 / 2000 [ 15%] (Warmup) -#> Iteration: 400 / 2000 [ 20%] (Warmup) -#> Iteration: 500 / 2000 [ 25%] (Warmup) -#> Iteration: 600 / 2000 [ 30%] (Warmup) -#> Iteration: 700 / 2000 [ 35%] (Warmup) -#> Iteration: 800 / 2000 [ 40%] (Warmup) -#> Iteration: 900 / 2000 [ 45%] (Warmup) -#> Iteration: 1000 / 2000 [ 50%] (Warmup) -#> Iteration: 1001 / 2000 [ 50%] (Sampling) -#> Iteration: 1100 / 2000 [ 55%] (Sampling) -#> Iteration: 1200 / 2000 [ 60%] (Sampling) -#> Iteration: 1300 / 2000 [ 65%] (Sampling) -#> Iteration: 1400 / 2000 [ 70%] (Sampling) -#> Iteration: 1500 / 2000 [ 75%] (Sampling) -#> Iteration: 1600 / 2000 [ 80%] (Sampling) -#> Iteration: 1700 / 2000 [ 85%] (Sampling) -#> Iteration: 1800 / 2000 [ 90%] (Sampling) -#> Iteration: 1900 / 2000 [ 95%] (Sampling) -#> Iteration: 2000 / 2000 [100%] (Sampling) -#> -#> Elapsed Time: 0.01225 seconds (Warm-up) -#> 0.019663 seconds (Sampling) -#> 0.031913 seconds (Total) -#>#> Running bin/stansummary \ -#> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-1.csv \ -#> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-2.csv -#> Inference for Stan model: bernoulli_model -#> 2 chains: each with iter=(1000,1000); warmup=(0,0); thin=(1,1); 2000 iterations saved. -#> -#> Warmup took (0.014, 0.012) seconds, 0.027 seconds total -#> Sampling took (0.021, 0.020) seconds, 0.041 seconds total -#> -#> Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat -#> lp__ -7.3 2.7e-02 7.3e-01 -8.9 -7.0 -6.8 738 18018 1.0e+00 -#> accept_stat__ 0.92 3.1e-03 1.3e-01 0.64 0.97 1.0 1670 40735 1.0e+00 -#> stepsize__ 0.92 1.7e-03 1.7e-03 0.92 0.92 0.92 1.0 24 1.4e+12 -#> treedepth__ 1.3 1.1e-02 4.7e-01 1.0 1.0 2.0 1968 48012 1.0e+00 -#> n_leapfrog__ 2.4 2.5e-02 1.0e+00 1.0 3.0 3.0 1640 40018 1.0e+00 -#> divergent__ 0.00 0.0e+00 0.0e+00 0.00 0.00 0.00 1000 24399 nan -#> energy__ 7.8 4.0e-02 1.0e+00 6.8 7.5 9.8 647 15780 1.0e+00 -#> theta 0.24 4.6e-03 1.2e-01 0.077 0.22 0.47 720 17570 1.0e+00 -#> -#> Samples were drawn using hmc with nuts. -#> For each parameter, N_Eff is a crude measure of effective sample size, -#> and R_hat is the potential scale reduction factor on split chains (at -#> convergence, R_hat=1). -#>-# Run optimization method (default is Stan's LBFGS algorithm) -# and also demonstrate specifying data as a path to a file (readable by CmdStan) -my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.R") -fit_optim <- mod$optimize(data = my_data_file, seed = 123)#> Warning: Optimization method is experimental and the structure of returned object may change.#> method = optimize -#> optimize -#> algorithm = lbfgs (Default) -#> lbfgs -#> init_alpha = 0.001 (Default) -#> tol_obj = 9.9999999999999998e-13 (Default) -#> tol_rel_obj = 10000 (Default) -#> tol_grad = 1e-08 (Default) -#> tol_rel_grad = 10000000 (Default) -#> tol_param = 1e-08 (Default) -#> history_size = 5 (Default) -#> iter = 2000 (Default) -#> save_iterations = 0 (Default) -#> id = 1 -#> data -#> file = /Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli.data.R -#> init = 2 (Default) -#> random -#> seed = 123 -#> output -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-optimize-1.csv -#> diagnostic_file = (Default) -#> refresh = 100 (Default) -#> -#> Initial log joint probability = -9.51104 -#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes -#> 6 -5.00402 0.000103557 2.55661e-07 1 1 9 -#> Optimization terminated normally: -#> Convergence detected: relative gradient magnitude is below tolerance#> Estimates from optimization:#> theta lp__ -#> 0.20000 -5.00402-# Run variational Bayes method (default is meanfield ADVI) -fit_vb <- mod$variational(data = standata, seed = 123)#> Warning: Variational inference method is experimental and the structure of returned object may change.#> method = variational -#> variational -#> algorithm = meanfield (Default) -#> meanfield -#> iter = 10000 (Default) -#> grad_samples = 1 (Default) -#> elbo_samples = 100 (Default) -#> eta = 1 (Default) -#> adapt -#> engaged = 1 (Default) -#> iter = 50 (Default) -#> tol_rel_obj = 0.01 (Default) -#> eval_elbo = 100 (Default) -#> output_samples = 1000 (Default) -#> id = 1 -#> data -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b022843c2b1.data.R -#> init = 2 (Default) -#> random -#> seed = 123 -#> output -#> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-variational-1.csv -#> diagnostic_file = (Default) -#> refresh = 100 (Default) -#> -#> ------------------------------------------------------------ -#> EXPERIMENTAL ALGORITHM: -#> This procedure has not been thoroughly tested and may be unstable -#> or buggy. The interface is subject to change. -#> ------------------------------------------------------------ -#> -#> -#> -#> Gradient evaluation took 2.1e-05 seconds -#> 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds. -#> Adjust your expectations accordingly! -#> -#> -#> Begin eta adaptation. -#> Iteration: 1 / 250 [ 0%] (Adaptation) -#> Iteration: 50 / 250 [ 20%] (Adaptation) -#> Iteration: 100 / 250 [ 40%] (Adaptation) -#> Iteration: 150 / 250 [ 60%] (Adaptation) -#> Iteration: 200 / 250 [ 80%] (Adaptation) -#> Success! Found best value [eta = 1] earlier than expected. -#> -#> Begin stochastic gradient ascent. -#> iter ELBO delta_ELBO_mean delta_ELBO_med notes -#> 100 -6.258 1.000 1.000 -#> 200 -6.475 0.517 1.000 -#> 300 -6.228 0.358 0.040 -#> 400 -6.220 0.269 0.040 -#> 500 -6.379 0.220 0.034 -#> 600 -6.195 0.188 0.034 -#> 700 -6.262 0.163 0.030 -#> 800 -6.345 0.144 0.030 -#> 900 -6.201 0.131 0.025 -#> 1000 -6.307 0.119 0.025 -#> 1100 -6.290 0.020 0.023 -#> 1200 -6.238 0.017 0.017 -#> 1300 -6.182 0.014 0.013 -#> 1400 -6.167 0.014 0.013 -#> 1500 -6.219 0.012 0.011 -#> 1600 -6.164 0.010 0.009 MEDIAN ELBO CONVERGED -#> -#> Drawing a sample of size 1000 from the approximate posterior... -#> COMPLETED.#> Running bin/stansummary \ -#> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-variational-1.csv -#> Warning: non-fatal error reading adapation data -#> Inference for Stan model: bernoulli_model -#> 1 chains: each with iter=(1001); warmup=(0); thin=(0); 1001 iterations saved. -#> -#> Warmup took (0.00) seconds, 0.00 seconds total -#> Sampling took (0.00) seconds, 0.00 seconds total -#> -#> Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat -#> lp__ 0.00 0.0e+00 0.00 0.00 0.00 0.0e+00 500 inf nan -#> log_p__ -7.2 2.5e-02 0.72 -8.6 -7.0 -6.8e+00 789 inf 1.0e+00 -#> log_g__ -0.54 2.9e-02 0.76 -2.1 -0.27 -1.5e-03 679 inf 1.0e+00 -#> theta 0.26 4.2e-03 0.12 0.091 0.23 4.9e-01 823 inf 1.0e+00 -#> -#> Samples were drawn using meanfield with . -#> For each parameter, N_Eff is a crude measure of effective sample size, -#> and R_hat is the potential scale reduction factor on split chains (at -#> convergence, R_hat=1). -#>-# For models fit using MCMC, if you like working with RStan's stanfit objects -# then you can create one with rstan::read_stan_csv() -if (require(rstan, quietly = TRUE)) { - stanfit <- rstan::read_stan_csv(fit_mcmc$output_files()) - print(stanfit) -}#> Inference for Stan model: bernoulli-stan-sample-1. -#> 2 chains, each with iter=2000; warmup=1000; thin=1; -#> post-warmup draws per chain=1000, total post-warmup draws=2000. -#> -#> mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat -#> theta 0.24 0.00 0.12 0.06 0.14 0.22 0.32 0.52 720 1 -#> lp__ -7.32 0.03 0.73 -9.37 -7.53 -7.05 -6.81 -6.75 737 1 -#> -#> Samples were drawn using NUTS(diag_e) at Mon Oct 14 21:41:32 2019. -#> For each parameter, n_eff is a crude measure of effective sample size, -#> and Rhat is the potential scale reduction factor on split chains (at -#> convergence, Rhat=1).-# } - -
A CmdStanModel object is an R6 object created
-by the cmdstan_model() function. The object stores the path to a Stan
-program and compiled executable (once created), and provides methods for
-fitting the model using Stan's algorithms.
CmdStanModel objects have the following associated
-methods, many of which have their own (linked) documentation pages:
| Method | Description |
$stan_file() | Return the file path to the Stan program. |
$code() | Return Stan program as a character vector. |
$print() | Print readable version of Stan program. |
$check_syntax() | Check Stan syntax without having to compile. |
$format() | Format and canonicalize the Stan model code. |
| Method | Description |
$compile() | Compile Stan program. |
$exe_file() | Return the file path to the compiled executable. |
$hpp_file() | Return the file path to the .hpp file containing the generated C++ code. |
$save_hpp_file() | Save the .hpp file containing the generated C++ code. |
$expose_functions() | Expose Stan functions for use in R. |
| Method | Description |
$diagnose() | Run CmdStan's "diagnose" method to test gradients, return CmdStanDiagnose object. |
| Method | Description |
$sample() | Run CmdStan's "sample" method, return CmdStanMCMC object. |
$sample_mpi() | Run CmdStan's "sample" method with MPI, return CmdStanMCMC object. |
$optimize() | Run CmdStan's "optimize" method, return CmdStanMLE object. |
$variational() | Run CmdStan's "variational" method, return CmdStanVB object. |
$pathfinder() | Run CmdStan's "pathfinder" method, return CmdStanPathfinder object. |
$generate_quantities() | Run CmdStan's "generate quantities" method, return CmdStanGQ object. |
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
# \dontrun{
-library(cmdstanr)
-library(posterior)
-library(bayesplot)
-#> This is bayesplot version 1.11.1.9000
-#> - Online documentation and vignettes at mc-stan.org/bayesplot
-#> - bayesplot theme set to bayesplot::theme_default()
-#> * Does _not_ affect other ggplot2 plots
-#> * See ?bayesplot_theme_set for details on theme setting
-#>
-#> Attaching package: ‘bayesplot’
-#> The following object is masked from ‘package:posterior’:
-#>
-#> rhat
-color_scheme_set("brightblue")
-
-# Set path to CmdStan
-# (Note: if you installed CmdStan via install_cmdstan() with default settings
-# then setting the path is unnecessary but the default below should still work.
-# Otherwise use the `path` argument to specify the location of your
-# CmdStan installation.)
-set_cmdstan_path(path = NULL)
-#> CmdStan path set to: /Users/jgabry/.cmdstan/cmdstan-2.36.0
-
-# Create a CmdStanModel object from a Stan program,
-# here using the example model that comes with CmdStan
-file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan")
-mod <- cmdstan_model(file)
-mod$print()
-#> data {
-#> int<lower=0> N;
-#> array[N] int<lower=0, upper=1> y;
-#> }
-#> parameters {
-#> real<lower=0, upper=1> theta;
-#> }
-#> model {
-#> theta ~ beta(1, 1); // uniform prior on interval 0,1
-#> y ~ bernoulli(theta);
-#> }
-# Print with line numbers. This can be set globally using the
-# `cmdstanr_print_line_numbers` option.
-mod$print(line_numbers = TRUE)
-#> 1: data {
-#> 2: int<lower=0> N;
-#> 3: array[N] int<lower=0, upper=1> y;
-#> 4: }
-#> 5: parameters {
-#> 6: real<lower=0, upper=1> theta;
-#> 7: }
-#> 8: model {
-#> 9: theta ~ beta(1, 1); // uniform prior on interval 0,1
-#> 10: y ~ bernoulli(theta);
-#> 11: }
-
-# Data as a named list (like RStan)
-stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1))
-
-# Run MCMC using the 'sample' method
-fit_mcmc <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- parallel_chains = 2
-)
-#> Running MCMC with 2 parallel chains...
-#>
-#> Chain 1 Iteration: 1 / 2000 [ 0%] (Warmup)
-#> Chain 1 Iteration: 100 / 2000 [ 5%] (Warmup)
-#> Chain 1 Iteration: 200 / 2000 [ 10%] (Warmup)
-#> Chain 1 Iteration: 300 / 2000 [ 15%] (Warmup)
-#> Chain 1 Iteration: 400 / 2000 [ 20%] (Warmup)
-#> Chain 1 Iteration: 500 / 2000 [ 25%] (Warmup)
-#> Chain 1 Iteration: 600 / 2000 [ 30%] (Warmup)
-#> Chain 1 Iteration: 700 / 2000 [ 35%] (Warmup)
-#> Chain 1 Iteration: 800 / 2000 [ 40%] (Warmup)
-#> Chain 1 Iteration: 900 / 2000 [ 45%] (Warmup)
-#> Chain 1 Iteration: 1000 / 2000 [ 50%] (Warmup)
-#> Chain 1 Iteration: 1001 / 2000 [ 50%] (Sampling)
-#> Chain 1 Iteration: 1100 / 2000 [ 55%] (Sampling)
-#> Chain 1 Iteration: 1200 / 2000 [ 60%] (Sampling)
-#> Chain 1 Iteration: 1300 / 2000 [ 65%] (Sampling)
-#> Chain 1 Iteration: 1400 / 2000 [ 70%] (Sampling)
-#> Chain 1 Iteration: 1500 / 2000 [ 75%] (Sampling)
-#> Chain 1 Iteration: 1600 / 2000 [ 80%] (Sampling)
-#> Chain 1 Iteration: 1700 / 2000 [ 85%] (Sampling)
-#> Chain 1 Iteration: 1800 / 2000 [ 90%] (Sampling)
-#> Chain 1 Iteration: 1900 / 2000 [ 95%] (Sampling)
-#> Chain 1 Iteration: 2000 / 2000 [100%] (Sampling)
-#> Chain 2 Iteration: 1 / 2000 [ 0%] (Warmup)
-#> Chain 2 Iteration: 100 / 2000 [ 5%] (Warmup)
-#> Chain 2 Iteration: 200 / 2000 [ 10%] (Warmup)
-#> Chain 2 Iteration: 300 / 2000 [ 15%] (Warmup)
-#> Chain 2 Iteration: 400 / 2000 [ 20%] (Warmup)
-#> Chain 2 Iteration: 500 / 2000 [ 25%] (Warmup)
-#> Chain 2 Iteration: 600 / 2000 [ 30%] (Warmup)
-#> Chain 2 Iteration: 700 / 2000 [ 35%] (Warmup)
-#> Chain 2 Iteration: 800 / 2000 [ 40%] (Warmup)
-#> Chain 2 Iteration: 900 / 2000 [ 45%] (Warmup)
-#> Chain 2 Iteration: 1000 / 2000 [ 50%] (Warmup)
-#> Chain 2 Iteration: 1001 / 2000 [ 50%] (Sampling)
-#> Chain 2 Iteration: 1100 / 2000 [ 55%] (Sampling)
-#> Chain 2 Iteration: 1200 / 2000 [ 60%] (Sampling)
-#> Chain 2 Iteration: 1300 / 2000 [ 65%] (Sampling)
-#> Chain 2 Iteration: 1400 / 2000 [ 70%] (Sampling)
-#> Chain 2 Iteration: 1500 / 2000 [ 75%] (Sampling)
-#> Chain 2 Iteration: 1600 / 2000 [ 80%] (Sampling)
-#> Chain 2 Iteration: 1700 / 2000 [ 85%] (Sampling)
-#> Chain 2 Iteration: 1800 / 2000 [ 90%] (Sampling)
-#> Chain 2 Iteration: 1900 / 2000 [ 95%] (Sampling)
-#> Chain 2 Iteration: 2000 / 2000 [100%] (Sampling)
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.2 seconds.
-#>
-
-# Use 'posterior' package for summaries
-fit_mcmc$summary()
-#> # A tibble: 2 × 10
-#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.35 -7.01 0.882 0.353 -9.14 -6.75 1.00 724. 896.
-#> 2 theta 0.254 0.239 0.129 0.126 0.0737 0.488 1.00 532. 657.
-
-# Check sampling diagnostics
-fit_mcmc$diagnostic_summary()
-#> $num_divergent
-#> [1] 0 0
-#>
-#> $num_max_treedepth
-#> [1] 0 0
-#>
-#> $ebfmi
-#> [1] 1.1148479 0.7568734
-#>
-
-# Get posterior draws
-draws <- fit_mcmc$draws()
-print(draws)
-#> # A draws_array: 1000 iterations, 2 chains, and 2 variables
-#> , , variable = lp__
-#>
-#> chain
-#> iteration 1 2
-#> 1 -7.0 -8.1
-#> 2 -7.9 -7.9
-#> 3 -7.4 -7.0
-#> 4 -6.7 -6.8
-#> 5 -6.9 -6.8
-#>
-#> , , variable = theta
-#>
-#> chain
-#> iteration 1 2
-#> 1 0.17 0.088
-#> 2 0.46 0.097
-#> 3 0.41 0.167
-#> 4 0.25 0.292
-#> 5 0.18 0.238
-#>
-#> # ... with 995 more iterations
-
-# Convert to data frame using posterior::as_draws_df
-as_draws_df(draws)
-#> # A draws_df: 1000 iterations, 2 chains, and 2 variables
-#> lp__ theta
-#> 1 -7.0 0.17
-#> 2 -7.9 0.46
-#> 3 -7.4 0.41
-#> 4 -6.7 0.25
-#> 5 -6.9 0.18
-#> 6 -6.9 0.33
-#> 7 -7.2 0.15
-#> 8 -6.8 0.29
-#> 9 -6.8 0.24
-#> 10 -6.8 0.24
-#> # ... with 1990 more draws
-#> # ... hidden reserved variables {'.chain', '.iteration', '.draw'}
-
-# Plot posterior using bayesplot (ggplot2)
-mcmc_hist(fit_mcmc$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm)
-# and also demonstrate specifying data as a path to a file instead of a list
-my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json")
-fit_optim <- mod$optimize(data = my_data_file, seed = 123)
-#> Initial log joint probability = -16.144
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 6 -5.00402 0.000246518 8.73164e-07 1 1 9
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.2 seconds.
-fit_optim$summary()
-#> # A tibble: 2 × 2
-#> variable estimate
-#> <chr> <dbl>
-#> 1 lp__ -5.00
-#> 2 theta 0.2
-
-# Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation
-# to the posterior
-fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE)
-#> Initial log joint probability = -6.93289
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 4 -6.74802 0.00149466 1.90231e-05 1 1 7
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.1 seconds.
-fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000)
-#> Calculating Hessian
-#> Calculating inverse of Cholesky factor
-#> Generating draws
-#> iteration: 0
-#> iteration: 100
-#> iteration: 200
-#> iteration: 300
-#> iteration: 400
-#> iteration: 500
-#> iteration: 600
-#> iteration: 700
-#> iteration: 800
-#> iteration: 900
-#> iteration: 1000
-#> iteration: 1100
-#> iteration: 1200
-#> iteration: 1300
-#> iteration: 1400
-#> iteration: 1500
-#> iteration: 1600
-#> iteration: 1700
-#> iteration: 1800
-#> iteration: 1900
-#> Finished in 0.1 seconds.
-fit_laplace$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.24 -6.96 0.738 0.292 -8.68 -6.75
-#> 2 lp_approx__ -0.494 -0.213 0.716 0.294 -2.00 -0.00155
-#> 3 theta 0.268 0.245 0.124 0.116 0.104 0.509
-
-# Run 'variational' method to use ADVI to approximate posterior
-fit_vb <- mod$variational(data = stan_data, seed = 123)
-#> ------------------------------------------------------------
-#> EXPERIMENTAL ALGORITHM:
-#> This procedure has not been thoroughly tested and may be unstable
-#> or buggy. The interface is subject to change.
-#> ------------------------------------------------------------
-#> Gradient evaluation took 1.1e-05 seconds
-#> 1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
-#> Adjust your expectations accordingly!
-#> Begin eta adaptation.
-#> Iteration: 1 / 250 [ 0%] (Adaptation)
-#> Iteration: 50 / 250 [ 20%] (Adaptation)
-#> Iteration: 100 / 250 [ 40%] (Adaptation)
-#> Iteration: 150 / 250 [ 60%] (Adaptation)
-#> Iteration: 200 / 250 [ 80%] (Adaptation)
-#> Success! Found best value [eta = 1] earlier than expected.
-#> Begin stochastic gradient ascent.
-#> iter ELBO delta_ELBO_mean delta_ELBO_med notes
-#> 100 -6.164 1.000 1.000
-#> 200 -6.225 0.505 1.000
-#> 300 -6.186 0.339 0.010 MEDIAN ELBO CONVERGED
-#> Drawing a sample of size 1000 from the approximate posterior...
-#> COMPLETED.
-#> Finished in 0.1 seconds.
-fit_vb$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.14 -6.93 0.528 0.247 -8.21 -6.75
-#> 2 lp_approx__ -0.520 -0.244 0.740 0.326 -1.90 -0.00227
-#> 3 theta 0.251 0.236 0.107 0.108 0.100 0.446
-mcmc_hist(fit_vb$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'pathfinder' method, a new alternative to the variational method
-fit_pf <- mod$pathfinder(data = stan_data, seed = 123)
-#> Path [1] :Initial log joint density = -18.273334
-#> Path [1] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 7.082e-04 1.432e-05 1.000e+00 1.000e+00 126 -6.145e+00 -6.145e+00
-#> Path [1] :Best Iter: [5] ELBO (-6.145070) evaluations: (126)
-#> Path [2] :Initial log joint density = -19.192715
-#> Path [2] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 2.015e-04 2.228e-06 1.000e+00 1.000e+00 126 -6.223e+00 -6.223e+00
-#> Path [2] :Best Iter: [2] ELBO (-6.170358) evaluations: (126)
-#> Path [3] :Initial log joint density = -6.774820
-#> Path [3] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 4 -6.748e+00 1.137e-04 2.596e-07 1.000e+00 1.000e+00 101 -6.178e+00 -6.178e+00
-#> Path [3] :Best Iter: [4] ELBO (-6.177909) evaluations: (101)
-#> Path [4] :Initial log joint density = -7.949193
-#> Path [4] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 2.145e-04 1.301e-06 1.000e+00 1.000e+00 126 -6.197e+00 -6.197e+00
-#> Path [4] :Best Iter: [5] ELBO (-6.197118) evaluations: (126)
-#> Total log probability function evaluations:4379
-#> Finished in 0.1 seconds.
-fit_pf$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp_approx__ -1.07 -0.727 0.945 0.311 -2.91 -0.450
-#> 2 lp__ -7.25 -6.97 0.753 0.308 -8.78 -6.75
-#> 3 theta 0.256 0.245 0.119 0.123 0.0824 0.462
-mcmc_hist(fit_pf$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'pathfinder' again with more paths, fewer draws per path,
-# better covariance approximation, and fewer LBFGSs iterations
-fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40,
- history_size=50, max_lbfgs_iters=100)
-#> Warning: Number of PSIS draws is larger than the total number of draws returned by the single Pathfinders. This is likely unintentional and leads to re-sampling from the same draws.
-#> Path [1] :Initial log joint density = -7.264402
-#> Path [1] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 3.293e-04 4.141e-07 1.000e+00 1.000e+00 126 -6.296e+00 -6.296e+00
-#> Path [1] :Best Iter: [4] ELBO (-6.235406) evaluations: (126)
-#> Path [2] :Initial log joint density = -11.117345
-#> Path [2] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 9.672e-04 1.459e-05 1.000e+00 1.000e+00 126 -6.259e+00 -6.259e+00
-#> Path [2] :Best Iter: [2] ELBO (-6.180823) evaluations: (126)
-#> Path [3] :Initial log joint density = -7.495731
-#> Path [3] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.114e-04 4.492e-07 1.000e+00 1.000e+00 126 -6.259e+00 -6.259e+00
-#> Path [3] :Best Iter: [2] ELBO (-6.245533) evaluations: (126)
-#> Path [4] :Initial log joint density = -7.770449
-#> Path [4] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.750e-04 9.341e-07 1.000e+00 1.000e+00 126 -6.225e+00 -6.225e+00
-#> Path [4] :Best Iter: [5] ELBO (-6.225361) evaluations: (126)
-#> Path [5] :Initial log joint density = -14.218076
-#> Path [5] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.997e-03 5.573e-05 1.000e+00 1.000e+00 126 -6.216e+00 -6.216e+00
-#> Path [5] :Best Iter: [5] ELBO (-6.216169) evaluations: (126)
-#> Path [6] :Initial log joint density = -7.472192
-#> Path [6] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.059e-04 4.145e-07 1.000e+00 1.000e+00 126 -6.137e+00 -6.137e+00
-#> Path [6] :Best Iter: [5] ELBO (-6.137426) evaluations: (126)
-#> Path [7] :Initial log joint density = -8.723559
-#> Path [7] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 3.317e-04 2.693e-06 1.000e+00 1.000e+00 126 -6.210e+00 -6.210e+00
-#> Path [7] :Best Iter: [3] ELBO (-6.175877) evaluations: (126)
-#> Path [8] :Initial log joint density = -9.460464
-#> Path [8] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 4.598e-04 4.575e-06 1.000e+00 1.000e+00 126 -6.241e+00 -6.241e+00
-#> Path [8] :Best Iter: [2] ELBO (-6.228285) evaluations: (126)
-#> Path [9] :Initial log joint density = -18.781825
-#> Path [9] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 4.081e-04 6.355e-06 1.000e+00 1.000e+00 126 -6.256e+00 -6.256e+00
-#> Path [9] :Best Iter: [2] ELBO (-6.178423) evaluations: (126)
-#> Path [10] :Initial log joint density = -7.492978
-#> Path [10] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 8.830e-04 2.382e-06 1.000e+00 1.000e+00 126 -6.191e+00 -6.191e+00
-#> Path [10] :Best Iter: [5] ELBO (-6.191391) evaluations: (126)
-#> Total log probability function evaluations:1410
-#> Finished in 0.1 seconds.
-
-# Specifying initial values as a function
-fit_mcmc_w_init_fun <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = function() list(theta = runif(1))
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.3 seconds.
-#>
-fit_mcmc_w_init_fun_2 <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = function(chain_id) {
- # silly but demonstrates optional use of chain_id
- list(theta = 1 / (chain_id + 1))
- }
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.3 seconds.
-#>
-fit_mcmc_w_init_fun_2$init()
-#> [[1]]
-#> [[1]]$theta
-#> [1] 0.5
-#>
-#>
-#> [[2]]
-#> [[2]]$theta
-#> [1] 0.3333333
-#>
-#>
-
-# Specifying initial values as a list of lists
-fit_mcmc_w_init_list <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = list(
- list(theta = 0.75), # chain 1
- list(theta = 0.25) # chain 2
- )
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.3 seconds.
-#>
-fit_optim_w_init_list <- mod$optimize(
- data = stan_data,
- seed = 123,
- init = list(
- list(theta = 0.75)
- )
-)
-#> Initial log joint probability = -11.6657
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 6 -5.00402 0.000237915 9.55309e-07 1 1 9
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.1 seconds.
-fit_optim_w_init_list$init()
-#> [[1]]
-#> [[1]]$theta
-#> [1] 0.75
-#>
-#>
-# }
-
-A CmdStanPathfinder object is the fitted model object returned by the
-$pathfinder() method of a
-CmdStanModel object.
CmdStanPathfinder objects have the following associated methods,
-all of which have their own (linked) documentation pages.
| Method | Description |
$draws() | Return approximate posterior draws as a draws_matrix. |
$lp() | Return the total log probability density (target) computed in the model block of the Stan program. |
$lp_approx() | Return the log density of the approximation to the posterior. |
$init() | Return user-specified initial values. |
$metadata() | Return a list of metadata gathered from the CmdStan CSV files. |
$code() | Return Stan code as a character vector. |
| Method | Description |
$summary() | Run posterior::summarise_draws(). |
$cmdstan_summary() | Run and print CmdStan's bin/stansummary. |
| Method | Description |
$save_object() | Save fitted model object to a file. |
$save_output_files() | Save output CSV files to a specified location. |
$save_data_file() | Save JSON data file to a specified location. |
$save_latent_dynamics_files() | Save diagnostic CSV files to a specified location. |
| Method | Description |
$time() | Report the total run time. |
$output() | Pretty print the output that was printed to the console. |
$return_codes() | Return the return codes from the CmdStan runs. |
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other fitted model objects:
-CmdStanDiagnose,
-CmdStanGQ,
-CmdStanLaplace,
-CmdStanMCMC,
-CmdStanMLE,
-CmdStanVB
A CmdStanVB object is the fitted model object returned by the
-$variational() method of a
-CmdStanModel object.
CmdStanVB objects have the following associated methods,
-all of which have their own (linked) documentation pages.
| Method | Description |
$draws() | Return approximate posterior draws as a draws_matrix. |
$lp() | Return the total log probability density (target) computed in the model block of the Stan program. |
$lp_approx() | Return the log density of the variational approximation to the posterior. |
$init() | Return user-specified initial values. |
$metadata() | Return a list of metadata gathered from the CmdStan CSV files. |
$code() | Return Stan code as a character vector. |
| Method | Description |
$summary() | Run posterior::summarise_draws(). |
$cmdstan_summary() | Run and print CmdStan's bin/stansummary. |
| Method | Description |
$save_object() | Save fitted model object to a file. |
$save_output_files() | Save output CSV files to a specified location. |
$save_data_file() | Save JSON data file to a specified location. |
$save_latent_dynamics_files() | Save diagnostic CSV files to a specified location. |
| Method | Description |
$time() | Report the total run time. |
$output() | Pretty print the output that was printed to the console. |
$return_codes() | Return the return codes from the CmdStan runs. |
| Method | Description |
$expose_functions() | Expose Stan functions for use in R. |
$init_model_methods() | Expose methods for log-probability, gradients, parameter constraining and unconstraining. |
$log_prob() | Calculate log-prob. |
$grad_log_prob() | Calculate log-prob and gradient. |
$hessian() | Calculate log-prob, gradient, and hessian. |
$constrain_variables() | Transform a set of unconstrained parameter values to the constrained scale. |
$unconstrain_variables() | Transform a set of parameter values to the unconstrained scale. |
$unconstrain_draws() | Transform all parameter draws to the unconstrained scale. |
$variable_skeleton() | Helper function to re-structure a vector of constrained parameter values. |
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other fitted model objects:
-CmdStanDiagnose,
-CmdStanGQ,
-CmdStanLaplace,
-CmdStanMCMC,
-CmdStanMLE,
-CmdStanPathfinder
draws object from a CmdStanR fitted model objectR/fit.R
- as_draws.CmdStanMCMC.RdCreate a draws object supported by the posterior package. These
-methods are just wrappers around CmdStanR's $draws()
-method provided for convenience.
# S3 method for CmdStanMCMC
-as_draws(x, ...)
-
-# S3 method for CmdStanMLE
-as_draws(x, ...)
-
-# S3 method for CmdStanLaplace
-as_draws(x, ...)
-
-# S3 method for CmdStanVB
-as_draws(x, ...)
-
-# S3 method for CmdStanGQ
-as_draws(x, ...)
-
-# S3 method for CmdStanPathfinder
-as_draws(x, ...)A CmdStanR fitted model object.
Optional arguments passed to the $draws()
-method (e.g., variables, inc_warmup, etc.).
To subset iterations, chains, or draws, use the
-posterior::subset_draws() method after creating the draws object.
# \dontrun{
-fit <- cmdstanr_example()
-as_draws(fit)
-#> # A draws_array: 1000 iterations, 4 chains, and 105 variables
-#> , , variable = lp__
-#>
-#> chain
-#> iteration 1 2 3 4
-#> 1 -66 -65 -65 -64
-#> 2 -67 -67 -65 -68
-#> 3 -66 -65 -65 -66
-#> 4 -65 -66 -64 -66
-#> 5 -64 -67 -65 -66
-#>
-#> , , variable = alpha
-#>
-#> chain
-#> iteration 1 2 3 4
-#> 1 0.792 0.30 0.38 0.38
-#> 2 0.061 0.54 0.49 0.14
-#> 3 0.538 0.23 0.47 0.29
-#> 4 0.446 0.10 0.37 0.41
-#> 5 0.253 0.68 0.46 0.56
-#>
-#> , , variable = beta[1]
-#>
-#> chain
-#> iteration 1 2 3 4
-#> 1 -0.93 -0.46 -0.99 -0.75
-#> 2 -0.60 -0.87 -0.93 -0.84
-#> 3 -0.67 -0.51 -0.45 -0.41
-#> 4 -0.90 -0.78 -0.75 -0.19
-#> 5 -0.63 -0.57 -0.45 -0.62
-#>
-#> , , variable = beta[2]
-#>
-#> chain
-#> iteration 1 2 3 4
-#> 1 -0.26 0.031 -0.15 -0.195
-#> 2 0.12 -0.226 -0.36 -0.386
-#> 3 -0.02 -0.199 -0.15 -0.548
-#> 4 -0.31 -0.013 -0.20 -0.084
-#> 5 -0.12 -0.615 -0.09 -0.610
-#>
-#> # ... with 995 more iterations, and 101 more variables
-
-# posterior's as_draws_*() methods will also work
-posterior::as_draws_rvars(fit)
-#> # A draws_rvars: 1000 iterations, 4 chains, and 4 variables
-#> $lp__: rvar<1000,4>[1] mean ± sd:
-#> [1] -66 ± 1.5
-#>
-#> $alpha: rvar<1000,4>[1] mean ± sd:
-#> [1] 0.37 ± 0.22
-#>
-#> $beta: rvar<1000,4>[3] mean ± sd:
-#> [1] -0.67 ± 0.25 -0.28 ± 0.23 0.69 ± 0.27
-#>
-#> $log_lik: rvar<1000,4>[100] mean ± sd:
-#> [1] -0.518 ± 0.101 -0.398 ± 0.148 -0.500 ± 0.222 -0.445 ± 0.153
-#> [5] -1.183 ± 0.289 -0.593 ± 0.195 -0.637 ± 0.127 -0.278 ± 0.133
-#> [9] -0.697 ± 0.172 -0.743 ± 0.237 -0.280 ± 0.126 -0.492 ± 0.240
-#> [13] -0.655 ± 0.213 -0.361 ± 0.174 -0.279 ± 0.109 -0.275 ± 0.087
-#> [17] -1.594 ± 0.288 -0.481 ± 0.109 -0.232 ± 0.075 -0.112 ± 0.078
-#> [21] -0.211 ± 0.087 -0.571 ± 0.151 -0.330 ± 0.138 -0.135 ± 0.065
-#> [25] -0.451 ± 0.123 -1.520 ± 0.340 -0.307 ± 0.123 -0.446 ± 0.086
-#> [29] -0.722 ± 0.230 -0.699 ± 0.193 -0.489 ± 0.165 -0.425 ± 0.111
-#> [33] -0.406 ± 0.128 -0.062 ± 0.048 -0.583 ± 0.188 -0.325 ± 0.130
-#> [37] -0.702 ± 0.230 -0.309 ± 0.149 -0.178 ± 0.110 -0.684 ± 0.132
-#> [41] -1.131 ± 0.261 -0.937 ± 0.201 -0.413 ± 0.266 -1.175 ± 0.188
-#> [45] -0.359 ± 0.118 -0.578 ± 0.131 -0.301 ± 0.128 -0.324 ± 0.083
-#> [49] -0.319 ± 0.081 -1.288 ± 0.331 -0.288 ± 0.094 -0.832 ± 0.146
-#> [53] -0.402 ± 0.132 -0.371 ± 0.142 -0.381 ± 0.136 -0.320 ± 0.188
-#> [57] -0.660 ± 0.121 -0.954 ± 0.356 -1.371 ± 0.345 -0.976 ± 0.161
-#> [61] -0.543 ± 0.100 -0.872 ± 0.317 -0.115 ± 0.071 -0.899 ± 0.250
-#> [65] -2.024 ± 0.609 -0.509 ± 0.139 -0.276 ± 0.081 -1.059 ± 0.239
-#> [69] -0.437 ± 0.086 -0.642 ± 0.235 -0.608 ± 0.213 -0.460 ± 0.173
-#> [73] -1.496 ± 0.368 -0.947 ± 0.199 -1.139 ± 0.392 -0.373 ± 0.140
-#> [77] -0.876 ± 0.143 -0.490 ± 0.174 -0.767 ± 0.193 -0.537 ± 0.197
-#> [81] -0.160 ± 0.100 -0.220 ± 0.138 -0.344 ± 0.082 -0.275 ± 0.092
-#> [85] -0.129 ± 0.074 -1.136 ± 0.323 -0.821 ± 0.130 -0.776 ± 0.248
-#> [89] -1.289 ± 0.322 -0.258 ± 0.136 -0.383 ± 0.131 -1.501 ± 0.351
-#> [93] -0.736 ± 0.220 -0.318 ± 0.088 -0.389 ± 0.113 -1.575 ± 0.284
-#> [97] -0.432 ± 0.101 -1.058 ± 0.374 -0.690 ± 0.144 -0.392 ± 0.098
-#>
-posterior::as_draws_list(fit)
-#> # A draws_list: 1000 iterations, 4 chains, and 105 variables
-#>
-#> [chain = 1]
-#> $lp__
-#> [1] -66 -67 -66 -65 -64 -65 -65 -66 -64 -66
-#>
-#> $alpha
-#> [1] 0.792 0.061 0.538 0.446 0.253 0.343 0.387 0.827 0.506 0.713
-#>
-#> $`beta[1]`
-#> [1] -0.93 -0.60 -0.67 -0.90 -0.63 -0.71 -0.64 -0.60 -0.73 -0.96
-#>
-#> $`beta[2]`
-#> [1] -0.26 0.12 -0.02 -0.31 -0.12 -0.46 -0.44 -0.39 -0.32 -0.22
-#>
-#>
-#> [chain = 2]
-#> $lp__
-#> [1] -65 -67 -65 -66 -67 -67 -67 -67 -67 -72
-#>
-#> $alpha
-#> [1] 0.303 0.538 0.227 0.104 0.678 0.596 -0.041 0.769 0.340 0.948
-#>
-#> $`beta[1]`
-#> [1] -0.46 -0.87 -0.51 -0.78 -0.57 -0.50 -0.57 -0.67 -0.60 -0.47
-#>
-#> $`beta[2]`
-#> [1] 0.03060 -0.22607 -0.19885 -0.01303 -0.61503 -0.63211 0.00036 -0.41151
-#> [9] -0.46209 -0.88087
-#>
-#> # ... with 990 more iterations, and 2 more chains, and 101 more variables
-# }
-
-This function converts a CmdStanMCMC object to an mcmc.list object
-compatible with the coda package. This is primarily intended for users
-of Stan coming from BUGS/JAGS who are used to coda for plotting and
-diagnostics. In general we recommend the more recent MCMC diagnostics in
-posterior and the ggplot2-based plotting functions in
-bayesplot, but for users who prefer coda this function provides
-compatibility.
as_mcmc.list(x)A CmdStanMCMC object.
An mcmc.list object compatible with the coda package.
# \dontrun{
-fit <- cmdstanr_example()
-x <- as_mcmc.list(fit)
-# }
-
-These are generic functions intended to primarily be used by developers of -packages that interface with on CmdStanR. Developers can define methods on -top of these generics to coerce objects into CmdStanR's fitted model objects.
-as.CmdStanMCMC(object, ...)
-
-as.CmdStanMLE(object, ...)
-
-as.CmdStanLaplace(object, ...)
-
-as.CmdStanVB(object, ...)
-
-as.CmdStanPathfinder(object, ...)
-
-as.CmdStanGQ(object, ...)
-
-as.CmdStanDiagnose(object, ...)The object to be coerced.
Additional arguments to pass to methods.
Path to where install_cmdstan() with default settings installs CmdStan.
cmdstan_default_install_path(old = FALSE, wsl = FALSE)Should the old default path (.cmdstanr) be used instead of the new
-one (.cmdstan)? Defaults to FALSE and may be removed in a future release.
Return the directory for WSL installations?
The installation path.
-Returns the path to the installation of CmdStan with the most recent release -version.
-cmdstan_default_path(old = FALSE, dir = NULL)Path to a custom install folder with CmdStan installations.
Path to the CmdStan installation with the most recent release
-version, or NULL if no installation found.
For Windows systems with WSL CmdStan installs, if there are side-by-side WSL -and native installs with the same version then the WSL is preferred. -Otherwise, the most recent release is chosen, regardless of whether it is -native or WSL.
-
-Create a new CmdStanModel object from a file containing a Stan program
-or from an existing Stan executable. The CmdStanModel object stores the
-path to a Stan program and compiled executable (once created), and provides
-methods for fitting the model using Stan's algorithms.
See the compile and ... arguments for control over whether and how
-compilation happens.
cmdstan_model(stan_file = NULL, exe_file = NULL, compile = TRUE, ...)(string) The path to a .stan file containing a Stan
-program. The helper function write_stan_file() is provided for cases when
-it is more convenient to specify the Stan program as a string. If
-stan_file is not specified then exe_file must be specified.
(string) The path to an existing Stan model executable. Can
-be provided instead of or in addition to stan_file (if stan_file is
-omitted some CmdStanModel methods like $code() and $print() will not
-work). This argument can only be used with CmdStan 2.27+.
(logical) Do compilation? The default is TRUE. If FALSE
-compilation can be done later via the $compile()
-method.
Optionally, additional arguments to pass to the
-$compile() method if compile=TRUE. These
-options include specifying the directory for saving the executable, turning
-on pedantic mode, specifying include paths, configuring C++ options, and
-more. See $compile() for details.
A CmdStanModel object.
install_cmdstan(), $compile(),
-$check_syntax()
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
# \dontrun{
-library(cmdstanr)
-library(posterior)
-library(bayesplot)
-color_scheme_set("brightblue")
-
-# Set path to CmdStan
-# (Note: if you installed CmdStan via install_cmdstan() with default settings
-# then setting the path is unnecessary but the default below should still work.
-# Otherwise use the `path` argument to specify the location of your
-# CmdStan installation.)
-set_cmdstan_path(path = NULL)
-#> CmdStan path set to: /Users/jgabry/.cmdstan/cmdstan-2.36.0
-
-# Create a CmdStanModel object from a Stan program,
-# here using the example model that comes with CmdStan
-file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan")
-mod <- cmdstan_model(file)
-mod$print()
-#> data {
-#> int<lower=0> N;
-#> array[N] int<lower=0, upper=1> y;
-#> }
-#> parameters {
-#> real<lower=0, upper=1> theta;
-#> }
-#> model {
-#> theta ~ beta(1, 1); // uniform prior on interval 0,1
-#> y ~ bernoulli(theta);
-#> }
-# Print with line numbers. This can be set globally using the
-# `cmdstanr_print_line_numbers` option.
-mod$print(line_numbers = TRUE)
-#> 1: data {
-#> 2: int<lower=0> N;
-#> 3: array[N] int<lower=0, upper=1> y;
-#> 4: }
-#> 5: parameters {
-#> 6: real<lower=0, upper=1> theta;
-#> 7: }
-#> 8: model {
-#> 9: theta ~ beta(1, 1); // uniform prior on interval 0,1
-#> 10: y ~ bernoulli(theta);
-#> 11: }
-
-# Data as a named list (like RStan)
-stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1))
-
-# Run MCMC using the 'sample' method
-fit_mcmc <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- parallel_chains = 2
-)
-#> Running MCMC with 2 parallel chains...
-#>
-#> Chain 1 Iteration: 1 / 2000 [ 0%] (Warmup)
-#> Chain 1 Iteration: 100 / 2000 [ 5%] (Warmup)
-#> Chain 1 Iteration: 200 / 2000 [ 10%] (Warmup)
-#> Chain 1 Iteration: 300 / 2000 [ 15%] (Warmup)
-#> Chain 1 Iteration: 400 / 2000 [ 20%] (Warmup)
-#> Chain 1 Iteration: 500 / 2000 [ 25%] (Warmup)
-#> Chain 1 Iteration: 600 / 2000 [ 30%] (Warmup)
-#> Chain 1 Iteration: 700 / 2000 [ 35%] (Warmup)
-#> Chain 1 Iteration: 800 / 2000 [ 40%] (Warmup)
-#> Chain 1 Iteration: 900 / 2000 [ 45%] (Warmup)
-#> Chain 1 Iteration: 1000 / 2000 [ 50%] (Warmup)
-#> Chain 1 Iteration: 1001 / 2000 [ 50%] (Sampling)
-#> Chain 1 Iteration: 1100 / 2000 [ 55%] (Sampling)
-#> Chain 1 Iteration: 1200 / 2000 [ 60%] (Sampling)
-#> Chain 1 Iteration: 1300 / 2000 [ 65%] (Sampling)
-#> Chain 1 Iteration: 1400 / 2000 [ 70%] (Sampling)
-#> Chain 1 Iteration: 1500 / 2000 [ 75%] (Sampling)
-#> Chain 1 Iteration: 1600 / 2000 [ 80%] (Sampling)
-#> Chain 1 Iteration: 1700 / 2000 [ 85%] (Sampling)
-#> Chain 1 Iteration: 1800 / 2000 [ 90%] (Sampling)
-#> Chain 1 Iteration: 1900 / 2000 [ 95%] (Sampling)
-#> Chain 1 Iteration: 2000 / 2000 [100%] (Sampling)
-#> Chain 2 Iteration: 1 / 2000 [ 0%] (Warmup)
-#> Chain 2 Iteration: 100 / 2000 [ 5%] (Warmup)
-#> Chain 2 Iteration: 200 / 2000 [ 10%] (Warmup)
-#> Chain 2 Iteration: 300 / 2000 [ 15%] (Warmup)
-#> Chain 2 Iteration: 400 / 2000 [ 20%] (Warmup)
-#> Chain 2 Iteration: 500 / 2000 [ 25%] (Warmup)
-#> Chain 2 Iteration: 600 / 2000 [ 30%] (Warmup)
-#> Chain 2 Iteration: 700 / 2000 [ 35%] (Warmup)
-#> Chain 2 Iteration: 800 / 2000 [ 40%] (Warmup)
-#> Chain 2 Iteration: 900 / 2000 [ 45%] (Warmup)
-#> Chain 2 Iteration: 1000 / 2000 [ 50%] (Warmup)
-#> Chain 2 Iteration: 1001 / 2000 [ 50%] (Sampling)
-#> Chain 2 Iteration: 1100 / 2000 [ 55%] (Sampling)
-#> Chain 2 Iteration: 1200 / 2000 [ 60%] (Sampling)
-#> Chain 2 Iteration: 1300 / 2000 [ 65%] (Sampling)
-#> Chain 2 Iteration: 1400 / 2000 [ 70%] (Sampling)
-#> Chain 2 Iteration: 1500 / 2000 [ 75%] (Sampling)
-#> Chain 2 Iteration: 1600 / 2000 [ 80%] (Sampling)
-#> Chain 2 Iteration: 1700 / 2000 [ 85%] (Sampling)
-#> Chain 2 Iteration: 1800 / 2000 [ 90%] (Sampling)
-#> Chain 2 Iteration: 1900 / 2000 [ 95%] (Sampling)
-#> Chain 2 Iteration: 2000 / 2000 [100%] (Sampling)
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.2 seconds.
-#>
-
-# Use 'posterior' package for summaries
-fit_mcmc$summary()
-#> # A tibble: 2 × 10
-#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.35 -7.01 0.882 0.353 -9.14 -6.75 1.00 724. 896.
-#> 2 theta 0.254 0.239 0.129 0.126 0.0737 0.488 1.00 532. 657.
-
-# Check sampling diagnostics
-fit_mcmc$diagnostic_summary()
-#> $num_divergent
-#> [1] 0 0
-#>
-#> $num_max_treedepth
-#> [1] 0 0
-#>
-#> $ebfmi
-#> [1] 1.1148479 0.7568734
-#>
-
-# Get posterior draws
-draws <- fit_mcmc$draws()
-print(draws)
-#> # A draws_array: 1000 iterations, 2 chains, and 2 variables
-#> , , variable = lp__
-#>
-#> chain
-#> iteration 1 2
-#> 1 -7.0 -8.1
-#> 2 -7.9 -7.9
-#> 3 -7.4 -7.0
-#> 4 -6.7 -6.8
-#> 5 -6.9 -6.8
-#>
-#> , , variable = theta
-#>
-#> chain
-#> iteration 1 2
-#> 1 0.17 0.088
-#> 2 0.46 0.097
-#> 3 0.41 0.167
-#> 4 0.25 0.292
-#> 5 0.18 0.238
-#>
-#> # ... with 995 more iterations
-
-# Convert to data frame using posterior::as_draws_df
-as_draws_df(draws)
-#> # A draws_df: 1000 iterations, 2 chains, and 2 variables
-#> lp__ theta
-#> 1 -7.0 0.17
-#> 2 -7.9 0.46
-#> 3 -7.4 0.41
-#> 4 -6.7 0.25
-#> 5 -6.9 0.18
-#> 6 -6.9 0.33
-#> 7 -7.2 0.15
-#> 8 -6.8 0.29
-#> 9 -6.8 0.24
-#> 10 -6.8 0.24
-#> # ... with 1990 more draws
-#> # ... hidden reserved variables {'.chain', '.iteration', '.draw'}
-
-# Plot posterior using bayesplot (ggplot2)
-mcmc_hist(fit_mcmc$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm)
-# and also demonstrate specifying data as a path to a file instead of a list
-my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json")
-fit_optim <- mod$optimize(data = my_data_file, seed = 123)
-#> Initial log joint probability = -16.144
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 6 -5.00402 0.000246518 8.73164e-07 1 1 9
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.1 seconds.
-fit_optim$summary()
-#> # A tibble: 2 × 2
-#> variable estimate
-#> <chr> <dbl>
-#> 1 lp__ -5.00
-#> 2 theta 0.2
-
-# Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation
-# to the posterior
-fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE)
-#> Initial log joint probability = -7.19041
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 5 -6.74802 0.000219546 2.02164e-07 1 1 8
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.1 seconds.
-fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000)
-#> Calculating Hessian
-#> Calculating inverse of Cholesky factor
-#> Generating draws
-#> iteration: 0
-#> iteration: 100
-#> iteration: 200
-#> iteration: 300
-#> iteration: 400
-#> iteration: 500
-#> iteration: 600
-#> iteration: 700
-#> iteration: 800
-#> iteration: 900
-#> iteration: 1000
-#> iteration: 1100
-#> iteration: 1200
-#> iteration: 1300
-#> iteration: 1400
-#> iteration: 1500
-#> iteration: 1600
-#> iteration: 1700
-#> iteration: 1800
-#> iteration: 1900
-#> Finished in 0.1 seconds.
-fit_laplace$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.23 -6.98 0.662 0.312 -8.57 -6.75
-#> 2 lp_approx__ -0.490 -0.230 0.676 0.315 -1.92 -0.00147
-#> 3 theta 0.269 0.251 0.121 0.123 0.101 0.490
-
-# Run 'variational' method to use ADVI to approximate posterior
-fit_vb <- mod$variational(data = stan_data, seed = 123)
-#> ------------------------------------------------------------
-#> EXPERIMENTAL ALGORITHM:
-#> This procedure has not been thoroughly tested and may be unstable
-#> or buggy. The interface is subject to change.
-#> ------------------------------------------------------------
-#> Gradient evaluation took 9e-06 seconds
-#> 1000 transitions using 10 leapfrog steps per transition would take 0.09 seconds.
-#> Adjust your expectations accordingly!
-#> Begin eta adaptation.
-#> Iteration: 1 / 250 [ 0%] (Adaptation)
-#> Iteration: 50 / 250 [ 20%] (Adaptation)
-#> Iteration: 100 / 250 [ 40%] (Adaptation)
-#> Iteration: 150 / 250 [ 60%] (Adaptation)
-#> Iteration: 200 / 250 [ 80%] (Adaptation)
-#> Success! Found best value [eta = 1] earlier than expected.
-#> Begin stochastic gradient ascent.
-#> iter ELBO delta_ELBO_mean delta_ELBO_med notes
-#> 100 -6.164 1.000 1.000
-#> 200 -6.225 0.505 1.000
-#> 300 -6.186 0.339 0.010 MEDIAN ELBO CONVERGED
-#> Drawing a sample of size 1000 from the approximate posterior...
-#> COMPLETED.
-#> Finished in 0.1 seconds.
-fit_vb$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.14 -6.93 0.528 0.247 -8.21 -6.75
-#> 2 lp_approx__ -0.520 -0.244 0.740 0.326 -1.90 -0.00227
-#> 3 theta 0.251 0.236 0.107 0.108 0.100 0.446
-mcmc_hist(fit_vb$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'pathfinder' method, a new alternative to the variational method
-fit_pf <- mod$pathfinder(data = stan_data, seed = 123)
-#> Path [1] :Initial log joint density = -18.273334
-#> Path [1] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 7.082e-04 1.432e-05 1.000e+00 1.000e+00 126 -6.145e+00 -6.145e+00
-#> Path [1] :Best Iter: [5] ELBO (-6.145070) evaluations: (126)
-#> Path [2] :Initial log joint density = -19.192715
-#> Path [2] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 2.015e-04 2.228e-06 1.000e+00 1.000e+00 126 -6.223e+00 -6.223e+00
-#> Path [2] :Best Iter: [2] ELBO (-6.170358) evaluations: (126)
-#> Path [3] :Initial log joint density = -6.774820
-#> Path [3] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 4 -6.748e+00 1.137e-04 2.596e-07 1.000e+00 1.000e+00 101 -6.178e+00 -6.178e+00
-#> Path [3] :Best Iter: [4] ELBO (-6.177909) evaluations: (101)
-#> Path [4] :Initial log joint density = -7.949193
-#> Path [4] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 2.145e-04 1.301e-06 1.000e+00 1.000e+00 126 -6.197e+00 -6.197e+00
-#> Path [4] :Best Iter: [5] ELBO (-6.197118) evaluations: (126)
-#> Total log probability function evaluations:4379
-#> Finished in 0.1 seconds.
-fit_pf$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp_approx__ -1.07 -0.727 0.945 0.311 -2.91 -0.450
-#> 2 lp__ -7.25 -6.97 0.753 0.308 -8.78 -6.75
-#> 3 theta 0.256 0.245 0.119 0.123 0.0824 0.462
-mcmc_hist(fit_pf$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'pathfinder' again with more paths, fewer draws per path,
-# better covariance approximation, and fewer LBFGSs iterations
-fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40,
- history_size=50, max_lbfgs_iters=100)
-#> Warning: Number of PSIS draws is larger than the total number of draws returned by the single Pathfinders. This is likely unintentional and leads to re-sampling from the same draws.
-#> Path [1] :Initial log joint density = -19.520956
-#> Path [1] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 4 -6.748e+00 1.226e-02 1.762e-04 1.000e+00 1.000e+00 101 -6.253e+00 -6.253e+00
-#> Path [1] :Best Iter: [2] ELBO (-6.195755) evaluations: (101)
-#> Path [2] :Initial log joint density = -7.254863
-#> Path [2] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 3.135e-04 3.796e-07 1.000e+00 1.000e+00 126 -6.215e+00 -6.215e+00
-#> Path [2] :Best Iter: [4] ELBO (-6.093719) evaluations: (126)
-#> Path [3] :Initial log joint density = -9.357651
-#> Path [3] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 4.272e-04 4.080e-06 1.000e+00 1.000e+00 126 -6.228e+00 -6.228e+00
-#> Path [3] :Best Iter: [3] ELBO (-6.197602) evaluations: (126)
-#> Path [4] :Initial log joint density = -15.654513
-#> Path [4] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.955e-03 5.901e-05 1.000e+00 1.000e+00 126 -6.224e+00 -6.224e+00
-#> Path [4] :Best Iter: [5] ELBO (-6.224416) evaluations: (126)
-#> Path [5] :Initial log joint density = -6.893187
-#> Path [5] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 4 -6.748e+00 1.780e-04 2.911e-05 9.344e-01 9.344e-01 101 -6.239e+00 -6.239e+00
-#> Path [5] :Best Iter: [3] ELBO (-6.215896) evaluations: (101)
-#> Path [6] :Initial log joint density = -10.329090
-#> Path [6] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 7.385e-04 9.583e-06 1.000e+00 1.000e+00 126 -6.222e+00 -6.222e+00
-#> Path [6] :Best Iter: [3] ELBO (-6.212303) evaluations: (126)
-#> Path [7] :Initial log joint density = -15.872224
-#> Path [7] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.903e-03 5.733e-05 1.000e+00 1.000e+00 126 -6.254e+00 -6.254e+00
-#> Path [7] :Best Iter: [4] ELBO (-6.195869) evaluations: (126)
-#> Path [8] :Initial log joint density = -10.958418
-#> Path [8] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 9.246e-04 1.360e-05 1.000e+00 1.000e+00 126 -6.214e+00 -6.214e+00
-#> Path [8] :Best Iter: [2] ELBO (-6.186854) evaluations: (126)
-#> Path [9] :Initial log joint density = -12.861849
-#> Path [9] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.621e-03 3.641e-05 1.000e+00 1.000e+00 126 -6.213e+00 -6.213e+00
-#> Path [9] :Best Iter: [5] ELBO (-6.213458) evaluations: (126)
-#> Path [10] :Initial log joint density = -16.360860
-#> Path [10] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.745e-03 5.153e-05 1.000e+00 1.000e+00 126 -6.169e+00 -6.169e+00
-#> Path [10] :Best Iter: [5] ELBO (-6.169190) evaluations: (126)
-#> Total log probability function evaluations:1360
-#> Finished in 0.1 seconds.
-
-# Specifying initial values as a function
-fit_mcmc_w_init_fun <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = function() list(theta = runif(1))
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.3 seconds.
-#>
-fit_mcmc_w_init_fun_2 <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = function(chain_id) {
- # silly but demonstrates optional use of chain_id
- list(theta = 1 / (chain_id + 1))
- }
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.3 seconds.
-#>
-fit_mcmc_w_init_fun_2$init()
-#> [[1]]
-#> [[1]]$theta
-#> [1] 0.5
-#>
-#>
-#> [[2]]
-#> [[2]]$theta
-#> [1] 0.3333333
-#>
-#>
-
-# Specifying initial values as a list of lists
-fit_mcmc_w_init_list <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = list(
- list(theta = 0.75), # chain 1
- list(theta = 0.25) # chain 2
- )
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.3 seconds.
-#>
-fit_optim_w_init_list <- mod$optimize(
- data = stan_data,
- seed = 123,
- init = list(
- list(theta = 0.75)
- )
-)
-#> Initial log joint probability = -11.6657
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 6 -5.00402 0.000237915 9.55309e-07 1 1 9
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.1 seconds.
-fit_optim_w_init_list$init()
-#> [[1]]
-#> [[1]]$theta
-#> [1] 0.75
-#>
-#>
-# }
-
-cmdstan_path.Rdcmdstan_path() returns the full path to the CmdStan installation. The path
-can be set using the set_cmdstan_path() function. See Details.
cmdstan_path() - -set_cmdstan_path(path = NULL)- -
| path | -The full file path to the CmdStan installation as a string. If
- |
-
|---|
The file path to the CmdStan installation.
-Before the package can be used it needs to know where the CmdStan -installation is located. When the package is loaded it tries to help automate -this to avoid having to manually set the path every session:
If the environment variable "CMDSTAN" exists at load time
-then its value will be automatically set as the default path to CmdStan for
-the R session.
If no environment variable is found when loaded but the directory
-".cmdstanr/cmdstan" exists in the user's home directory
-(Sys.getenv("HOME"), not the current working directory) then it will
-be set as the path to CmdStan for the R session. This is the same as the
-default directory that install_cmdstan() would use to install the latest
-version of CmdStan.
It is always possible to change the path after loading the package using
-set_cmdstan_path(path).
Stan Development Team
-CmdStanR: the R interface to CmdStan.
-CmdStanR (cmdstanr package) is an interface to Stan -(mc-stan.org) for R users. It provides the -necessary objects and functions to compile a Stan program and run Stan's -algorithms from R via CmdStan, the shell interface to Stan -(mc-stan.org/users/interfaces/cmdstan).
-The RStan interface (rstan package) is -an in-memory interface to Stan and relies on R packages like Rcpp -and inline to call C++ code from R. On the other hand, the CmdStanR -interface does not directly call any C++ code from R, instead relying on -the CmdStan interface behind the scenes for compilation, running -algorithms, and writing results to output files.
-Allows other developers to distribute R packages with pre-compiled -Stan programs (like rstanarm) on CRAN. (Note: As of 2023, this can -mostly be achieved with CmdStanR as well. See Developing using CmdStanR.)
Avoids use of R6 classes, which may result in more familiar syntax for -many R users.
CRAN binaries available for Mac and Windows.
Compatible with latest versions of Stan. Keeping up with Stan releases
-is complicated for RStan, often requiring non-trivial changes to the
-rstan package and new CRAN releases of both rstan and
-StanHeaders. With CmdStanR the latest improvements in Stan will be
-available from R immediately after updating CmdStan using
-cmdstanr::install_cmdstan().
Running Stan via external processes results in fewer unexpected -crashes, especially in RStudio.
Less memory overhead.
More permissive license. RStan uses the GPL-3 license while the -license for CmdStanR is BSD-3, which is a bit more permissive and is -the same license used for CmdStan and the Stan C++ source code.
CmdStanR requires a working version of CmdStan. If
-you already have CmdStan installed see cmdstan_model() to get started,
-otherwise see install_cmdstan() to install CmdStan. The vignette
-Getting started with CmdStanR
-demonstrates the basic functionality of the package.
For a list of global options see -cmdstanr_global_options.
-The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Useful links:
# \dontrun{
-library(cmdstanr)
-library(posterior)
-library(bayesplot)
-color_scheme_set("brightblue")
-
-# Set path to CmdStan
-# (Note: if you installed CmdStan via install_cmdstan() with default settings
-# then setting the path is unnecessary but the default below should still work.
-# Otherwise use the `path` argument to specify the location of your
-# CmdStan installation.)
-set_cmdstan_path(path = NULL)
-#> CmdStan path set to: /Users/jgabry/.cmdstan/cmdstan-2.36.0
-
-# Create a CmdStanModel object from a Stan program,
-# here using the example model that comes with CmdStan
-file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan")
-mod <- cmdstan_model(file)
-mod$print()
-#> data {
-#> int<lower=0> N;
-#> array[N] int<lower=0, upper=1> y;
-#> }
-#> parameters {
-#> real<lower=0, upper=1> theta;
-#> }
-#> model {
-#> theta ~ beta(1, 1); // uniform prior on interval 0,1
-#> y ~ bernoulli(theta);
-#> }
-# Print with line numbers. This can be set globally using the
-# `cmdstanr_print_line_numbers` option.
-mod$print(line_numbers = TRUE)
-#> 1: data {
-#> 2: int<lower=0> N;
-#> 3: array[N] int<lower=0, upper=1> y;
-#> 4: }
-#> 5: parameters {
-#> 6: real<lower=0, upper=1> theta;
-#> 7: }
-#> 8: model {
-#> 9: theta ~ beta(1, 1); // uniform prior on interval 0,1
-#> 10: y ~ bernoulli(theta);
-#> 11: }
-
-# Data as a named list (like RStan)
-stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1))
-
-# Run MCMC using the 'sample' method
-fit_mcmc <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- parallel_chains = 2
-)
-#> Running MCMC with 2 parallel chains...
-#>
-#> Chain 1 Iteration: 1 / 2000 [ 0%] (Warmup)
-#> Chain 1 Iteration: 100 / 2000 [ 5%] (Warmup)
-#> Chain 1 Iteration: 200 / 2000 [ 10%] (Warmup)
-#> Chain 1 Iteration: 300 / 2000 [ 15%] (Warmup)
-#> Chain 1 Iteration: 400 / 2000 [ 20%] (Warmup)
-#> Chain 1 Iteration: 500 / 2000 [ 25%] (Warmup)
-#> Chain 1 Iteration: 600 / 2000 [ 30%] (Warmup)
-#> Chain 1 Iteration: 700 / 2000 [ 35%] (Warmup)
-#> Chain 1 Iteration: 800 / 2000 [ 40%] (Warmup)
-#> Chain 1 Iteration: 900 / 2000 [ 45%] (Warmup)
-#> Chain 1 Iteration: 1000 / 2000 [ 50%] (Warmup)
-#> Chain 1 Iteration: 1001 / 2000 [ 50%] (Sampling)
-#> Chain 1 Iteration: 1100 / 2000 [ 55%] (Sampling)
-#> Chain 1 Iteration: 1200 / 2000 [ 60%] (Sampling)
-#> Chain 1 Iteration: 1300 / 2000 [ 65%] (Sampling)
-#> Chain 1 Iteration: 1400 / 2000 [ 70%] (Sampling)
-#> Chain 1 Iteration: 1500 / 2000 [ 75%] (Sampling)
-#> Chain 1 Iteration: 1600 / 2000 [ 80%] (Sampling)
-#> Chain 1 Iteration: 1700 / 2000 [ 85%] (Sampling)
-#> Chain 1 Iteration: 1800 / 2000 [ 90%] (Sampling)
-#> Chain 1 Iteration: 1900 / 2000 [ 95%] (Sampling)
-#> Chain 1 Iteration: 2000 / 2000 [100%] (Sampling)
-#> Chain 2 Iteration: 1 / 2000 [ 0%] (Warmup)
-#> Chain 2 Iteration: 100 / 2000 [ 5%] (Warmup)
-#> Chain 2 Iteration: 200 / 2000 [ 10%] (Warmup)
-#> Chain 2 Iteration: 300 / 2000 [ 15%] (Warmup)
-#> Chain 2 Iteration: 400 / 2000 [ 20%] (Warmup)
-#> Chain 2 Iteration: 500 / 2000 [ 25%] (Warmup)
-#> Chain 2 Iteration: 600 / 2000 [ 30%] (Warmup)
-#> Chain 2 Iteration: 700 / 2000 [ 35%] (Warmup)
-#> Chain 2 Iteration: 800 / 2000 [ 40%] (Warmup)
-#> Chain 2 Iteration: 900 / 2000 [ 45%] (Warmup)
-#> Chain 2 Iteration: 1000 / 2000 [ 50%] (Warmup)
-#> Chain 2 Iteration: 1001 / 2000 [ 50%] (Sampling)
-#> Chain 2 Iteration: 1100 / 2000 [ 55%] (Sampling)
-#> Chain 2 Iteration: 1200 / 2000 [ 60%] (Sampling)
-#> Chain 2 Iteration: 1300 / 2000 [ 65%] (Sampling)
-#> Chain 2 Iteration: 1400 / 2000 [ 70%] (Sampling)
-#> Chain 2 Iteration: 1500 / 2000 [ 75%] (Sampling)
-#> Chain 2 Iteration: 1600 / 2000 [ 80%] (Sampling)
-#> Chain 2 Iteration: 1700 / 2000 [ 85%] (Sampling)
-#> Chain 2 Iteration: 1800 / 2000 [ 90%] (Sampling)
-#> Chain 2 Iteration: 1900 / 2000 [ 95%] (Sampling)
-#> Chain 2 Iteration: 2000 / 2000 [100%] (Sampling)
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.2 seconds.
-#>
-
-# Use 'posterior' package for summaries
-fit_mcmc$summary()
-#> # A tibble: 2 × 10
-#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.35 -7.01 0.882 0.353 -9.14 -6.75 1.00 724. 896.
-#> 2 theta 0.254 0.239 0.129 0.126 0.0737 0.488 1.00 532. 657.
-
-# Check sampling diagnostics
-fit_mcmc$diagnostic_summary()
-#> $num_divergent
-#> [1] 0 0
-#>
-#> $num_max_treedepth
-#> [1] 0 0
-#>
-#> $ebfmi
-#> [1] 1.1148479 0.7568734
-#>
-
-# Get posterior draws
-draws <- fit_mcmc$draws()
-print(draws)
-#> # A draws_array: 1000 iterations, 2 chains, and 2 variables
-#> , , variable = lp__
-#>
-#> chain
-#> iteration 1 2
-#> 1 -7.0 -8.1
-#> 2 -7.9 -7.9
-#> 3 -7.4 -7.0
-#> 4 -6.7 -6.8
-#> 5 -6.9 -6.8
-#>
-#> , , variable = theta
-#>
-#> chain
-#> iteration 1 2
-#> 1 0.17 0.088
-#> 2 0.46 0.097
-#> 3 0.41 0.167
-#> 4 0.25 0.292
-#> 5 0.18 0.238
-#>
-#> # ... with 995 more iterations
-
-# Convert to data frame using posterior::as_draws_df
-as_draws_df(draws)
-#> # A draws_df: 1000 iterations, 2 chains, and 2 variables
-#> lp__ theta
-#> 1 -7.0 0.17
-#> 2 -7.9 0.46
-#> 3 -7.4 0.41
-#> 4 -6.7 0.25
-#> 5 -6.9 0.18
-#> 6 -6.9 0.33
-#> 7 -7.2 0.15
-#> 8 -6.8 0.29
-#> 9 -6.8 0.24
-#> 10 -6.8 0.24
-#> # ... with 1990 more draws
-#> # ... hidden reserved variables {'.chain', '.iteration', '.draw'}
-
-# Plot posterior using bayesplot (ggplot2)
-mcmc_hist(fit_mcmc$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm)
-# and also demonstrate specifying data as a path to a file instead of a list
-my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json")
-fit_optim <- mod$optimize(data = my_data_file, seed = 123)
-#> Initial log joint probability = -16.144
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 6 -5.00402 0.000246518 8.73164e-07 1 1 9
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.1 seconds.
-fit_optim$summary()
-#> # A tibble: 2 × 2
-#> variable estimate
-#> <chr> <dbl>
-#> 1 lp__ -5.00
-#> 2 theta 0.2
-
-# Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation
-# to the posterior
-fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE)
-#> Initial log joint probability = -6.80195
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 4 -6.74802 0.00029907 1.30133e-06 1 1 7
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.1 seconds.
-fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000)
-#> Calculating Hessian
-#> Calculating inverse of Cholesky factor
-#> Generating draws
-#> iteration: 0
-#> iteration: 100
-#> iteration: 200
-#> iteration: 300
-#> iteration: 400
-#> iteration: 500
-#> iteration: 600
-#> iteration: 700
-#> iteration: 800
-#> iteration: 900
-#> iteration: 1000
-#> iteration: 1100
-#> iteration: 1200
-#> iteration: 1300
-#> iteration: 1400
-#> iteration: 1500
-#> iteration: 1600
-#> iteration: 1700
-#> iteration: 1800
-#> iteration: 1900
-#> Finished in 0.1 seconds.
-fit_laplace$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.22 -6.96 0.660 0.293 -8.56 -6.75
-#> 2 lp_approx__ -0.479 -0.220 0.648 0.296 -1.85 -0.00230
-#> 3 theta 0.271 0.251 0.122 0.119 0.101 0.501
-
-# Run 'variational' method to use ADVI to approximate posterior
-fit_vb <- mod$variational(data = stan_data, seed = 123)
-#> ------------------------------------------------------------
-#> EXPERIMENTAL ALGORITHM:
-#> This procedure has not been thoroughly tested and may be unstable
-#> or buggy. The interface is subject to change.
-#> ------------------------------------------------------------
-#> Gradient evaluation took 1e-05 seconds
-#> 1000 transitions using 10 leapfrog steps per transition would take 0.1 seconds.
-#> Adjust your expectations accordingly!
-#> Begin eta adaptation.
-#> Iteration: 1 / 250 [ 0%] (Adaptation)
-#> Iteration: 50 / 250 [ 20%] (Adaptation)
-#> Iteration: 100 / 250 [ 40%] (Adaptation)
-#> Iteration: 150 / 250 [ 60%] (Adaptation)
-#> Iteration: 200 / 250 [ 80%] (Adaptation)
-#> Success! Found best value [eta = 1] earlier than expected.
-#> Begin stochastic gradient ascent.
-#> iter ELBO delta_ELBO_mean delta_ELBO_med notes
-#> 100 -6.164 1.000 1.000
-#> 200 -6.225 0.505 1.000
-#> 300 -6.186 0.339 0.010 MEDIAN ELBO CONVERGED
-#> Drawing a sample of size 1000 from the approximate posterior...
-#> COMPLETED.
-#> Finished in 0.1 seconds.
-fit_vb$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.14 -6.93 0.528 0.247 -8.21 -6.75
-#> 2 lp_approx__ -0.520 -0.244 0.740 0.326 -1.90 -0.00227
-#> 3 theta 0.251 0.236 0.107 0.108 0.100 0.446
-mcmc_hist(fit_vb$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'pathfinder' method, a new alternative to the variational method
-fit_pf <- mod$pathfinder(data = stan_data, seed = 123)
-#> Path [1] :Initial log joint density = -18.273334
-#> Path [1] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 7.082e-04 1.432e-05 1.000e+00 1.000e+00 126 -6.145e+00 -6.145e+00
-#> Path [1] :Best Iter: [5] ELBO (-6.145070) evaluations: (126)
-#> Path [2] :Initial log joint density = -19.192715
-#> Path [2] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 2.015e-04 2.228e-06 1.000e+00 1.000e+00 126 -6.223e+00 -6.223e+00
-#> Path [2] :Best Iter: [2] ELBO (-6.170358) evaluations: (126)
-#> Path [3] :Initial log joint density = -6.774820
-#> Path [3] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 4 -6.748e+00 1.137e-04 2.596e-07 1.000e+00 1.000e+00 101 -6.178e+00 -6.178e+00
-#> Path [3] :Best Iter: [4] ELBO (-6.177909) evaluations: (101)
-#> Path [4] :Initial log joint density = -7.949193
-#> Path [4] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 2.145e-04 1.301e-06 1.000e+00 1.000e+00 126 -6.197e+00 -6.197e+00
-#> Path [4] :Best Iter: [5] ELBO (-6.197118) evaluations: (126)
-#> Total log probability function evaluations:4379
-#> Finished in 0.1 seconds.
-fit_pf$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp_approx__ -1.07 -0.727 0.945 0.311 -2.91 -0.450
-#> 2 lp__ -7.25 -6.97 0.753 0.308 -8.78 -6.75
-#> 3 theta 0.256 0.245 0.119 0.123 0.0824 0.462
-mcmc_hist(fit_pf$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'pathfinder' again with more paths, fewer draws per path,
-# better covariance approximation, and fewer LBFGSs iterations
-fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40,
- history_size=50, max_lbfgs_iters=100)
-#> Warning: Number of PSIS draws is larger than the total number of draws returned by the single Pathfinders. This is likely unintentional and leads to re-sampling from the same draws.
-#> Path [1] :Initial log joint density = -8.288264
-#> Path [1] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 2.788e-04 2.003e-06 1.000e+00 1.000e+00 126 -6.242e+00 -6.242e+00
-#> Path [1] :Best Iter: [2] ELBO (-6.189407) evaluations: (126)
-#> Path [2] :Initial log joint density = -7.122559
-#> Path [2] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 4 -6.748e+00 3.446e-03 7.646e-05 1.000e+00 1.000e+00 101 -6.267e+00 -6.267e+00
-#> Path [2] :Best Iter: [2] ELBO (-6.217814) evaluations: (101)
-#> Path [3] :Initial log joint density = -6.904847
-#> Path [3] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 4 -6.748e+00 1.217e-03 1.351e-05 1.000e+00 1.000e+00 101 -6.256e+00 -6.256e+00
-#> Path [3] :Best Iter: [3] ELBO (-6.226443) evaluations: (101)
-#> Path [4] :Initial log joint density = -15.810600
-#> Path [4] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.919e-03 5.787e-05 1.000e+00 1.000e+00 126 -6.202e+00 -6.202e+00
-#> Path [4] :Best Iter: [4] ELBO (-6.144907) evaluations: (126)
-#> Path [5] :Initial log joint density = -7.535814
-#> Path [5] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.206e-04 5.112e-07 1.000e+00 1.000e+00 126 -6.233e+00 -6.233e+00
-#> Path [5] :Best Iter: [4] ELBO (-6.165244) evaluations: (126)
-#> Path [6] :Initial log joint density = -12.317092
-#> Path [6] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.394e-03 2.744e-05 1.000e+00 1.000e+00 126 -6.247e+00 -6.247e+00
-#> Path [6] :Best Iter: [3] ELBO (-6.207884) evaluations: (126)
-#> Path [7] :Initial log joint density = -6.970206
-#> Path [7] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 4 -6.748e+00 3.919e-04 8.523e-05 9.151e-01 9.151e-01 101 -6.187e+00 -6.187e+00
-#> Path [7] :Best Iter: [4] ELBO (-6.187348) evaluations: (101)
-#> Path [8] :Initial log joint density = -6.960470
-#> Path [8] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 4 -6.748e+00 1.773e-03 2.529e-05 1.000e+00 1.000e+00 101 -6.148e+00 -6.148e+00
-#> Path [8] :Best Iter: [4] ELBO (-6.147740) evaluations: (101)
-#> Path [9] :Initial log joint density = -6.771981
-#> Path [9] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 3 -6.748e+00 2.126e-03 4.338e-07 9.709e-01 9.709e-01 76 -6.239e+00 -6.239e+00
-#> Path [9] :Best Iter: [2] ELBO (-6.211440) evaluations: (76)
-#> Path [10] :Initial log joint density = -7.099410
-#> Path [10] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.211e-04 7.078e-08 1.000e+00 1.000e+00 126 -6.227e+00 -6.227e+00
-#> Path [10] :Best Iter: [2] ELBO (-6.196731) evaluations: (126)
-#> Total log probability function evaluations:1260
-#> Finished in 0.1 seconds.
-
-# Specifying initial values as a function
-fit_mcmc_w_init_fun <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = function() list(theta = runif(1))
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.3 seconds.
-#>
-fit_mcmc_w_init_fun_2 <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = function(chain_id) {
- # silly but demonstrates optional use of chain_id
- list(theta = 1 / (chain_id + 1))
- }
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.3 seconds.
-#>
-fit_mcmc_w_init_fun_2$init()
-#> [[1]]
-#> [[1]]$theta
-#> [1] 0.5
-#>
-#>
-#> [[2]]
-#> [[2]]$theta
-#> [1] 0.3333333
-#>
-#>
-
-# Specifying initial values as a list of lists
-fit_mcmc_w_init_list <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = list(
- list(theta = 0.75), # chain 1
- list(theta = 0.25) # chain 2
- )
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.3 seconds.
-#>
-fit_optim_w_init_list <- mod$optimize(
- data = stan_data,
- seed = 123,
- init = list(
- list(theta = 0.75)
- )
-)
-#> Initial log joint probability = -11.6657
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 6 -5.00402 0.000237915 9.55309e-07 1 1 9
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.1 seconds.
-fit_optim_w_init_list$init()
-#> [[1]]
-#> [[1]]$theta
-#> [1] 0.75
-#>
-#>
-# }
-
-Fit models for use in examples
-cmdstanr_example(
- example = c("logistic", "schools", "schools_ncp"),
- method = c("sample", "optimize", "laplace", "variational", "pathfinder", "diagnose"),
- ...,
- quiet = TRUE,
- force_recompile = getOption("cmdstanr_force_recompile", default = FALSE)
-)
-
-print_example_program(example = c("logistic", "schools", "schools_ncp"))(string) The name of the example. The currently available -examples are
"logistic": logistic regression with intercept and 3 predictors.
"schools": the so-called "eight schools" model, a hierarchical
-meta-analysis. Fitting this model will result in warnings about
-divergences.
"schools_ncp": non-centered parameterization of the "eight schools"
-model that fixes the problem with divergences.
To print the Stan code for a given example use
-print_example_program(example).
(string) Which fitting method should be used? The default is
-the "sample" method (MCMC).
Arguments passed to the chosen method. See the help pages for
-the individual methods for details.
(logical) If TRUE (the default) then fitting the model is
-wrapped in utils::capture.output().
Passed to the $compile() method.
The fitted model object returned by the selected method.
# \dontrun{
-print_example_program("logistic")
-#> data {
-#> int<lower=0> N;
-#> int<lower=0> K;
-#> array[N] int<lower=0, upper=1> y;
-#> matrix[N, K] X;
-#> }
-#> parameters {
-#> real alpha;
-#> vector[K] beta;
-#> }
-#> model {
-#> target += normal_lpdf(alpha | 0, 1);
-#> target += normal_lpdf(beta | 0, 1);
-#> target += bernoulli_logit_glm_lpmf(y | X, alpha, beta);
-#> }
-#> generated quantities {
-#> vector[N] log_lik;
-#> for (n in 1 : N) {
-#> log_lik[n] = bernoulli_logit_lpmf(y[n] | alpha + X[n] * beta);
-#> }
-#> }
-fit_logistic_mcmc <- cmdstanr_example("logistic", chains = 2)
-fit_logistic_mcmc$summary()
-#> # A tibble: 105 × 10
-#> variable mean median sd mad q5 q95 rhat ess_bulk
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -66.0 -65.6 1.43 1.22 -68.8 -64.3 1.00 963.
-#> 2 alpha 0.376 0.373 0.220 0.221 0.0226 0.745 1.00 2237.
-#> 3 beta[1] -0.671 -0.663 0.250 0.252 -1.10 -0.258 1.00 1930.
-#> 4 beta[2] -0.262 -0.261 0.222 0.229 -0.629 0.0934 1.00 1801.
-#> 5 beta[3] 0.674 0.681 0.264 0.263 0.242 1.10 1.00 1979.
-#> 6 log_lik[1] -0.515 -0.509 0.0991 0.0964 -0.681 -0.363 1.00 2100.
-#> 7 log_lik[2] -0.401 -0.381 0.147 0.138 -0.671 -0.197 1.00 2010.
-#> 8 log_lik[3] -0.489 -0.456 0.215 0.202 -0.891 -0.207 1.00 1914.
-#> 9 log_lik[4] -0.455 -0.435 0.154 0.152 -0.726 -0.242 1.00 1918.
-#> 10 log_lik[5] -1.18 -1.16 0.283 0.282 -1.69 -0.749 1.00 2470.
-#> # ℹ 95 more rows
-#> # ℹ 1 more variable: ess_tail <dbl>
-
-fit_logistic_optim <- cmdstanr_example("logistic", method = "optimize")
-fit_logistic_optim$summary()
-#> # A tibble: 105 × 2
-#> variable estimate
-#> <chr> <dbl>
-#> 1 lp__ -63.9
-#> 2 alpha 0.364
-#> 3 beta[1] -0.632
-#> 4 beta[2] -0.259
-#> 5 beta[3] 0.648
-#> 6 log_lik[1] -0.515
-#> 7 log_lik[2] -0.394
-#> 8 log_lik[3] -0.469
-#> 9 log_lik[4] -0.442
-#> 10 log_lik[5] -1.14
-#> # ℹ 95 more rows
-
-fit_logistic_vb <- cmdstanr_example("logistic", method = "variational")
-fit_logistic_vb$summary()
-#> # A tibble: 106 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -66.4 -65.9 1.84 1.52 -69.9 -64.3
-#> 2 lp_approx__ -1.98 -1.65 1.42 1.25 -4.64 -0.331
-#> 3 alpha 0.377 0.370 0.296 0.307 -0.116 0.869
-#> 4 beta[1] -0.646 -0.648 0.241 0.234 -1.04 -0.241
-#> 5 beta[2] -0.252 -0.257 0.201 0.191 -0.579 0.0845
-#> 6 beta[3] 0.702 0.695 0.280 0.269 0.236 1.16
-#> 7 log_lik[1] -0.523 -0.518 0.128 0.130 -0.747 -0.331
-#> 8 log_lik[2] -0.398 -0.369 0.168 0.155 -0.716 -0.174
-#> 9 log_lik[3] -0.480 -0.450 0.204 0.193 -0.859 -0.211
-#> 10 log_lik[4] -0.455 -0.431 0.159 0.160 -0.739 -0.235
-#> # ℹ 96 more rows
-
-print_example_program("schools")
-#> data {
-#> int<lower=1> J;
-#> vector<lower=0>[J] sigma;
-#> vector[J] y;
-#> }
-#> parameters {
-#> real mu;
-#> real<lower=0> tau;
-#> vector[J] theta;
-#> }
-#> model {
-#> target += normal_lpdf(tau | 0, 10);
-#> target += normal_lpdf(mu | 0, 10);
-#> target += normal_lpdf(theta | mu, tau);
-#> target += normal_lpdf(y | theta, sigma);
-#> }
-fit_schools_mcmc <- cmdstanr_example("schools")
-#> Warning: 260 of 4000 (6.0%) transitions ended with a divergence.
-#> See https://mc-stan.org/misc/warnings for details.
-#> Warning: 1 of 4 chains had an E-BFMI less than 0.3.
-#> See https://mc-stan.org/misc/warnings for details.
-fit_schools_mcmc$summary()
-#> # A tibble: 11 × 10
-#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -58.0 -58.4 5.27 5.30 -66.3 -48.5 1.07 43.0 27.8
-#> 2 mu 6.38 6.13 4.15 3.80 -0.237 13.5 1.01 627. 964.
-#> 3 tau 5.31 4.49 3.57 3.28 1.10 12.2 1.07 37.6 21.2
-#> 4 theta[1] 9.14 8.16 7.00 6.00 -0.685 21.7 1.01 966. 1427.
-#> 5 theta[2] 6.73 6.51 5.64 4.88 -2.30 16.4 1.02 1127. 1732.
-#> 6 theta[3] 5.08 5.36 6.58 5.64 -6.24 15.5 1.02 792. 1549.
-#> 7 theta[4] 6.57 6.30 5.92 5.19 -2.79 16.1 1.01 1328. 1869.
-#> 8 theta[5] 4.56 4.87 5.47 5.08 -5.00 13.1 1.02 707. 1473.
-#> 9 theta[6] 5.31 5.53 6.01 5.11 -5.18 14.7 1.01 834. 1685.
-#> 10 theta[7] 9.04 8.41 5.90 5.30 0.408 19.3 1.01 701. 1390.
-#> 11 theta[8] 6.88 6.54 6.69 5.37 -4.06 18.2 1.02 1238. 2106.
-
-print_example_program("schools_ncp")
-#> data {
-#> int<lower=1> J;
-#> vector<lower=0>[J] sigma;
-#> vector[J] y;
-#> }
-#> parameters {
-#> real mu;
-#> real<lower=0> tau;
-#> vector[J] theta_raw;
-#> }
-#> transformed parameters {
-#> vector[J] theta = mu + tau * theta_raw;
-#> }
-#> model {
-#> target += normal_lpdf(tau | 0, 10);
-#> target += normal_lpdf(mu | 0, 10);
-#> target += normal_lpdf(theta_raw | 0, 1);
-#> target += normal_lpdf(y | theta, sigma);
-#> }
-fit_schools_ncp_mcmc <- cmdstanr_example("schools_ncp")
-fit_schools_ncp_mcmc$summary()
-#> # A tibble: 19 × 10
-#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -46.9 -4.66e+1 2.40 2.30 -51.2 -43.5 1.00 1569. 2358.
-#> 2 mu 6.51 6.55e+0 4.22 4.14 -0.355 13.3 1.00 3009. 2465.
-#> 3 tau 4.86 4.06e+0 3.67 3.43 0.444 12.0 1.00 1792. 1620.
-#> 4 theta_r… 0.348 3.50e-1 0.956 0.960 -1.26 1.93 1.00 3527. 2677.
-#> 5 theta_r… 0.0410 5.49e-2 0.889 0.880 -1.47 1.46 1.00 4110. 2991.
-#> 6 theta_r… -0.142 -1.47e-1 0.953 0.945 -1.69 1.43 1.00 4686. 2441.
-#> 7 theta_r… -0.0107 -9.40e-3 0.937 0.935 -1.53 1.53 1.00 4262. 2854.
-#> 8 theta_r… -0.296 -2.93e-1 0.931 0.915 -1.83 1.24 1.00 3286. 2516.
-#> 9 theta_r… -0.172 -1.84e-1 0.911 0.881 -1.69 1.36 1.00 3801. 2532.
-#> 10 theta_r… 0.360 3.84e-1 0.933 0.904 -1.21 1.83 1.00 3900. 2825.
-#> 11 theta_r… 0.0773 7.71e-2 0.982 1.00 -1.54 1.67 1.00 4239. 2766.
-#> 12 theta[1] 9.01 8.06e+0 6.94 5.65 -0.546 22.1 1.00 3731. 3102.
-#> 13 theta[2] 6.85 6.80e+0 5.61 5.13 -2.12 16.3 1.00 4538. 3046.
-#> 14 theta[3] 5.47 5.82e+0 6.48 5.46 -6.06 15.3 1.00 4254. 3124.
-#> 15 theta[4] 6.52 6.47e+0 5.74 5.18 -2.72 15.9 1.00 4625. 3465.
-#> 16 theta[5] 4.79 5.10e+0 5.61 5.15 -4.87 13.3 1.00 4852. 3150.
-#> 17 theta[6] 5.54 5.73e+0 5.60 5.11 -3.94 14.1 1.00 4021. 3062.
-#> 18 theta[7] 8.95 8.27e+0 6.02 5.41 0.317 19.7 1.00 4087. 3495.
-#> 19 theta[8] 7.00 7.00e+0 6.58 5.65 -3.39 17.8 1.00 3992. 3246.
-
-# optimization fails for hierarchical model
-cmdstanr_example("schools", "optimize", quiet = FALSE)
-#> Initial log joint probability = -57.1999
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 99 137.364 0.389882 2.12196e+10 0.1758 0.3216 199
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 175 252.319 0.0285374 7.72538e+16 1e-12 0.001 386 LS failed, Hessian reset
-#> Chain 1 Optimization terminated with error:
-#> Chain 1 Line search failed to achieve a sufficient decrease, no more progress can be made
-#> Warning: Fitting finished unexpectedly! Use the $output() method for more information.
-#> Finished in 0.2 seconds.
-#> Error: Fitting failed. Unable to print.
-# }
-
-These options can be set via options() for an entire R session.
cmdstanr_draws_format: Which format provided by the posterior
-package should be used when returning the posterior or approximate posterior
-draws? The default depends on the model fitting method. See
-draws for more details.
cmdstanr_force_recompile: Should the default be to recompile models
-even if there were no Stan code changes since last compiled? See
-compile for more details. The default is FALSE.
cmdstanr_max_rows: The maximum number of rows of output to print when
-using the $print() method. The default is 10.
cmdstanr_print_line_numbers: Should line numbers be included when
-printing a Stan program? The default is FALSE.
cmdstanr_no_ver_check: Should the check for a more recent version of
-CmdStan be disabled? The default is FALSE.
cmdstanr_output_dir: The directory where CmdStan should write its output
-CSV files when fitting models. The default is a temporary directory. Files in
-a temporary directory are removed as part of R garbage collection, while
-files in an explicitly defined directory are not automatically deleted.
cmdstanr_verbose: Should more information be printed
-when compiling or running models, including showing how CmdStan was called
-internally? The default is FALSE.
cmdstanr_warn_inits: Should a warning be thrown if initial values are
-only provided for a subset of parameters? The default is TRUE.
cmdstanr_write_stan_file_dir: The directory where write_stan_file()
-should write Stan files. The default is a temporary directory. Files in
-a temporary directory are removed as part of R garbage collection, while
-files in an explicitly defined directory are not automatically deleted.
mc.cores: The number of cores to use for various parallelization tasks
-(e.g. running MCMC chains, installing CmdStan). The default depends on the
-use case and is documented with the methods that make use of mc.cores.
R/data.R
- draws_to_csv.RdWrite posterior draws objects to CSV files suitable for running standalone generated -quantities with CmdStan.
-draws_to_csv(
- draws,
- sampler_diagnostics = NULL,
- dir = tempdir(),
- basename = "fittedParams"
-)A posterior::draws_* object.
Either NULL or a posterior::draws_* object
-of sampler diagnostics.
(string) An optional path to the directory where the CSV files will be -written. If not set, temporary directory is used.
(string) If dir is specified, `basename`` is used for naming
-the output CSV files. If not specified, the file names are randomly generated.
Paths to CSV files (one per chain).
-draws_to_csv() generates a CSV suitable for running standalone generated
-quantities with CmdStan. The CSV file contains a single comment #num_samples,
-which equals the number of iterations in the supplied draws object.
The comment is followed by the column names. The first column is the lp__ value,
-followed by sampler diagnostics and finnaly other variables of the draws object.
-#' If the draws object does not contain the lp__ or sampler diagnostics variables,
-columns with zeros are created in order to conform with the requirements of the
-standalone generated quantities method of CmdStan.
The column names line is finally followed by the values of the draws in the same -order as the column names.
-# \dontrun{
-draws <- posterior::example_draws()
-
-draws_csv_files <- draws_to_csv(draws)
-print(draws_csv_files)
-#> [1] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T//RtmpWzIPg0/fittedParams-202503310842-1-4146ef.csv"
-#> [2] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T//RtmpWzIPg0/fittedParams-202503310842-2-4146ef.csv"
-#> [3] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T//RtmpWzIPg0/fittedParams-202503310842-3-4146ef.csv"
-#> [4] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T//RtmpWzIPg0/fittedParams-202503310842-4-4146ef.csv"
-
-# draws_csv_files <- draws_to_csv(draws,
-# sampler_diagnostic = sampler_diagnostics,
-# dir = "~/my_folder",
-# basename = "my-samples")
-# }
-
-This provides a knitr engine for Stan, suitable for usage when attempting
-to render Stan chunks and compile the model code within to an executable with
-CmdStan. Use register_knitr_engine() to make this the default engine for
-stan chunks. See the vignette
-R Markdown CmdStan Engine
-for an example.
eng_cmdstan(options)(named list) Chunk options, as provided by knitr during
-chunk execution.
# \dontrun{
-knitr::knit_engines$set(stan = cmdstanr::eng_cmdstan)
-# }
-stansummary and diagnose utilitiesR/fit.R
- fit-method-cmdstan_summary.RdRun CmdStan's stansummary and diagnose utilities. These are
-documented in the CmdStan Guide:
https://mc-stan.org/docs/cmdstan-guide/stansummary.html
https://mc-stan.org/docs/cmdstan-guide/diagnose.html
Although these methods can be used for models fit using the
-$variational() method, much of the output is
-currently only relevant for models fit using the
-$sample() method.
See the $summary() for computing similar summaries in -R rather than calling CmdStan's utilites.
-cmdstan_summary(flags = NULL)
-
-cmdstan_diagnose()An optional character vector of flags (e.g.
-flags = c("--sig_figs=1")).
# \dontrun{
-fit <- cmdstanr_example("logistic")
-fit$cmdstan_diagnose()
-#> Checking sampler transitions treedepth.
-#> Treedepth satisfactory for all transitions.
-#>
-#> Checking sampler transitions for divergences.
-#> No divergent transitions found.
-#>
-#> Checking E-BFMI - sampler transitions HMC potential energy.
-#> E-BFMI satisfactory.
-#>
-#> Rank-normalized split effective sample size satisfactory for all parameters.
-#>
-#> Rank-normalized split R-hat values satisfactory for all parameters.
-#>
-#> Processing complete, no problems detected.
-fit$cmdstan_summary()
-#> Inference for Stan model: logistic_model
-#> 4 chains: each with iter=1000; warmup=1000; thin=1; 1000 iterations saved.
-#>
-#> Warmup took (0.023, 0.023, 0.022, 0.021) seconds, 0.089 seconds total
-#> Sampling took (0.078, 0.073, 0.074, 0.072) seconds, 0.30 seconds total
-#>
-#> Mean MCSE StdDev MAD 5% 50% 95% ESS_bulk ESS_tail R_hat
-#>
-#> lp__ -6.6e+01 3.1e-02 1.4 1.2 -69 -6.6e+01 -6.4e+01 2077 2863 1.0
-#> accept_stat__ 0.91 1.4e-03 0.10 0.072 0.71 0.95 1.0 5781 4282 1.0
-#> stepsize__ 0.72 nan 0.040 0.039 0.66 0.73 0.77 nan nan nan
-#> treedepth__ 2.4 1.5e-02 0.53 0.00 2.0 2.0 3.0 1630 1360 1.0
-#> n_leapfrog__ 5.3 9.5e-02 2.0 0.00 3.0 7.0 7.0 3307 884 1.0
-#> divergent__ 0.00 nan 0.00 0.00 0.00 0.00 0.00 nan nan nan
-#> energy__ 68 4.7e-02 2.0 1.8 65 68 72 1774 2711 1.0
-#>
-#> alpha 3.7e-01 3.4e-03 0.22 0.22 0.028 3.7e-01 7.3e-01 4093 2683 1.0
-#> beta[1] -6.6e-01 3.9e-03 0.25 0.25 -1.1 -6.6e-01 -2.6e-01 4257 3329 1.0
-#> beta[2] -2.7e-01 3.6e-03 0.22 0.23 -0.64 -2.6e-01 9.8e-02 3945 3224 1.0
-#> beta[3] 6.7e-01 4.3e-03 0.27 0.27 0.25 6.6e-01 1.1e+00 3898 2767 1.00
-#> log_lik[1] -5.2e-01 1.5e-03 0.098 0.098 -0.69 -5.1e-01 -3.7e-01 4210 2902 1.0
-#> log_lik[2] -4.0e-01 2.2e-03 0.14 0.14 -0.67 -3.9e-01 -2.0e-01 4291 2935 1.0
-#> log_lik[3] -5.0e-01 3.3e-03 0.21 0.20 -0.90 -4.6e-01 -2.1e-01 4240 2834 1.0
-#> log_lik[4] -4.5e-01 2.5e-03 0.16 0.15 -0.73 -4.3e-01 -2.4e-01 3795 3034 1.0
-#> log_lik[5] -1.2e+00 4.4e-03 0.28 0.28 -1.7 -1.2e+00 -7.5e-01 4168 2944 1.0
-#> log_lik[6] -5.9e-01 3.0e-03 0.19 0.19 -0.93 -5.8e-01 -3.3e-01 3850 2768 1.00
-#> log_lik[7] -6.4e-01 1.9e-03 0.13 0.12 -0.87 -6.3e-01 -4.5e-01 4232 3044 1.0
-#> log_lik[8] -2.8e-01 2.2e-03 0.13 0.12 -0.53 -2.6e-01 -1.1e-01 3612 3023 1.0
-#> log_lik[9] -6.9e-01 2.6e-03 0.16 0.16 -0.98 -6.8e-01 -4.5e-01 4078 3164 1.0
-#> log_lik[10] -7.4e-01 3.7e-03 0.23 0.23 -1.2 -7.1e-01 -4.0e-01 4026 2967 1.0
-#> log_lik[11] -2.8e-01 2.1e-03 0.13 0.12 -0.52 -2.6e-01 -1.2e-01 3419 2597 1.0
-#> log_lik[12] -5.0e-01 3.6e-03 0.24 0.22 -0.94 -4.7e-01 -1.9e-01 4359 3260 1.0
-#> log_lik[13] -6.5e-01 3.3e-03 0.21 0.21 -1.0 -6.3e-01 -3.6e-01 4071 2663 1.0
-#> log_lik[14] -3.6e-01 2.7e-03 0.17 0.16 -0.68 -3.3e-01 -1.4e-01 4196 2966 1.0
-#> log_lik[15] -2.8e-01 1.7e-03 0.11 0.10 -0.47 -2.6e-01 -1.4e-01 4062 2457 1.0
-#> log_lik[16] -2.8e-01 1.5e-03 0.087 0.085 -0.44 -2.7e-01 -1.5e-01 3304 2815 1.0
-#> log_lik[17] -1.6e+00 4.8e-03 0.29 0.29 -2.1 -1.6e+00 -1.1e+00 3624 2964 1.0
-#> log_lik[18] -4.8e-01 1.7e-03 0.10 0.10 -0.66 -4.8e-01 -3.2e-01 3932 2824 1.0
-#> log_lik[19] -2.4e-01 1.3e-03 0.075 0.074 -0.37 -2.3e-01 -1.3e-01 3554 3085 1.0
-#> log_lik[20] -1.1e-01 1.3e-03 0.079 0.061 -0.27 -9.5e-02 -3.0e-02 4170 3008 1.0
-#> log_lik[21] -2.2e-01 1.5e-03 0.089 0.084 -0.38 -2.0e-01 -9.7e-02 3302 2740 1.0
-#> log_lik[22] -5.7e-01 2.4e-03 0.14 0.14 -0.83 -5.6e-01 -3.6e-01 3779 3160 1.0
-#> log_lik[23] -3.3e-01 2.2e-03 0.14 0.13 -0.58 -3.1e-01 -1.5e-01 3934 3316 1.0
-#> log_lik[24] -1.4e-01 1.1e-03 0.067 0.061 -0.27 -1.3e-01 -5.4e-02 3656 3128 1.0
-#> log_lik[25] -4.6e-01 1.9e-03 0.12 0.12 -0.68 -4.4e-01 -2.8e-01 4029 2992 1.0
-#> log_lik[26] -1.5e+00 5.2e-03 0.34 0.33 -2.1 -1.5e+00 -9.9e-01 4325 3387 1.0
-#> log_lik[27] -3.1e-01 2.1e-03 0.12 0.12 -0.54 -2.9e-01 -1.5e-01 3375 2563 1.0
-#> log_lik[28] -4.5e-01 1.3e-03 0.082 0.082 -0.59 -4.4e-01 -3.2e-01 3776 2975 1.0
-#> log_lik[29] -7.3e-01 3.3e-03 0.23 0.23 -1.1 -7.0e-01 -3.9e-01 4688 3192 1.0
-#> log_lik[30] -7.0e-01 2.9e-03 0.18 0.18 -1.0 -6.8e-01 -4.2e-01 4197 3162 1.0
-#> log_lik[31] -4.9e-01 2.7e-03 0.16 0.16 -0.79 -4.7e-01 -2.6e-01 3604 2908 1.0
-#> log_lik[32] -4.3e-01 1.7e-03 0.11 0.11 -0.62 -4.2e-01 -2.7e-01 3793 2685 1.0
-#> log_lik[33] -4.1e-01 2.0e-03 0.13 0.12 -0.65 -3.9e-01 -2.3e-01 4149 2968 1.0
-#> log_lik[34] -6.6e-02 8.7e-04 0.052 0.038 -0.16 -5.2e-02 -1.3e-02 3571 2878 1.0
-#> log_lik[35] -5.9e-01 2.8e-03 0.19 0.19 -0.92 -5.6e-01 -3.2e-01 4485 3033 1.0
-#> log_lik[36] -3.3e-01 2.0e-03 0.13 0.12 -0.56 -3.1e-01 -1.5e-01 4361 3167 1.00
-#> log_lik[37] -7.0e-01 3.4e-03 0.23 0.22 -1.1 -6.7e-01 -3.7e-01 4430 3268 1.0
-#> log_lik[38] -3.2e-01 2.5e-03 0.15 0.14 -0.61 -2.9e-01 -1.2e-01 3889 3062 1.0
-#> log_lik[39] -1.8e-01 1.7e-03 0.11 0.089 -0.38 -1.6e-01 -5.5e-02 4256 2773 1.0
-#> log_lik[40] -6.8e-01 2.0e-03 0.12 0.12 -0.90 -6.7e-01 -4.9e-01 4119 3108 1.0
-#> log_lik[41] -1.1e+00 4.3e-03 0.25 0.25 -1.6 -1.1e+00 -7.5e-01 3475 2646 1.0
-#> log_lik[42] -9.3e-01 3.0e-03 0.19 0.19 -1.3 -9.2e-01 -6.3e-01 4203 3085 1.0
-#> log_lik[43] -4.1e-01 3.9e-03 0.26 0.22 -0.91 -3.5e-01 -1.1e-01 4971 3223 1.00
-#> log_lik[44] -1.2e+00 2.9e-03 0.18 0.18 -1.5 -1.2e+00 -8.9e-01 3992 2815 1.0
-#> log_lik[45] -3.6e-01 1.8e-03 0.12 0.11 -0.57 -3.4e-01 -1.9e-01 4297 3077 1.00
-#> log_lik[46] -5.8e-01 1.9e-03 0.13 0.12 -0.81 -5.7e-01 -3.9e-01 4228 3113 1.0
-#> log_lik[47] -3.1e-01 2.1e-03 0.13 0.12 -0.55 -2.9e-01 -1.4e-01 3831 2919 1.0
-#> log_lik[48] -3.3e-01 1.3e-03 0.082 0.081 -0.47 -3.2e-01 -2.1e-01 3891 3283 1.0
-#> log_lik[49] -3.2e-01 1.3e-03 0.079 0.078 -0.46 -3.2e-01 -2.0e-01 3475 2682 1.0
-#> log_lik[50] -1.3e+00 4.9e-03 0.32 0.32 -1.8 -1.3e+00 -7.9e-01 4610 3388 1.0
-#> log_lik[51] -2.9e-01 1.4e-03 0.093 0.090 -0.46 -2.8e-01 -1.6e-01 4207 3215 1.0
-#> log_lik[52] -8.3e-01 2.2e-03 0.14 0.14 -1.1 -8.3e-01 -6.1e-01 4205 3039 1.0
-#> log_lik[53] -4.1e-01 2.2e-03 0.13 0.12 -0.64 -3.9e-01 -2.2e-01 3463 2821 1.0
-#> log_lik[54] -3.7e-01 2.2e-03 0.14 0.13 -0.63 -3.5e-01 -1.8e-01 4357 3109 1.00
-#> log_lik[55] -3.9e-01 2.1e-03 0.13 0.13 -0.63 -3.7e-01 -2.0e-01 4184 2870 1.0
-#> log_lik[56] -3.2e-01 2.8e-03 0.19 0.17 -0.67 -2.8e-01 -9.4e-02 4725 2892 1.0
-#> log_lik[57] -6.6e-01 1.8e-03 0.12 0.12 -0.86 -6.5e-01 -4.8e-01 4204 3006 1.0
-#> log_lik[58] -9.5e-01 5.4e-03 0.36 0.35 -1.6 -9.0e-01 -4.5e-01 4724 2916 1.0
-#> log_lik[59] -1.4e+00 5.4e-03 0.34 0.33 -2.0 -1.3e+00 -8.4e-01 4126 2944 1.0
-#> log_lik[60] -9.8e-01 2.4e-03 0.16 0.16 -1.3 -9.7e-01 -7.3e-01 4237 3028 1.0
-#> log_lik[61] -5.4e-01 1.5e-03 0.097 0.096 -0.71 -5.4e-01 -3.9e-01 4260 3064 1.0
-#> log_lik[62] -8.8e-01 4.8e-03 0.31 0.31 -1.4 -8.5e-01 -4.4e-01 4420 3420 1.0
-#> log_lik[63] -1.2e-01 1.3e-03 0.075 0.063 -0.26 -1.0e-01 -3.4e-02 3288 2867 1.0
-#> log_lik[64] -9.0e-01 3.6e-03 0.24 0.23 -1.3 -8.7e-01 -5.5e-01 4286 3272 1.0
-#> log_lik[65] -2.0e+00 1.0e-02 0.59 0.61 -3.0 -2.0e+00 -1.1e+00 3390 2645 1.0
-#> log_lik[66] -5.1e-01 2.1e-03 0.14 0.13 -0.75 -5.0e-01 -3.1e-01 4289 2934 1.0
-#> log_lik[67] -2.8e-01 1.3e-03 0.081 0.080 -0.42 -2.7e-01 -1.6e-01 3650 3029 1.0
-#> log_lik[68] -1.1e+00 3.7e-03 0.23 0.22 -1.5 -1.0e+00 -7.1e-01 3844 3244 1.0
-#> log_lik[69] -4.4e-01 1.4e-03 0.083 0.084 -0.58 -4.3e-01 -3.1e-01 3763 2707 1.0
-#> log_lik[70] -6.4e-01 3.4e-03 0.23 0.21 -1.1 -6.1e-01 -3.2e-01 4692 3113 1.0
-#> log_lik[71] -6.1e-01 3.3e-03 0.21 0.21 -0.99 -5.8e-01 -3.1e-01 4276 3089 1.0
-#> log_lik[72] -4.6e-01 2.6e-03 0.17 0.16 -0.78 -4.4e-01 -2.3e-01 4326 3218 1.0
-#> log_lik[73] -1.5e+00 6.0e-03 0.37 0.37 -2.1 -1.4e+00 -9.1e-01 3893 3087 1.0
-#> log_lik[74] -9.5e-01 2.9e-03 0.19 0.20 -1.3 -9.3e-01 -6.5e-01 4399 2879 1.0
-#> log_lik[75] -1.1e+00 5.8e-03 0.38 0.38 -1.8 -1.1e+00 -5.8e-01 4471 3388 1.0
-#> log_lik[76] -3.7e-01 2.1e-03 0.14 0.13 -0.62 -3.5e-01 -1.8e-01 4223 2860 1.00
-#> log_lik[77] -8.8e-01 2.2e-03 0.14 0.14 -1.1 -8.7e-01 -6.7e-01 3901 2655 1.0
-#> log_lik[78] -4.9e-01 2.6e-03 0.17 0.16 -0.80 -4.6e-01 -2.5e-01 4623 3544 1.00
-#> log_lik[79] -7.6e-01 3.0e-03 0.19 0.19 -1.1 -7.5e-01 -4.8e-01 4156 3125 1.0
-#> log_lik[80] -5.4e-01 2.8e-03 0.19 0.18 -0.88 -5.2e-01 -2.7e-01 4708 3185 1.0
-#> log_lik[81] -1.7e-01 1.7e-03 0.10 0.084 -0.37 -1.4e-01 -4.9e-02 4002 3125 1.0
-#> log_lik[82] -2.2e-01 2.1e-03 0.14 0.11 -0.48 -1.9e-01 -6.6e-02 4234 2703 1.0
-#> log_lik[83] -3.5e-01 1.3e-03 0.080 0.079 -0.49 -3.4e-01 -2.2e-01 3872 2864 1.00
-#> log_lik[84] -2.8e-01 1.5e-03 0.091 0.087 -0.44 -2.7e-01 -1.5e-01 4033 2778 1.0
-#> log_lik[85] -1.3e-01 1.2e-03 0.075 0.063 -0.28 -1.2e-01 -4.3e-02 4008 3071 1.0
-#> log_lik[86] -1.1e+00 4.8e-03 0.31 0.30 -1.7 -1.1e+00 -6.7e-01 4424 2829 1.0
-#> log_lik[87] -8.2e-01 1.9e-03 0.13 0.12 -1.0 -8.1e-01 -6.3e-01 4199 3091 1.0
-#> log_lik[88] -7.7e-01 3.9e-03 0.24 0.24 -1.2 -7.4e-01 -4.3e-01 3979 2725 1.00
-#> log_lik[89] -1.3e+00 5.0e-03 0.32 0.31 -1.8 -1.2e+00 -8.0e-01 4105 2643 1.0
-#> log_lik[90] -2.7e-01 2.2e-03 0.14 0.12 -0.54 -2.4e-01 -9.4e-02 4059 3128 1.0
-#> log_lik[91] -3.9e-01 2.0e-03 0.13 0.12 -0.63 -3.7e-01 -2.0e-01 4231 3276 1.0
-#> log_lik[92] -1.5e+00 5.5e-03 0.34 0.34 -2.1 -1.5e+00 -9.7e-01 3905 2575 1.0
-#> log_lik[93] -7.4e-01 3.5e-03 0.22 0.22 -1.1 -7.2e-01 -4.2e-01 3995 2990 1.0
-#> log_lik[94] -3.2e-01 1.4e-03 0.087 0.084 -0.48 -3.1e-01 -1.9e-01 4054 2800 1.00
-#> log_lik[95] -3.9e-01 1.9e-03 0.11 0.11 -0.59 -3.8e-01 -2.3e-01 3454 2901 1.0
-#> log_lik[96] -1.6e+00 4.6e-03 0.28 0.28 -2.1 -1.5e+00 -1.1e+00 3721 3096 1.0
-#> log_lik[97] -4.3e-01 1.5e-03 0.098 0.095 -0.60 -4.2e-01 -2.8e-01 4401 3224 1.0
-#> log_lik[98] -1.0e+00 5.7e-03 0.38 0.37 -1.7 -1.0e+00 -5.2e-01 4630 2912 1.0
-#> log_lik[99] -6.9e-01 2.1e-03 0.14 0.13 -0.94 -6.8e-01 -4.8e-01 4146 3139 1.0
-#> log_lik[100] -3.9e-01 1.5e-03 0.096 0.096 -0.56 -3.8e-01 -2.5e-01 4330 2963 1.0
-#>
-#> Samples were drawn using hmc with nuts.
-#> For each parameter, ESS_bulk and ESS_tail measure the effective sample size for the entire sample (bulk) and for the .05 and .95 tails (tail),
-#> and R_hat measures the potential scale reduction on split chains. At convergence R_hat will be very close to 1.00.
-# }
-
-Return Stan code
-code()A character vector with one element per line of code.
-
-# \dontrun{
-fit <- cmdstanr_example()
-fit$code() # character vector
-#> [1] "data {"
-#> [2] " int<lower=0> N;"
-#> [3] " int<lower=0> K;"
-#> [4] " array[N] int<lower=0, upper=1> y;"
-#> [5] " matrix[N, K] X;"
-#> [6] "}"
-#> [7] "parameters {"
-#> [8] " real alpha;"
-#> [9] " vector[K] beta;"
-#> [10] "}"
-#> [11] "model {"
-#> [12] " target += normal_lpdf(alpha | 0, 1);"
-#> [13] " target += normal_lpdf(beta | 0, 1);"
-#> [14] " target += bernoulli_logit_glm_lpmf(y | X, alpha, beta);"
-#> [15] "}"
-#> [16] "generated quantities {"
-#> [17] " vector[N] log_lik;"
-#> [18] " for (n in 1 : N) {"
-#> [19] " log_lik[n] = bernoulli_logit_lpmf(y[n] | alpha + X[n] * beta);"
-#> [20] " }"
-#> [21] "}"
-cat(fit$code(), sep = "\n") # pretty print
-#> data {
-#> int<lower=0> N;
-#> int<lower=0> K;
-#> array[N] int<lower=0, upper=1> y;
-#> matrix[N, K] X;
-#> }
-#> parameters {
-#> real alpha;
-#> vector[K] beta;
-#> }
-#> model {
-#> target += normal_lpdf(alpha | 0, 1);
-#> target += normal_lpdf(beta | 0, 1);
-#> target += bernoulli_logit_glm_lpmf(y | X, alpha, beta);
-#> }
-#> generated quantities {
-#> vector[N] log_lik;
-#> for (n in 1 : N) {
-#> log_lik[n] = bernoulli_logit_lpmf(y[n] | alpha + X[n] * beta);
-#> }
-#> }
-# }
-
-R/fit.R
- fit-method-constrain_variables.RdThe $constrain_variables() method transforms input parameters
-to the constrained scale.
constrain_variables(
- unconstrained_variables,
- transformed_parameters = TRUE,
- generated_quantities = TRUE
-)(numeric) A vector of unconstrained parameters -to constrain.
(logical) Whether to return transformed -parameters implied by newly-constrained parameters (defaults to TRUE).
(logical) Whether to return generated quantities -implied by newly-constrained parameters (defaults to TRUE).
log_prob(), grad_log_prob(), constrain_variables(),
-unconstrain_variables(), unconstrain_draws(), variable_skeleton(),
-hessian()
# \dontrun{
-fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE)
-fit_mcmc$constrain_variables(unconstrained_variables = c(0.5, 1.2, 1.1, 2.2))
-#> $alpha
-#> [1] 0.5
-#>
-#> $beta
-#> [1] 1.2 1.1 2.2
-#>
-#> $log_lik
-#> [1] -0.671996961 -0.076521408 -0.050097680 -0.859061944 -4.368940350
-#> [6] -0.026348601 -1.399709850 -1.607383090 -0.197580264 -0.161519876
-#> [11] -0.020289680 -0.005109469 -3.391453198 -0.438505774 -0.019831358
-#> [16] -0.320576250 -0.491982174 -0.985897601 -0.760711848 -0.135968222
-#> [21] -0.090365458 -0.289106971 -2.401995550 -0.741279075 -1.130539252
-#> [26] -0.051754355 -0.052810285 -0.592894549 -4.203884680 -1.627902707
-#> [31] -0.086396696 -0.055223416 -0.702223347 -1.437214411 -4.277397542
-#> [36] -3.430371053 -0.029399304 -2.582700752 -0.082096143 -0.613542693
-#> [41] -1.631715761 -0.780555873 -8.456425184 -1.129873174 -0.803760727
-#> [46] -0.375921603 -0.200697408 -1.179516083 -0.288812923 -0.028251618
-#> [51] -2.203830461 -0.542549041 -0.108332542 -1.410224678 -0.176267923
-#> [56] -6.137792581 -0.301999796 -0.002415079 -1.082073078 -0.661755204
-#> [61] -0.906463865 -0.022520669 -0.200599774 -0.088896080 -4.807633540
-#> [66] -0.027891348 -1.129060197 -0.378523990 -0.287530851 -4.083209231
-#> [71] -4.503703248 -2.056130808 -0.409881228 -2.435290917 -0.022311747
-#> [76] -0.310871780 -1.558477854 -3.416918272 -0.143495888 -0.017064658
-#> [81] -1.422948018 -0.019539438 -0.383208393 -0.083042642 -0.216009949
-#> [86] -3.175676338 -0.547416219 -3.767462435 -1.854701489 -3.595218847
-#> [91] -1.713321036 -4.146153592 -1.389571073 -0.344785402 -0.305319997
-#> [96] -1.233113298 -1.753232181 -0.002618034 -0.608156530 -1.329013367
-#>
-# }
-
-Warnings and summaries of sampler diagnostics. To instead get
-the underlying values of the sampler diagnostics for each iteration and
-chain use the $sampler_diagnostics()
-method.
Currently parameter-specific diagnostics like R-hat and effective sample
-size are not handled by this method. Those diagnostics are provided via
-the $summary() method (using
-posterior::summarize_draws()).
diagnostic_summary(
- diagnostics = c("divergences", "treedepth", "ebfmi"),
- quiet = FALSE
-)(character vector) One or more diagnostics to check. The
-currently supported diagnostics are "divergences, "treedepth", and
-"ebfmi. The default is to check all of them.
(logical) Should warning messages about the diagnostics be
-suppressed? The default is FALSE, in which case warning messages are
-printed in addition to returning the values of the diagnostics.
A list with as many named elements as diagnostics selected. The
-possible elements and their values are:
"num_divergent": A vector of the number of divergences per chain.
"num_max_treedepth": A vector of the number of times max_treedepth was hit per chain.
"ebfmi": A vector of E-BFMI values per chain.
CmdStanMCMC and the
-$sampler_diagnostics() method
# \dontrun{
-fit <- cmdstanr_example("schools")
-#> Warning: 191 of 4000 (5.0%) transitions ended with a divergence.
-#> See https://mc-stan.org/misc/warnings for details.
-#> Warning: 2 of 4 chains had an E-BFMI less than 0.3.
-#> See https://mc-stan.org/misc/warnings for details.
-fit$diagnostic_summary()
-#> Warning: 191 of 4000 (5.0%) transitions ended with a divergence.
-#> See https://mc-stan.org/misc/warnings for details.
-#> Warning: 2 of 4 chains had an E-BFMI less than 0.3.
-#> See https://mc-stan.org/misc/warnings for details.
-#> $num_divergent
-#> [1] 23 107 22 39
-#>
-#> $num_max_treedepth
-#> [1] 0 0 0 0
-#>
-#> $ebfmi
-#> [1] 0.4160269 0.2285873 0.3803260 0.2845859
-#>
-fit$diagnostic_summary(quiet = TRUE)
-#> $num_divergent
-#> [1] 23 107 22 39
-#>
-#> $num_max_treedepth
-#> [1] 0 0 0 0
-#>
-#> $ebfmi
-#> [1] 0.4160269 0.2285873 0.3803260 0.2845859
-#>
-# }
-
-Extract posterior draws after MCMC or approximate posterior -draws after variational approximation using formats provided by the -posterior package.
-The variables include the parameters, transformed parameters, and
-generated quantities from the Stan program as well as lp__, the total
-log probability (target) accumulated in the model block.
draws(
- variables = NULL,
- inc_warmup = FALSE,
- format = getOption("cmdstanr_draws_format")
-)(character vector) Optionally, the names of the variables -(parameters, transformed parameters, and generated quantities) to read in.
If NULL (the default) then all variables are included.
If an empty string (variables="") then none are included.
For non-scalar variables all elements or specific elements can be selected:
variables = "theta" selects all elements of theta;
variables = c("theta[1]", "theta[3]") selects only the 1st and 3rd elements.
(logical) Should warmup draws be included? Defaults to
-FALSE. Ignored except when used with CmdStanMCMC objects.
(string) The format of the returned draws or point estimates. -Must be a valid format from the posterior package. The defaults -are the following.
For sampling and generated quantities the default is
-"draws_array". This format keeps the chains
-separate. To combine the chains use any of the other formats (e.g.
-"draws_matrix").
For point estimates from optimization and approximate draws from
-variational inference the default is
-"draws_matrix".
To use a different format it can be specified as the full name of the
-format from the posterior package (e.g. format = "draws_df") or
-omitting the "draws_" prefix (e.g. format = "df").
Changing the default format: To change the default format for an entire
-R session use options(cmdstanr_draws_format = format), where format is
-the name (in quotes) of a valid format from the posterior package. For
-example options(cmdstanr_draws_format = "draws_df") will change the
-default to a data frame.
Note about efficiency: For models with a large number of parameters
-(20k+) we recommend using the "draws_list" format, which is the most
-efficient and RAM friendly when combining draws from multiple chains. If
-speed or memory is not a constraint we recommend selecting the format that
-most suits the coding style of the post processing phase.
Depends on the value of format. The defaults are:
For MCMC, a 3-D
-draws_array object (iteration x chain x
-variable).
For standalone generated quantities, a
-3-D draws_array object (iteration x chain x
-variable).
For variational inference, a 2-D
-draws_matrix object (draw x variable) because
-there are no chains. An additional variable lp_approx__ is also included,
-which is the log density of the variational approximation to the posterior
-evaluated at each of the draws.
For optimization, a 1-row
-draws_matrix with one column per variable. These
-are not actually draws, just point estimates stored in the draws_matrix
-format. See $mle() to extract them as a numeric vector.
# \dontrun{
-# logistic regression with intercept alpha and coefficients beta
-fit <- cmdstanr_example("logistic", method = "sample")
-
-# returned as 3-D array (see ?posterior::draws_array)
-draws <- fit$draws()
-dim(draws)
-#> [1] 1000 4 105
-str(draws)
-#> 'draws_array' num [1:1000, 1:4, 1:105] -64.4 -64.5 -65.8 -66.7 -66 ...
-#> - attr(*, "dimnames")=List of 3
-#> ..$ iteration: chr [1:1000] "1" "2" "3" "4" ...
-#> ..$ chain : chr [1:4] "1" "2" "3" "4"
-#> ..$ variable : chr [1:105] "lp__" "alpha" "beta[1]" "beta[2]" ...
-
-# can easily convert to other formats (data frame, matrix, list)
-# using the posterior package
-head(posterior::as_draws_matrix(draws))
-#> # A draws_matrix: 6 iterations, 1 chains, and 105 variables
-#> variable
-#> draw lp__ alpha beta[1] beta[2] beta[3] log_lik[1] log_lik[2] log_lik[3]
-#> 1 -64 0.37 -0.73 -0.28 0.46 -0.49 -0.47 -0.49
-#> 2 -65 0.37 -0.54 -0.24 0.89 -0.54 -0.31 -0.44
-#> 3 -66 0.44 -1.06 -0.43 0.84 -0.46 -0.30 -0.44
-#> 4 -67 0.49 -0.81 0.15 0.92 -0.42 -0.24 -0.16
-#> 5 -66 0.34 -0.51 -0.53 0.30 -0.55 -0.67 -0.87
-#> 6 -66 0.38 -0.42 -0.44 0.34 -0.54 -0.66 -0.76
-#> # ... with 97 more variables
-
-# or can specify 'format' argument to avoid manual conversion
-# matrix format combines all chains
-draws <- fit$draws(format = "matrix")
-head(draws)
-#> # A draws_matrix: 6 iterations, 1 chains, and 105 variables
-#> variable
-#> draw lp__ alpha beta[1] beta[2] beta[3] log_lik[1] log_lik[2] log_lik[3]
-#> 1 -64 0.37 -0.73 -0.28 0.46 -0.49 -0.47 -0.49
-#> 2 -65 0.37 -0.54 -0.24 0.89 -0.54 -0.31 -0.44
-#> 3 -66 0.44 -1.06 -0.43 0.84 -0.46 -0.30 -0.44
-#> 4 -67 0.49 -0.81 0.15 0.92 -0.42 -0.24 -0.16
-#> 5 -66 0.34 -0.51 -0.53 0.30 -0.55 -0.67 -0.87
-#> 6 -66 0.38 -0.42 -0.44 0.34 -0.54 -0.66 -0.76
-#> # ... with 97 more variables
-
-# can select specific parameters
-fit$draws("alpha")
-#> # A draws_array: 1000 iterations, 4 chains, and 1 variables
-#> , , variable = alpha
-#>
-#> chain
-#> iteration 1 2 3 4
-#> 1 0.37 0.152 0.089 0.26
-#> 2 0.37 0.301 0.658 0.22
-#> 3 0.44 0.685 0.554 0.52
-#> 4 0.49 0.049 0.734 0.51
-#> 5 0.34 0.171 0.797 0.22
-#>
-#> # ... with 995 more iterations
-fit$draws("beta") # selects entire vector beta
-#> # A draws_array: 1000 iterations, 4 chains, and 3 variables
-#> , , variable = beta[1]
-#>
-#> chain
-#> iteration 1 2 3 4
-#> 1 -0.73 -0.57 -0.63 -0.95
-#> 2 -0.54 -0.63 -0.67 -0.64
-#> 3 -1.06 -0.85 -1.14 -0.59
-#> 4 -0.81 -0.58 -0.70 -0.67
-#> 5 -0.51 -0.60 -0.68 -1.30
-#>
-#> , , variable = beta[2]
-#>
-#> chain
-#> iteration 1 2 3 4
-#> 1 -0.28 -0.615 -0.432 -0.21
-#> 2 -0.24 -0.304 -0.049 -0.14
-#> 3 -0.43 -0.431 -0.395 0.06
-#> 4 0.15 -0.192 -0.265 -0.13
-#> 5 -0.53 -0.018 -0.289 -0.19
-#>
-#> , , variable = beta[3]
-#>
-#> chain
-#> iteration 1 2 3 4
-#> 1 0.46 0.87 0.53 0.50
-#> 2 0.89 0.84 0.89 0.22
-#> 3 0.84 0.83 0.47 0.35
-#> 4 0.92 0.71 1.36 0.67
-#> 5 0.30 0.42 1.40 0.59
-#>
-#> # ... with 995 more iterations
-fit$draws(c("alpha", "beta[2]"))
-#> # A draws_array: 1000 iterations, 4 chains, and 2 variables
-#> , , variable = alpha
-#>
-#> chain
-#> iteration 1 2 3 4
-#> 1 0.37 0.152 0.089 0.26
-#> 2 0.37 0.301 0.658 0.22
-#> 3 0.44 0.685 0.554 0.52
-#> 4 0.49 0.049 0.734 0.51
-#> 5 0.34 0.171 0.797 0.22
-#>
-#> , , variable = beta[2]
-#>
-#> chain
-#> iteration 1 2 3 4
-#> 1 -0.28 -0.615 -0.432 -0.21
-#> 2 -0.24 -0.304 -0.049 -0.14
-#> 3 -0.43 -0.431 -0.395 0.06
-#> 4 0.15 -0.192 -0.265 -0.13
-#> 5 -0.53 -0.018 -0.289 -0.19
-#>
-#> # ... with 995 more iterations
-
-# can be passed directly to bayesplot plotting functions
-bayesplot::color_scheme_set("brightblue")
-bayesplot::mcmc_dens(fit$draws(c("alpha", "beta")))
-
-bayesplot::mcmc_scatter(fit$draws(c("beta[1]", "beta[2]")), alpha = 0.3)
-
-
-
-# example using variational inference
-fit <- cmdstanr_example("logistic", method = "variational")
-head(fit$draws("beta")) # a matrix by default
-#> # A draws_matrix: 6 iterations, 1 chains, and 3 variables
-#> variable
-#> draw beta[1] beta[2] beta[3]
-#> 1 -0.70 -0.15 0.690
-#> 2 -0.69 -0.54 0.897
-#> 3 -0.41 -0.67 0.323
-#> 4 -0.93 -0.51 0.460
-#> 5 -0.77 -0.39 0.031
-#> 6 -0.64 -0.16 0.711
-head(fit$draws("beta", format = "df"))
-#> # A draws_df: 6 iterations, 1 chains, and 3 variables
-#> beta[1] beta[2] beta[3]
-#> 1 -0.70 -0.15 0.690
-#> 2 -0.69 -0.54 0.897
-#> 3 -0.41 -0.67 0.323
-#> 4 -0.93 -0.51 0.460
-#> 5 -0.77 -0.39 0.031
-#> 6 -0.64 -0.16 0.711
-#> # ... hidden reserved variables {'.chain', '.iteration', '.draw'}
-# }
-
-R/fit.R
- fit-method-grad_log_prob.RdThe $grad_log_prob() method provides access to the Stan
-model's log_prob function and its derivative.
grad_log_prob(
- unconstrained_variables,
- jacobian = TRUE,
- jacobian_adjustment = NULL
-)(numeric) A vector of unconstrained parameters.
(logical) Whether to include the log-density adjustments from -un/constraining variables.
Deprecated. Please use jacobian instead.
log_prob(), grad_log_prob(), constrain_variables(),
-unconstrain_variables(), unconstrain_draws(), variable_skeleton(),
-hessian()
# \dontrun{
-fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE)
-fit_mcmc$grad_log_prob(unconstrained_variables = c(0.5, 1.2, 1.1, 2.2))
-#> [1] 1.462151 -26.619534 -25.528776 -14.286822
-#> attr(,"log_prob")
-#> [1] -130.2141
-# }
-
-Return the data frame containing the gradients for all -parameters.
-gradients()A list of lists. See Examples.
-# \dontrun{
-test <- cmdstanr_example("logistic", method = "diagnose")
-
-# retrieve the gradients
-test$gradients()
-#> param_idx value model finite_diff error
-#> 1 0 -1.956090 42.51440 42.51440 -1.09401e-08
-#> 2 1 -0.130427 -13.33070 -13.33070 -1.40895e-08
-#> 3 2 1.228760 -24.25270 -24.25270 -4.45662e-08
-#> 4 3 0.625050 3.12825 3.12825 -3.33963e-09
-# }
-
-R/fit.R
- fit-method-hessian.RdThe $hessian() method provides access to the Stan model's
-log_prob, its derivative, and its hessian.
hessian(unconstrained_variables, jacobian = TRUE, jacobian_adjustment = NULL)(numeric) A vector of unconstrained parameters.
(logical) Whether to include the log-density adjustments from -un/constraining variables.
Deprecated. Please use jacobian instead.
# \dontrun{
-fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE)
-# fit_mcmc$init_model_methods(hessian = TRUE)
-# fit_mcmc$hessian(unconstrained_variables = c(0.5, 1.2, 1.1, 2.2))
-# }
-
-Return user-specified initial values. If the user provided
-initial values files or R objects (list of lists or function) via the
-init argument when fitting the model then these are returned (always in
-the list of lists format). Currently it is not possible to extract initial
-values generated automatically by CmdStan, although CmdStan may support
-this in the future.
init()A list of lists. See Examples.
-# \dontrun{
-init_fun <- function() list(alpha = rnorm(1), beta = rnorm(3))
-fit <- cmdstanr_example("logistic", init = init_fun, chains = 2)
-str(fit$init())
-#> List of 2
-#> $ :List of 2
-#> ..$ alpha: num -1.19
-#> ..$ beta : num [1:3] -0.0532 0.2552 1.706
-#> $ :List of 2
-#> ..$ alpha: num 1
-#> ..$ beta : num [1:3] -0.496 0.356 -1.135
-
-# partial inits (only specifying for a subset of parameters)
-init_list <- list(
- list(mu = 10, tau = 2),
- list(mu = -10, tau = 1)
-)
-fit <- cmdstanr_example("schools_ncp", init = init_list, chains = 2, adapt_delta = 0.9)
-#> Init values were only set for a subset of parameters.
-#> Missing init values for the following parameters:
-#> - chain 1: theta_raw
-#> - chain 2: theta_raw
-#>
-#> To disable this message use options(cmdstanr_warn_inits = FALSE).
-
-# only user-specified inits returned
-str(fit$init())
-#> List of 2
-#> $ :List of 2
-#> ..$ mu : int 10
-#> ..$ tau: int 2
-#> $ :List of 2
-#> ..$ mu : int -10
-#> ..$ tau: int 1
-# }
-
-R/fit.R
- fit-method-init_model_methods.RdThe $init_model_methods() method compiles and initializes the
-log_prob, grad_log_prob, constrain_variables, unconstrain_variables
-and unconstrain_draws functions. These are then available as methods of
-the fitted model object. This requires the additional Rcpp package,
-which are not required for fitting models using
-CmdStanR.
Note: there may be many compiler warnings emitted during compilation but -these can be ignored so long as they are warnings and not errors.
-init_model_methods(seed = 1, verbose = FALSE, hessian = FALSE)(integer) The random seed to use when initializing the model.
(logical) Whether to show verbose logging during compilation.
(logical) Whether to expose the (experimental) hessian method.
# \dontrun{
-fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE)
-# }
-Extract the inverse metric (mass matrix) for each MCMC chain.
-inv_metric(matrix = TRUE)(logical) If a diagonal metric was used, setting matrix = FALSE returns a list containing just the diagonals of the matrices instead
-of the full matrices. Setting matrix = FALSE has no effect for dense
-metrics.
A list of length equal to the number of MCMC chains. See the matrix
argument for details.
-# \dontrun{
-fit <- cmdstanr_example("logistic")
-fit$inv_metric()
-#> $`1`
-#> [,1] [,2] [,3] [,4]
-#> [1,] 0.0399675 0.0000000 0.000000 0.0000000
-#> [2,] 0.0000000 0.0581864 0.000000 0.0000000
-#> [3,] 0.0000000 0.0000000 0.045564 0.0000000
-#> [4,] 0.0000000 0.0000000 0.000000 0.0657117
-#>
-#> $`2`
-#> [,1] [,2] [,3] [,4]
-#> [1,] 0.042812 0.0000000 0.0000000 0.0000000
-#> [2,] 0.000000 0.0677379 0.0000000 0.0000000
-#> [3,] 0.000000 0.0000000 0.0505126 0.0000000
-#> [4,] 0.000000 0.0000000 0.0000000 0.0732512
-#>
-#> $`3`
-#> [,1] [,2] [,3] [,4]
-#> [1,] 0.0421817 0.000000 0.0000000 0.0000000
-#> [2,] 0.0000000 0.062998 0.0000000 0.0000000
-#> [3,] 0.0000000 0.000000 0.0483093 0.0000000
-#> [4,] 0.0000000 0.000000 0.0000000 0.0806989
-#>
-#> $`4`
-#> [,1] [,2] [,3] [,4]
-#> [1,] 0.0458222 0.0000000 0.0000000 0.0000000
-#> [2,] 0.0000000 0.0724218 0.0000000 0.0000000
-#> [3,] 0.0000000 0.0000000 0.0524932 0.0000000
-#> [4,] 0.0000000 0.0000000 0.0000000 0.0726335
-#>
-fit$inv_metric(matrix=FALSE)
-#> $`1`
-#> [1] 0.0399675 0.0581864 0.0455640 0.0657117
-#>
-#> $`2`
-#> [1] 0.0428120 0.0677379 0.0505126 0.0732512
-#>
-#> $`3`
-#> [1] 0.0421817 0.0629980 0.0483093 0.0806989
-#>
-#> $`4`
-#> [1] 0.0458222 0.0724218 0.0524932 0.0726335
-#>
-
-fit <- cmdstanr_example("logistic", metric = "dense_e")
-fit$inv_metric()
-#> $`1`
-#> [,1] [,2] [,3] [,4]
-#> [1,] 0.048423300 0.001823680 0.00659113 -0.000666867
-#> [2,] 0.001823680 0.055877400 -0.00378891 0.000328776
-#> [3,] 0.006591130 -0.003788910 0.04607360 -0.010493400
-#> [4,] -0.000666867 0.000328776 -0.01049340 0.061314400
-#>
-#> $`2`
-#> [,1] [,2] [,3] [,4]
-#> [1,] 0.055615500 -0.004688890 0.000366103 0.00657168
-#> [2,] -0.004688890 0.070634600 0.000161933 -0.00900785
-#> [3,] 0.000366103 0.000161933 0.043863800 -0.01242910
-#> [4,] 0.006571680 -0.009007850 -0.012429100 0.06169190
-#>
-#> $`3`
-#> [,1] [,2] [,3] [,4]
-#> [1,] 0.046092000 0.000277152 -0.000196182 0.00653545
-#> [2,] 0.000277152 0.052935700 -0.001316380 -0.00135849
-#> [3,] -0.000196182 -0.001316380 0.052723400 -0.00738233
-#> [4,] 0.006535450 -0.001358490 -0.007382330 0.06829300
-#>
-#> $`4`
-#> [,1] [,2] [,3] [,4]
-#> [1,] 0.05362400 -0.00182030 0.00998512 -0.00383164
-#> [2,] -0.00182030 0.06051710 0.00414984 -0.01575160
-#> [3,] 0.00998512 0.00414984 0.04282750 -0.01094520
-#> [4,] -0.00383164 -0.01575160 -0.01094520 0.07697140
-#>
-# }
-
-Extract the inverse metric (mass matrix) for each chain.
-$inverse_metric(matrix = TRUE) -- -
matrix: (logical) Should a list of matrices be returned? By default a
-list of matrices is always returned, even if a diagonal metric was used when
-fitting the model. If a diagonal metric was used then setting matrix=FALSE
-will return a list of vectors instead, which uses less memory.
A list of length equal to the number of MCMC chains. See the matrix
-argument for details.
-#>fit$inverse_metric()#> [[1]] -#> [,1] [,2] [,3] [,4] -#> [1,] 0.0489704 0.0000000 0.0000000 0.0000000 -#> [2,] 0.0000000 0.0565619 0.0000000 0.0000000 -#> [3,] 0.0000000 0.0000000 0.0505927 0.0000000 -#> [4,] 0.0000000 0.0000000 0.0000000 0.0765553 -#> -#> [[2]] -#> [,1] [,2] [,3] [,4] -#> [1,] 0.0464694 0.0000000 0.0000000 0.0000000 -#> [2,] 0.0000000 0.0502202 0.0000000 0.0000000 -#> [3,] 0.0000000 0.0000000 0.0505695 0.0000000 -#> [4,] 0.0000000 0.0000000 0.0000000 0.0764519 -#> -#> [[3]] -#> [,1] [,2] [,3] [,4] -#> [1,] 0.0508438 0.0000000 0.0000000 0.0000000 -#> [2,] 0.0000000 0.0673253 0.0000000 0.0000000 -#> [3,] 0.0000000 0.0000000 0.0560788 0.0000000 -#> [4,] 0.0000000 0.0000000 0.0000000 0.0788151 -#> -#> [[4]] -#> [,1] [,2] [,3] [,4] -#> [1,] 0.0448417 0.0000000 0.0000000 0.0000000 -#> [2,] 0.0000000 0.0581256 0.0000000 0.0000000 -#> [3,] 0.0000000 0.0000000 0.0553186 0.0000000 -#> [4,] 0.0000000 0.0000000 0.0000000 0.0710309 -#>fit$inverse_metric(matrix=FALSE)#> [[1]] -#> [1] 0.0489704 0.0565619 0.0505927 0.0765553 -#> -#> [[2]] -#> [1] 0.0464694 0.0502202 0.0505695 0.0764519 -#> -#> [[3]] -#> [1] 0.0508438 0.0673253 0.0560788 0.0788151 -#> -#> [[4]] -#> [1] 0.0448417 0.0581256 0.0553186 0.0710309 -#>#>fit$inverse_metric()#> [[1]] -#> [,1] [,2] [,3] [,4] -#> [1,] 0.04622490 -0.00200878 -0.00532990 0.00912209 -#> [2,] -0.00200878 0.05868030 -0.00111311 -0.00606391 -#> [3,] -0.00532990 -0.00111311 0.04705130 -0.01343920 -#> [4,] 0.00912209 -0.00606391 -0.01343920 0.06636260 -#> -#> [[2]] -#> [,1] [,2] [,3] [,4] -#> [1,] 0.048247300 0.000662222 0.00412900 0.00605341 -#> [2,] 0.000662222 0.066143300 -0.01073320 0.00407476 -#> [3,] 0.004129000 -0.010733200 0.05326770 -0.00932996 -#> [4,] 0.006053410 0.004074760 -0.00932996 0.07355610 -#> -#> [[3]] -#> [,1] [,2] [,3] [,4] -#> [1,] 0.04912670 -0.00264697 0.00630654 -0.00385016 -#> [2,] -0.00264697 0.06148540 -0.01042830 -0.01020800 -#> [3,] 0.00630654 -0.01042830 0.06169950 -0.01587110 -#> [4,] -0.00385016 -0.01020800 -0.01587110 0.07391140 -#> -#> [[4]] -#> [,1] [,2] [,3] [,4] -#> [1,] 0.040504300 0.000998829 -0.000784530 0.00366177 -#> [2,] 0.000998829 0.051848800 0.000382584 -0.00376364 -#> [3,] -0.000784530 0.000382584 0.044784800 -0.01134560 -#> [4,] 0.003661770 -0.003763640 -0.011345600 0.06060850 -#># } - -
R/fit.R
- fit-method-log_prob.RdThe $log_prob() method provides access to the Stan model's
-log_prob function.
log_prob(unconstrained_variables, jacobian = TRUE, jacobian_adjustment = NULL)(numeric) A vector of unconstrained parameters.
(logical) Whether to include the log-density adjustments from -un/constraining variables.
Deprecated. Please use jacobian instead.
# \dontrun{
-fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE)
-fit_mcmc$log_prob(unconstrained_variables = c(0.5, 1.2, 1.1, 2.2))
-#> [1] -130.2141
-# }
-
-The $loo() method computes approximate LOO-CV using the
-loo package. In order to use this method you must compute and save
-the pointwise log-likelihood in your Stan program. See loo::loo.array()
-and the loo package vignettes
-for details.
loo(variables = "log_lik", r_eff = TRUE, moment_match = FALSE, ...)(string) The name of the variable in the Stan program
-containing the pointwise log-likelihood. The default is to look for
-"log_lik". This argument is passed to the $draws()
-method.
(multiple options) How to handle the r_eff argument for loo():
TRUE (the default) will automatically call loo::relative_eff.array()
-to compute the r_eff argument to pass to loo::loo.array().
FALSE or NULL will avoid computing r_eff (which can sometimes be slow),
-but the reported ESS and MCSE estimates can be over-optimistic if the
-posterior draws are not (near) independent.
If r_eff is anything else, that object will be passed as the r_eff
-argument to loo::loo.array().
(logical) Whether to use a
-moment-matching correction for problematic
-observations. The default is FALSE. Using moment_match=TRUE will result
-in compiling the additional methods described in
-fit-method-init_model_methods. This allows CmdStanR to automatically
-supply the functions for the log_lik_i, unconstrain_pars,
-log_prob_upars, and log_lik_i_upars arguments to
-loo::loo_moment_match().
Other arguments (e.g., cores, save_psis, etc.) passed to
-loo::loo.array() or loo::loo_moment_match.default()
-(if moment_match = TRUE is set).
The object returned by loo::loo.array() or
-loo::loo_moment_match.default().
The loo package website with -documentation and -vignettes.
-# \dontrun{
-# the "logistic" example model has "log_lik" in generated quantities
-fit <- cmdstanr_example("logistic")
-loo_result <- fit$loo(cores = 2)
-print(loo_result)
-#>
-#> Computed from 4000 by 100 log-likelihood matrix.
-#>
-#> Estimate SE
-#> elpd_loo -63.7 4.1
-#> p_loo 3.9 0.5
-#> looic 127.4 8.3
-#> ------
-#> MCSE of elpd_loo is 0.0.
-#> MCSE and ESS estimates assume MCMC draws (r_eff in [0.9, 1.4]).
-#>
-#> All Pareto k estimates are good (k < 0.7).
-#> See help('pareto-k-diagnostic') for details.
-# }
-
-The $lp() method extracts lp__, the total log probability
-(target) accumulated in the model block of the Stan program. For
-variational inference the log density of the variational approximation to
-the posterior is available via the $lp_approx() method. For
-Laplace approximation the unnormalized density of the approximation to
-the posterior is available via the $lp_approx() method.
See the Increment log density and Distribution Statements -sections of the Stan Reference Manual for details on when normalizing -constants are dropped from log probability calculations.
-lp()
-
-lp_approx()
-
-lp_approx()A numeric vector with length equal to the number of (post-warmup)
-draws or length equal to 1 for optimization.
lp__ is the unnormalized log density on Stan's unconstrained space.
-This will in general be different than the unnormalized model log density
-evaluated at a posterior draw (which is on the constrained space). lp__ is
-intended to diagnose sampling efficiency and evaluate approximations.
For variational inference lp_approx__ is the log density of the variational
-approximation to lp__ (also on the unconstrained space). It is exposed in
-the variational method for performing the checks described in Yao et al.
-(2018) and implemented in the loo package.
For Laplace approximation lp_approx__ is the unnormalized density of the
-Laplace approximation. It can be used to perform the same checks as in the
-case of the variational method described in Yao et al. (2018).
Yao, Y., Vehtari, A., Simpson, D., and Gelman, A. (2018). Yes, but did it -work?: Evaluating variational inference. Proceedings of the 35th -International Conference on Machine Learning, PMLR 80:5581–5590.
-# \dontrun{
-fit_mcmc <- cmdstanr_example("logistic")
-head(fit_mcmc$lp())
-#> [1] -65.7069 -64.7742 -64.7082 -65.3051 -65.2394 -65.4373
-
-fit_mle <- cmdstanr_example("logistic", method = "optimize")
-fit_mle$lp()
-#> [1] -63.9218
-
-fit_vb <- cmdstanr_example("logistic", method = "variational")
-plot(fit_vb$lp(), fit_vb$lp_approx())
-
-# }
-
-The $metadata() method returns a list of information gathered
-from the CSV output files, including the CmdStan configuration used when
-fitting the model. See Examples and read_cmdstan_csv().
metadata()# \dontrun{
-fit_mcmc <- cmdstanr_example("logistic", method = "sample")
-str(fit_mcmc$metadata())
-#> List of 42
-#> $ stan_version_major : num 2
-#> $ stan_version_minor : num 36
-#> $ stan_version_patch : num 0
-#> $ start_datetime : chr "2025-03-31 14:45:33 UTC"
-#> $ method : chr "sample"
-#> $ save_warmup : int 0
-#> $ thin : num 1
-#> $ gamma : num 0.05
-#> $ kappa : num 0.75
-#> $ t0 : num 10
-#> $ init_buffer : num 75
-#> $ term_buffer : num 50
-#> $ window : num 25
-#> $ save_metric : int 0
-#> $ algorithm : chr "hmc"
-#> $ engine : chr "nuts"
-#> $ metric : chr "diag_e"
-#> $ stepsize_jitter : num 0
-#> $ num_chains : num 1
-#> $ id : num [1:4] 1 2 3 4
-#> $ init : num [1:4] 2 2 2 2
-#> $ seed : num 2.88e+08
-#> $ refresh : num 100
-#> $ sig_figs : num -1
-#> $ profile_file : chr "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/logistic-profile-202503310845-1-2f9a61.csv"
-#> $ save_cmdstan_config : int 0
-#> $ stanc_version : chr "stanc3 v2.36.0"
-#> $ sampler_diagnostics : chr [1:6] "accept_stat__" "stepsize__" "treedepth__" "n_leapfrog__" ...
-#> $ variables : chr [1:105] "lp__" "alpha" "beta[1]" "beta[2]" ...
-#> $ step_size_adaptation: num [1:4] 0.796 0.801 0.702 0.62
-#> $ model_name : chr "logistic_model"
-#> $ adapt_engaged : int 1
-#> $ adapt_delta : num 0.8
-#> $ max_treedepth : num 10
-#> $ step_size : num [1:4] 1 1 1 1
-#> $ iter_warmup : num 1000
-#> $ iter_sampling : num 1000
-#> $ threads_per_chain : num 1
-#> $ time :'data.frame': 4 obs. of 4 variables:
-#> ..$ chain_id: num [1:4] 1 2 3 4
-#> ..$ warmup : num [1:4] 0.024 0.023 0.022 0.022
-#> ..$ sampling: num [1:4] 0.073 0.071 0.076 0.075
-#> ..$ total : num [1:4] 0.097 0.094 0.098 0.097
-#> $ stan_variable_sizes :List of 4
-#> ..$ lp__ : num 1
-#> ..$ alpha : num 1
-#> ..$ beta : num 3
-#> ..$ log_lik: num 100
-#> $ stan_variables : chr [1:4] "lp__" "alpha" "beta" "log_lik"
-#> $ model_params : chr [1:105] "lp__" "alpha" "beta[1]" "beta[2]" ...
-
-fit_mle <- cmdstanr_example("logistic", method = "optimize")
-str(fit_mle$metadata())
-#> List of 32
-#> $ stan_version_major : num 2
-#> $ stan_version_minor : num 36
-#> $ stan_version_patch : num 0
-#> $ start_datetime : chr "2025-03-31 14:45:35 UTC"
-#> $ method : chr "optimize"
-#> $ algorithm : chr "lbfgs"
-#> $ init_alpha : num 0.001
-#> $ tol_obj : num 1e-12
-#> $ tol_rel_obj : num 10000
-#> $ tol_grad : num 1e-08
-#> $ tol_rel_grad : num 1e+07
-#> $ tol_param : num 1e-08
-#> $ history_size : num 5
-#> $ jacobian : int 0
-#> $ iter : num 2000
-#> $ save_iterations : int 0
-#> $ id : num 1
-#> $ init : num 2
-#> $ seed : num 2.1e+09
-#> $ refresh : num 100
-#> $ sig_figs : num -1
-#> $ profile_file : chr "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/logistic-profile-202503310845-1-2604a8.csv"
-#> $ save_cmdstan_config: int 0
-#> $ stanc_version : chr "stanc3 v2.36.0"
-#> $ sampler_diagnostics: chr(0)
-#> $ variables : chr [1:105] "lp__" "alpha" "beta[1]" "beta[2]" ...
-#> $ model_name : chr "logistic_model"
-#> $ threads : num 1
-#> $ time :'data.frame': 0 obs. of 0 variables
-#> $ stan_variable_sizes:List of 4
-#> ..$ lp__ : num 1
-#> ..$ alpha : num 1
-#> ..$ beta : num 3
-#> ..$ log_lik: num 100
-#> $ stan_variables : chr [1:4] "lp__" "alpha" "beta" "log_lik"
-#> $ model_params : chr [1:105] "lp__" "alpha" "beta[1]" "beta[2]" ...
-
-fit_vb <- cmdstanr_example("logistic", method = "variational")
-str(fit_vb$metadata())
-#> List of 30
-#> $ stan_version_major : num 2
-#> $ stan_version_minor : num 36
-#> $ stan_version_patch : num 0
-#> $ start_datetime : chr "2025-03-31 14:45:35 UTC"
-#> $ method : chr "variational"
-#> $ algorithm : chr "meanfield"
-#> $ iter : num 50
-#> $ grad_samples : num 1
-#> $ elbo_samples : num 100
-#> $ eta : num 1
-#> $ tol_rel_obj : num 0.01
-#> $ eval_elbo : num 100
-#> $ output_samples : num 1000
-#> $ id : num 1
-#> $ init : num 2
-#> $ seed : num 9.33e+08
-#> $ refresh : num 100
-#> $ sig_figs : num -1
-#> $ profile_file : chr "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/logistic-profile-202503310845-1-4983c4.csv"
-#> $ save_cmdstan_config: int 0
-#> $ stanc_version : chr "stanc3 v2.36.0"
-#> $ sampler_diagnostics: chr(0)
-#> $ variables : chr [1:106] "lp__" "lp_approx__" "alpha" "beta[1]" ...
-#> $ model_name : chr "logistic_model"
-#> $ adapt_engaged : int 1
-#> $ threads : num 1
-#> $ time :'data.frame': 0 obs. of 0 variables
-#> $ stan_variable_sizes:List of 5
-#> ..$ lp__ : num 1
-#> ..$ lp_approx__: num 1
-#> ..$ alpha : num 1
-#> ..$ beta : num 3
-#> ..$ log_lik : num 100
-#> $ stan_variables : chr [1:5] "lp__" "lp_approx__" "alpha" "beta" ...
-#> $ model_params : chr [1:106] "lp__" "lp_approx__" "alpha" "beta[1]" ...
-# }
-
-The $mle() method is only available for CmdStanMLE
-objects. It returns the point estimate as a numeric vector with one element
-per variable. The returned vector does not include lp__, the total log
-probability (target) accumulated in the model block of the Stan program,
-which is available via the $lp() method and also
-included in the $draws() method.
For models with constrained parameters that are fit with jacobian=TRUE,
-the $mle() method actually returns the maximum a posteriori (MAP)
-estimate (posterior mode) rather than the MLE. See
-$optimize() and the CmdStan User's Guide for
-more details.
mle(variables = NULL)(character vector) The variables (parameters, transformed -parameters, and generated quantities) to include. If NULL (the default) -then all variables are included.
A numeric vector. See Examples.
-# \dontrun{
-fit <- cmdstanr_example("logistic", method = "optimize")
-fit$mle("alpha")
-#> alpha
-#> 0.364453
-fit$mle("beta")
-#> beta[1] beta[2] beta[3]
-#> -0.631550 -0.258968 0.648499
-fit$mle("beta[2]")
-#> beta[2]
-#> -0.258968
-# }
-
-The $num_chains() method returns the number of MCMC chains.
num_chains()An integer.
-# \dontrun{
-fit_mcmc <- cmdstanr_example(chains = 2)
-fit_mcmc$num_chains()
-#> [1] 2
-# }
-
-For MCMC, the $output() method returns the stdout and stderr
-of all chains as a list of character vectors if id=NULL. If the id
-argument is specified it instead pretty prints the console output for a
-single chain.
For optimization and variational inference $output() just pretty prints
-the console output.
output(id = NULL)(integer) The chain id. Ignored if the model was not fit using -MCMC.
# \dontrun{
-fit_mcmc <- cmdstanr_example("logistic", method = "sample")
-fit_mcmc$output(1)
-#>
-#> method = sample (Default)
-#> sample
-#> num_samples = 1000 (Default)
-#> num_warmup = 1000 (Default)
-#> save_warmup = false (Default)
-#> thin = 1 (Default)
-#> adapt
-#> engaged = true (Default)
-#> gamma = 0.05 (Default)
-#> delta = 0.8 (Default)
-#> kappa = 0.75 (Default)
-#> t0 = 10 (Default)
-#> init_buffer = 75 (Default)
-#> term_buffer = 50 (Default)
-#> window = 25 (Default)
-#> save_metric = false (Default)
-#> algorithm = hmc (Default)
-#> hmc
-#> engine = nuts (Default)
-#> nuts
-#> max_depth = 10 (Default)
-#> metric = diag_e (Default)
-#> metric_file = (Default)
-#> stepsize = 1 (Default)
-#> stepsize_jitter = 0 (Default)
-#> num_chains = 1 (Default)
-#> id = 1 (Default)
-#> data
-#> file = /private/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/Rtmp0Jp8Zk/temp_libpathb838752e97c0/cmdstanr/logistic.data.json
-#> init = 2 (Default)
-#> random
-#> seed = 279613479
-#> output
-#> file = /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/logistic-202503310845-1-28c1ee.csv
-#> diagnostic_file = (Default)
-#> refresh = 100 (Default)
-#> sig_figs = -1 (Default)
-#> profile_file = /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/logistic-profile-202503310845-1-0139e1.csv
-#> save_cmdstan_config = false (Default)
-#> num_threads = 1 (Default)
-#>
-#>
-#> Gradient evaluation took 2.7e-05 seconds
-#> 1000 transitions using 10 leapfrog steps per transition would take 0.27 seconds.
-#> Adjust your expectations accordingly!
-#>
-#>
-#> Iteration: 1 / 2000 [ 0%] (Warmup)
-#> Iteration: 100 / 2000 [ 5%] (Warmup)
-#> Iteration: 200 / 2000 [ 10%] (Warmup)
-#> Iteration: 300 / 2000 [ 15%] (Warmup)
-#> Iteration: 400 / 2000 [ 20%] (Warmup)
-#> Iteration: 500 / 2000 [ 25%] (Warmup)
-#> Iteration: 600 / 2000 [ 30%] (Warmup)
-#> Iteration: 700 / 2000 [ 35%] (Warmup)
-#> Iteration: 800 / 2000 [ 40%] (Warmup)
-#> Iteration: 900 / 2000 [ 45%] (Warmup)
-#> Iteration: 1000 / 2000 [ 50%] (Warmup)
-#> Iteration: 1001 / 2000 [ 50%] (Sampling)
-#> Iteration: 1100 / 2000 [ 55%] (Sampling)
-#> Iteration: 1200 / 2000 [ 60%] (Sampling)
-#> Iteration: 1300 / 2000 [ 65%] (Sampling)
-#> Iteration: 1400 / 2000 [ 70%] (Sampling)
-#> Iteration: 1500 / 2000 [ 75%] (Sampling)
-#> Iteration: 1600 / 2000 [ 80%] (Sampling)
-#> Iteration: 1700 / 2000 [ 85%] (Sampling)
-#> Iteration: 1800 / 2000 [ 90%] (Sampling)
-#> Iteration: 1900 / 2000 [ 95%] (Sampling)
-#> Iteration: 2000 / 2000 [100%] (Sampling)
-#>
-#> Elapsed Time: 0.022 seconds (Warm-up)
-#> 0.074 seconds (Sampling)
-#> 0.096 seconds (Total)
-out <- fit_mcmc$output()
-str(out)
-#> List of 4
-#> $ : chr [1:75] "" "method = sample (Default)" " sample" " num_samples = 1000 (Default)" ...
-#> $ : chr [1:75] "" "method = sample (Default)" " sample" " num_samples = 1000 (Default)" ...
-#> $ : chr [1:75] "" "method = sample (Default)" " sample" " num_samples = 1000 (Default)" ...
-#> $ : chr [1:75] "" "method = sample (Default)" " sample" " num_samples = 1000 (Default)" ...
-
-fit_mle <- cmdstanr_example("logistic", method = "optimize")
-fit_mle$output()
-#>
-#> method = optimize
-#> optimize
-#> algorithm = lbfgs (Default)
-#> lbfgs
-#> init_alpha = 0.001 (Default)
-#> tol_obj = 1e-12 (Default)
-#> tol_rel_obj = 10000 (Default)
-#> tol_grad = 1e-08 (Default)
-#> tol_rel_grad = 1e+07 (Default)
-#> tol_param = 1e-08 (Default)
-#> history_size = 5 (Default)
-#> jacobian = false (Default)
-#> iter = 2000 (Default)
-#> save_iterations = false (Default)
-#> id = 1 (Default)
-#> data
-#> file = /private/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/Rtmp0Jp8Zk/temp_libpathb838752e97c0/cmdstanr/logistic.data.json
-#> init = 2 (Default)
-#> random
-#> seed = 384468259
-#> output
-#> file = /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/logistic-202503310845-1-819b35.csv
-#> diagnostic_file = (Default)
-#> refresh = 100 (Default)
-#> sig_figs = -1 (Default)
-#> profile_file = /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/logistic-profile-202503310845-1-053c3a.csv
-#> save_cmdstan_config = false (Default)
-#> num_threads = 1 (Default)
-#>
-#> Initial log joint probability = -105.906
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 8 -63.9218 6.03661e-05 0.000635963 0.6801 0.6801 10
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-
-fit_vb <- cmdstanr_example("logistic", method = "variational")
-fit_vb$output()
-#>
-#> method = variational
-#> variational
-#> algorithm = meanfield (Default)
-#> meanfield
-#> iter = 10000 (Default)
-#> grad_samples = 1 (Default)
-#> elbo_samples = 100 (Default)
-#> eta = 1 (Default)
-#> adapt
-#> engaged = true (Default)
-#> iter = 50 (Default)
-#> tol_rel_obj = 0.01 (Default)
-#> eval_elbo = 100 (Default)
-#> output_samples = 1000 (Default)
-#> id = 1 (Default)
-#> data
-#> file = /private/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/Rtmp0Jp8Zk/temp_libpathb838752e97c0/cmdstanr/logistic.data.json
-#> init = 2 (Default)
-#> random
-#> seed = 428491960
-#> output
-#> file = /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/logistic-202503310845-1-1320f3.csv
-#> diagnostic_file = (Default)
-#> refresh = 100 (Default)
-#> sig_figs = -1 (Default)
-#> profile_file = /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/logistic-profile-202503310845-1-101778.csv
-#> save_cmdstan_config = false (Default)
-#> num_threads = 1 (Default)
-#>
-#> ------------------------------------------------------------
-#> EXPERIMENTAL ALGORITHM:
-#> This procedure has not been thoroughly tested and may be unstable
-#> or buggy. The interface is subject to change.
-#> ------------------------------------------------------------
-#>
-#>
-#>
-#> Gradient evaluation took 2.7e-05 seconds
-#> 1000 transitions using 10 leapfrog steps per transition would take 0.27 seconds.
-#> Adjust your expectations accordingly!
-#>
-#>
-#> Begin eta adaptation.
-#> Iteration: 1 / 250 [ 0%] (Adaptation)
-#> Iteration: 50 / 250 [ 20%] (Adaptation)
-#> Iteration: 100 / 250 [ 40%] (Adaptation)
-#> Iteration: 150 / 250 [ 60%] (Adaptation)
-#> Iteration: 200 / 250 [ 80%] (Adaptation)
-#> Success! Found best value [eta = 1] earlier than expected.
-#>
-#> Begin stochastic gradient ascent.
-#> iter ELBO delta_ELBO_mean delta_ELBO_med notes
-#> 100 -67.030 1.000 1.000
-#> 200 -66.410 0.505 1.000
-#> 300 -65.913 0.339 0.009 MEDIAN ELBO CONVERGED
-#>
-#> Drawing a sample of size 1000 from the approximate posterior...
-#> COMPLETED.
-# }
-
-The $profiles() method returns a list of data frames with
-profiling data if any profiling data was written to the profile CSV files.
-See save_profile_files() to control where the files are saved.
Support for profiling Stan programs is available with CmdStan >= 2.26 and -requires adding profiling statements to the Stan program.
-profiles()A list of data frames with profiling data if the profiling CSV files -were created.
-
-# \dontrun{
-# first fit a model using MCMC
-mcmc_program <- write_stan_file(
- 'data {
- int<lower=0> N;
- array[N] int<lower=0,upper=1> y;
- }
- parameters {
- real<lower=0,upper=1> theta;
- }
- model {
- profile("likelihood") {
- y ~ bernoulli(theta);
- }
- }
- generated quantities {
- array[N] int y_rep;
- profile("gq") {
- y_rep = bernoulli_rng(rep_vector(theta, N));
- }
- }
-'
-)
-mod_mcmc <- cmdstan_model(mcmc_program)
-
-data <- list(N = 10, y = c(1,1,0,0,0,1,0,1,0,0))
-fit <- mod_mcmc$sample(data = data, seed = 123, refresh = 0)
-#> Running MCMC with 4 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#> Chain 3 finished in 0.0 seconds.
-#> Chain 4 finished in 0.0 seconds.
-#>
-#> All 4 chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.6 seconds.
-#>
-
-fit$profiles()
-#> [[1]]
-#> name thread_id total_time forward_time reverse_time chain_stack
-#> 1 likelihood 0x7ff85af4eb00 0.001158940 0.000848855 0.00031008 6721
-#> 2 gq 0x7ff85af4eb00 0.000444795 0.000444795 0.00000000 0
-#> no_chain_stack autodiff_calls no_autodiff_calls
-#> 1 6721 6721 1
-#> 2 0 0 1000
-#>
-#> [[2]]
-#> name thread_id total_time forward_time reverse_time chain_stack
-#> 1 likelihood 0x7ff85af4eb00 0.001207200 0.000885927 0.000321272 6792
-#> 2 gq 0x7ff85af4eb00 0.000453808 0.000453808 0.000000000 0
-#> no_chain_stack autodiff_calls no_autodiff_calls
-#> 1 6792 6792 1
-#> 2 0 0 1000
-#>
-#> [[3]]
-#> name thread_id total_time forward_time reverse_time chain_stack
-#> 1 likelihood 0x7ff85af4eb00 0.001181750 0.000860407 0.000321346 6835
-#> 2 gq 0x7ff85af4eb00 0.000418544 0.000418544 0.000000000 0
-#> no_chain_stack autodiff_calls no_autodiff_calls
-#> 1 6835 6835 1
-#> 2 0 0 1000
-#>
-#> [[4]]
-#> name thread_id total_time forward_time reverse_time chain_stack
-#> 1 likelihood 0x7ff85af4eb00 0.001272150 0.000927098 0.000345053 6955
-#> 2 gq 0x7ff85af4eb00 0.000416536 0.000416536 0.000000000 0
-#> no_chain_stack autodiff_calls no_autodiff_calls
-#> 1 6955 6955 1
-#> 2 0 0 1000
-#>
-# }
-
-The $return_codes() method returns a vector of return codes
-from the CmdStan run(s). A return code of 0 indicates a successful run.
return_codes()An integer vector of return codes with length equal to the number of -CmdStan runs (number of chains for MCMC and one otherwise).
-# \dontrun{
-# example with return codes all zero
-fit_mcmc <- cmdstanr_example("schools", method = "sample")
-#> Warning: 81 of 4000 (2.0%) transitions ended with a divergence.
-#> See https://mc-stan.org/misc/warnings for details.
-fit_mcmc$return_codes() # should be all zero
-#> [1] 0 0 0 0
-
-# example of non-zero return code (optimization fails for hierarchical model)
-fit_opt <- cmdstanr_example("schools", method = "optimize")
-#> Chain 1 Optimization terminated with error:
-#> Chain 1 Line search failed to achieve a sufficient decrease, no more progress can be made
-#> Warning: Fitting finished unexpectedly! Use the $output() method for more information.
-fit_opt$return_codes() # should be non-zero
-#> [1] 1
-# }
-
-Extract the values of sampler diagnostics for each iteration and
-chain of MCMC. To instead get summaries of these diagnostics and associated
-warning messages use the
-$diagnostic_summary() method.
sampler_diagnostics(
- inc_warmup = FALSE,
- format = getOption("cmdstanr_draws_format", "draws_array")
-)(logical) Should warmup draws be included? Defaults to FALSE.
(string) The draws format to return. See -draws for details.
Depends on format, but the default is a 3-D
-draws_array object (iteration x chain x
-variable). The variables for Stan's default MCMC algorithm are
-"accept_stat__", "stepsize__", "treedepth__", "n_leapfrog__",
-"divergent__", "energy__".
# \dontrun{
-fit <- cmdstanr_example("logistic")
-sampler_diagnostics <- fit$sampler_diagnostics()
-str(sampler_diagnostics)
-#> 'draws_array' num [1:1000, 1:4, 1:6] 2 2 2 2 2 2 2 2 3 2 ...
-#> - attr(*, "dimnames")=List of 3
-#> ..$ iteration: chr [1:1000] "1" "2" "3" "4" ...
-#> ..$ chain : chr [1:4] "1" "2" "3" "4"
-#> ..$ variable : chr [1:6] "treedepth__" "divergent__" "energy__" "accept_stat__" ...
-
-library(posterior)
-as_draws_df(sampler_diagnostics)
-#> # A draws_df: 1000 iterations, 4 chains, and 6 variables
-#> treedepth__ divergent__ energy__ accept_stat__ stepsize__ n_leapfrog__
-#> 1 2 0 69 0.87 0.79 3
-#> 2 2 0 67 0.96 0.79 7
-#> 3 2 0 68 0.80 0.79 3
-#> 4 2 0 69 0.72 0.79 3
-#> 5 2 0 68 0.83 0.79 3
-#> 6 2 0 69 0.94 0.79 3
-#> 7 2 0 66 0.99 0.79 3
-#> 8 2 0 68 0.62 0.79 3
-#> 9 3 0 67 1.00 0.79 7
-#> 10 2 0 67 0.99 0.79 7
-#> # ... with 3990 more draws
-#> # ... hidden reserved variables {'.chain', '.iteration', '.draw'}
-
-# or specify format to get a data frame instead of calling as_draws_df
-fit$sampler_diagnostics(format = "df")
-#> # A draws_df: 1000 iterations, 4 chains, and 6 variables
-#> treedepth__ divergent__ energy__ accept_stat__ stepsize__ n_leapfrog__
-#> 1 2 0 69 0.87 0.79 3
-#> 2 2 0 67 0.96 0.79 7
-#> 3 2 0 68 0.80 0.79 3
-#> 4 2 0 69 0.72 0.79 3
-#> 5 2 0 68 0.83 0.79 3
-#> 6 2 0 69 0.94 0.79 3
-#> 7 2 0 66 0.99 0.79 3
-#> 8 2 0 68 0.62 0.79 3
-#> 9 3 0 67 1.00 0.79 7
-#> 10 2 0 67 0.99 0.79 7
-#> # ... with 3990 more draws
-#> # ... hidden reserved variables {'.chain', '.iteration', '.draw'}
-# }
-
-This method is a wrapper around base::saveRDS() that ensures
-that all posterior draws and diagnostics are saved when saving a fitted
-model object. Because the contents of the CmdStan output CSV files are only
-read into R lazily (i.e., as needed), the $save_object() method is the
-safest way to guarantee that everything has been read in before saving.
See the "Saving fitted model objects" section of the -Getting started with CmdStanR -vignette for some suggestions on faster model saving for large models.
-save_object(file, ...)(string) Path where the file should be saved.
Other arguments to pass to base::saveRDS() besides object and file.
# \dontrun{
-fit <- cmdstanr_example("logistic")
-
-temp_rds_file <- tempfile(fileext = ".RDS")
-fit$save_object(file = temp_rds_file)
-rm(fit)
-
-fit <- readRDS(temp_rds_file)
-fit$summary()
-#> # A tibble: 105 × 10
-#> variable mean median sd mad q5 q95 rhat ess_bulk
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -65.9 -65.6 1.42 1.21 -68.6 -64.3 1.00 2184.
-#> 2 alpha 0.376 0.374 0.216 0.221 0.0219 0.730 1.00 4114.
-#> 3 beta[1] -0.661 -0.659 0.244 0.246 -1.08 -0.269 1.00 4410.
-#> 4 beta[2] -0.274 -0.272 0.227 0.228 -0.639 0.0940 1.00 3599.
-#> 5 beta[3] 0.675 0.668 0.265 0.261 0.256 1.11 1.00 3811.
-#> 6 log_lik[1] -0.517 -0.507 0.0995 0.0970 -0.692 -0.369 1.00 4022.
-#> 7 log_lik[2] -0.404 -0.387 0.144 0.134 -0.663 -0.199 1.00 4613.
-#> 8 log_lik[3] -0.501 -0.465 0.219 0.208 -0.909 -0.215 1.00 3919.
-#> 9 log_lik[4] -0.450 -0.432 0.152 0.148 -0.723 -0.233 1.00 3378.
-#> 10 log_lik[5] -1.18 -1.15 0.277 0.272 -1.68 -0.758 1.00 4267.
-#> # ℹ 95 more rows
-#> # ℹ 1 more variable: ess_tail <dbl>
-# }
-
-All fitted model objects have methods for saving (moving to a -specified location) the files created by CmdStanR to hold CmdStan output -csv files and input data files. These methods move the files from their -current location (possibly the temporary directory) to a user-specified -location. The paths stored in the fitted model object will also be -updated to point to the new file locations.
-The versions without the save_ prefix (e.g., $output_files()) return
-the current file paths without moving any files.
save_output_files(dir = ".", basename = NULL, timestamp = TRUE, random = TRUE)
-
-save_latent_dynamics_files(
- dir = ".",
- basename = NULL,
- timestamp = TRUE,
- random = TRUE
-)
-
-save_profile_files(dir = ".", basename = NULL, timestamp = TRUE, random = TRUE)
-
-save_data_file(dir = ".", basename = NULL, timestamp = TRUE, random = TRUE)
-
-save_config_files(dir = ".", basename = NULL, timestamp = TRUE, random = TRUE)
-
-save_metric_files(dir = ".", basename = NULL, timestamp = TRUE, random = TRUE)
-
-output_files(include_failed = FALSE)
-
-profile_files(include_failed = FALSE)
-
-latent_dynamics_files(include_failed = FALSE)
-
-data_file()
-
-config_files(include_failed = FALSE)
-
-metric_files(include_failed = FALSE)(string) Path to directory where the files should be saved.
(string) Base filename to use. See Details.
(logical) Should a timestamp be added to the file name(s)?
-Defaults to TRUE. See Details.
(logical) Should random alphanumeric characters be added to the
-end of the file name(s)? Defaults to TRUE. See Details.
(logical) Should CmdStan runs that failed also be
-included? The default is FALSE.
The $save_* methods print a message with the new file paths and (invisibly)
-return a character vector of the new paths (or NA for any that couldn't be
-copied). They also have the side effect of setting the internal paths in the
-fitted model object to the new paths.
The methods without the save_ prefix return character vectors of file
-paths without moving any files.
For $save_output_files() the files moved to dir will have names of
-the form basename-timestamp-id-random, where
basename is the user's provided basename argument;
timestamp is of the form format(Sys.time(), "%Y%m%d%H%M");
id is the MCMC chain id (or 1 for non MCMC);
random contains six random alphanumeric characters;
For $save_latent_dynamics_files() everything is the same as for
-$save_output_files() except "-diagnostic-" is included in the new
-file name after basename.
For $save_profile_files() everything is the same as for
-$save_output_files() except "-profile-" is included in the new
-file name after basename.
For $save_metric_files() everything is the same as for
-$save_output_files() except "-metric-" is included in the new
-file name after basename.
For $save_config_files() everything is the same as for
-$save_output_files() except "-config-" is included in the new
-file name after basename.
For $save_data_file() no id is included in the file name because even
-with multiple MCMC chains the data file is the same.
# \dontrun{
-fit <- cmdstanr_example()
-fit$output_files()
-#> [1] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/logistic-202503310845-1-466e74.csv"
-#> [2] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/logistic-202503310845-2-466e74.csv"
-#> [3] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/logistic-202503310845-3-466e74.csv"
-#> [4] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/logistic-202503310845-4-466e74.csv"
-fit$data_file()
-#> [1] "/private/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/Rtmp0Jp8Zk/temp_libpathb838752e97c0/cmdstanr/logistic.data.json"
-
-# just using tempdir for the example
-my_dir <- tempdir()
-fit$save_output_files(dir = my_dir, basename = "banana")
-#> Moved 4 files and set internal paths to new locations:
-#> - /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/banana-202503310845-1-6af888.csv
-#> - /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/banana-202503310845-2-6af888.csv
-#> - /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/banana-202503310845-3-6af888.csv
-#> - /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/banana-202503310845-4-6af888.csv
-fit$save_output_files(dir = my_dir, basename = "tomato", timestamp = FALSE)
-#> Moved 4 files and set internal paths to new locations:
-#> - /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/tomato-1-7df674.csv
-#> - /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/tomato-2-7df674.csv
-#> - /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/tomato-3-7df674.csv
-#> - /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/tomato-4-7df674.csv
-fit$save_output_files(dir = my_dir, basename = "lettuce", timestamp = FALSE, random = FALSE)
-#> Moved 4 files and set internal paths to new locations:
-#> - /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/lettuce-1.csv
-#> - /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/lettuce-2.csv
-#> - /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/lettuce-3.csv
-#> - /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/lettuce-4.csv
-# }
-
-The $summary() method runs
-summarise_draws() from the posterior
-package and returns the output. For MCMC, only post-warmup draws are
-included in the summary.
There is also a $print() method that prints the same summary stats but
-removes the extra formatting used for printing tibbles and returns the
-fitted model object itself. The $print() method may also be faster than
-$summary() because it is designed to only compute the summary statistics
-for the variables that will actually fit in the printed output whereas
-$summary() will compute them for all of the specified variables in order
-to be able to return them to the user. See Examples.
summary(variables = NULL, ...)(character vector) The variables to include.
Optional arguments to pass to posterior::summarise_draws().
The $summary() method returns the tibble data frame created by
-posterior::summarise_draws().
The $print() method returns the fitted model object itself (invisibly),
-which is the standard behavior for print methods in R.
# \dontrun{
-fit <- cmdstanr_example("logistic")
-fit$summary()
-#> # A tibble: 105 × 10
-#> variable mean median sd mad q5 q95 rhat ess_bulk
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -66.0 -65.6 1.45 1.21 -68.7 -64.3 1.00 2233.
-#> 2 alpha 0.382 0.380 0.214 0.217 0.0308 0.730 1.00 4473.
-#> 3 beta[1] -0.667 -0.665 0.248 0.254 -1.08 -0.261 1.00 4454.
-#> 4 beta[2] -0.274 -0.271 0.233 0.232 -0.649 0.103 1.00 4166.
-#> 5 beta[3] 0.680 0.672 0.264 0.258 0.254 1.12 1.00 3555.
-#> 6 log_lik[1] -0.514 -0.506 0.0973 0.0977 -0.684 -0.368 1.00 4272.
-#> 7 log_lik[2] -0.403 -0.383 0.147 0.137 -0.677 -0.199 1.00 4474.
-#> 8 log_lik[3] -0.497 -0.465 0.217 0.204 -0.897 -0.202 1.00 4228.
-#> 9 log_lik[4] -0.450 -0.434 0.152 0.148 -0.717 -0.238 1.00 3834.
-#> 10 log_lik[5] -1.18 -1.16 0.278 0.275 -1.67 -0.760 1.00 4261.
-#> # ℹ 95 more rows
-#> # ℹ 1 more variable: ess_tail <dbl>
-fit$print()
-#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
-#> lp__ -65.96 -65.62 1.45 1.21 -68.75 -64.29 1.00 2232 2919
-#> alpha 0.38 0.38 0.21 0.22 0.03 0.73 1.00 4473 3190
-#> beta[1] -0.67 -0.66 0.25 0.25 -1.08 -0.26 1.00 4454 3056
-#> beta[2] -0.27 -0.27 0.23 0.23 -0.65 0.10 1.00 4166 2823
-#> beta[3] 0.68 0.67 0.26 0.26 0.25 1.12 1.00 3555 3140
-#> log_lik[1] -0.51 -0.51 0.10 0.10 -0.68 -0.37 1.00 4272 3239
-#> log_lik[2] -0.40 -0.38 0.15 0.14 -0.68 -0.20 1.00 4474 3106
-#> log_lik[3] -0.50 -0.46 0.22 0.20 -0.90 -0.20 1.00 4228 3323
-#> log_lik[4] -0.45 -0.43 0.15 0.15 -0.72 -0.24 1.00 3833 2800
-#> log_lik[5] -1.18 -1.16 0.28 0.28 -1.67 -0.76 1.00 4260 3060
-#>
-#> # showing 10 of 105 rows (change via 'max_rows' argument or 'cmdstanr_max_rows' option)
-fit$print(max_rows = 2) # same as print(fit, max_rows = 2)
-#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
-#> lp__ -65.96 -65.62 1.45 1.21 -68.75 -64.29 1.00 2232 2919
-#> alpha 0.38 0.38 0.21 0.22 0.03 0.73 1.00 4473 3190
-#>
-#> # showing 2 of 105 rows (change via 'max_rows' argument or 'cmdstanr_max_rows' option)
-
-# include only certain variables
-fit$summary("beta")
-#> # A tibble: 3 × 10
-#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 beta[1] -0.667 -0.665 0.248 0.254 -1.08 -0.261 1.00 4454. 3056.
-#> 2 beta[2] -0.274 -0.271 0.233 0.232 -0.649 0.103 1.00 4166. 2824.
-#> 3 beta[3] 0.680 0.672 0.264 0.258 0.254 1.12 1.00 3555. 3140.
-fit$print(c("alpha", "beta[2]"))
-#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
-#> alpha 0.38 0.38 0.21 0.22 0.03 0.73 1.00 4473 3190
-#> beta[2] -0.27 -0.27 0.23 0.23 -0.65 0.10 1.00 4166 2823
-
-# include all variables but only certain summaries
-fit$summary(NULL, c("mean", "sd"))
-#> # A tibble: 105 × 3
-#> variable mean sd
-#> <chr> <dbl> <dbl>
-#> 1 lp__ -66.0 1.45
-#> 2 alpha 0.382 0.214
-#> 3 beta[1] -0.667 0.248
-#> 4 beta[2] -0.274 0.233
-#> 5 beta[3] 0.680 0.264
-#> 6 log_lik[1] -0.514 0.0973
-#> 7 log_lik[2] -0.403 0.147
-#> 8 log_lik[3] -0.497 0.217
-#> 9 log_lik[4] -0.450 0.152
-#> 10 log_lik[5] -1.18 0.278
-#> # ℹ 95 more rows
-
-# can use functions created from formulas
-# for example, calculate Pr(beta > 0)
-fit$summary("beta", prob_gt_0 = ~ mean(. > 0))
-#> # A tibble: 3 × 2
-#> variable prob_gt_0
-#> <chr> <dbl>
-#> 1 beta[1] 0.0015
-#> 2 beta[2] 0.118
-#> 3 beta[3] 0.994
-
-# can combine user-specified functions with
-# the default summary functions
-fit$summary(variables = c("alpha", "beta"),
- posterior::default_summary_measures()[1:4],
- quantiles = ~ quantile2(., probs = c(0.025, 0.975)),
- posterior::default_convergence_measures()
- )
-#> # A tibble: 4 × 10
-#> variable mean median sd mad q2.5 q97.5 rhat ess_bulk ess_tail
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 alpha 0.382 0.380 0.214 0.217 -0.0460 0.804 1.00 4473. 3191.
-#> 2 beta[1] -0.667 -0.665 0.248 0.254 -1.16 -0.189 1.00 4454. 3056.
-#> 3 beta[2] -0.274 -0.271 0.233 0.232 -0.727 0.182 1.00 4166. 2824.
-#> 4 beta[3] 0.680 0.672 0.264 0.258 0.172 1.21 1.00 3555. 3140.
-
-# the functions need to calculate the appropriate
-# value for a matrix input
-fit$summary(variables = "alpha", dim)
-#> # A tibble: 1 × 3
-#> variable dim.1 dim.2
-#> <chr> <int> <int>
-#> 1 alpha 1000 4
-
-# the usual [stats::var()] is therefore not directly suitable as it
-# will produce a covariance matrix unless the data is converted to a vector
-fit$print(c("alpha", "beta"), var2 = ~var(as.vector(.x)))
-#> variable var2
-#> alpha 0.05
-#> beta[1] 0.06
-#> beta[2] 0.05
-#> beta[3] 0.07
-
-# }
-
-Report the run time in seconds. For MCMC additional information
-is provided about the run times of individual chains and the warmup and
-sampling phases. For Laplace approximation the time only include the time
-for drawing the approximate sample and does not include the time
-taken to run the $optimize() method.
time()A list with elements
total: (scalar) The total run time. For MCMC this may be different than
-the sum of the chain run times if parallelization was used.
chains: (data frame) For MCMC only, timing info for the individual
-chains. The data frame has columns "chain_id", "warmup", "sampling",
-and "total".
# \dontrun{
-fit_mcmc <- cmdstanr_example("logistic", method = "sample")
-fit_mcmc$time()
-#> $total
-#> [1] 0.938334
-#>
-#> $chains
-#> chain_id warmup sampling total
-#> 1 1 0.022 0.074 0.096
-#> 2 2 0.022 0.076 0.098
-#> 3 3 0.022 0.076 0.098
-#> 4 4 0.022 0.074 0.096
-#>
-
-fit_vb <- cmdstanr_example("logistic", method = "variational")
-fit_vb$time()
-#> $total
-#> [1] 0.1466939
-#>
-
-fit_mle <- cmdstanr_example("logistic", method = "optimize", jacobian = TRUE)
-fit_mle$time()
-#> $total
-#> [1] 0.146842
-#>
-
-# use fit_mle to draw samples from laplace approximation
-fit_laplace <- cmdstanr_example("logistic", method = "laplace", mode = fit_mle)
-fit_laplace$time() # just time for drawing sample not for running optimize
-#> $total
-#> [1] 0.1427319
-#>
-fit_laplace$time()$total + fit_mle$time()$total # total time
-#> [1] 0.2895739
-# }
-
-R/fit.R
- fit-method-unconstrain_draws.RdThe $unconstrain_draws() method transforms all parameter draws
-to the unconstrained scale. The method returns a list for each chain,
-containing the parameter values from each iteration on the unconstrained
-scale. If called with no arguments, then the draws within the fit object
-are unconstrained. Alternatively, either an existing draws object or a
-character vector of paths to CSV files can be passed.
unconstrain_draws(
- files = NULL,
- draws = NULL,
- format = getOption("cmdstanr_draws_format", "draws_array"),
- inc_warmup = FALSE
-)(character vector) The paths to the CmdStan CSV files. These can -be files generated by running CmdStanR or running CmdStan directly.
A posterior::draws_* object.
(string) The format of the returned draws. Must be a valid -format from the posterior package.
(logical) Should warmup draws be included? Defaults to
-FALSE.
log_prob(), grad_log_prob(), constrain_variables(),
-unconstrain_variables(), unconstrain_draws(), variable_skeleton(),
-hessian()
# \dontrun{
-fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE)
-
-# Unconstrain all internal draws
-unconstrained_internal_draws <- fit_mcmc$unconstrain_draws()
-
-# Unconstrain external CmdStan CSV files
-unconstrained_csv <- fit_mcmc$unconstrain_draws(files = fit_mcmc$output_files())
-
-# Unconstrain existing draws object
-unconstrained_draws <- fit_mcmc$unconstrain_draws(draws = fit_mcmc$draws())
-# }
-
-R/fit.R
- fit-method-unconstrain_variables.RdThe $unconstrain_variables() method transforms input
-parameters to the unconstrained scale.
unconstrain_variables(variables)(list) A list of parameter values to transform, in the same
-format as provided to the init argument of the $sample() method.
log_prob(), grad_log_prob(), constrain_variables(),
-unconstrain_variables(), unconstrain_draws(), variable_skeleton(),
-hessian()
# \dontrun{
-fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE)
-fit_mcmc$unconstrain_variables(list(alpha = 0.5, beta = c(0.7, 1.1, 0.2)))
-#> [1] 0.5 0.7 1.1 0.2
-# }
-
-The $variable_skeleton() method returns the variable skeleton
-needed by utils::relist() to re-structure a vector of constrained
-parameter values to a named list.
variable_skeleton(transformed_parameters = TRUE, generated_quantities = TRUE)(logical) Whether to include transformed
-parameters in the skeleton (defaults to TRUE).
(logical) Whether to include generated quantities
-in the skeleton (defaults to TRUE).
log_prob(), grad_log_prob(), constrain_variables(),
-unconstrain_variables(), unconstrain_draws(), variable_skeleton(),
-hessian()
# \dontrun{
-fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE)
-fit_mcmc$variable_skeleton()
-#> $alpha
-#> [1] 0
-#>
-#> $beta
-#> [1] 0 0 0
-#>
-#> $log_lik
-#> [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
-#> [38] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
-#> [75] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
-#>
-# }
-
-
- Package overview and global options-An overview of the package and how it differs from RStan. - |
- |
|---|---|
| - - | -CmdStanR: the R interface to CmdStan |
-
| - - | -CmdStanR global options |
-
- Installing and setting the path to CmdStan-Install CmdStan, assuming the necessary C++ toolchain. - |
- |
|
-
|
- Install CmdStan or clean and rebuild an existing installation |
-
| - - | -Get or set the file path to the CmdStan installation |
-
- Running CmdStan from R-Run CmdStan from R. - |
- |
| - - | -Create a new CmdStanModel object |
-
| - - | -CmdStanModel objects |
-
| - - | -Check syntax of a Stan program |
-
| - - | -Compile a Stan program |
-
| - - | -Run Stan's diagnose method |
-
| - - | -Expose Stan functions to R |
-
| - - | -Run stanc's auto-formatter on the model code. |
-
| - - | -Run Stan's standalone generated quantities method |
-
| - - | -Run Stan's Laplace algorithm |
-
| - - | -Run Stan's optimization algorithms |
-
| - - | -Run Stan's Pathfinder Variational Inference Algorithm |
-
| - - | -Run Stan's MCMC algorithms |
-
| - - | -Run Stan's MCMC algorithms with MPI |
-
| - - | -Input and output variables of a Stan program |
-
| - - | -Run Stan's variational approximation algorithms |
-
| - - | -Fit models for use in examples |
-
- Fitted model objects and methods- - |
- |
| - - | -CmdStanMCMC objects |
-
| - - | -CmdStanMLE objects |
-
| - - | -CmdStanLaplace objects |
-
| - - | -CmdStanVB objects |
-
| - - | -CmdStanPathfinder objects |
-
| - - | -CmdStanGQ objects |
-
| - - | -CmdStanDiagnose objects |
-
| - - | -Run CmdStan's |
-
| - - | -Return Stan code |
-
| - - | -Transform a set of unconstrained parameter values to the constrained scale |
-
| - - | -Sampler diagnostic summaries and warnings |
-
| - - | -Extract posterior draws |
-
| - - | -Calculate the log-probability and the gradient w.r.t. each input for a given vector of unconstrained parameters |
-
| - - | -Extract gradients after diagnostic mode |
-
| - - | -Calculate the log-probability , the gradient w.r.t. each input, and the hessian for a given vector of unconstrained parameters |
-
| - - | -Extract user-specified initial values |
-
| - - | -Compile additional methods for accessing the model log-probability function and parameter constraining and unconstraining. |
-
| - - | -Extract inverse metric (mass matrix) after MCMC |
-
| - - | -Calculate the log-probability given a provided vector of unconstrained parameters. |
-
| - - | -Leave-one-out cross-validation (LOO-CV) |
-
| - - | -Extract log probability (target) |
-
| - - | -Extract metadata from CmdStan CSV files |
-
| - - | -Extract point estimate after optimization |
-
| - - | -Extract number of chains after MCMC |
-
| - - | -Access console output |
-
| - - | -Return profiling data |
-
| - - | -Extract return codes from CmdStan |
-
| - - | -Extract sampler diagnostics after MCMC |
-
| - - | -Save fitted model object to a file |
-
|
-
|
- Save output and data files |
-
| - - | -Compute a summary table of estimates and diagnostics |
-
| - - | -Report timing of CmdStan runs |
-
| - - | -Transform all parameter draws to the unconstrained scale |
-
| - - | -Transform a set of parameter values to the unconstrained scale |
-
| - - | -Return the variable skeleton for |
-
| - - | -Expose Stan functions to R |
-
- Other tools- - |
- |
| - - | -Read CmdStan CSV files into R |
-
| - - | -Write data to a JSON file readable by CmdStan |
-
| - - | -Write Stan code to a file |
-
| - - | -Write posterior draws objects to CSV files suitable for running standalone generated quantities with CmdStan. |
-
| - - | -Convert |
-
|
-
|
- Create a |
-
|
-
|
- Coercion methods for CmdStan objects |
-
- Using CmdStanR with knitr and R Markdown- - |
- |
| - - | -Register CmdStanR's knitr engine for Stan |
-
| - - | -CmdStan knitr engine for Stan |
-
R/install.R
- install_cmdstan.RdThe install_cmdstan() function attempts to download and
-install the latest release of CmdStan.
-Installing a previous release or a new release candidate is also possible
-by specifying the version or release_url argument.
-See the first few sections of the CmdStan
-installation guide
-for details on the C++ toolchain required for installing CmdStan.
The rebuild_cmdstan() function cleans and rebuilds the CmdStan
-installation. Use this function in case of any issues when compiling models.
The cmdstan_make_local() function is used to read/write makefile flags
-and variables from/to the make/local file of a CmdStan installation.
-Writing to the make/local file can be used to permanently add makefile
-flags/variables to an installation. For example adding specific compiler
-switches, changing the C++ compiler, etc. A change to the make/local file
-should typically be followed by calling rebuild_cmdstan().
The check_cmdstan_toolchain() function attempts to check for the required
-C++ toolchain. It is called internally by install_cmdstan() but can also
-be called directly by the user. On Windows only, calling the function with
-the fix = TRUE argument will attempt to install the necessary toolchain
-components if they are not found. For Windows users with RTools and CmdStan
-versions >= 2.35 no additional toolchain configuration is required.
NOTE: When installing CmdStan on Windows with RTools and CmdStan versions
-prior to 2.35.0, the above additional toolchain configuration
-is still required. To enable this configuration, set the environment variable
-CMDSTANR_USE_MSYS_TOOLCHAIN to 'true' and call
-check_cmdstan_toolchain(fix = TRUE).
install_cmdstan(
- dir = NULL,
- cores = getOption("mc.cores", 2),
- quiet = FALSE,
- overwrite = FALSE,
- timeout = 1200,
- version = NULL,
- release_url = NULL,
- release_file = NULL,
- cpp_options = list(),
- check_toolchain = TRUE,
- wsl = FALSE
-)
-
-rebuild_cmdstan(
- dir = cmdstan_path(),
- cores = getOption("mc.cores", 2),
- quiet = FALSE,
- timeout = 600
-)
-
-cmdstan_make_local(dir = cmdstan_path(), cpp_options = NULL, append = TRUE)
-
-check_cmdstan_toolchain(fix = FALSE, quiet = FALSE)(string) The path to the directory in which to install CmdStan.
-The default is to install it in a directory called .cmdstan within the
-user's home directory (i.e, file.path(Sys.getenv("HOME"), ".cmdstan")).
(integer) The number of CPU cores to use to parallelize building
-CmdStan and speed up installation. If cores is not specified then the
-default is to look for the option "mc.cores", which can be set for an
-entire R session by options(mc.cores=value). If the "mc.cores" option
-has not been set then the default is 2.
(logical) For install_cmdstan(), should the verbose output
-from the system processes be suppressed when building the CmdStan binaries?
-The default is FALSE. For check_cmdstan_toolchain(), should the
-function suppress printing informational messages? The default is FALSE.
-If TRUE only errors will be printed.
(logical) Should CmdStan still be downloaded and installed
-even if an installation of the same version is found in dir? The default
-is FALSE, in which case an informative error is thrown instead of
-overwriting the user's installation.
(positive real) Timeout (in seconds) for the build stage of -the installation.
(string) The CmdStan release version to install. The default
-is NULL, which downloads the latest stable release from
-https://github.com/stan-dev/cmdstan/releases.
(string) The URL for the specific CmdStan release or
-release candidate to install. See https://github.com/stan-dev/cmdstan/releases.
-The URL should point to the tarball (.tar.gz. file) itself, e.g.,
-release_url="https://github.com/stan-dev/cmdstan/releases/download/v2.25.0/cmdstan-2.25.0.tar.gz".
-If both version and release_url are specified then version will be used.
(string) A file path to a CmdStan release tar.gz file
-downloaded from the releases page: https://github.com/stan-dev/cmdstan/releases.
-For example: release_file=""./cmdstan-2.33.1.tar.gz". If release_file is
-specified then both release_url and version will be ignored.
(list) Any makefile flags/variables to be written to
-the make/local file. For example, list("CXX" = "clang++") will force
-the use of clang for compilation.
(logical) Should install_cmdstan() attempt to check
-that the required toolchain is installed and properly configured. The
-default is TRUE.
(logical) Should CmdStan be installed and run through the Windows
-Subsystem for Linux (WSL). The default is FALSE.
(logical) For cmdstan_make_local(), should the listed
-makefile flags be appended to the end of the existing make/local file?
-The default is TRUE. If FALSE the file is overwritten.
For check_cmdstan_toolchain(), should CmdStanR attempt to fix
-any detected toolchain problems? Currently this option is only available on
-Windows. The default is FALSE, in which case problems are only reported
-along with suggested fixes.
For cmdstan_make_local(), if cpp_options=NULL then the existing
-contents of make/local are returned without writing anything, otherwise
-the updated contents are returned.
# \dontrun{
-check_cmdstan_toolchain()
-#> The C++ toolchain required for CmdStan is setup properly!
-
-# install_cmdstan(cores = 4)
-
-cpp_options <- list(
- "CXX" = "clang++",
- "CXXFLAGS+= -march=native",
- PRECOMPILED_HEADERS = TRUE
-)
-# cmdstan_make_local(cpp_options = cpp_options)
-# rebuild_cmdstan()
-# }
-
-The $check_syntax() method of a CmdStanModel object
-checks the Stan program for syntax errors and returns TRUE (invisibly) if
-parsing succeeds. If invalid syntax in found an error is thrown.
check_syntax(
- pedantic = FALSE,
- include_paths = NULL,
- stanc_options = list(),
- quiet = FALSE
-)(logical) Should pedantic mode be turned on? The default is
-FALSE. Pedantic mode attempts to warn you about potential issues in your
-Stan program beyond syntax errors. For details see the Pedantic mode chapter in
-the Stan Reference Manual.
(character vector) Paths to directories where Stan
-should look for files specified in #include directives in the Stan
-program.
(list) Any other Stan-to-C++ transpiler options to be
-used when compiling the model. See the documentation for the
-$compile() method for details.
(logical) Should informational messages be suppressed? The
-default is FALSE, which will print a message if the Stan program is valid
-or the compiler error message if there are syntax errors. If TRUE, only
-the error message will be printed.
The $check_syntax() method returns TRUE (invisibly) if the model
-is valid.
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
-model-method-compile,
-model-method-diagnose,
-model-method-expose_functions,
-model-method-format,
-model-method-generate-quantities,
-model-method-laplace,
-model-method-optimize,
-model-method-pathfinder,
-model-method-sample,
-model-method-sample_mpi,
-model-method-variables,
-model-method-variational
# \dontrun{
-file <- write_stan_file("
-data {
- int N;
- array[N] int y;
-}
-parameters {
- // should have <lower=0> but omitting to demonstrate pedantic mode
- real lambda;
-}
-model {
- y ~ poisson(lambda);
-}
-")
-mod <- cmdstan_model(file, compile = FALSE)
-
-# the program is syntactically correct, however...
-mod$check_syntax()
-#> Stan program is syntactically correct
-
-# pedantic mode will warn that lambda should be constrained to be positive
-# and that lambda has no prior distribution
-mod$check_syntax(pedantic = TRUE)
-#> Warning in '/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/model_febb1e69c7387a0e64cf13583e078104.stan', line 11, column 14: A
-#> poisson distribution is given parameter lambda as a rate parameter
-#> (argument 1), but lambda was not constrained to be strictly positive.
-#> Warning: The parameter lambda has no priors. This means either no prior is
-#> provided, or the prior(s) depend on data variables. In the later case,
-#> this may be a false positive.
-#> Stan program is syntactically correct
-# }
-
-The $compile() method of a CmdStanModel object checks the
-syntax of the Stan program, translates the program to C++, and creates a
-compiled executable. To just check the syntax of a Stan program without
-compiling it use the $check_syntax() method
-instead.
In most cases the user does not need to explicitly call the $compile()
-method as compilation will occur when calling cmdstan_model(). However it
-is possible to set compile=FALSE in the call to cmdstan_model() and
-subsequently call the $compile() method directly.
After compilation, the paths to the executable and the .hpp file
-containing the generated C++ code are available via the $exe_file() and
-$hpp_file() methods. The default is to create the executable in the same
-directory as the Stan program and to write the generated C++ code in a
-temporary directory. To save the C++ code to a non-temporary location use
-$save_hpp_file(dir).
compile(
- quiet = TRUE,
- dir = NULL,
- pedantic = FALSE,
- include_paths = NULL,
- user_header = NULL,
- cpp_options = list(),
- stanc_options = list(),
- force_recompile = getOption("cmdstanr_force_recompile", default = FALSE),
- compile_model_methods = FALSE,
- compile_standalone = FALSE,
- dry_run = FALSE,
- compile_hessian_method = FALSE,
- threads = FALSE
-)(logical) Should the verbose output from CmdStan during
-compilation be suppressed? The default is TRUE, but if you encounter an
-error we recommend trying again with quiet=FALSE to see more of the
-output.
(string) The path to the directory in which to store the CmdStan
-executable (or .hpp file if using $save_hpp_file()). The default is the
-same location as the Stan program.
(logical) Should pedantic mode be turned on? The default is
-FALSE. Pedantic mode attempts to warn you about potential issues in your
-Stan program beyond syntax errors. For details see the Pedantic mode section in
-the Stan Reference Manual. Note: to do a pedantic check for a model
-without compiling it or for a model that is already compiled the
-$check_syntax() method can be used instead.
(character vector) Paths to directories where Stan
-should look for files specified in #include directives in the Stan
-program.
(string) The path to a C++ file (with a .hpp extension) -to compile with the Stan model.
(list) Any makefile options to be used when compiling the
-model (STAN_THREADS, STAN_MPI, STAN_OPENCL, etc.). Anything you would
-otherwise write in the make/local file. For an example of using threading
-see the Stan case study
-Reduce Sum: A Minimal Example.
(list) Any Stan-to-C++ transpiler options to be used
-when compiling the model. See the Examples section below as well as the
-stanc chapter of the CmdStan Guide for more details on available options:
-https://mc-stan.org/docs/cmdstan-guide/stanc.html.
(logical) Should the model be recompiled even if was
-not modified since last compiled. The default is FALSE. Can also be set
-via a global cmdstanr_force_recompile option.
(logical) Compile additional model methods
-(log_prob(), grad_log_prob(), constrain_variables(),
-unconstrain_variables()).
(logical) Should functions in the Stan model be
-compiled for use in R? If TRUE the functions will be available via the
-functions field in the compiled model object. This can also be done after
-compilation using the
-$expose_functions() method.
(logical) If TRUE, the code will do all checks before compilation,
-but skip the actual C++ compilation. Used to speedup tests.
(logical) Should the (experimental) hessian() method be
-be compiled with the model methods?
Deprecated and will be removed in a future release. Please
-turn on threading via cpp_options = list(stan_threads = TRUE) instead.
The $compile() method is called for its side effect of creating the
-executable and adding its path to the CmdStanModel object, but it also
-returns the CmdStanModel object invisibly.
After compilation, the $exe_file(), $hpp_file(), and $save_hpp_file()
methods can be used and return file paths.
-The $check_syntax() method to check
-Stan syntax or enable pedantic model without compiling.
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
-model-method-check_syntax,
-model-method-diagnose,
-model-method-expose_functions,
-model-method-format,
-model-method-generate-quantities,
-model-method-laplace,
-model-method-optimize,
-model-method-pathfinder,
-model-method-sample,
-model-method-sample_mpi,
-model-method-variables,
-model-method-variational
# \dontrun{
-file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan")
-
-# by default compilation happens when cmdstan_model() is called.
-# to delay compilation until calling the $compile() method set compile=FALSE
-mod <- cmdstan_model(file, compile = FALSE)
-mod$compile()
-mod$exe_file()
-#> [1] "/Users/jgabry/.cmdstan/cmdstan-2.36.0/examples/bernoulli/bernoulli"
-
-# turn on threading support (for using functions that support within-chain parallelization)
-mod$compile(force_recompile = TRUE, cpp_options = list(stan_threads = TRUE))
-mod$exe_file()
-#> [1] "/Users/jgabry/.cmdstan/cmdstan-2.36.0/examples/bernoulli/bernoulli"
-
-# turn on pedantic mode (new in Stan v2.24)
-file_pedantic <- write_stan_file("
-parameters {
- real sigma; // pedantic mode will warn about missing <lower=0>
-}
-model {
- sigma ~ exponential(1);
-}
-")
-mod <- cmdstan_model(file_pedantic, pedantic = TRUE)
-#> Warning in '/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/model-b9aa1e29138e.stan', line 6, column 2: Parameter
-#> sigma is given a exponential distribution, which has strictly positive
-#> support, but sigma was not constrained to be strictly positive.
-
-# }
-
-The $diagnose() method of a CmdStanModel object
-runs Stan's basic diagnostic feature that will calculate the gradients
-of the initial state and compare them with gradients calculated by
-finite differences. Discrepancies between the two indicate that there is
-a problem with the model or initial states or else there is a bug in Stan.
diagnose(
- data = NULL,
- seed = NULL,
- init = NULL,
- output_dir = getOption("cmdstanr_output_dir"),
- output_basename = NULL,
- epsilon = NULL,
- error = NULL
-)(multiple options) The data to use for the variables specified in -the data block of the Stan program. One of the following:
A named list of R objects with the names corresponding to variables
-declared in the data block of the Stan program. Internally this list is
-then written to JSON for CmdStan using write_stan_json(). See
-write_stan_json() for details on the conversions performed on R objects
-before they are passed to Stan.
A path to a data file compatible with CmdStan (JSON or R dump). See the -appendices in the CmdStan guide for details on using these formats.
NULL or an empty list if the Stan program has no data block.
(positive integer(s)) A seed for the (P)RNG to pass to CmdStan.
-In the case of multi-chain sampling the single seed will automatically be
-augmented by the the run (chain) ID so that each chain uses a different
-seed. The exception is the transformed data block, which defaults to using
-same seed for all chains so that the same data is generated for all chains
-if RNG functions are used. The only time seed should be specified as a
-vector (one element per chain) is if RNG functions are used in transformed
-data and the goal is to generate different data for each chain.
(multiple options) The initialization method to use for the -variables declared in the parameters block of the Stan program. One of the -following:
A real number x>0. This initializes all parameters randomly between
-[-x,x] on the unconstrained parameter space.;
The number 0. This initializes all parameters to 0;
A character vector of paths (one per chain) to JSON or Rdump files
-containing initial values for all or some parameters. See
-write_stan_json() to write R objects to JSON files compatible with
-CmdStan.
A list of lists containing initial values for all or some parameters. For -MCMC the list should contain a sublist for each chain. For other model -fitting methods there should be just one sublist. The sublists should have -named elements corresponding to the parameters for which you are specifying -initial values. See Examples.
A function that returns a single list with names corresponding to the
-parameters for which you are specifying initial values. The function can
-take no arguments or a single argument chain_id. For MCMC, if the
-function has argument chain_id it will be supplied with the chain id
-(from 1 to number of chains) when called to generate the initial values.
-See
-Examples.
A CmdStanMCMC, CmdStanMLE, CmdStanVB, CmdStanPathfinder,
-or CmdStanLaplace fit object.
-If the fit object's parameters are only a subset of the model
-parameters then the other parameters will be drawn by Stan's default
-initialization. The fit object must have at least some parameters that are the
-same name and dimensions as the current Stan model. For the sample and
-pathfinder method, if the fit object has fewer draws than the requested
-number of chains/paths then the inits will be drawn using sampling with
-replacement. Otherwise sampling without replacement will be used.
-When a CmdStanPathfinder fit object is used as the init, if
-. psis_resample was set to FALSE and calculate_lp was
-set to TRUE (default), then resampling without replacement with Pareto
-smoothed weights will be used. If psis_resample was set to TRUE or
-calculate_lp was set to FALSE then sampling without replacement with
-uniform weights will be used to select the draws.
-PSIS resampling is used to select the draws for CmdStanVB,
-and CmdStanLaplace fit objects.
A type inheriting from posterior::draws. If the draws object has less
-samples than the number of requested chains/paths then the inits will be
-drawn using sampling with replacement. Otherwise sampling without
-replacement will be used. If the draws object's parameters are only a subset
-of the model parameters then the other parameters will be drawn by Stan's
-default initialization. The fit object must have at least some parameters
-that are the same name and dimensions as the current Stan model.
(string) A path to a directory where CmdStan should write
-its output CSV files. For MCMC there will be one file per chain; for other
-methods there will be a single file. For interactive use this can typically
-be left at NULL (temporary directory) since CmdStanR makes the CmdStan
-output (posterior draws and diagnostics) available in R via methods of the
-fitted model objects. This can be set for an entire R session using
-options(cmdstanr_output_dir). The behavior of output_dir is as follows:
If NULL (the default), then the CSV files are written to a temporary
-directory and only saved permanently if the user calls one of the $save_*
-methods of the fitted model object (e.g.,
-$save_output_files()). These temporary
-files are removed when the fitted model object is garbage collected (manually or automatically).
If a path, then the files are created in output_dir with names
-corresponding to the defaults used by $save_output_files().
(string) A string to use as a prefix for the names of
-the output CSV files of CmdStan. If NULL (the default), the basename of
-the output CSV files will be comprised from the model name, timestamp, and
-5 random characters.
(positive real) The finite difference step size. Default -value is 1e-6.
(positive real) The error threshold. Default value is 1e-6.
A CmdStanDiagnose object.
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
-model-method-check_syntax,
-model-method-compile,
-model-method-expose_functions,
-model-method-format,
-model-method-generate-quantities,
-model-method-laplace,
-model-method-optimize,
-model-method-pathfinder,
-model-method-sample,
-model-method-sample_mpi,
-model-method-variables,
-model-method-variational
# \dontrun{
-test <- cmdstanr_example("logistic", method = "diagnose")
-
-# retrieve the gradients
-test$gradients()
-#> param_idx value model finite_diff error
-#> 1 0 -1.760420 33.15260 33.15260 2.88651e-08
-#> 2 1 -1.814920 5.15400 5.15400 6.47713e-09
-#> 3 2 0.718975 -16.58770 -16.58770 2.24132e-08
-#> 4 3 1.697010 -5.70682 -5.70682 5.89145e-10
-# }
-
-The $expose_functions() method of a CmdStanModel object
-will compile the functions in the Stan program's functions block and
-expose them for use in R. This can also be specified via the
-compile_standalone argument to the $compile()
-method.
This method is also available for fitted model objects (CmdStanMCMC, CmdStanVB, etc.).
-See Examples.
Note: there may be many compiler warnings emitted during compilation but -these can be ignored so long as they are warnings and not errors.
-expose_functions(global = FALSE, verbose = FALSE)(logical) Should the functions be added to the Global
-Environment? The default is FALSE, in which case the functions are
-available via the functions field of the R6 object.
(logical) Should detailed information about generated code be
-printed to the console? Defaults to FALSE.
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
-model-method-check_syntax,
-model-method-compile,
-model-method-diagnose,
-model-method-format,
-model-method-generate-quantities,
-model-method-laplace,
-model-method-optimize,
-model-method-pathfinder,
-model-method-sample,
-model-method-sample_mpi,
-model-method-variables,
-model-method-variational
# \dontrun{
-stan_file <- write_stan_file(
- "
- functions {
- real a_plus_b(real a, real b) {
- return a + b;
- }
- }
- parameters {
- real x;
- }
- model {
- x ~ std_normal();
- }
- "
-)
-mod <- cmdstan_model(stan_file)
-mod$expose_functions()
-mod$functions$a_plus_b(1, 2)
-#> [1] 3
-
-fit <- mod$sample(refresh = 0)
-#> Running MCMC with 4 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#> Chain 3 finished in 0.0 seconds.
-#> Chain 4 finished in 0.0 seconds.
-#>
-#> All 4 chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.6 seconds.
-#>
-fit$expose_functions() # already compiled because of above but this would compile them otherwise
-#> Functions already compiled, nothing to do!
-fit$functions$a_plus_b(1, 2)
-#> [1] 3
-# }
-
-
-The $format() method of a CmdStanModel object
-runs stanc's auto-formatter on the model code. Either saves the formatted
-model directly back to the file or prints it for inspection.
format(
- overwrite_file = FALSE,
- canonicalize = FALSE,
- backup = TRUE,
- max_line_length = NULL,
- quiet = FALSE
-)(logical) Should the formatted code be written back
-to the input model file. The default is FALSE.
(list or logical) Defines whether or not the compiler
-should 'canonicalize' the Stan model, removing things like deprecated syntax.
-Default is FALSE. If TRUE, all canonicalizations are run. You can also
-supply a list of strings which represent options. In that case the options
-are passed to stanc (new in Stan 2.29). See the User's guide section
-for available canonicalization options.
(logical) If TRUE, create stanfile.bak backups before
-writing to the file. Disable this option if you're sure you have other
-copies of the file or are using a version control system like Git. Defaults
-to TRUE. The value is ignored if overwrite_file = FALSE.
(integer) The maximum length of a line when formatting.
-The default is NULL, which defers to the default line length of stanc.
(logical) Should informational messages be suppressed? The
-default is FALSE.
The $format() method returns TRUE (invisibly) if the model
-is valid.
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
-model-method-check_syntax,
-model-method-compile,
-model-method-diagnose,
-model-method-expose_functions,
-model-method-generate-quantities,
-model-method-laplace,
-model-method-optimize,
-model-method-pathfinder,
-model-method-sample,
-model-method-sample_mpi,
-model-method-variables,
-model-method-variational
# \dontrun{
-
-# Example of removing unnecessary whitespace
-file <- write_stan_file("
-data {
- int N;
- array[N] int y;
-}
-parameters {
- real lambda;
-}
-model {
- target +=
- poisson_lpmf(y | lambda);
-}
-")
-
-# set compile=FALSE then call format to fix old syntax
-mod <- cmdstan_model(file, compile = FALSE)
-mod$format(canonicalize = list("deprecations"))
-#> data {
-#> int N;
-#> array[N] int y;
-#> }
-#> parameters {
-#> real lambda;
-#> }
-#> model {
-#> target += poisson_lpmf(y | lambda);
-#> }
-#>
-#>
-
-# overwrite the original file instead of just printing it
-mod$format(canonicalize = list("deprecations"), overwrite_file = TRUE)
-#> Old version of the model stored to /var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/model_39022cccc3fe5384fab5a52b791fead6.stan.bak-20250331084850.
-mod$compile()
-# }
-
-R/model.R
- model-method-generate-quantities.RdThe $generate_quantities() method of a CmdStanModel object
-runs Stan's standalone generated quantities to obtain generated quantities
-based on previously fitted parameters.
(multiple options) The parameter draws to use. One of -the following:
A CmdStanMCMC or CmdStanVB fitted model object.
A posterior::draws_array (for MCMC) or posterior::draws_matrix (for
-VB) object returned by CmdStanR's $draws() method.
A character vector of paths to CmdStan CSV output files.
NOTE: if you plan on making many calls to $generate_quantities() then the
-most efficient option is to pass the paths of the CmdStan CSV output files
-(this avoids CmdStanR having to rewrite the draws contained in the fitted
-model object to CSV each time). If you no longer have the CSV files you can
-use draws_to_csv() once to write them and then pass the resulting file
-paths to $generate_quantities() as many times as needed.
(multiple options) The data to use for the variables specified in -the data block of the Stan program. One of the following:
A named list of R objects with the names corresponding to variables
-declared in the data block of the Stan program. Internally this list is
-then written to JSON for CmdStan using write_stan_json(). See
-write_stan_json() for details on the conversions performed on R objects
-before they are passed to Stan.
A path to a data file compatible with CmdStan (JSON or R dump). See the -appendices in the CmdStan guide for details on using these formats.
NULL or an empty list if the Stan program has no data block.
(positive integer(s)) A seed for the (P)RNG to pass to CmdStan.
-In the case of multi-chain sampling the single seed will automatically be
-augmented by the the run (chain) ID so that each chain uses a different
-seed. The exception is the transformed data block, which defaults to using
-same seed for all chains so that the same data is generated for all chains
-if RNG functions are used. The only time seed should be specified as a
-vector (one element per chain) is if RNG functions are used in transformed
-data and the goal is to generate different data for each chain.
(string) A path to a directory where CmdStan should write
-its output CSV files. For MCMC there will be one file per chain; for other
-methods there will be a single file. For interactive use this can typically
-be left at NULL (temporary directory) since CmdStanR makes the CmdStan
-output (posterior draws and diagnostics) available in R via methods of the
-fitted model objects. This can be set for an entire R session using
-options(cmdstanr_output_dir). The behavior of output_dir is as follows:
If NULL (the default), then the CSV files are written to a temporary
-directory and only saved permanently if the user calls one of the $save_*
-methods of the fitted model object (e.g.,
-$save_output_files()). These temporary
-files are removed when the fitted model object is garbage collected (manually or automatically).
If a path, then the files are created in output_dir with names
-corresponding to the defaults used by $save_output_files().
(string) A string to use as a prefix for the names of
-the output CSV files of CmdStan. If NULL (the default), the basename of
-the output CSV files will be comprised from the model name, timestamp, and
-5 random characters.
(positive integer) The number of significant figures used
-when storing the output values. By default, CmdStan represent the output
-values with 6 significant figures. The upper limit for sig_figs is 18.
-Increasing this value will result in larger output CSV files and thus an
-increased usage of disk space.
(positive integer) The maximum number of MCMC chains
-to run in parallel. If parallel_chains is not specified then the default
-is to look for the option "mc.cores", which can be set for an entire R
-session by options(mc.cores=value). If the "mc.cores" option has not
-been set then the default is 1.
(positive integer) If the model was
-compiled with threading support, the number of
-threads to use in parallelized sections within an MCMC chain (e.g., when
-using the Stan functions reduce_sum() or map_rect()). This is in
-contrast with parallel_chains, which specifies the number of chains to
-run in parallel. The actual number of CPU cores used is
-parallel_chains*threads_per_chain. For an example of using threading see
-the Stan case study Reduce Sum: A Minimal Example.
(integer vector of length 2) The platform and device IDs of
-the OpenCL device to use for fitting. The model must be compiled with
-cpp_options = list(stan_opencl = TRUE) for this argument to have an
-effect.
A CmdStanGQ object.
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
-model-method-check_syntax,
-model-method-compile,
-model-method-diagnose,
-model-method-expose_functions,
-model-method-format,
-model-method-laplace,
-model-method-optimize,
-model-method-pathfinder,
-model-method-sample,
-model-method-sample_mpi,
-model-method-variables,
-model-method-variational
# \dontrun{
-# first fit a model using MCMC
-mcmc_program <- write_stan_file(
- "data {
- int<lower=0> N;
- array[N] int<lower=0,upper=1> y;
- }
- parameters {
- real<lower=0,upper=1> theta;
- }
- model {
- y ~ bernoulli(theta);
- }"
-)
-mod_mcmc <- cmdstan_model(mcmc_program)
-
-data <- list(N = 10, y = c(1,1,0,0,0,1,0,1,0,0))
-fit_mcmc <- mod_mcmc$sample(data = data, seed = 123, refresh = 0)
-#> Running MCMC with 4 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#> Chain 3 finished in 0.0 seconds.
-#> Chain 4 finished in 0.0 seconds.
-#>
-#> All 4 chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.6 seconds.
-#>
-
-# stan program for standalone generated quantities
-# (could keep model block, but not necessary so removing it)
-gq_program <- write_stan_file(
- "data {
- int<lower=0> N;
- array[N] int<lower=0,upper=1> y;
- }
- parameters {
- real<lower=0,upper=1> theta;
- }
- generated quantities {
- array[N] int y_rep = bernoulli_rng(rep_vector(theta, N));
- }"
-)
-
-mod_gq <- cmdstan_model(gq_program)
-fit_gq <- mod_gq$generate_quantities(fit_mcmc, data = data, seed = 123)
-#> Running standalone generated quantities after 4 MCMC chains, 1 chain at a time ...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#> Chain 3 finished in 0.0 seconds.
-#> Chain 4 finished in 0.0 seconds.
-#>
-#> All 4 chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.6 seconds.
-str(fit_gq$draws())
-#> 'draws_array' int [1:1000, 1:4, 1:10] 0 0 0 1 1 0 1 1 0 1 ...
-#> - attr(*, "dimnames")=List of 3
-#> ..$ iteration: chr [1:1000] "1" "2" "3" "4" ...
-#> ..$ chain : chr [1:4] "1" "2" "3" "4"
-#> ..$ variable : chr [1:10] "y_rep[1]" "y_rep[2]" "y_rep[3]" "y_rep[4]" ...
-
-library(posterior)
-as_draws_df(fit_gq$draws())
-#> # A draws_df: 1000 iterations, 4 chains, and 10 variables
-#> y_rep[1] y_rep[2] y_rep[3] y_rep[4] y_rep[5] y_rep[6] y_rep[7] y_rep[8]
-#> 1 0 0 0 0 0 1 1 1
-#> 2 0 0 0 0 1 1 0 0
-#> 3 0 0 0 1 0 0 1 1
-#> 4 1 1 0 0 0 0 1 0
-#> 5 1 0 1 0 1 0 1 0
-#> 6 0 0 0 1 1 0 0 0
-#> 7 1 1 0 1 1 1 0 0
-#> 8 1 1 1 1 1 0 1 1
-#> 9 0 1 0 1 0 1 1 0
-#> 10 1 1 1 1 1 1 1 1
-#> # ... with 3990 more draws, and 2 more variables
-#> # ... hidden reserved variables {'.chain', '.iteration', '.draw'}
-# }
-
-The $laplace() method of a CmdStanModel object produces a
-sample from a normal approximation centered at the mode of a distribution
-in the unconstrained space. If the mode is a maximum a posteriori (MAP)
-estimate, the samples provide an estimate of the mean and standard
-deviation of the posterior distribution. If the mode is a maximum
-likelihood estimate (MLE), the sample provides an estimate of the standard
-error of the likelihood. Whether the mode is the MAP or MLE depends on
-the value of the jacobian argument when running optimization. See the
-CmdStan User’s Guide
-for more details.
Any argument left as NULL will default to the default value used by the
-installed version of CmdStan.
laplace(
- data = NULL,
- seed = NULL,
- refresh = NULL,
- init = NULL,
- save_latent_dynamics = FALSE,
- output_dir = getOption("cmdstanr_output_dir"),
- output_basename = NULL,
- sig_figs = NULL,
- threads = NULL,
- opencl_ids = NULL,
- mode = NULL,
- opt_args = NULL,
- jacobian = TRUE,
- draws = NULL,
- show_messages = TRUE,
- show_exceptions = TRUE,
- save_cmdstan_config = NULL
-)(multiple options) The data to use for the variables specified in -the data block of the Stan program. One of the following:
A named list of R objects with the names corresponding to variables
-declared in the data block of the Stan program. Internally this list is
-then written to JSON for CmdStan using write_stan_json(). See
-write_stan_json() for details on the conversions performed on R objects
-before they are passed to Stan.
A path to a data file compatible with CmdStan (JSON or R dump). See the -appendices in the CmdStan guide for details on using these formats.
NULL or an empty list if the Stan program has no data block.
(positive integer(s)) A seed for the (P)RNG to pass to CmdStan.
-In the case of multi-chain sampling the single seed will automatically be
-augmented by the the run (chain) ID so that each chain uses a different
-seed. The exception is the transformed data block, which defaults to using
-same seed for all chains so that the same data is generated for all chains
-if RNG functions are used. The only time seed should be specified as a
-vector (one element per chain) is if RNG functions are used in transformed
-data and the goal is to generate different data for each chain.
(non-negative integer) The number of iterations between
-printed screen updates. If refresh = 0, only error messages will be
-printed.
(multiple options) The initialization method to use for the -variables declared in the parameters block of the Stan program. One of the -following:
A real number x>0. This initializes all parameters randomly between
-[-x,x] on the unconstrained parameter space.;
The number 0. This initializes all parameters to 0;
A character vector of paths (one per chain) to JSON or Rdump files
-containing initial values for all or some parameters. See
-write_stan_json() to write R objects to JSON files compatible with
-CmdStan.
A list of lists containing initial values for all or some parameters. For -MCMC the list should contain a sublist for each chain. For other model -fitting methods there should be just one sublist. The sublists should have -named elements corresponding to the parameters for which you are specifying -initial values. See Examples.
A function that returns a single list with names corresponding to the
-parameters for which you are specifying initial values. The function can
-take no arguments or a single argument chain_id. For MCMC, if the
-function has argument chain_id it will be supplied with the chain id
-(from 1 to number of chains) when called to generate the initial values.
-See
-Examples.
A CmdStanMCMC, CmdStanMLE, CmdStanVB, CmdStanPathfinder,
-or CmdStanLaplace fit object.
-If the fit object's parameters are only a subset of the model
-parameters then the other parameters will be drawn by Stan's default
-initialization. The fit object must have at least some parameters that are the
-same name and dimensions as the current Stan model. For the sample and
-pathfinder method, if the fit object has fewer draws than the requested
-number of chains/paths then the inits will be drawn using sampling with
-replacement. Otherwise sampling without replacement will be used.
-When a CmdStanPathfinder fit object is used as the init, if
-. psis_resample was set to FALSE and calculate_lp was
-set to TRUE (default), then resampling without replacement with Pareto
-smoothed weights will be used. If psis_resample was set to TRUE or
-calculate_lp was set to FALSE then sampling without replacement with
-uniform weights will be used to select the draws.
-PSIS resampling is used to select the draws for CmdStanVB,
-and CmdStanLaplace fit objects.
A type inheriting from posterior::draws. If the draws object has less
-samples than the number of requested chains/paths then the inits will be
-drawn using sampling with replacement. Otherwise sampling without
-replacement will be used. If the draws object's parameters are only a subset
-of the model parameters then the other parameters will be drawn by Stan's
-default initialization. The fit object must have at least some parameters
-that are the same name and dimensions as the current Stan model.
Ignored for this method.
(string) A path to a directory where CmdStan should write
-its output CSV files. For MCMC there will be one file per chain; for other
-methods there will be a single file. For interactive use this can typically
-be left at NULL (temporary directory) since CmdStanR makes the CmdStan
-output (posterior draws and diagnostics) available in R via methods of the
-fitted model objects. This can be set for an entire R session using
-options(cmdstanr_output_dir). The behavior of output_dir is as follows:
If NULL (the default), then the CSV files are written to a temporary
-directory and only saved permanently if the user calls one of the $save_*
-methods of the fitted model object (e.g.,
-$save_output_files()). These temporary
-files are removed when the fitted model object is garbage collected (manually or automatically).
If a path, then the files are created in output_dir with names
-corresponding to the defaults used by $save_output_files().
(string) A string to use as a prefix for the names of
-the output CSV files of CmdStan. If NULL (the default), the basename of
-the output CSV files will be comprised from the model name, timestamp, and
-5 random characters.
(positive integer) The number of significant figures used
-when storing the output values. By default, CmdStan represent the output
-values with 6 significant figures. The upper limit for sig_figs is 18.
-Increasing this value will result in larger output CSV files and thus an
-increased usage of disk space.
(positive integer) If the model was
-compiled with threading support, the number of
-threads to use in parallelized sections (e.g., when
-using the Stan functions reduce_sum() or map_rect()).
(integer vector of length 2) The platform and device IDs of
-the OpenCL device to use for fitting. The model must be compiled with
-cpp_options = list(stan_opencl = TRUE) for this argument to have an
-effect.
(multiple options) The mode to center the approximation at. One -of the following:
A CmdStanMLE object from a previous run of $optimize().
The path to a CmdStan CSV file from running optimization.
NULL, in which case $optimize() will be run
-with jacobian=jacobian (see the jacobian argument below).
In all cases the total time reported by $time() will be
-the time of the Laplace sampling step only and does not include the time
-taken to run the $optimize() method.
(named list) A named list of optional arguments to pass to
-$optimize() if mode=NULL.
(logical) Whether or not to enable the Jacobian adjustment
-for constrained parameters. The default is TRUE. See the
-Laplace Sampling
-section of the CmdStan User's Guide for more details. If mode is not
-NULL then the value of jacobian must match the value used when
-optimization was originally run. If mode is NULL then the value of
-jacobian specified here is used when running optimization.
(positive integer) The number of draws to take.
(logical) When TRUE (the default), prints all output
-during the execution process, such as iteration numbers and elapsed times.
-If the output is silenced then the $output() method
-of the resulting fit object can be used to display the silenced messages.
(logical) When TRUE (the default), prints all
-informational messages, for example rejection of the current proposal.
-Disable if you wish to silence these messages, but this is not usually
-recommended unless you are very confident that the model is correct up to
-numerical error. If the messages are silenced then the
-$output() method of the resulting fit object can be
-used to display the silenced messages.
(logical) When TRUE (the default), call CmdStan
-with argument "output save_config=1" to save a json file which contains
-the argument tree and extra information (equivalent to the output CSV file
-header). This option is only available in CmdStan 2.34.0 and later.
A CmdStanLaplace object.
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
-model-method-check_syntax,
-model-method-compile,
-model-method-diagnose,
-model-method-expose_functions,
-model-method-format,
-model-method-generate-quantities,
-model-method-optimize,
-model-method-pathfinder,
-model-method-sample,
-model-method-sample_mpi,
-model-method-variables,
-model-method-variational
# \dontrun{
-file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan")
-mod <- cmdstan_model(file)
-mod$print()
-#> data {
-#> int<lower=0> N;
-#> array[N] int<lower=0, upper=1> y;
-#> }
-#> parameters {
-#> real<lower=0, upper=1> theta;
-#> }
-#> model {
-#> theta ~ beta(1, 1); // uniform prior on interval 0,1
-#> y ~ bernoulli(theta);
-#> }
-
-stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1))
-fit_mode <- mod$optimize(data = stan_data, jacobian = TRUE)
-#> Initial log joint probability = -7.39766
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 5 -6.74802 8.90809e-05 3.13416e-07 1 1 8
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.2 seconds.
-fit_laplace <- mod$laplace(data = stan_data, mode = fit_mode)
-#> Calculating Hessian
-#> Calculating inverse of Cholesky factor
-#> Generating draws
-#> iteration: 0
-#> iteration: 100
-#> iteration: 200
-#> iteration: 300
-#> iteration: 400
-#> iteration: 500
-#> iteration: 600
-#> iteration: 700
-#> iteration: 800
-#> iteration: 900
-#> Finished in 0.2 seconds.
-fit_laplace$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.26 -6.98 0.724 0.319 -8.80 -6.75
-#> 2 lp_approx__ -0.533 -0.245 0.740 0.330 -2.10 -0.00300
-#> 3 theta 0.262 0.240 0.125 0.120 0.0941 0.503
-
-# if mode isn't specified optimize is run internally first
-fit_laplace <- mod$laplace(data = stan_data)
-#> Initial log joint probability = -7.50846
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 5 -6.74802 0.000114292 4.68497e-07 1 1 8
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.1 seconds.
-#> Calculating Hessian
-#> Calculating inverse of Cholesky factor
-#> Generating draws
-#> iteration: 0
-#> iteration: 100
-#> iteration: 200
-#> iteration: 300
-#> iteration: 400
-#> iteration: 500
-#> iteration: 600
-#> iteration: 700
-#> iteration: 800
-#> iteration: 900
-#> Finished in 0.2 seconds.
-fit_laplace$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.23 -6.96 0.773 0.291 -8.49 -6.75
-#> 2 lp_approx__ -0.485 -0.219 0.744 0.303 -1.81 -0.00163
-#> 3 theta 0.265 0.250 0.121 0.121 0.101 0.493
-
-# plot approximate posterior
-bayesplot::mcmc_hist(fit_laplace$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-# }
-
-
-The $optimize() method of a CmdStanModel object runs
-Stan's optimizer to obtain a (penalized) maximum likelihood estimate (MLE)
-or a maximum a posteriori estimate (MAP), depending on the value of the
-jacobian argument. For models with constrained parameters, when the
-Jacobian adjustment is not applied, the point estimate corresponds to a
-penalized MLE, and when the Jacobian adjustment is applied the point
-estimate corresponds to the MAP (posterior mode) of the model we would fit
-if we were instead doing MCMC sampling. The Jacobian adjustment has no
-affect if the model has only unconstrained parameters. See the
-CmdStan User's Guide
-for more details.
Any argument left as NULL will default to the default value used by the
-installed version of CmdStan. See the CmdStan User’s Guide for more details on the
-default arguments. The default values can also be obtained by checking the
-metadata of an example model, e.g.,
-cmdstanr_example(method="optimize")$metadata().
optimize(
- data = NULL,
- seed = NULL,
- refresh = NULL,
- init = NULL,
- save_latent_dynamics = FALSE,
- output_dir = getOption("cmdstanr_output_dir"),
- output_basename = NULL,
- sig_figs = NULL,
- threads = NULL,
- opencl_ids = NULL,
- algorithm = NULL,
- jacobian = FALSE,
- init_alpha = NULL,
- iter = NULL,
- tol_obj = NULL,
- tol_rel_obj = NULL,
- tol_grad = NULL,
- tol_rel_grad = NULL,
- tol_param = NULL,
- history_size = NULL,
- show_messages = TRUE,
- show_exceptions = TRUE,
- save_cmdstan_config = NULL
-)(multiple options) The data to use for the variables specified in -the data block of the Stan program. One of the following:
A named list of R objects with the names corresponding to variables
-declared in the data block of the Stan program. Internally this list is
-then written to JSON for CmdStan using write_stan_json(). See
-write_stan_json() for details on the conversions performed on R objects
-before they are passed to Stan.
A path to a data file compatible with CmdStan (JSON or R dump). See the -appendices in the CmdStan guide for details on using these formats.
NULL or an empty list if the Stan program has no data block.
(positive integer(s)) A seed for the (P)RNG to pass to CmdStan.
-In the case of multi-chain sampling the single seed will automatically be
-augmented by the the run (chain) ID so that each chain uses a different
-seed. The exception is the transformed data block, which defaults to using
-same seed for all chains so that the same data is generated for all chains
-if RNG functions are used. The only time seed should be specified as a
-vector (one element per chain) is if RNG functions are used in transformed
-data and the goal is to generate different data for each chain.
(non-negative integer) The number of iterations between
-printed screen updates. If refresh = 0, only error messages will be
-printed.
(multiple options) The initialization method to use for the -variables declared in the parameters block of the Stan program. One of the -following:
A real number x>0. This initializes all parameters randomly between
-[-x,x] on the unconstrained parameter space.;
The number 0. This initializes all parameters to 0;
A character vector of paths (one per chain) to JSON or Rdump files
-containing initial values for all or some parameters. See
-write_stan_json() to write R objects to JSON files compatible with
-CmdStan.
A list of lists containing initial values for all or some parameters. For -MCMC the list should contain a sublist for each chain. For other model -fitting methods there should be just one sublist. The sublists should have -named elements corresponding to the parameters for which you are specifying -initial values. See Examples.
A function that returns a single list with names corresponding to the
-parameters for which you are specifying initial values. The function can
-take no arguments or a single argument chain_id. For MCMC, if the
-function has argument chain_id it will be supplied with the chain id
-(from 1 to number of chains) when called to generate the initial values.
-See
-Examples.
A CmdStanMCMC, CmdStanMLE, CmdStanVB, CmdStanPathfinder,
-or CmdStanLaplace fit object.
-If the fit object's parameters are only a subset of the model
-parameters then the other parameters will be drawn by Stan's default
-initialization. The fit object must have at least some parameters that are the
-same name and dimensions as the current Stan model. For the sample and
-pathfinder method, if the fit object has fewer draws than the requested
-number of chains/paths then the inits will be drawn using sampling with
-replacement. Otherwise sampling without replacement will be used.
-When a CmdStanPathfinder fit object is used as the init, if
-. psis_resample was set to FALSE and calculate_lp was
-set to TRUE (default), then resampling without replacement with Pareto
-smoothed weights will be used. If psis_resample was set to TRUE or
-calculate_lp was set to FALSE then sampling without replacement with
-uniform weights will be used to select the draws.
-PSIS resampling is used to select the draws for CmdStanVB,
-and CmdStanLaplace fit objects.
A type inheriting from posterior::draws. If the draws object has less
-samples than the number of requested chains/paths then the inits will be
-drawn using sampling with replacement. Otherwise sampling without
-replacement will be used. If the draws object's parameters are only a subset
-of the model parameters then the other parameters will be drawn by Stan's
-default initialization. The fit object must have at least some parameters
-that are the same name and dimensions as the current Stan model.
(logical) Should auxiliary diagnostic information
-about the latent dynamics be written to temporary diagnostic CSV files?
-This argument replaces CmdStan's diagnostic_file argument and the content
-written to CSV is controlled by the user's CmdStan installation and not
-CmdStanR (for some algorithms no content may be written). The default is
-FALSE, which is appropriate for almost every use case. To save the
-temporary files created when save_latent_dynamics=TRUE see the
-$save_latent_dynamics_files()
-method.
(string) A path to a directory where CmdStan should write
-its output CSV files. For MCMC there will be one file per chain; for other
-methods there will be a single file. For interactive use this can typically
-be left at NULL (temporary directory) since CmdStanR makes the CmdStan
-output (posterior draws and diagnostics) available in R via methods of the
-fitted model objects. This can be set for an entire R session using
-options(cmdstanr_output_dir). The behavior of output_dir is as follows:
If NULL (the default), then the CSV files are written to a temporary
-directory and only saved permanently if the user calls one of the $save_*
-methods of the fitted model object (e.g.,
-$save_output_files()). These temporary
-files are removed when the fitted model object is garbage collected (manually or automatically).
If a path, then the files are created in output_dir with names
-corresponding to the defaults used by $save_output_files().
(string) A string to use as a prefix for the names of
-the output CSV files of CmdStan. If NULL (the default), the basename of
-the output CSV files will be comprised from the model name, timestamp, and
-5 random characters.
(positive integer) The number of significant figures used
-when storing the output values. By default, CmdStan represent the output
-values with 6 significant figures. The upper limit for sig_figs is 18.
-Increasing this value will result in larger output CSV files and thus an
-increased usage of disk space.
(positive integer) If the model was
-compiled with threading support, the number of
-threads to use in parallelized sections (e.g., when
-using the Stan functions reduce_sum() or map_rect()).
(integer vector of length 2) The platform and device IDs of
-the OpenCL device to use for fitting. The model must be compiled with
-cpp_options = list(stan_opencl = TRUE) for this argument to have an
-effect.
(string) The optimization algorithm. One of "lbfgs",
-"bfgs", or "newton". The control parameters below are only available
-for "lbfgs" and "bfgs. For their default values and more details see
-the CmdStan User's Guide. The default values can also be obtained by
-running cmdstanr_example(method="optimize")$metadata().
(logical) Whether or not to use the Jacobian adjustment for
-constrained variables. For historical reasons, the default is FALSE, meaning optimization
-yields the (regularized) maximum likelihood estimate. Setting it to TRUE
-yields the maximum a posteriori estimate. See the
-Maximum Likelihood Estimation
-section of the CmdStan User's Guide for more details.
-For use later with $laplace() the jacobian
-argument should typically be set to TRUE.
(positive real) The initial step size parameter.
(positive integer) The maximum number of iterations.
(positive real) Convergence tolerance on changes in objective function value.
(positive real) Convergence tolerance on relative changes in objective function value.
(positive real) Convergence tolerance on the norm of the gradient.
(positive real) Convergence tolerance on the relative norm of the gradient.
(positive real) Convergence tolerance on changes in parameter value.
(positive integer) The size of the history used when -approximating the Hessian. Only available for L-BFGS.
(logical) When TRUE (the default), prints all output
-during the execution process, such as iteration numbers and elapsed times.
-If the output is silenced then the $output() method
-of the resulting fit object can be used to display the silenced messages.
(logical) When TRUE (the default), prints all
-informational messages, for example rejection of the current proposal.
-Disable if you wish to silence these messages, but this is not usually
-recommended unless you are very confident that the model is correct up to
-numerical error. If the messages are silenced then the
-$output() method of the resulting fit object can be
-used to display the silenced messages.
(logical) When TRUE (the default), call CmdStan
-with argument "output save_config=1" to save a json file which contains
-the argument tree and extra information (equivalent to the output CSV file
-header). This option is only available in CmdStan 2.34.0 and later.
A CmdStanMLE object.
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
-model-method-check_syntax,
-model-method-compile,
-model-method-diagnose,
-model-method-expose_functions,
-model-method-format,
-model-method-generate-quantities,
-model-method-laplace,
-model-method-pathfinder,
-model-method-sample,
-model-method-sample_mpi,
-model-method-variables,
-model-method-variational
# \dontrun{
-library(cmdstanr)
-library(posterior)
-library(bayesplot)
-color_scheme_set("brightblue")
-
-# Set path to CmdStan
-# (Note: if you installed CmdStan via install_cmdstan() with default settings
-# then setting the path is unnecessary but the default below should still work.
-# Otherwise use the `path` argument to specify the location of your
-# CmdStan installation.)
-set_cmdstan_path(path = NULL)
-#> CmdStan path set to: /Users/jgabry/.cmdstan/cmdstan-2.36.0
-
-# Create a CmdStanModel object from a Stan program,
-# here using the example model that comes with CmdStan
-file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan")
-mod <- cmdstan_model(file)
-mod$print()
-#> data {
-#> int<lower=0> N;
-#> array[N] int<lower=0, upper=1> y;
-#> }
-#> parameters {
-#> real<lower=0, upper=1> theta;
-#> }
-#> model {
-#> theta ~ beta(1, 1); // uniform prior on interval 0,1
-#> y ~ bernoulli(theta);
-#> }
-# Print with line numbers. This can be set globally using the
-# `cmdstanr_print_line_numbers` option.
-mod$print(line_numbers = TRUE)
-#> 1: data {
-#> 2: int<lower=0> N;
-#> 3: array[N] int<lower=0, upper=1> y;
-#> 4: }
-#> 5: parameters {
-#> 6: real<lower=0, upper=1> theta;
-#> 7: }
-#> 8: model {
-#> 9: theta ~ beta(1, 1); // uniform prior on interval 0,1
-#> 10: y ~ bernoulli(theta);
-#> 11: }
-
-# Data as a named list (like RStan)
-stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1))
-
-# Run MCMC using the 'sample' method
-fit_mcmc <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- parallel_chains = 2
-)
-#> Running MCMC with 2 parallel chains...
-#>
-#> Chain 1 Iteration: 1 / 2000 [ 0%] (Warmup)
-#> Chain 1 Iteration: 100 / 2000 [ 5%] (Warmup)
-#> Chain 1 Iteration: 200 / 2000 [ 10%] (Warmup)
-#> Chain 1 Iteration: 300 / 2000 [ 15%] (Warmup)
-#> Chain 1 Iteration: 400 / 2000 [ 20%] (Warmup)
-#> Chain 1 Iteration: 500 / 2000 [ 25%] (Warmup)
-#> Chain 1 Iteration: 600 / 2000 [ 30%] (Warmup)
-#> Chain 1 Iteration: 700 / 2000 [ 35%] (Warmup)
-#> Chain 1 Iteration: 800 / 2000 [ 40%] (Warmup)
-#> Chain 1 Iteration: 900 / 2000 [ 45%] (Warmup)
-#> Chain 1 Iteration: 1000 / 2000 [ 50%] (Warmup)
-#> Chain 1 Iteration: 1001 / 2000 [ 50%] (Sampling)
-#> Chain 1 Iteration: 1100 / 2000 [ 55%] (Sampling)
-#> Chain 1 Iteration: 1200 / 2000 [ 60%] (Sampling)
-#> Chain 1 Iteration: 1300 / 2000 [ 65%] (Sampling)
-#> Chain 1 Iteration: 1400 / 2000 [ 70%] (Sampling)
-#> Chain 1 Iteration: 1500 / 2000 [ 75%] (Sampling)
-#> Chain 1 Iteration: 1600 / 2000 [ 80%] (Sampling)
-#> Chain 1 Iteration: 1700 / 2000 [ 85%] (Sampling)
-#> Chain 1 Iteration: 1800 / 2000 [ 90%] (Sampling)
-#> Chain 1 Iteration: 1900 / 2000 [ 95%] (Sampling)
-#> Chain 1 Iteration: 2000 / 2000 [100%] (Sampling)
-#> Chain 2 Iteration: 1 / 2000 [ 0%] (Warmup)
-#> Chain 2 Iteration: 100 / 2000 [ 5%] (Warmup)
-#> Chain 2 Iteration: 200 / 2000 [ 10%] (Warmup)
-#> Chain 2 Iteration: 300 / 2000 [ 15%] (Warmup)
-#> Chain 2 Iteration: 400 / 2000 [ 20%] (Warmup)
-#> Chain 2 Iteration: 500 / 2000 [ 25%] (Warmup)
-#> Chain 2 Iteration: 600 / 2000 [ 30%] (Warmup)
-#> Chain 2 Iteration: 700 / 2000 [ 35%] (Warmup)
-#> Chain 2 Iteration: 800 / 2000 [ 40%] (Warmup)
-#> Chain 2 Iteration: 900 / 2000 [ 45%] (Warmup)
-#> Chain 2 Iteration: 1000 / 2000 [ 50%] (Warmup)
-#> Chain 2 Iteration: 1001 / 2000 [ 50%] (Sampling)
-#> Chain 2 Iteration: 1100 / 2000 [ 55%] (Sampling)
-#> Chain 2 Iteration: 1200 / 2000 [ 60%] (Sampling)
-#> Chain 2 Iteration: 1300 / 2000 [ 65%] (Sampling)
-#> Chain 2 Iteration: 1400 / 2000 [ 70%] (Sampling)
-#> Chain 2 Iteration: 1500 / 2000 [ 75%] (Sampling)
-#> Chain 2 Iteration: 1600 / 2000 [ 80%] (Sampling)
-#> Chain 2 Iteration: 1700 / 2000 [ 85%] (Sampling)
-#> Chain 2 Iteration: 1800 / 2000 [ 90%] (Sampling)
-#> Chain 2 Iteration: 1900 / 2000 [ 95%] (Sampling)
-#> Chain 2 Iteration: 2000 / 2000 [100%] (Sampling)
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.2 seconds.
-#>
-
-# Use 'posterior' package for summaries
-fit_mcmc$summary()
-#> # A tibble: 2 × 10
-#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.35 -7.01 0.882 0.353 -9.14 -6.75 1.00 724. 896.
-#> 2 theta 0.254 0.239 0.129 0.126 0.0737 0.488 1.00 532. 657.
-
-# Check sampling diagnostics
-fit_mcmc$diagnostic_summary()
-#> $num_divergent
-#> [1] 0 0
-#>
-#> $num_max_treedepth
-#> [1] 0 0
-#>
-#> $ebfmi
-#> [1] 1.1148479 0.7568734
-#>
-
-# Get posterior draws
-draws <- fit_mcmc$draws()
-print(draws)
-#> # A draws_array: 1000 iterations, 2 chains, and 2 variables
-#> , , variable = lp__
-#>
-#> chain
-#> iteration 1 2
-#> 1 -7.0 -8.1
-#> 2 -7.9 -7.9
-#> 3 -7.4 -7.0
-#> 4 -6.7 -6.8
-#> 5 -6.9 -6.8
-#>
-#> , , variable = theta
-#>
-#> chain
-#> iteration 1 2
-#> 1 0.17 0.088
-#> 2 0.46 0.097
-#> 3 0.41 0.167
-#> 4 0.25 0.292
-#> 5 0.18 0.238
-#>
-#> # ... with 995 more iterations
-
-# Convert to data frame using posterior::as_draws_df
-as_draws_df(draws)
-#> # A draws_df: 1000 iterations, 2 chains, and 2 variables
-#> lp__ theta
-#> 1 -7.0 0.17
-#> 2 -7.9 0.46
-#> 3 -7.4 0.41
-#> 4 -6.7 0.25
-#> 5 -6.9 0.18
-#> 6 -6.9 0.33
-#> 7 -7.2 0.15
-#> 8 -6.8 0.29
-#> 9 -6.8 0.24
-#> 10 -6.8 0.24
-#> # ... with 1990 more draws
-#> # ... hidden reserved variables {'.chain', '.iteration', '.draw'}
-
-# Plot posterior using bayesplot (ggplot2)
-mcmc_hist(fit_mcmc$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm)
-# and also demonstrate specifying data as a path to a file instead of a list
-my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json")
-fit_optim <- mod$optimize(data = my_data_file, seed = 123)
-#> Initial log joint probability = -16.144
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 6 -5.00402 0.000246518 8.73164e-07 1 1 9
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.1 seconds.
-fit_optim$summary()
-#> # A tibble: 2 × 2
-#> variable estimate
-#> <chr> <dbl>
-#> 1 lp__ -5.00
-#> 2 theta 0.2
-
-# Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation
-# to the posterior
-fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE)
-#> Initial log joint probability = -12.6782
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 5 -6.74802 0.00154762 3.33598e-05 1 1 8
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.2 seconds.
-fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000)
-#> Calculating Hessian
-#> Calculating inverse of Cholesky factor
-#> Generating draws
-#> iteration: 0
-#> iteration: 100
-#> iteration: 200
-#> iteration: 300
-#> iteration: 400
-#> iteration: 500
-#> iteration: 600
-#> iteration: 700
-#> iteration: 800
-#> iteration: 900
-#> iteration: 1000
-#> iteration: 1100
-#> iteration: 1200
-#> iteration: 1300
-#> iteration: 1400
-#> iteration: 1500
-#> iteration: 1600
-#> iteration: 1700
-#> iteration: 1800
-#> iteration: 1900
-#> Finished in 0.1 seconds.
-fit_laplace$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.23 -6.96 0.673 0.289 -8.53 -6.75
-#> 2 lp_approx__ -0.484 -0.215 0.665 0.294 -1.83 -0.00153
-#> 3 theta 0.270 0.253 0.121 0.119 0.0987 0.501
-
-# Run 'variational' method to use ADVI to approximate posterior
-fit_vb <- mod$variational(data = stan_data, seed = 123)
-#> ------------------------------------------------------------
-#> EXPERIMENTAL ALGORITHM:
-#> This procedure has not been thoroughly tested and may be unstable
-#> or buggy. The interface is subject to change.
-#> ------------------------------------------------------------
-#> Gradient evaluation took 1.1e-05 seconds
-#> 1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
-#> Adjust your expectations accordingly!
-#> Begin eta adaptation.
-#> Iteration: 1 / 250 [ 0%] (Adaptation)
-#> Iteration: 50 / 250 [ 20%] (Adaptation)
-#> Iteration: 100 / 250 [ 40%] (Adaptation)
-#> Iteration: 150 / 250 [ 60%] (Adaptation)
-#> Iteration: 200 / 250 [ 80%] (Adaptation)
-#> Success! Found best value [eta = 1] earlier than expected.
-#> Begin stochastic gradient ascent.
-#> iter ELBO delta_ELBO_mean delta_ELBO_med notes
-#> 100 -6.164 1.000 1.000
-#> 200 -6.225 0.505 1.000
-#> 300 -6.186 0.339 0.010 MEDIAN ELBO CONVERGED
-#> Drawing a sample of size 1000 from the approximate posterior...
-#> COMPLETED.
-#> Finished in 0.2 seconds.
-fit_vb$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.14 -6.93 0.528 0.247 -8.21 -6.75
-#> 2 lp_approx__ -0.520 -0.244 0.740 0.326 -1.90 -0.00227
-#> 3 theta 0.251 0.236 0.107 0.108 0.100 0.446
-mcmc_hist(fit_vb$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'pathfinder' method, a new alternative to the variational method
-fit_pf <- mod$pathfinder(data = stan_data, seed = 123)
-#> Path [1] :Initial log joint density = -18.273334
-#> Path [1] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 7.082e-04 1.432e-05 1.000e+00 1.000e+00 126 -6.145e+00 -6.145e+00
-#> Path [1] :Best Iter: [5] ELBO (-6.145070) evaluations: (126)
-#> Path [2] :Initial log joint density = -19.192715
-#> Path [2] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 2.015e-04 2.228e-06 1.000e+00 1.000e+00 126 -6.223e+00 -6.223e+00
-#> Path [2] :Best Iter: [2] ELBO (-6.170358) evaluations: (126)
-#> Path [3] :Initial log joint density = -6.774820
-#> Path [3] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 4 -6.748e+00 1.137e-04 2.596e-07 1.000e+00 1.000e+00 101 -6.178e+00 -6.178e+00
-#> Path [3] :Best Iter: [4] ELBO (-6.177909) evaluations: (101)
-#> Path [4] :Initial log joint density = -7.949193
-#> Path [4] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 2.145e-04 1.301e-06 1.000e+00 1.000e+00 126 -6.197e+00 -6.197e+00
-#> Path [4] :Best Iter: [5] ELBO (-6.197118) evaluations: (126)
-#> Total log probability function evaluations:4379
-#> Finished in 0.2 seconds.
-fit_pf$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp_approx__ -1.07 -0.727 0.945 0.311 -2.91 -0.450
-#> 2 lp__ -7.25 -6.97 0.753 0.308 -8.78 -6.75
-#> 3 theta 0.256 0.245 0.119 0.123 0.0824 0.462
-mcmc_hist(fit_pf$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'pathfinder' again with more paths, fewer draws per path,
-# better covariance approximation, and fewer LBFGSs iterations
-fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40,
- history_size=50, max_lbfgs_iters=100)
-#> Warning: Number of PSIS draws is larger than the total number of draws returned by the single Pathfinders. This is likely unintentional and leads to re-sampling from the same draws.
-#> Path [1] :Initial log joint density = -6.755726
-#> Path [1] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 3 -6.748e+00 9.605e-04 1.028e-05 9.884e-01 9.884e-01 76 -6.240e+00 -6.240e+00
-#> Path [1] :Best Iter: [2] ELBO (-6.221702) evaluations: (76)
-#> Path [2] :Initial log joint density = -9.051867
-#> Path [2] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 3.457e-04 2.924e-06 1.000e+00 1.000e+00 126 -6.190e+00 -6.190e+00
-#> Path [2] :Best Iter: [5] ELBO (-6.189740) evaluations: (126)
-#> Path [3] :Initial log joint density = -6.775220
-#> Path [3] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 3 -6.748e+00 2.337e-03 3.532e-06 9.682e-01 9.682e-01 76 -6.298e+00 -6.298e+00
-#> Path [3] :Best Iter: [3] ELBO (-6.298399) evaluations: (76)
-#> Path [4] :Initial log joint density = -18.179237
-#> Path [4] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 7.662e-04 1.607e-05 1.000e+00 1.000e+00 126 -6.271e+00 -6.271e+00
-#> Path [4] :Best Iter: [4] ELBO (-6.175731) evaluations: (126)
-#> Path [5] :Initial log joint density = -17.845239
-#> Path [5] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 9.725e-04 2.274e-05 1.000e+00 1.000e+00 126 -6.275e+00 -6.275e+00
-#> Path [5] :Best Iter: [4] ELBO (-6.207498) evaluations: (126)
-#> Path [6] :Initial log joint density = -11.863276
-#> Path [6] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.187e-03 2.045e-05 1.000e+00 1.000e+00 126 -6.264e+00 -6.264e+00
-#> Path [6] :Best Iter: [2] ELBO (-6.141104) evaluations: (126)
-#> Path [7] :Initial log joint density = -6.807183
-#> Path [7] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 4 -6.748e+00 3.391e-04 1.605e-06 1.000e+00 1.000e+00 101 -6.264e+00 -6.264e+00
-#> Path [7] :Best Iter: [2] ELBO (-6.199985) evaluations: (101)
-#> Path [8] :Initial log joint density = -6.874334
-#> Path [8] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 4 -6.748e+00 1.370e-04 2.042e-05 9.398e-01 9.398e-01 101 -6.203e+00 -6.203e+00
-#> Path [8] :Best Iter: [4] ELBO (-6.203174) evaluations: (101)
-#> Path [9] :Initial log joint density = -16.813276
-#> Path [9] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.549e-03 4.398e-05 1.000e+00 1.000e+00 126 -6.174e+00 -6.174e+00
-#> Path [9] :Best Iter: [5] ELBO (-6.174126) evaluations: (126)
-#> Path [10] :Initial log joint density = -11.429555
-#> Path [10] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.045e-03 1.644e-05 1.000e+00 1.000e+00 126 -6.293e+00 -6.293e+00
-#> Path [10] :Best Iter: [3] ELBO (-6.173918) evaluations: (126)
-#> Total log probability function evaluations:1260
-#> Finished in 0.2 seconds.
-
-# Specifying initial values as a function
-fit_mcmc_w_init_fun <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = function() list(theta = runif(1))
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.3 seconds.
-#>
-fit_mcmc_w_init_fun_2 <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = function(chain_id) {
- # silly but demonstrates optional use of chain_id
- list(theta = 1 / (chain_id + 1))
- }
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.4 seconds.
-#>
-fit_mcmc_w_init_fun_2$init()
-#> [[1]]
-#> [[1]]$theta
-#> [1] 0.5
-#>
-#>
-#> [[2]]
-#> [[2]]$theta
-#> [1] 0.3333333
-#>
-#>
-
-# Specifying initial values as a list of lists
-fit_mcmc_w_init_list <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = list(
- list(theta = 0.75), # chain 1
- list(theta = 0.25) # chain 2
- )
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.3 seconds.
-#>
-fit_optim_w_init_list <- mod$optimize(
- data = stan_data,
- seed = 123,
- init = list(
- list(theta = 0.75)
- )
-)
-#> Initial log joint probability = -11.6657
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 6 -5.00402 0.000237915 9.55309e-07 1 1 9
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.2 seconds.
-fit_optim_w_init_list$init()
-#> [[1]]
-#> [[1]]$theta
-#> [1] 0.75
-#>
-#>
-# }
-
-R/model.R
- model-method-pathfinder.RdThe $pathfinder() method of a CmdStanModel object runs
-Stan's Pathfinder algorithms. Pathfinder is a variational method for
-approximately sampling from differentiable log densities. Starting from a
-random initialization, Pathfinder locates normal approximations
-to the target density along a quasi-Newton optimization path in
-the unconstrained space, with local covariance estimated using
-the negative inverse Hessian estimates produced by the LBFGS
-optimizer. Pathfinder selects the normal approximation with the
-lowest estimated Kullback-Leibler (KL) divergence to the true
-posterior. Finally Pathfinder draws from that normal
-approximation and returns the draws transformed to the
-constrained scale. See the
-CmdStan User’s Guide
-for more details.
Any argument left as NULL will default to the default value used by the
-installed version of CmdStan
pathfinder(
- data = NULL,
- seed = NULL,
- refresh = NULL,
- init = NULL,
- save_latent_dynamics = FALSE,
- output_dir = getOption("cmdstanr_output_dir"),
- output_basename = NULL,
- sig_figs = NULL,
- opencl_ids = NULL,
- num_threads = NULL,
- init_alpha = NULL,
- tol_obj = NULL,
- tol_rel_obj = NULL,
- tol_grad = NULL,
- tol_rel_grad = NULL,
- tol_param = NULL,
- history_size = NULL,
- single_path_draws = NULL,
- draws = NULL,
- num_paths = 4,
- max_lbfgs_iters = NULL,
- num_elbo_draws = NULL,
- save_single_paths = NULL,
- psis_resample = NULL,
- calculate_lp = NULL,
- show_messages = TRUE,
- show_exceptions = TRUE,
- save_cmdstan_config = NULL
-)(multiple options) The data to use for the variables specified in -the data block of the Stan program. One of the following:
A named list of R objects with the names corresponding to variables
-declared in the data block of the Stan program. Internally this list is
-then written to JSON for CmdStan using write_stan_json(). See
-write_stan_json() for details on the conversions performed on R objects
-before they are passed to Stan.
A path to a data file compatible with CmdStan (JSON or R dump). See the -appendices in the CmdStan guide for details on using these formats.
NULL or an empty list if the Stan program has no data block.
(positive integer(s)) A seed for the (P)RNG to pass to CmdStan.
-In the case of multi-chain sampling the single seed will automatically be
-augmented by the the run (chain) ID so that each chain uses a different
-seed. The exception is the transformed data block, which defaults to using
-same seed for all chains so that the same data is generated for all chains
-if RNG functions are used. The only time seed should be specified as a
-vector (one element per chain) is if RNG functions are used in transformed
-data and the goal is to generate different data for each chain.
(non-negative integer) The number of iterations between
-printed screen updates. If refresh = 0, only error messages will be
-printed.
(multiple options) The initialization method to use for the -variables declared in the parameters block of the Stan program. One of the -following:
A real number x>0. This initializes all parameters randomly between
-[-x,x] on the unconstrained parameter space.;
The number 0. This initializes all parameters to 0;
A character vector of paths (one per chain) to JSON or Rdump files
-containing initial values for all or some parameters. See
-write_stan_json() to write R objects to JSON files compatible with
-CmdStan.
A list of lists containing initial values for all or some parameters. For -MCMC the list should contain a sublist for each chain. For other model -fitting methods there should be just one sublist. The sublists should have -named elements corresponding to the parameters for which you are specifying -initial values. See Examples.
A function that returns a single list with names corresponding to the
-parameters for which you are specifying initial values. The function can
-take no arguments or a single argument chain_id. For MCMC, if the
-function has argument chain_id it will be supplied with the chain id
-(from 1 to number of chains) when called to generate the initial values.
-See
-Examples.
A CmdStanMCMC, CmdStanMLE, CmdStanVB, CmdStanPathfinder,
-or CmdStanLaplace fit object.
-If the fit object's parameters are only a subset of the model
-parameters then the other parameters will be drawn by Stan's default
-initialization. The fit object must have at least some parameters that are the
-same name and dimensions as the current Stan model. For the sample and
-pathfinder method, if the fit object has fewer draws than the requested
-number of chains/paths then the inits will be drawn using sampling with
-replacement. Otherwise sampling without replacement will be used.
-When a CmdStanPathfinder fit object is used as the init, if
-. psis_resample was set to FALSE and calculate_lp was
-set to TRUE (default), then resampling without replacement with Pareto
-smoothed weights will be used. If psis_resample was set to TRUE or
-calculate_lp was set to FALSE then sampling without replacement with
-uniform weights will be used to select the draws.
-PSIS resampling is used to select the draws for CmdStanVB,
-and CmdStanLaplace fit objects.
A type inheriting from posterior::draws. If the draws object has less
-samples than the number of requested chains/paths then the inits will be
-drawn using sampling with replacement. Otherwise sampling without
-replacement will be used. If the draws object's parameters are only a subset
-of the model parameters then the other parameters will be drawn by Stan's
-default initialization. The fit object must have at least some parameters
-that are the same name and dimensions as the current Stan model.
(logical) Should auxiliary diagnostic information
-about the latent dynamics be written to temporary diagnostic CSV files?
-This argument replaces CmdStan's diagnostic_file argument and the content
-written to CSV is controlled by the user's CmdStan installation and not
-CmdStanR (for some algorithms no content may be written). The default is
-FALSE, which is appropriate for almost every use case. To save the
-temporary files created when save_latent_dynamics=TRUE see the
-$save_latent_dynamics_files()
-method.
(string) A path to a directory where CmdStan should write
-its output CSV files. For MCMC there will be one file per chain; for other
-methods there will be a single file. For interactive use this can typically
-be left at NULL (temporary directory) since CmdStanR makes the CmdStan
-output (posterior draws and diagnostics) available in R via methods of the
-fitted model objects. This can be set for an entire R session using
-options(cmdstanr_output_dir). The behavior of output_dir is as follows:
If NULL (the default), then the CSV files are written to a temporary
-directory and only saved permanently if the user calls one of the $save_*
-methods of the fitted model object (e.g.,
-$save_output_files()). These temporary
-files are removed when the fitted model object is garbage collected (manually or automatically).
If a path, then the files are created in output_dir with names
-corresponding to the defaults used by $save_output_files().
(string) A string to use as a prefix for the names of
-the output CSV files of CmdStan. If NULL (the default), the basename of
-the output CSV files will be comprised from the model name, timestamp, and
-5 random characters.
(positive integer) The number of significant figures used
-when storing the output values. By default, CmdStan represent the output
-values with 6 significant figures. The upper limit for sig_figs is 18.
-Increasing this value will result in larger output CSV files and thus an
-increased usage of disk space.
(integer vector of length 2) The platform and device IDs of
-the OpenCL device to use for fitting. The model must be compiled with
-cpp_options = list(stan_opencl = TRUE) for this argument to have an
-effect.
(positive integer) If the model was
-compiled with threading support, the number of
-threads to use in parallelized sections (e.g., for multi-path pathfinder
-as well as reduce_sum).
(positive real) The initial step size parameter.
(positive real) Convergence tolerance on changes in objective function value.
(positive real) Convergence tolerance on relative changes in objective function value.
(positive real) Convergence tolerance on the norm of the gradient.
(positive real) Convergence tolerance on the relative norm of the gradient.
(positive real) Convergence tolerance on changes in parameter value.
(positive integer) The size of the history used when -approximating the Hessian.
(positive integer) Number of draws a single
-pathfinder should return. The number of draws PSIS sampling samples from
-will be equal to single_path_draws * num_paths.
(positive integer) Number of draws to return after performing
-pareto smooted importance sampling (PSIS). This should be smaller than
-single_path_draws * num_paths (future versions of CmdStan will throw a
-warning).
(positive integer) Number of single pathfinders to run.
(positive integer) The maximum number of iterations -for LBFGS.
(positive integer) Number of draws to make when -calculating the ELBO of the approximation at each iteration of LBFGS.
(logical) Whether to save the results of single -pathfinder runs in multi-pathfinder.
(logical) Whether to perform pareto smoothed importance sampling.
-If TRUE, the number of draws returned will be equal to draws.
-If FALSE, the number of draws returned will be equal to single_path_draws * num_paths.
(logical) Whether to calculate the log probability of the draws.
-If TRUE, the log probability will be calculated and given in the output.
-If FALSE, the log probability will only be returned for draws used to determine the
-ELBO in the pathfinder steps. All other draws will have a log probability of NA.
-A value of FALSE will also turn off pareto smoothed importance sampling as the
-lp calculation is needed for PSIS.
(logical) When TRUE (the default), prints all output
-during the execution process, such as iteration numbers and elapsed times.
-If the output is silenced then the $output() method
-of the resulting fit object can be used to display the silenced messages.
(logical) When TRUE (the default), prints all
-informational messages, for example rejection of the current proposal.
-Disable if you wish to silence these messages, but this is not usually
-recommended unless you are very confident that the model is correct up to
-numerical error. If the messages are silenced then the
-$output() method of the resulting fit object can be
-used to display the silenced messages.
(logical) When TRUE (the default), call CmdStan
-with argument "output save_config=1" to save a json file which contains
-the argument tree and extra information (equivalent to the output CSV file
-header). This option is only available in CmdStan 2.34.0 and later.
A CmdStanPathfinder object.
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
-model-method-check_syntax,
-model-method-compile,
-model-method-diagnose,
-model-method-expose_functions,
-model-method-format,
-model-method-generate-quantities,
-model-method-laplace,
-model-method-optimize,
-model-method-sample,
-model-method-sample_mpi,
-model-method-variables,
-model-method-variational
# \dontrun{
-library(cmdstanr)
-library(posterior)
-library(bayesplot)
-color_scheme_set("brightblue")
-
-# Set path to CmdStan
-# (Note: if you installed CmdStan via install_cmdstan() with default settings
-# then setting the path is unnecessary but the default below should still work.
-# Otherwise use the `path` argument to specify the location of your
-# CmdStan installation.)
-set_cmdstan_path(path = NULL)
-#> CmdStan path set to: /Users/jgabry/.cmdstan/cmdstan-2.36.0
-
-# Create a CmdStanModel object from a Stan program,
-# here using the example model that comes with CmdStan
-file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan")
-mod <- cmdstan_model(file)
-mod$print()
-#> data {
-#> int<lower=0> N;
-#> array[N] int<lower=0, upper=1> y;
-#> }
-#> parameters {
-#> real<lower=0, upper=1> theta;
-#> }
-#> model {
-#> theta ~ beta(1, 1); // uniform prior on interval 0,1
-#> y ~ bernoulli(theta);
-#> }
-# Print with line numbers. This can be set globally using the
-# `cmdstanr_print_line_numbers` option.
-mod$print(line_numbers = TRUE)
-#> 1: data {
-#> 2: int<lower=0> N;
-#> 3: array[N] int<lower=0, upper=1> y;
-#> 4: }
-#> 5: parameters {
-#> 6: real<lower=0, upper=1> theta;
-#> 7: }
-#> 8: model {
-#> 9: theta ~ beta(1, 1); // uniform prior on interval 0,1
-#> 10: y ~ bernoulli(theta);
-#> 11: }
-
-# Data as a named list (like RStan)
-stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1))
-
-# Run MCMC using the 'sample' method
-fit_mcmc <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- parallel_chains = 2
-)
-#> Running MCMC with 2 parallel chains...
-#>
-#> Chain 1 Iteration: 1 / 2000 [ 0%] (Warmup)
-#> Chain 1 Iteration: 100 / 2000 [ 5%] (Warmup)
-#> Chain 1 Iteration: 200 / 2000 [ 10%] (Warmup)
-#> Chain 1 Iteration: 300 / 2000 [ 15%] (Warmup)
-#> Chain 1 Iteration: 400 / 2000 [ 20%] (Warmup)
-#> Chain 1 Iteration: 500 / 2000 [ 25%] (Warmup)
-#> Chain 1 Iteration: 600 / 2000 [ 30%] (Warmup)
-#> Chain 1 Iteration: 700 / 2000 [ 35%] (Warmup)
-#> Chain 1 Iteration: 800 / 2000 [ 40%] (Warmup)
-#> Chain 1 Iteration: 900 / 2000 [ 45%] (Warmup)
-#> Chain 1 Iteration: 1000 / 2000 [ 50%] (Warmup)
-#> Chain 1 Iteration: 1001 / 2000 [ 50%] (Sampling)
-#> Chain 1 Iteration: 1100 / 2000 [ 55%] (Sampling)
-#> Chain 1 Iteration: 1200 / 2000 [ 60%] (Sampling)
-#> Chain 1 Iteration: 1300 / 2000 [ 65%] (Sampling)
-#> Chain 1 Iteration: 1400 / 2000 [ 70%] (Sampling)
-#> Chain 1 Iteration: 1500 / 2000 [ 75%] (Sampling)
-#> Chain 1 Iteration: 1600 / 2000 [ 80%] (Sampling)
-#> Chain 1 Iteration: 1700 / 2000 [ 85%] (Sampling)
-#> Chain 1 Iteration: 1800 / 2000 [ 90%] (Sampling)
-#> Chain 1 Iteration: 1900 / 2000 [ 95%] (Sampling)
-#> Chain 1 Iteration: 2000 / 2000 [100%] (Sampling)
-#> Chain 2 Iteration: 1 / 2000 [ 0%] (Warmup)
-#> Chain 2 Iteration: 100 / 2000 [ 5%] (Warmup)
-#> Chain 2 Iteration: 200 / 2000 [ 10%] (Warmup)
-#> Chain 2 Iteration: 300 / 2000 [ 15%] (Warmup)
-#> Chain 2 Iteration: 400 / 2000 [ 20%] (Warmup)
-#> Chain 2 Iteration: 500 / 2000 [ 25%] (Warmup)
-#> Chain 2 Iteration: 600 / 2000 [ 30%] (Warmup)
-#> Chain 2 Iteration: 700 / 2000 [ 35%] (Warmup)
-#> Chain 2 Iteration: 800 / 2000 [ 40%] (Warmup)
-#> Chain 2 Iteration: 900 / 2000 [ 45%] (Warmup)
-#> Chain 2 Iteration: 1000 / 2000 [ 50%] (Warmup)
-#> Chain 2 Iteration: 1001 / 2000 [ 50%] (Sampling)
-#> Chain 2 Iteration: 1100 / 2000 [ 55%] (Sampling)
-#> Chain 2 Iteration: 1200 / 2000 [ 60%] (Sampling)
-#> Chain 2 Iteration: 1300 / 2000 [ 65%] (Sampling)
-#> Chain 2 Iteration: 1400 / 2000 [ 70%] (Sampling)
-#> Chain 2 Iteration: 1500 / 2000 [ 75%] (Sampling)
-#> Chain 2 Iteration: 1600 / 2000 [ 80%] (Sampling)
-#> Chain 2 Iteration: 1700 / 2000 [ 85%] (Sampling)
-#> Chain 2 Iteration: 1800 / 2000 [ 90%] (Sampling)
-#> Chain 2 Iteration: 1900 / 2000 [ 95%] (Sampling)
-#> Chain 2 Iteration: 2000 / 2000 [100%] (Sampling)
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.2 seconds.
-#>
-
-# Use 'posterior' package for summaries
-fit_mcmc$summary()
-#> # A tibble: 2 × 10
-#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.35 -7.01 0.882 0.353 -9.14 -6.75 1.00 724. 896.
-#> 2 theta 0.254 0.239 0.129 0.126 0.0737 0.488 1.00 532. 657.
-
-# Check sampling diagnostics
-fit_mcmc$diagnostic_summary()
-#> $num_divergent
-#> [1] 0 0
-#>
-#> $num_max_treedepth
-#> [1] 0 0
-#>
-#> $ebfmi
-#> [1] 1.1148479 0.7568734
-#>
-
-# Get posterior draws
-draws <- fit_mcmc$draws()
-print(draws)
-#> # A draws_array: 1000 iterations, 2 chains, and 2 variables
-#> , , variable = lp__
-#>
-#> chain
-#> iteration 1 2
-#> 1 -7.0 -8.1
-#> 2 -7.9 -7.9
-#> 3 -7.4 -7.0
-#> 4 -6.7 -6.8
-#> 5 -6.9 -6.8
-#>
-#> , , variable = theta
-#>
-#> chain
-#> iteration 1 2
-#> 1 0.17 0.088
-#> 2 0.46 0.097
-#> 3 0.41 0.167
-#> 4 0.25 0.292
-#> 5 0.18 0.238
-#>
-#> # ... with 995 more iterations
-
-# Convert to data frame using posterior::as_draws_df
-as_draws_df(draws)
-#> # A draws_df: 1000 iterations, 2 chains, and 2 variables
-#> lp__ theta
-#> 1 -7.0 0.17
-#> 2 -7.9 0.46
-#> 3 -7.4 0.41
-#> 4 -6.7 0.25
-#> 5 -6.9 0.18
-#> 6 -6.9 0.33
-#> 7 -7.2 0.15
-#> 8 -6.8 0.29
-#> 9 -6.8 0.24
-#> 10 -6.8 0.24
-#> # ... with 1990 more draws
-#> # ... hidden reserved variables {'.chain', '.iteration', '.draw'}
-
-# Plot posterior using bayesplot (ggplot2)
-mcmc_hist(fit_mcmc$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm)
-# and also demonstrate specifying data as a path to a file instead of a list
-my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json")
-fit_optim <- mod$optimize(data = my_data_file, seed = 123)
-#> Initial log joint probability = -16.144
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 6 -5.00402 0.000246518 8.73164e-07 1 1 9
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.2 seconds.
-fit_optim$summary()
-#> # A tibble: 2 × 2
-#> variable estimate
-#> <chr> <dbl>
-#> 1 lp__ -5.00
-#> 2 theta 0.2
-
-# Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation
-# to the posterior
-fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE)
-#> Initial log joint probability = -11.6636
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 5 -6.74802 0.00109631 1.77159e-05 1 1 8
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.2 seconds.
-fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000)
-#> Calculating Hessian
-#> Calculating inverse of Cholesky factor
-#> Generating draws
-#> iteration: 0
-#> iteration: 100
-#> iteration: 200
-#> iteration: 300
-#> iteration: 400
-#> iteration: 500
-#> iteration: 600
-#> iteration: 700
-#> iteration: 800
-#> iteration: 900
-#> iteration: 1000
-#> iteration: 1100
-#> iteration: 1200
-#> iteration: 1300
-#> iteration: 1400
-#> iteration: 1500
-#> iteration: 1600
-#> iteration: 1700
-#> iteration: 1800
-#> iteration: 1900
-#> Finished in 0.2 seconds.
-fit_laplace$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.26 -6.97 0.807 0.300 -8.67 -6.75
-#> 2 lp_approx__ -0.514 -0.222 0.798 0.305 -1.93 -0.00172
-#> 3 theta 0.270 0.254 0.125 0.122 0.0996 0.507
-
-# Run 'variational' method to use ADVI to approximate posterior
-fit_vb <- mod$variational(data = stan_data, seed = 123)
-#> ------------------------------------------------------------
-#> EXPERIMENTAL ALGORITHM:
-#> This procedure has not been thoroughly tested and may be unstable
-#> or buggy. The interface is subject to change.
-#> ------------------------------------------------------------
-#> Gradient evaluation took 1.1e-05 seconds
-#> 1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
-#> Adjust your expectations accordingly!
-#> Begin eta adaptation.
-#> Iteration: 1 / 250 [ 0%] (Adaptation)
-#> Iteration: 50 / 250 [ 20%] (Adaptation)
-#> Iteration: 100 / 250 [ 40%] (Adaptation)
-#> Iteration: 150 / 250 [ 60%] (Adaptation)
-#> Iteration: 200 / 250 [ 80%] (Adaptation)
-#> Success! Found best value [eta = 1] earlier than expected.
-#> Begin stochastic gradient ascent.
-#> iter ELBO delta_ELBO_mean delta_ELBO_med notes
-#> 100 -6.164 1.000 1.000
-#> 200 -6.225 0.505 1.000
-#> 300 -6.186 0.339 0.010 MEDIAN ELBO CONVERGED
-#> Drawing a sample of size 1000 from the approximate posterior...
-#> COMPLETED.
-#> Finished in 0.2 seconds.
-fit_vb$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.14 -6.93 0.528 0.247 -8.21 -6.75
-#> 2 lp_approx__ -0.520 -0.244 0.740 0.326 -1.90 -0.00227
-#> 3 theta 0.251 0.236 0.107 0.108 0.100 0.446
-mcmc_hist(fit_vb$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'pathfinder' method, a new alternative to the variational method
-fit_pf <- mod$pathfinder(data = stan_data, seed = 123)
-#> Path [1] :Initial log joint density = -18.273334
-#> Path [1] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 7.082e-04 1.432e-05 1.000e+00 1.000e+00 126 -6.145e+00 -6.145e+00
-#> Path [1] :Best Iter: [5] ELBO (-6.145070) evaluations: (126)
-#> Path [2] :Initial log joint density = -19.192715
-#> Path [2] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 2.015e-04 2.228e-06 1.000e+00 1.000e+00 126 -6.223e+00 -6.223e+00
-#> Path [2] :Best Iter: [2] ELBO (-6.170358) evaluations: (126)
-#> Path [3] :Initial log joint density = -6.774820
-#> Path [3] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 4 -6.748e+00 1.137e-04 2.596e-07 1.000e+00 1.000e+00 101 -6.178e+00 -6.178e+00
-#> Path [3] :Best Iter: [4] ELBO (-6.177909) evaluations: (101)
-#> Path [4] :Initial log joint density = -7.949193
-#> Path [4] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 2.145e-04 1.301e-06 1.000e+00 1.000e+00 126 -6.197e+00 -6.197e+00
-#> Path [4] :Best Iter: [5] ELBO (-6.197118) evaluations: (126)
-#> Total log probability function evaluations:4379
-#> Finished in 0.2 seconds.
-fit_pf$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp_approx__ -1.07 -0.727 0.945 0.311 -2.91 -0.450
-#> 2 lp__ -7.25 -6.97 0.753 0.308 -8.78 -6.75
-#> 3 theta 0.256 0.245 0.119 0.123 0.0824 0.462
-mcmc_hist(fit_pf$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'pathfinder' again with more paths, fewer draws per path,
-# better covariance approximation, and fewer LBFGSs iterations
-fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40,
- history_size=50, max_lbfgs_iters=100)
-#> Warning: Number of PSIS draws is larger than the total number of draws returned by the single Pathfinders. This is likely unintentional and leads to re-sampling from the same draws.
-#> Path [1] :Initial log joint density = -8.866534
-#> Path [1] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 3.406e-04 2.829e-06 1.000e+00 1.000e+00 126 -6.250e+00 -6.250e+00
-#> Path [1] :Best Iter: [3] ELBO (-6.238275) evaluations: (126)
-#> Path [2] :Initial log joint density = -12.781290
-#> Path [2] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.589e-03 3.507e-05 1.000e+00 1.000e+00 126 -6.162e+00 -6.162e+00
-#> Path [2] :Best Iter: [4] ELBO (-6.140443) evaluations: (126)
-#> Path [3] :Initial log joint density = -7.160474
-#> Path [3] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.830e-04 1.466e-07 1.000e+00 1.000e+00 126 -6.227e+00 -6.227e+00
-#> Path [3] :Best Iter: [2] ELBO (-6.155847) evaluations: (126)
-#> Path [4] :Initial log joint density = -15.099639
-#> Path [4] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 2.032e-03 6.062e-05 1.000e+00 1.000e+00 126 -6.233e+00 -6.233e+00
-#> Path [4] :Best Iter: [3] ELBO (-6.206690) evaluations: (126)
-#> Path [5] :Initial log joint density = -8.969584
-#> Path [5] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 3.443e-04 2.894e-06 1.000e+00 1.000e+00 126 -6.229e+00 -6.229e+00
-#> Path [5] :Best Iter: [4] ELBO (-6.164928) evaluations: (126)
-#> Path [6] :Initial log joint density = -7.603579
-#> Path [6] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.364e-04 6.236e-07 1.000e+00 1.000e+00 126 -6.247e+00 -6.247e+00
-#> Path [6] :Best Iter: [4] ELBO (-6.232661) evaluations: (126)
-#> Path [7] :Initial log joint density = -6.767406
-#> Path [7] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 3 -6.748e+00 1.821e-03 5.050e-06 9.750e-01 9.750e-01 76 -6.198e+00 -6.198e+00
-#> Path [7] :Best Iter: [3] ELBO (-6.197967) evaluations: (76)
-#> Path [8] :Initial log joint density = -7.131671
-#> Path [8] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.518e-04 1.054e-07 1.000e+00 1.000e+00 126 -6.299e+00 -6.299e+00
-#> Path [8] :Best Iter: [2] ELBO (-6.177966) evaluations: (126)
-#> Path [9] :Initial log joint density = -12.644619
-#> Path [9] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.534e-03 3.280e-05 1.000e+00 1.000e+00 126 -6.242e+00 -6.242e+00
-#> Path [9] :Best Iter: [3] ELBO (-6.212996) evaluations: (126)
-#> Path [10] :Initial log joint density = -9.270214
-#> Path [10] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 3.999e-04 3.679e-06 1.000e+00 1.000e+00 126 -6.217e+00 -6.217e+00
-#> Path [10] :Best Iter: [4] ELBO (-6.202150) evaluations: (126)
-#> Total log probability function evaluations:1360
-#> Finished in 0.2 seconds.
-
-# Specifying initial values as a function
-fit_mcmc_w_init_fun <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = function() list(theta = runif(1))
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.3 seconds.
-#>
-fit_mcmc_w_init_fun_2 <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = function(chain_id) {
- # silly but demonstrates optional use of chain_id
- list(theta = 1 / (chain_id + 1))
- }
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.3 seconds.
-#>
-fit_mcmc_w_init_fun_2$init()
-#> [[1]]
-#> [[1]]$theta
-#> [1] 0.5
-#>
-#>
-#> [[2]]
-#> [[2]]$theta
-#> [1] 0.3333333
-#>
-#>
-
-# Specifying initial values as a list of lists
-fit_mcmc_w_init_list <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = list(
- list(theta = 0.75), # chain 1
- list(theta = 0.25) # chain 2
- )
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.3 seconds.
-#>
-fit_optim_w_init_list <- mod$optimize(
- data = stan_data,
- seed = 123,
- init = list(
- list(theta = 0.75)
- )
-)
-#> Initial log joint probability = -11.6657
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 6 -5.00402 0.000237915 9.55309e-07 1 1 9
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.2 seconds.
-fit_optim_w_init_list$init()
-#> [[1]]
-#> [[1]]$theta
-#> [1] 0.75
-#>
-#>
-# }
-
-The $sample() method of a CmdStanModel object runs Stan's
-main Markov chain Monte Carlo algorithm.
Any argument left as NULL will default to the default value used by the
-installed version of CmdStan. See the
-CmdStan User’s Guide
-for more details.
After model fitting any diagnostics specified via the diagnostics
-argument will be checked and warnings will be printed if warranted.
sample(
- data = NULL,
- seed = NULL,
- refresh = NULL,
- init = NULL,
- save_latent_dynamics = FALSE,
- output_dir = getOption("cmdstanr_output_dir"),
- output_basename = NULL,
- sig_figs = NULL,
- chains = 4,
- parallel_chains = getOption("mc.cores", 1),
- chain_ids = seq_len(chains),
- threads_per_chain = NULL,
- opencl_ids = NULL,
- iter_warmup = NULL,
- iter_sampling = NULL,
- save_warmup = FALSE,
- thin = NULL,
- max_treedepth = NULL,
- adapt_engaged = TRUE,
- adapt_delta = NULL,
- step_size = NULL,
- metric = NULL,
- metric_file = NULL,
- inv_metric = NULL,
- init_buffer = NULL,
- term_buffer = NULL,
- window = NULL,
- fixed_param = FALSE,
- show_messages = TRUE,
- show_exceptions = TRUE,
- diagnostics = c("divergences", "treedepth", "ebfmi"),
- save_metric = NULL,
- save_cmdstan_config = NULL,
- cores = NULL,
- num_cores = NULL,
- num_chains = NULL,
- num_warmup = NULL,
- num_samples = NULL,
- validate_csv = NULL,
- save_extra_diagnostics = NULL,
- max_depth = NULL,
- stepsize = NULL
-)(multiple options) The data to use for the variables specified in -the data block of the Stan program. One of the following:
A named list of R objects with the names corresponding to variables
-declared in the data block of the Stan program. Internally this list is
-then written to JSON for CmdStan using write_stan_json(). See
-write_stan_json() for details on the conversions performed on R objects
-before they are passed to Stan.
A path to a data file compatible with CmdStan (JSON or R dump). See the -appendices in the CmdStan guide for details on using these formats.
NULL or an empty list if the Stan program has no data block.
(positive integer(s)) A seed for the (P)RNG to pass to CmdStan.
-In the case of multi-chain sampling the single seed will automatically be
-augmented by the the run (chain) ID so that each chain uses a different
-seed. The exception is the transformed data block, which defaults to using
-same seed for all chains so that the same data is generated for all chains
-if RNG functions are used. The only time seed should be specified as a
-vector (one element per chain) is if RNG functions are used in transformed
-data and the goal is to generate different data for each chain.
(non-negative integer) The number of iterations between
-printed screen updates. If refresh = 0, only error messages will be
-printed.
(multiple options) The initialization method to use for the -variables declared in the parameters block of the Stan program. One of the -following:
A real number x>0. This initializes all parameters randomly between
-[-x,x] on the unconstrained parameter space.;
The number 0. This initializes all parameters to 0;
A character vector of paths (one per chain) to JSON or Rdump files
-containing initial values for all or some parameters. See
-write_stan_json() to write R objects to JSON files compatible with
-CmdStan.
A list of lists containing initial values for all or some parameters. For -MCMC the list should contain a sublist for each chain. For other model -fitting methods there should be just one sublist. The sublists should have -named elements corresponding to the parameters for which you are specifying -initial values. See Examples.
A function that returns a single list with names corresponding to the
-parameters for which you are specifying initial values. The function can
-take no arguments or a single argument chain_id. For MCMC, if the
-function has argument chain_id it will be supplied with the chain id
-(from 1 to number of chains) when called to generate the initial values.
-See
-Examples.
A CmdStanMCMC, CmdStanMLE, CmdStanVB, CmdStanPathfinder,
-or CmdStanLaplace fit object.
-If the fit object's parameters are only a subset of the model
-parameters then the other parameters will be drawn by Stan's default
-initialization. The fit object must have at least some parameters that are the
-same name and dimensions as the current Stan model. For the sample and
-pathfinder method, if the fit object has fewer draws than the requested
-number of chains/paths then the inits will be drawn using sampling with
-replacement. Otherwise sampling without replacement will be used.
-When a CmdStanPathfinder fit object is used as the init, if
-. psis_resample was set to FALSE and calculate_lp was
-set to TRUE (default), then resampling without replacement with Pareto
-smoothed weights will be used. If psis_resample was set to TRUE or
-calculate_lp was set to FALSE then sampling without replacement with
-uniform weights will be used to select the draws.
-PSIS resampling is used to select the draws for CmdStanVB,
-and CmdStanLaplace fit objects.
A type inheriting from posterior::draws. If the draws object has less
-samples than the number of requested chains/paths then the inits will be
-drawn using sampling with replacement. Otherwise sampling without
-replacement will be used. If the draws object's parameters are only a subset
-of the model parameters then the other parameters will be drawn by Stan's
-default initialization. The fit object must have at least some parameters
-that are the same name and dimensions as the current Stan model.
(logical) Should auxiliary diagnostic information
-about the latent dynamics be written to temporary diagnostic CSV files?
-This argument replaces CmdStan's diagnostic_file argument and the content
-written to CSV is controlled by the user's CmdStan installation and not
-CmdStanR (for some algorithms no content may be written). The default is
-FALSE, which is appropriate for almost every use case. To save the
-temporary files created when save_latent_dynamics=TRUE see the
-$save_latent_dynamics_files()
-method.
(string) A path to a directory where CmdStan should write
-its output CSV files. For MCMC there will be one file per chain; for other
-methods there will be a single file. For interactive use this can typically
-be left at NULL (temporary directory) since CmdStanR makes the CmdStan
-output (posterior draws and diagnostics) available in R via methods of the
-fitted model objects. This can be set for an entire R session using
-options(cmdstanr_output_dir). The behavior of output_dir is as follows:
If NULL (the default), then the CSV files are written to a temporary
-directory and only saved permanently if the user calls one of the $save_*
-methods of the fitted model object (e.g.,
-$save_output_files()). These temporary
-files are removed when the fitted model object is garbage collected (manually or automatically).
If a path, then the files are created in output_dir with names
-corresponding to the defaults used by $save_output_files().
(string) A string to use as a prefix for the names of
-the output CSV files of CmdStan. If NULL (the default), the basename of
-the output CSV files will be comprised from the model name, timestamp, and
-5 random characters.
(positive integer) The number of significant figures used
-when storing the output values. By default, CmdStan represent the output
-values with 6 significant figures. The upper limit for sig_figs is 18.
-Increasing this value will result in larger output CSV files and thus an
-increased usage of disk space.
(positive integer) The number of Markov chains to run. The -default is 4.
(positive integer) The maximum number of MCMC chains
-to run in parallel. If parallel_chains is not specified then the default
-is to look for the option "mc.cores", which can be set for an entire R
-session by options(mc.cores=value). If the "mc.cores" option has not
-been set then the default is 1.
(integer vector) A vector of chain IDs. Must contain as many
-unique positive integers as the number of chains. If not set, the default
-chain IDs are used (integers starting from 1).
(positive integer) If the model was
-compiled with threading support, the number of
-threads to use in parallelized sections within an MCMC chain (e.g., when
-using the Stan functions reduce_sum() or map_rect()). This is in
-contrast with parallel_chains, which specifies the number of chains to
-run in parallel. The actual number of CPU cores used is
-parallel_chains*threads_per_chain. For an example of using threading see
-the Stan case study Reduce Sum: A Minimal Example.
(integer vector of length 2) The platform and device IDs of
-the OpenCL device to use for fitting. The model must be compiled with
-cpp_options = list(stan_opencl = TRUE) for this argument to have an
-effect.
(positive integer) The number of warmup iterations to run
-per chain. Note: in the CmdStan User's Guide this is referred to as
-num_warmup.
(positive integer) The number of post-warmup iterations
-to run per chain. Note: in the CmdStan User's Guide this is referred to as
-num_samples.
(logical) Should warmup iterations be saved? The default
-is FALSE.
(positive integer) The period between saved samples. This should -typically be left at its default (no thinning) unless memory is a problem.
(positive integer) The maximum allowed tree depth for -the NUTS engine. See the Tree Depth section of the CmdStan User's Guide -for more details.
(logical) Do warmup adaptation? The default is TRUE.
-If a precomputed inverse metric is specified via the inv_metric argument
-(or metric_file) then, if adapt_engaged=TRUE, Stan will use the
-provided inverse metric just as an initial guess during adaptation. To turn
-off adaptation when using a precomputed inverse metric set
-adapt_engaged=FALSE.
(real in (0,1)) The adaptation target acceptance
-statistic.
(positive real) The initial step size for the discrete -approximation to continuous Hamiltonian dynamics. This is further tuned -during warmup.
(string) One of "diag_e", "dense_e", or "unit_e",
-specifying the geometry of the base manifold. See the Euclidean Metric
-section of the CmdStan User's Guide for more details. To specify a
-precomputed (inverse) metric, see the inv_metric argument below.
(character vector) The paths to JSON or Rdump files (one
-per chain) compatible with CmdStan that contain precomputed inverse
-metrics. The metric_file argument is inherited from CmdStan but is
-confusing in that the entry in JSON or Rdump file(s) must be named
-inv_metric, referring to the inverse metric. We recommend instead using
-CmdStanR's inv_metric argument (see below) to specify an inverse metric
-directly using a vector or matrix from your R session.
(vector, matrix) A vector (if metric='diag_e') or a
-matrix (if metric='dense_e') for initializing the inverse metric. This
-can be used as an alternative to the metric_file argument. A vector is
-interpreted as a diagonal metric. The inverse metric is usually set to an
-estimate of the posterior covariance. See the adapt_engaged argument
-above for details about (and control over) how specifying a precomputed
-inverse metric interacts with adaptation.
(nonnegative integer) Width of initial fast timestep -adaptation interval during warmup.
(nonnegative integer) Width of final fast timestep -adaptation interval during warmup.
(nonnegative integer) Initial width of slow timestep/metric -adaptation interval.
(logical) When TRUE, call CmdStan with argument
-"algorithm=fixed_param". The default is FALSE. The fixed parameter
-sampler generates a new sample without changing the current state of the
-Markov chain; only generated quantities may change. This can be useful
-when, for example, trying to generate pseudo-data using the generated
-quantities block. If the parameters block is empty then using
-fixed_param=TRUE is mandatory. When fixed_param=TRUE the chains and
-parallel_chains arguments will be set to 1.
(logical) When TRUE (the default), prints all output
-during the execution process, such as iteration numbers and elapsed times.
-If the output is silenced then the $output() method
-of the resulting fit object can be used to display the silenced messages.
(logical) When TRUE (the default), prints all
-informational messages, for example rejection of the current proposal.
-Disable if you wish to silence these messages, but this is not usually
-recommended unless you are very confident that the model is correct up to
-numerical error. If the messages are silenced then the
-$output() method of the resulting fit object can be
-used to display the silenced messages.
(character vector) The diagnostics to automatically check
-and warn about after sampling. Setting this to an empty string "" or
-NULL can be used to prevent CmdStanR from automatically reading in the
-sampler diagnostics from CSV if you wish to manually read in the results
-and validate them yourself, for example using read_cmdstan_csv(). The
-currently available diagnostics are "divergences", "treedepth", and
-"ebfmi" (the default is to check all of them).
These diagnostics are also available after fitting. The
-$sampler_diagnostics() method provides
-access the diagnostic values for each iteration and the
-$diagnostic_summary() method provides
-summaries of the diagnostics and can regenerate the warning messages.
Diagnostics like R-hat and effective sample size are not currently
-available via the diagnostics argument but can be checked after fitting
-using the $summary() method.
(logical) When TRUE, call CmdStan with argument
-"adaptation save_metric=1" to save the adapted metric in separate JSON
-file with elements "stepsize", "metric_type" and "inv_metric". The default
-is TRUE. This option is only available in CmdStan 2.34.0 and later.
(logical) When TRUE (the default), call CmdStan
-with argument "output save_config=1" to save a json file which contains
-the argument tree and extra information (equivalent to the output CSV file
-header). This option is only available in CmdStan 2.34.0 and later.
Deprecated and will be removed in a future release.
A CmdStanMCMC object.
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
-model-method-check_syntax,
-model-method-compile,
-model-method-diagnose,
-model-method-expose_functions,
-model-method-format,
-model-method-generate-quantities,
-model-method-laplace,
-model-method-optimize,
-model-method-pathfinder,
-model-method-sample_mpi,
-model-method-variables,
-model-method-variational
# \dontrun{
-library(cmdstanr)
-library(posterior)
-library(bayesplot)
-color_scheme_set("brightblue")
-
-# Set path to CmdStan
-# (Note: if you installed CmdStan via install_cmdstan() with default settings
-# then setting the path is unnecessary but the default below should still work.
-# Otherwise use the `path` argument to specify the location of your
-# CmdStan installation.)
-set_cmdstan_path(path = NULL)
-#> CmdStan path set to: /Users/jgabry/.cmdstan/cmdstan-2.36.0
-
-# Create a CmdStanModel object from a Stan program,
-# here using the example model that comes with CmdStan
-file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan")
-mod <- cmdstan_model(file)
-mod$print()
-#> data {
-#> int<lower=0> N;
-#> array[N] int<lower=0, upper=1> y;
-#> }
-#> parameters {
-#> real<lower=0, upper=1> theta;
-#> }
-#> model {
-#> theta ~ beta(1, 1); // uniform prior on interval 0,1
-#> y ~ bernoulli(theta);
-#> }
-# Print with line numbers. This can be set globally using the
-# `cmdstanr_print_line_numbers` option.
-mod$print(line_numbers = TRUE)
-#> 1: data {
-#> 2: int<lower=0> N;
-#> 3: array[N] int<lower=0, upper=1> y;
-#> 4: }
-#> 5: parameters {
-#> 6: real<lower=0, upper=1> theta;
-#> 7: }
-#> 8: model {
-#> 9: theta ~ beta(1, 1); // uniform prior on interval 0,1
-#> 10: y ~ bernoulli(theta);
-#> 11: }
-
-# Data as a named list (like RStan)
-stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1))
-
-# Run MCMC using the 'sample' method
-fit_mcmc <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- parallel_chains = 2
-)
-#> Running MCMC with 2 parallel chains...
-#>
-#> Chain 1 Iteration: 1 / 2000 [ 0%] (Warmup)
-#> Chain 1 Iteration: 100 / 2000 [ 5%] (Warmup)
-#> Chain 1 Iteration: 200 / 2000 [ 10%] (Warmup)
-#> Chain 1 Iteration: 300 / 2000 [ 15%] (Warmup)
-#> Chain 1 Iteration: 400 / 2000 [ 20%] (Warmup)
-#> Chain 1 Iteration: 500 / 2000 [ 25%] (Warmup)
-#> Chain 1 Iteration: 600 / 2000 [ 30%] (Warmup)
-#> Chain 1 Iteration: 700 / 2000 [ 35%] (Warmup)
-#> Chain 1 Iteration: 800 / 2000 [ 40%] (Warmup)
-#> Chain 1 Iteration: 900 / 2000 [ 45%] (Warmup)
-#> Chain 1 Iteration: 1000 / 2000 [ 50%] (Warmup)
-#> Chain 1 Iteration: 1001 / 2000 [ 50%] (Sampling)
-#> Chain 1 Iteration: 1100 / 2000 [ 55%] (Sampling)
-#> Chain 1 Iteration: 1200 / 2000 [ 60%] (Sampling)
-#> Chain 1 Iteration: 1300 / 2000 [ 65%] (Sampling)
-#> Chain 1 Iteration: 1400 / 2000 [ 70%] (Sampling)
-#> Chain 1 Iteration: 1500 / 2000 [ 75%] (Sampling)
-#> Chain 1 Iteration: 1600 / 2000 [ 80%] (Sampling)
-#> Chain 1 Iteration: 1700 / 2000 [ 85%] (Sampling)
-#> Chain 1 Iteration: 1800 / 2000 [ 90%] (Sampling)
-#> Chain 1 Iteration: 1900 / 2000 [ 95%] (Sampling)
-#> Chain 1 Iteration: 2000 / 2000 [100%] (Sampling)
-#> Chain 2 Iteration: 1 / 2000 [ 0%] (Warmup)
-#> Chain 2 Iteration: 100 / 2000 [ 5%] (Warmup)
-#> Chain 2 Iteration: 200 / 2000 [ 10%] (Warmup)
-#> Chain 2 Iteration: 300 / 2000 [ 15%] (Warmup)
-#> Chain 2 Iteration: 400 / 2000 [ 20%] (Warmup)
-#> Chain 2 Iteration: 500 / 2000 [ 25%] (Warmup)
-#> Chain 2 Iteration: 600 / 2000 [ 30%] (Warmup)
-#> Chain 2 Iteration: 700 / 2000 [ 35%] (Warmup)
-#> Chain 2 Iteration: 800 / 2000 [ 40%] (Warmup)
-#> Chain 2 Iteration: 900 / 2000 [ 45%] (Warmup)
-#> Chain 2 Iteration: 1000 / 2000 [ 50%] (Warmup)
-#> Chain 2 Iteration: 1001 / 2000 [ 50%] (Sampling)
-#> Chain 2 Iteration: 1100 / 2000 [ 55%] (Sampling)
-#> Chain 2 Iteration: 1200 / 2000 [ 60%] (Sampling)
-#> Chain 2 Iteration: 1300 / 2000 [ 65%] (Sampling)
-#> Chain 2 Iteration: 1400 / 2000 [ 70%] (Sampling)
-#> Chain 2 Iteration: 1500 / 2000 [ 75%] (Sampling)
-#> Chain 2 Iteration: 1600 / 2000 [ 80%] (Sampling)
-#> Chain 2 Iteration: 1700 / 2000 [ 85%] (Sampling)
-#> Chain 2 Iteration: 1800 / 2000 [ 90%] (Sampling)
-#> Chain 2 Iteration: 1900 / 2000 [ 95%] (Sampling)
-#> Chain 2 Iteration: 2000 / 2000 [100%] (Sampling)
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.2 seconds.
-#>
-
-# Use 'posterior' package for summaries
-fit_mcmc$summary()
-#> # A tibble: 2 × 10
-#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.35 -7.01 0.882 0.353 -9.14 -6.75 1.00 724. 896.
-#> 2 theta 0.254 0.239 0.129 0.126 0.0737 0.488 1.00 532. 657.
-
-# Check sampling diagnostics
-fit_mcmc$diagnostic_summary()
-#> $num_divergent
-#> [1] 0 0
-#>
-#> $num_max_treedepth
-#> [1] 0 0
-#>
-#> $ebfmi
-#> [1] 1.1148479 0.7568734
-#>
-
-# Get posterior draws
-draws <- fit_mcmc$draws()
-print(draws)
-#> # A draws_array: 1000 iterations, 2 chains, and 2 variables
-#> , , variable = lp__
-#>
-#> chain
-#> iteration 1 2
-#> 1 -7.0 -8.1
-#> 2 -7.9 -7.9
-#> 3 -7.4 -7.0
-#> 4 -6.7 -6.8
-#> 5 -6.9 -6.8
-#>
-#> , , variable = theta
-#>
-#> chain
-#> iteration 1 2
-#> 1 0.17 0.088
-#> 2 0.46 0.097
-#> 3 0.41 0.167
-#> 4 0.25 0.292
-#> 5 0.18 0.238
-#>
-#> # ... with 995 more iterations
-
-# Convert to data frame using posterior::as_draws_df
-as_draws_df(draws)
-#> # A draws_df: 1000 iterations, 2 chains, and 2 variables
-#> lp__ theta
-#> 1 -7.0 0.17
-#> 2 -7.9 0.46
-#> 3 -7.4 0.41
-#> 4 -6.7 0.25
-#> 5 -6.9 0.18
-#> 6 -6.9 0.33
-#> 7 -7.2 0.15
-#> 8 -6.8 0.29
-#> 9 -6.8 0.24
-#> 10 -6.8 0.24
-#> # ... with 1990 more draws
-#> # ... hidden reserved variables {'.chain', '.iteration', '.draw'}
-
-# Plot posterior using bayesplot (ggplot2)
-mcmc_hist(fit_mcmc$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm)
-# and also demonstrate specifying data as a path to a file instead of a list
-my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json")
-fit_optim <- mod$optimize(data = my_data_file, seed = 123)
-#> Initial log joint probability = -16.144
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 6 -5.00402 0.000246518 8.73164e-07 1 1 9
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.2 seconds.
-fit_optim$summary()
-#> # A tibble: 2 × 2
-#> variable estimate
-#> <chr> <dbl>
-#> 1 lp__ -5.00
-#> 2 theta 0.2
-
-# Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation
-# to the posterior
-fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE)
-#> Initial log joint probability = -6.989
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 4 -6.74802 0.000455385 0.000104592 0.9108 0.9108 7
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.2 seconds.
-fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000)
-#> Calculating Hessian
-#> Calculating inverse of Cholesky factor
-#> Generating draws
-#> iteration: 0
-#> iteration: 100
-#> iteration: 200
-#> iteration: 300
-#> iteration: 400
-#> iteration: 500
-#> iteration: 600
-#> iteration: 700
-#> iteration: 800
-#> iteration: 900
-#> iteration: 1000
-#> iteration: 1100
-#> iteration: 1200
-#> iteration: 1300
-#> iteration: 1400
-#> iteration: 1500
-#> iteration: 1600
-#> iteration: 1700
-#> iteration: 1800
-#> iteration: 1900
-#> Finished in 0.1 seconds.
-fit_laplace$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.23 -6.97 0.700 0.304 -8.62 -6.75
-#> 2 lp_approx__ -0.482 -0.222 0.697 0.301 -1.89 -0.00177
-#> 3 theta 0.273 0.257 0.121 0.123 0.104 0.502
-
-# Run 'variational' method to use ADVI to approximate posterior
-fit_vb <- mod$variational(data = stan_data, seed = 123)
-#> ------------------------------------------------------------
-#> EXPERIMENTAL ALGORITHM:
-#> This procedure has not been thoroughly tested and may be unstable
-#> or buggy. The interface is subject to change.
-#> ------------------------------------------------------------
-#> Gradient evaluation took 1.2e-05 seconds
-#> 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
-#> Adjust your expectations accordingly!
-#> Begin eta adaptation.
-#> Iteration: 1 / 250 [ 0%] (Adaptation)
-#> Iteration: 50 / 250 [ 20%] (Adaptation)
-#> Iteration: 100 / 250 [ 40%] (Adaptation)
-#> Iteration: 150 / 250 [ 60%] (Adaptation)
-#> Iteration: 200 / 250 [ 80%] (Adaptation)
-#> Success! Found best value [eta = 1] earlier than expected.
-#> Begin stochastic gradient ascent.
-#> iter ELBO delta_ELBO_mean delta_ELBO_med notes
-#> 100 -6.164 1.000 1.000
-#> 200 -6.225 0.505 1.000
-#> 300 -6.186 0.339 0.010 MEDIAN ELBO CONVERGED
-#> Drawing a sample of size 1000 from the approximate posterior...
-#> COMPLETED.
-#> Finished in 0.2 seconds.
-fit_vb$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.14 -6.93 0.528 0.247 -8.21 -6.75
-#> 2 lp_approx__ -0.520 -0.244 0.740 0.326 -1.90 -0.00227
-#> 3 theta 0.251 0.236 0.107 0.108 0.100 0.446
-mcmc_hist(fit_vb$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'pathfinder' method, a new alternative to the variational method
-fit_pf <- mod$pathfinder(data = stan_data, seed = 123)
-#> Path [1] :Initial log joint density = -18.273334
-#> Path [1] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 7.082e-04 1.432e-05 1.000e+00 1.000e+00 126 -6.145e+00 -6.145e+00
-#> Path [1] :Best Iter: [5] ELBO (-6.145070) evaluations: (126)
-#> Path [2] :Initial log joint density = -19.192715
-#> Path [2] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 2.015e-04 2.228e-06 1.000e+00 1.000e+00 126 -6.223e+00 -6.223e+00
-#> Path [2] :Best Iter: [2] ELBO (-6.170358) evaluations: (126)
-#> Path [3] :Initial log joint density = -6.774820
-#> Path [3] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 4 -6.748e+00 1.137e-04 2.596e-07 1.000e+00 1.000e+00 101 -6.178e+00 -6.178e+00
-#> Path [3] :Best Iter: [4] ELBO (-6.177909) evaluations: (101)
-#> Path [4] :Initial log joint density = -7.949193
-#> Path [4] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 2.145e-04 1.301e-06 1.000e+00 1.000e+00 126 -6.197e+00 -6.197e+00
-#> Path [4] :Best Iter: [5] ELBO (-6.197118) evaluations: (126)
-#> Total log probability function evaluations:4379
-#> Finished in 0.1 seconds.
-fit_pf$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp_approx__ -1.07 -0.727 0.945 0.311 -2.91 -0.450
-#> 2 lp__ -7.25 -6.97 0.753 0.308 -8.78 -6.75
-#> 3 theta 0.256 0.245 0.119 0.123 0.0824 0.462
-mcmc_hist(fit_pf$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'pathfinder' again with more paths, fewer draws per path,
-# better covariance approximation, and fewer LBFGSs iterations
-fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40,
- history_size=50, max_lbfgs_iters=100)
-#> Warning: Number of PSIS draws is larger than the total number of draws returned by the single Pathfinders. This is likely unintentional and leads to re-sampling from the same draws.
-#> Path [1] :Initial log joint density = -14.842013
-#> Path [1] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 2.041e-03 6.008e-05 1.000e+00 1.000e+00 126 -6.245e+00 -6.245e+00
-#> Path [1] :Best Iter: [4] ELBO (-6.197550) evaluations: (126)
-#> Path [2] :Initial log joint density = -13.719046
-#> Path [2] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.898e-03 4.974e-05 1.000e+00 1.000e+00 126 -6.222e+00 -6.222e+00
-#> Path [2] :Best Iter: [3] ELBO (-6.220670) evaluations: (126)
-#> Path [3] :Initial log joint density = -6.815359
-#> Path [3] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 3 -6.748e+00 4.756e-03 9.139e-05 9.421e-01 9.421e-01 76 -6.278e+00 -6.278e+00
-#> Path [3] :Best Iter: [2] ELBO (-6.242127) evaluations: (76)
-#> Path [4] :Initial log joint density = -17.875212
-#> Path [4] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 9.541e-04 2.212e-05 1.000e+00 1.000e+00 126 -6.214e+00 -6.214e+00
-#> Path [4] :Best Iter: [5] ELBO (-6.214309) evaluations: (126)
-#> Path [5] :Initial log joint density = -6.766198
-#> Path [5] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 3 -6.748e+00 2.751e-03 1.480e-04 1.000e+00 1.000e+00 76 -6.182e+00 -6.182e+00
-#> Path [5] :Best Iter: [3] ELBO (-6.182449) evaluations: (76)
-#> Path [6] :Initial log joint density = -17.054272
-#> Path [6] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.428e-03 3.932e-05 1.000e+00 1.000e+00 126 -6.264e+00 -6.264e+00
-#> Path [6] :Best Iter: [2] ELBO (-6.179165) evaluations: (126)
-#> Path [7] :Initial log joint density = -18.498779
-#> Path [7] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 5.714e-04 1.045e-05 1.000e+00 1.000e+00 126 -6.239e+00 -6.239e+00
-#> Path [7] :Best Iter: [2] ELBO (-6.173540) evaluations: (126)
-#> Path [8] :Initial log joint density = -9.008917
-#> Path [8] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 3.451e-04 2.910e-06 1.000e+00 1.000e+00 126 -6.200e+00 -6.200e+00
-#> Path [8] :Best Iter: [2] ELBO (-6.176862) evaluations: (126)
-#> Path [9] :Initial log joint density = -7.868746
-#> Path [9] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.971e-04 1.133e-06 1.000e+00 1.000e+00 126 -6.234e+00 -6.234e+00
-#> Path [9] :Best Iter: [4] ELBO (-6.211410) evaluations: (126)
-#> Path [10] :Initial log joint density = -6.753679
-#> Path [10] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 3 -6.748e+00 9.064e-04 2.757e-05 1.000e+00 1.000e+00 76 -6.257e+00 -6.257e+00
-#> Path [10] :Best Iter: [2] ELBO (-6.157033) evaluations: (76)
-#> Total log probability function evaluations:1260
-#> Finished in 0.2 seconds.
-
-# Specifying initial values as a function
-fit_mcmc_w_init_fun <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = function() list(theta = runif(1))
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.3 seconds.
-#>
-fit_mcmc_w_init_fun_2 <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = function(chain_id) {
- # silly but demonstrates optional use of chain_id
- list(theta = 1 / (chain_id + 1))
- }
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.3 seconds.
-#>
-fit_mcmc_w_init_fun_2$init()
-#> [[1]]
-#> [[1]]$theta
-#> [1] 0.5
-#>
-#>
-#> [[2]]
-#> [[2]]$theta
-#> [1] 0.3333333
-#>
-#>
-
-# Specifying initial values as a list of lists
-fit_mcmc_w_init_list <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = list(
- list(theta = 0.75), # chain 1
- list(theta = 0.25) # chain 2
- )
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.3 seconds.
-#>
-fit_optim_w_init_list <- mod$optimize(
- data = stan_data,
- seed = 123,
- init = list(
- list(theta = 0.75)
- )
-)
-#> Initial log joint probability = -11.6657
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 6 -5.00402 0.000237915 9.55309e-07 1 1 9
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.2 seconds.
-fit_optim_w_init_list$init()
-#> [[1]]
-#> [[1]]$theta
-#> [1] 0.75
-#>
-#>
-# }
-
-The $sample_mpi() method of a CmdStanModel object is
-identical to the $sample() method but with support for
-MPI (message passing interface). The target audience for MPI are
-those with large computer clusters. For other users, the
-$sample() method provides both parallelization of
-chains and threading support for within-chain parallelization.
In order to use MPI with Stan, an MPI implementation must be -installed. For Unix systems the most commonly used implementations are -MPICH and OpenMPI. The implementations provide an MPI C++ compiler wrapper -(for example mpicxx), which is required to compile the model.
-An example of compiling with MPI:
-mpi_options = list(STAN_MPI=TRUE, CXX="mpicxx", TBB_CXX_TYPE="gcc")
-mod = cmdstan_model("model.stan", cpp_options = mpi_options)The C++ options that must be supplied to the -compile call are:
STAN_MPI: Enables the use of MPI with Stan if TRUE.
CXX: The name of the MPI C++ compiler wrapper. Typically "mpicxx".
TBB_CXX_TYPE: The C++ compiler the MPI wrapper wraps. Typically "gcc"
-on Linux and "clang" on macOS.
In the call to the $sample_mpi() method it is also possible to provide
-the name of the MPI launcher (mpi_cmd, defaulting to "mpiexec") and any
-other MPI launch arguments (mpi_args). In most cases, it is enough to
-only define the number of processes. To use n_procs processes specify
-mpi_args = list("n" = n_procs).
sample_mpi(
- data = NULL,
- mpi_cmd = "mpiexec",
- mpi_args = NULL,
- seed = NULL,
- refresh = NULL,
- init = NULL,
- save_latent_dynamics = FALSE,
- output_dir = getOption("cmdstanr_output_dir"),
- output_basename = NULL,
- chains = 1,
- chain_ids = seq_len(chains),
- iter_warmup = NULL,
- iter_sampling = NULL,
- save_warmup = FALSE,
- thin = NULL,
- max_treedepth = NULL,
- adapt_engaged = TRUE,
- adapt_delta = NULL,
- step_size = NULL,
- metric = NULL,
- metric_file = NULL,
- inv_metric = NULL,
- init_buffer = NULL,
- term_buffer = NULL,
- window = NULL,
- fixed_param = FALSE,
- sig_figs = NULL,
- show_messages = TRUE,
- show_exceptions = TRUE,
- diagnostics = c("divergences", "treedepth", "ebfmi"),
- save_cmdstan_config = NULL,
- validate_csv = TRUE
-)(multiple options) The data to use for the variables specified in -the data block of the Stan program. One of the following:
A named list of R objects with the names corresponding to variables
-declared in the data block of the Stan program. Internally this list is
-then written to JSON for CmdStan using write_stan_json(). See
-write_stan_json() for details on the conversions performed on R objects
-before they are passed to Stan.
A path to a data file compatible with CmdStan (JSON or R dump). See the -appendices in the CmdStan guide for details on using these formats.
NULL or an empty list if the Stan program has no data block.
(string) The MPI launcher used for launching MPI
-processes. The default launcher is "mpiexec".
(list) A list of arguments to use when launching MPI
-processes. For example, mpi_args = list("n" = 4) launches the executable
-as mpiexec -n 4 model_executable, followed by CmdStan arguments for the
-model executable.
(positive integer(s)) A seed for the (P)RNG to pass to CmdStan.
-In the case of multi-chain sampling the single seed will automatically be
-augmented by the the run (chain) ID so that each chain uses a different
-seed. The exception is the transformed data block, which defaults to using
-same seed for all chains so that the same data is generated for all chains
-if RNG functions are used. The only time seed should be specified as a
-vector (one element per chain) is if RNG functions are used in transformed
-data and the goal is to generate different data for each chain.
(non-negative integer) The number of iterations between
-printed screen updates. If refresh = 0, only error messages will be
-printed.
(multiple options) The initialization method to use for the -variables declared in the parameters block of the Stan program. One of the -following:
A real number x>0. This initializes all parameters randomly between
-[-x,x] on the unconstrained parameter space.;
The number 0. This initializes all parameters to 0;
A character vector of paths (one per chain) to JSON or Rdump files
-containing initial values for all or some parameters. See
-write_stan_json() to write R objects to JSON files compatible with
-CmdStan.
A list of lists containing initial values for all or some parameters. For -MCMC the list should contain a sublist for each chain. For other model -fitting methods there should be just one sublist. The sublists should have -named elements corresponding to the parameters for which you are specifying -initial values. See Examples.
A function that returns a single list with names corresponding to the
-parameters for which you are specifying initial values. The function can
-take no arguments or a single argument chain_id. For MCMC, if the
-function has argument chain_id it will be supplied with the chain id
-(from 1 to number of chains) when called to generate the initial values.
-See
-Examples.
A CmdStanMCMC, CmdStanMLE, CmdStanVB, CmdStanPathfinder,
-or CmdStanLaplace fit object.
-If the fit object's parameters are only a subset of the model
-parameters then the other parameters will be drawn by Stan's default
-initialization. The fit object must have at least some parameters that are the
-same name and dimensions as the current Stan model. For the sample and
-pathfinder method, if the fit object has fewer draws than the requested
-number of chains/paths then the inits will be drawn using sampling with
-replacement. Otherwise sampling without replacement will be used.
-When a CmdStanPathfinder fit object is used as the init, if
-. psis_resample was set to FALSE and calculate_lp was
-set to TRUE (default), then resampling without replacement with Pareto
-smoothed weights will be used. If psis_resample was set to TRUE or
-calculate_lp was set to FALSE then sampling without replacement with
-uniform weights will be used to select the draws.
-PSIS resampling is used to select the draws for CmdStanVB,
-and CmdStanLaplace fit objects.
A type inheriting from posterior::draws. If the draws object has less
-samples than the number of requested chains/paths then the inits will be
-drawn using sampling with replacement. Otherwise sampling without
-replacement will be used. If the draws object's parameters are only a subset
-of the model parameters then the other parameters will be drawn by Stan's
-default initialization. The fit object must have at least some parameters
-that are the same name and dimensions as the current Stan model.
(logical) Should auxiliary diagnostic information
-about the latent dynamics be written to temporary diagnostic CSV files?
-This argument replaces CmdStan's diagnostic_file argument and the content
-written to CSV is controlled by the user's CmdStan installation and not
-CmdStanR (for some algorithms no content may be written). The default is
-FALSE, which is appropriate for almost every use case. To save the
-temporary files created when save_latent_dynamics=TRUE see the
-$save_latent_dynamics_files()
-method.
(string) A path to a directory where CmdStan should write
-its output CSV files. For MCMC there will be one file per chain; for other
-methods there will be a single file. For interactive use this can typically
-be left at NULL (temporary directory) since CmdStanR makes the CmdStan
-output (posterior draws and diagnostics) available in R via methods of the
-fitted model objects. This can be set for an entire R session using
-options(cmdstanr_output_dir). The behavior of output_dir is as follows:
If NULL (the default), then the CSV files are written to a temporary
-directory and only saved permanently if the user calls one of the $save_*
-methods of the fitted model object (e.g.,
-$save_output_files()). These temporary
-files are removed when the fitted model object is garbage collected (manually or automatically).
If a path, then the files are created in output_dir with names
-corresponding to the defaults used by $save_output_files().
(string) A string to use as a prefix for the names of
-the output CSV files of CmdStan. If NULL (the default), the basename of
-the output CSV files will be comprised from the model name, timestamp, and
-5 random characters.
(positive integer) The number of Markov chains to run. The -default is 4.
(integer vector) A vector of chain IDs. Must contain as many
-unique positive integers as the number of chains. If not set, the default
-chain IDs are used (integers starting from 1).
(positive integer) The number of warmup iterations to run
-per chain. Note: in the CmdStan User's Guide this is referred to as
-num_warmup.
(positive integer) The number of post-warmup iterations
-to run per chain. Note: in the CmdStan User's Guide this is referred to as
-num_samples.
(logical) Should warmup iterations be saved? The default
-is FALSE.
(positive integer) The period between saved samples. This should -typically be left at its default (no thinning) unless memory is a problem.
(positive integer) The maximum allowed tree depth for -the NUTS engine. See the Tree Depth section of the CmdStan User's Guide -for more details.
(logical) Do warmup adaptation? The default is TRUE.
-If a precomputed inverse metric is specified via the inv_metric argument
-(or metric_file) then, if adapt_engaged=TRUE, Stan will use the
-provided inverse metric just as an initial guess during adaptation. To turn
-off adaptation when using a precomputed inverse metric set
-adapt_engaged=FALSE.
(real in (0,1)) The adaptation target acceptance
-statistic.
(positive real) The initial step size for the discrete -approximation to continuous Hamiltonian dynamics. This is further tuned -during warmup.
(string) One of "diag_e", "dense_e", or "unit_e",
-specifying the geometry of the base manifold. See the Euclidean Metric
-section of the CmdStan User's Guide for more details. To specify a
-precomputed (inverse) metric, see the inv_metric argument below.
(character vector) The paths to JSON or Rdump files (one
-per chain) compatible with CmdStan that contain precomputed inverse
-metrics. The metric_file argument is inherited from CmdStan but is
-confusing in that the entry in JSON or Rdump file(s) must be named
-inv_metric, referring to the inverse metric. We recommend instead using
-CmdStanR's inv_metric argument (see below) to specify an inverse metric
-directly using a vector or matrix from your R session.
(vector, matrix) A vector (if metric='diag_e') or a
-matrix (if metric='dense_e') for initializing the inverse metric. This
-can be used as an alternative to the metric_file argument. A vector is
-interpreted as a diagonal metric. The inverse metric is usually set to an
-estimate of the posterior covariance. See the adapt_engaged argument
-above for details about (and control over) how specifying a precomputed
-inverse metric interacts with adaptation.
(nonnegative integer) Width of initial fast timestep -adaptation interval during warmup.
(nonnegative integer) Width of final fast timestep -adaptation interval during warmup.
(nonnegative integer) Initial width of slow timestep/metric -adaptation interval.
(logical) When TRUE, call CmdStan with argument
-"algorithm=fixed_param". The default is FALSE. The fixed parameter
-sampler generates a new sample without changing the current state of the
-Markov chain; only generated quantities may change. This can be useful
-when, for example, trying to generate pseudo-data using the generated
-quantities block. If the parameters block is empty then using
-fixed_param=TRUE is mandatory. When fixed_param=TRUE the chains and
-parallel_chains arguments will be set to 1.
(positive integer) The number of significant figures used
-when storing the output values. By default, CmdStan represent the output
-values with 6 significant figures. The upper limit for sig_figs is 18.
-Increasing this value will result in larger output CSV files and thus an
-increased usage of disk space.
(logical) When TRUE (the default), prints all output
-during the execution process, such as iteration numbers and elapsed times.
-If the output is silenced then the $output() method
-of the resulting fit object can be used to display the silenced messages.
(logical) When TRUE (the default), prints all
-informational messages, for example rejection of the current proposal.
-Disable if you wish to silence these messages, but this is not usually
-recommended unless you are very confident that the model is correct up to
-numerical error. If the messages are silenced then the
-$output() method of the resulting fit object can be
-used to display the silenced messages.
(character vector) The diagnostics to automatically check
-and warn about after sampling. Setting this to an empty string "" or
-NULL can be used to prevent CmdStanR from automatically reading in the
-sampler diagnostics from CSV if you wish to manually read in the results
-and validate them yourself, for example using read_cmdstan_csv(). The
-currently available diagnostics are "divergences", "treedepth", and
-"ebfmi" (the default is to check all of them).
These diagnostics are also available after fitting. The
-$sampler_diagnostics() method provides
-access the diagnostic values for each iteration and the
-$diagnostic_summary() method provides
-summaries of the diagnostics and can regenerate the warning messages.
Diagnostics like R-hat and effective sample size are not currently
-available via the diagnostics argument but can be checked after fitting
-using the $summary() method.
(logical) When TRUE (the default), call CmdStan
-with argument "output save_config=1" to save a json file which contains
-the argument tree and extra information (equivalent to the output CSV file
-header). This option is only available in CmdStan 2.34.0 and later.
Deprecated. Use diagnostics instead.
A CmdStanMCMC object.
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
The Stan Math Library's documentation -(mc-stan.org/math) for more -details on MPI support in Stan.
-Other CmdStanModel methods:
-model-method-check_syntax,
-model-method-compile,
-model-method-diagnose,
-model-method-expose_functions,
-model-method-format,
-model-method-generate-quantities,
-model-method-laplace,
-model-method-optimize,
-model-method-pathfinder,
-model-method-sample,
-model-method-variables,
-model-method-variational
# \dontrun{
-# mpi_options <- list(STAN_MPI=TRUE, CXX="mpicxx", TBB_CXX_TYPE="gcc")
-# mod <- cmdstan_model("model.stan", cpp_options = mpi_options)
-# fit <- mod$sample_mpi(..., mpi_args = list("n" = 4))
-# }
-
-The $variables() method of a CmdStanModel object returns
-a list, each element representing a Stan model block: data, parameters,
-transformed_parameters and generated_quantities.
Each element contains a list of variables, with each variables represented
-as a list with infromation on its scalar type (real or int) and
-number of dimensions.
transformed data is not included, as variables in that block are not
-part of the model's input or output.
variables()The $variables() returns a list with information on input and
-output variables for each of the Stan model blocks.
Other CmdStanModel methods:
-model-method-check_syntax,
-model-method-compile,
-model-method-diagnose,
-model-method-expose_functions,
-model-method-format,
-model-method-generate-quantities,
-model-method-laplace,
-model-method-optimize,
-model-method-pathfinder,
-model-method-sample,
-model-method-sample_mpi,
-model-method-variational
# \dontrun{
-file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan")
-
-# create a `CmdStanModel` object, compiling the model is not required
-mod <- cmdstan_model(file, compile = FALSE)
-
-mod$variables()
-#> $parameters
-#> $parameters$theta
-#> $parameters$theta$type
-#> [1] "real"
-#>
-#> $parameters$theta$dimensions
-#> [1] 0
-#>
-#>
-#>
-#> $included_files
-#> list()
-#>
-#> $data
-#> $data$N
-#> $data$N$type
-#> [1] "int"
-#>
-#> $data$N$dimensions
-#> [1] 0
-#>
-#>
-#> $data$y
-#> $data$y$type
-#> [1] "int"
-#>
-#> $data$y$dimensions
-#> [1] 1
-#>
-#>
-#>
-#> $transformed_parameters
-#> named list()
-#>
-#> $generated_quantities
-#> named list()
-#>
-
-# }
-
-R/model.R
- model-method-variational.RdThe $variational() method of a CmdStanModel object runs
-Stan's Automatic Differentiation Variational Inference (ADVI) algorithms.
-The approximation is a Gaussian in the unconstrained variable space. Stan
-implements two ADVI algorithms: the algorithm="meanfield" option uses a
-fully factorized Gaussian for the approximation; the algorithm="fullrank"
-option uses a Gaussian with a full-rank covariance matrix for the
-approximation. See the
-CmdStan User’s Guide
-for more details.
Any argument left as NULL will default to the default value used by the
-installed version of CmdStan.
variational(
- data = NULL,
- seed = NULL,
- refresh = NULL,
- init = NULL,
- save_latent_dynamics = FALSE,
- output_dir = getOption("cmdstanr_output_dir"),
- output_basename = NULL,
- sig_figs = NULL,
- threads = NULL,
- opencl_ids = NULL,
- algorithm = NULL,
- iter = NULL,
- grad_samples = NULL,
- elbo_samples = NULL,
- eta = NULL,
- adapt_engaged = NULL,
- adapt_iter = NULL,
- tol_rel_obj = NULL,
- eval_elbo = NULL,
- output_samples = NULL,
- draws = NULL,
- show_messages = TRUE,
- show_exceptions = TRUE,
- save_cmdstan_config = NULL
-)(multiple options) The data to use for the variables specified in -the data block of the Stan program. One of the following:
A named list of R objects with the names corresponding to variables
-declared in the data block of the Stan program. Internally this list is
-then written to JSON for CmdStan using write_stan_json(). See
-write_stan_json() for details on the conversions performed on R objects
-before they are passed to Stan.
A path to a data file compatible with CmdStan (JSON or R dump). See the -appendices in the CmdStan guide for details on using these formats.
NULL or an empty list if the Stan program has no data block.
(positive integer(s)) A seed for the (P)RNG to pass to CmdStan.
-In the case of multi-chain sampling the single seed will automatically be
-augmented by the the run (chain) ID so that each chain uses a different
-seed. The exception is the transformed data block, which defaults to using
-same seed for all chains so that the same data is generated for all chains
-if RNG functions are used. The only time seed should be specified as a
-vector (one element per chain) is if RNG functions are used in transformed
-data and the goal is to generate different data for each chain.
(non-negative integer) The number of iterations between
-printed screen updates. If refresh = 0, only error messages will be
-printed.
(multiple options) The initialization method to use for the -variables declared in the parameters block of the Stan program. One of the -following:
A real number x>0. This initializes all parameters randomly between
-[-x,x] on the unconstrained parameter space.;
The number 0. This initializes all parameters to 0;
A character vector of paths (one per chain) to JSON or Rdump files
-containing initial values for all or some parameters. See
-write_stan_json() to write R objects to JSON files compatible with
-CmdStan.
A list of lists containing initial values for all or some parameters. For -MCMC the list should contain a sublist for each chain. For other model -fitting methods there should be just one sublist. The sublists should have -named elements corresponding to the parameters for which you are specifying -initial values. See Examples.
A function that returns a single list with names corresponding to the
-parameters for which you are specifying initial values. The function can
-take no arguments or a single argument chain_id. For MCMC, if the
-function has argument chain_id it will be supplied with the chain id
-(from 1 to number of chains) when called to generate the initial values.
-See
-Examples.
A CmdStanMCMC, CmdStanMLE, CmdStanVB, CmdStanPathfinder,
-or CmdStanLaplace fit object.
-If the fit object's parameters are only a subset of the model
-parameters then the other parameters will be drawn by Stan's default
-initialization. The fit object must have at least some parameters that are the
-same name and dimensions as the current Stan model. For the sample and
-pathfinder method, if the fit object has fewer draws than the requested
-number of chains/paths then the inits will be drawn using sampling with
-replacement. Otherwise sampling without replacement will be used.
-When a CmdStanPathfinder fit object is used as the init, if
-. psis_resample was set to FALSE and calculate_lp was
-set to TRUE (default), then resampling without replacement with Pareto
-smoothed weights will be used. If psis_resample was set to TRUE or
-calculate_lp was set to FALSE then sampling without replacement with
-uniform weights will be used to select the draws.
-PSIS resampling is used to select the draws for CmdStanVB,
-and CmdStanLaplace fit objects.
A type inheriting from posterior::draws. If the draws object has less
-samples than the number of requested chains/paths then the inits will be
-drawn using sampling with replacement. Otherwise sampling without
-replacement will be used. If the draws object's parameters are only a subset
-of the model parameters then the other parameters will be drawn by Stan's
-default initialization. The fit object must have at least some parameters
-that are the same name and dimensions as the current Stan model.
(logical) Should auxiliary diagnostic information
-about the latent dynamics be written to temporary diagnostic CSV files?
-This argument replaces CmdStan's diagnostic_file argument and the content
-written to CSV is controlled by the user's CmdStan installation and not
-CmdStanR (for some algorithms no content may be written). The default is
-FALSE, which is appropriate for almost every use case. To save the
-temporary files created when save_latent_dynamics=TRUE see the
-$save_latent_dynamics_files()
-method.
(string) A path to a directory where CmdStan should write
-its output CSV files. For MCMC there will be one file per chain; for other
-methods there will be a single file. For interactive use this can typically
-be left at NULL (temporary directory) since CmdStanR makes the CmdStan
-output (posterior draws and diagnostics) available in R via methods of the
-fitted model objects. This can be set for an entire R session using
-options(cmdstanr_output_dir). The behavior of output_dir is as follows:
If NULL (the default), then the CSV files are written to a temporary
-directory and only saved permanently if the user calls one of the $save_*
-methods of the fitted model object (e.g.,
-$save_output_files()). These temporary
-files are removed when the fitted model object is garbage collected (manually or automatically).
If a path, then the files are created in output_dir with names
-corresponding to the defaults used by $save_output_files().
(string) A string to use as a prefix for the names of
-the output CSV files of CmdStan. If NULL (the default), the basename of
-the output CSV files will be comprised from the model name, timestamp, and
-5 random characters.
(positive integer) The number of significant figures used
-when storing the output values. By default, CmdStan represent the output
-values with 6 significant figures. The upper limit for sig_figs is 18.
-Increasing this value will result in larger output CSV files and thus an
-increased usage of disk space.
(positive integer) If the model was
-compiled with threading support, the number of
-threads to use in parallelized sections (e.g., when using the Stan
-functions reduce_sum() or map_rect()).
(integer vector of length 2) The platform and device IDs of
-the OpenCL device to use for fitting. The model must be compiled with
-cpp_options = list(stan_opencl = TRUE) for this argument to have an
-effect.
(string) The algorithm. Either "meanfield" or
-"fullrank".
(positive integer) The maximum number of iterations.
(positive integer) The number of samples for Monte Carlo -estimate of gradients.
(positive integer) The number of samples for Monte Carlo -estimate of ELBO (objective function).
(positive real) The step size weighting parameter for adaptive -step size sequence.
(logical) Do warmup adaptation?
(positive integer) The maximum number of adaptation -iterations.
(positive real) Convergence tolerance on the relative norm -of the objective.
(positive integer) Evaluate ELBO every Nth iteration.
(positive integer) Use draws argument instead.
-output_samples will be deprecated in the future.
(positive integer) Number of approximate posterior -samples to draw and save.
(logical) When TRUE (the default), prints all output
-during the execution process, such as iteration numbers and elapsed times.
-If the output is silenced then the $output() method
-of the resulting fit object can be used to display the silenced messages.
(logical) When TRUE (the default), prints all
-informational messages, for example rejection of the current proposal.
-Disable if you wish to silence these messages, but this is not usually
-recommended unless you are very confident that the model is correct up to
-numerical error. If the messages are silenced then the
-$output() method of the resulting fit object can be
-used to display the silenced messages.
(logical) When TRUE (the default), call CmdStan
-with argument "output save_config=1" to save a json file which contains
-the argument tree and extra information (equivalent to the output CSV file
-header). This option is only available in CmdStan 2.34.0 and later.
A CmdStanVB object.
The CmdStanR website -(mc-stan.org/cmdstanr) for online -documentation and tutorials.
-The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
-model-method-check_syntax,
-model-method-compile,
-model-method-diagnose,
-model-method-expose_functions,
-model-method-format,
-model-method-generate-quantities,
-model-method-laplace,
-model-method-optimize,
-model-method-pathfinder,
-model-method-sample,
-model-method-sample_mpi,
-model-method-variables
# \dontrun{
-library(cmdstanr)
-library(posterior)
-library(bayesplot)
-color_scheme_set("brightblue")
-
-# Set path to CmdStan
-# (Note: if you installed CmdStan via install_cmdstan() with default settings
-# then setting the path is unnecessary but the default below should still work.
-# Otherwise use the `path` argument to specify the location of your
-# CmdStan installation.)
-set_cmdstan_path(path = NULL)
-#> CmdStan path set to: /Users/jgabry/.cmdstan/cmdstan-2.36.0
-
-# Create a CmdStanModel object from a Stan program,
-# here using the example model that comes with CmdStan
-file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan")
-mod <- cmdstan_model(file)
-mod$print()
-#> data {
-#> int<lower=0> N;
-#> array[N] int<lower=0, upper=1> y;
-#> }
-#> parameters {
-#> real<lower=0, upper=1> theta;
-#> }
-#> model {
-#> theta ~ beta(1, 1); // uniform prior on interval 0,1
-#> y ~ bernoulli(theta);
-#> }
-# Print with line numbers. This can be set globally using the
-# `cmdstanr_print_line_numbers` option.
-mod$print(line_numbers = TRUE)
-#> 1: data {
-#> 2: int<lower=0> N;
-#> 3: array[N] int<lower=0, upper=1> y;
-#> 4: }
-#> 5: parameters {
-#> 6: real<lower=0, upper=1> theta;
-#> 7: }
-#> 8: model {
-#> 9: theta ~ beta(1, 1); // uniform prior on interval 0,1
-#> 10: y ~ bernoulli(theta);
-#> 11: }
-
-# Data as a named list (like RStan)
-stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1))
-
-# Run MCMC using the 'sample' method
-fit_mcmc <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- parallel_chains = 2
-)
-#> Running MCMC with 2 parallel chains...
-#>
-#> Chain 1 Iteration: 1 / 2000 [ 0%] (Warmup)
-#> Chain 1 Iteration: 100 / 2000 [ 5%] (Warmup)
-#> Chain 1 Iteration: 200 / 2000 [ 10%] (Warmup)
-#> Chain 1 Iteration: 300 / 2000 [ 15%] (Warmup)
-#> Chain 1 Iteration: 400 / 2000 [ 20%] (Warmup)
-#> Chain 1 Iteration: 500 / 2000 [ 25%] (Warmup)
-#> Chain 1 Iteration: 600 / 2000 [ 30%] (Warmup)
-#> Chain 1 Iteration: 700 / 2000 [ 35%] (Warmup)
-#> Chain 1 Iteration: 800 / 2000 [ 40%] (Warmup)
-#> Chain 1 Iteration: 900 / 2000 [ 45%] (Warmup)
-#> Chain 1 Iteration: 1000 / 2000 [ 50%] (Warmup)
-#> Chain 1 Iteration: 1001 / 2000 [ 50%] (Sampling)
-#> Chain 1 Iteration: 1100 / 2000 [ 55%] (Sampling)
-#> Chain 1 Iteration: 1200 / 2000 [ 60%] (Sampling)
-#> Chain 1 Iteration: 1300 / 2000 [ 65%] (Sampling)
-#> Chain 1 Iteration: 1400 / 2000 [ 70%] (Sampling)
-#> Chain 1 Iteration: 1500 / 2000 [ 75%] (Sampling)
-#> Chain 1 Iteration: 1600 / 2000 [ 80%] (Sampling)
-#> Chain 1 Iteration: 1700 / 2000 [ 85%] (Sampling)
-#> Chain 1 Iteration: 1800 / 2000 [ 90%] (Sampling)
-#> Chain 1 Iteration: 1900 / 2000 [ 95%] (Sampling)
-#> Chain 1 Iteration: 2000 / 2000 [100%] (Sampling)
-#> Chain 2 Iteration: 1 / 2000 [ 0%] (Warmup)
-#> Chain 2 Iteration: 100 / 2000 [ 5%] (Warmup)
-#> Chain 2 Iteration: 200 / 2000 [ 10%] (Warmup)
-#> Chain 2 Iteration: 300 / 2000 [ 15%] (Warmup)
-#> Chain 2 Iteration: 400 / 2000 [ 20%] (Warmup)
-#> Chain 2 Iteration: 500 / 2000 [ 25%] (Warmup)
-#> Chain 2 Iteration: 600 / 2000 [ 30%] (Warmup)
-#> Chain 2 Iteration: 700 / 2000 [ 35%] (Warmup)
-#> Chain 2 Iteration: 800 / 2000 [ 40%] (Warmup)
-#> Chain 2 Iteration: 900 / 2000 [ 45%] (Warmup)
-#> Chain 2 Iteration: 1000 / 2000 [ 50%] (Warmup)
-#> Chain 2 Iteration: 1001 / 2000 [ 50%] (Sampling)
-#> Chain 2 Iteration: 1100 / 2000 [ 55%] (Sampling)
-#> Chain 2 Iteration: 1200 / 2000 [ 60%] (Sampling)
-#> Chain 2 Iteration: 1300 / 2000 [ 65%] (Sampling)
-#> Chain 2 Iteration: 1400 / 2000 [ 70%] (Sampling)
-#> Chain 2 Iteration: 1500 / 2000 [ 75%] (Sampling)
-#> Chain 2 Iteration: 1600 / 2000 [ 80%] (Sampling)
-#> Chain 2 Iteration: 1700 / 2000 [ 85%] (Sampling)
-#> Chain 2 Iteration: 1800 / 2000 [ 90%] (Sampling)
-#> Chain 2 Iteration: 1900 / 2000 [ 95%] (Sampling)
-#> Chain 2 Iteration: 2000 / 2000 [100%] (Sampling)
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.2 seconds.
-#>
-
-# Use 'posterior' package for summaries
-fit_mcmc$summary()
-#> # A tibble: 2 × 10
-#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.35 -7.01 0.882 0.353 -9.14 -6.75 1.00 724. 896.
-#> 2 theta 0.254 0.239 0.129 0.126 0.0737 0.488 1.00 532. 657.
-
-# Check sampling diagnostics
-fit_mcmc$diagnostic_summary()
-#> $num_divergent
-#> [1] 0 0
-#>
-#> $num_max_treedepth
-#> [1] 0 0
-#>
-#> $ebfmi
-#> [1] 1.1148479 0.7568734
-#>
-
-# Get posterior draws
-draws <- fit_mcmc$draws()
-print(draws)
-#> # A draws_array: 1000 iterations, 2 chains, and 2 variables
-#> , , variable = lp__
-#>
-#> chain
-#> iteration 1 2
-#> 1 -7.0 -8.1
-#> 2 -7.9 -7.9
-#> 3 -7.4 -7.0
-#> 4 -6.7 -6.8
-#> 5 -6.9 -6.8
-#>
-#> , , variable = theta
-#>
-#> chain
-#> iteration 1 2
-#> 1 0.17 0.088
-#> 2 0.46 0.097
-#> 3 0.41 0.167
-#> 4 0.25 0.292
-#> 5 0.18 0.238
-#>
-#> # ... with 995 more iterations
-
-# Convert to data frame using posterior::as_draws_df
-as_draws_df(draws)
-#> # A draws_df: 1000 iterations, 2 chains, and 2 variables
-#> lp__ theta
-#> 1 -7.0 0.17
-#> 2 -7.9 0.46
-#> 3 -7.4 0.41
-#> 4 -6.7 0.25
-#> 5 -6.9 0.18
-#> 6 -6.9 0.33
-#> 7 -7.2 0.15
-#> 8 -6.8 0.29
-#> 9 -6.8 0.24
-#> 10 -6.8 0.24
-#> # ... with 1990 more draws
-#> # ... hidden reserved variables {'.chain', '.iteration', '.draw'}
-
-# Plot posterior using bayesplot (ggplot2)
-mcmc_hist(fit_mcmc$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm)
-# and also demonstrate specifying data as a path to a file instead of a list
-my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json")
-fit_optim <- mod$optimize(data = my_data_file, seed = 123)
-#> Initial log joint probability = -16.144
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 6 -5.00402 0.000246518 8.73164e-07 1 1 9
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.2 seconds.
-fit_optim$summary()
-#> # A tibble: 2 × 2
-#> variable estimate
-#> <chr> <dbl>
-#> 1 lp__ -5.00
-#> 2 theta 0.2
-
-# Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation
-# to the posterior
-fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE)
-#> Initial log joint probability = -19.2814
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 5 -6.74802 0.000163806 1.63613e-06 1 1 8
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.1 seconds.
-fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000)
-#> Calculating Hessian
-#> Calculating inverse of Cholesky factor
-#> Generating draws
-#> iteration: 0
-#> iteration: 100
-#> iteration: 200
-#> iteration: 300
-#> iteration: 400
-#> iteration: 500
-#> iteration: 600
-#> iteration: 700
-#> iteration: 800
-#> iteration: 900
-#> iteration: 1000
-#> iteration: 1100
-#> iteration: 1200
-#> iteration: 1300
-#> iteration: 1400
-#> iteration: 1500
-#> iteration: 1600
-#> iteration: 1700
-#> iteration: 1800
-#> iteration: 1900
-#> Finished in 0.2 seconds.
-fit_laplace$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.24 -6.96 0.752 0.292 -8.69 -6.75
-#> 2 lp_approx__ -0.496 -0.208 0.731 0.290 -1.95 -0.00172
-#> 3 theta 0.270 0.248 0.123 0.116 0.0971 0.507
-
-# Run 'variational' method to use ADVI to approximate posterior
-fit_vb <- mod$variational(data = stan_data, seed = 123)
-#> ------------------------------------------------------------
-#> EXPERIMENTAL ALGORITHM:
-#> This procedure has not been thoroughly tested and may be unstable
-#> or buggy. The interface is subject to change.
-#> ------------------------------------------------------------
-#> Gradient evaluation took 1.1e-05 seconds
-#> 1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
-#> Adjust your expectations accordingly!
-#> Begin eta adaptation.
-#> Iteration: 1 / 250 [ 0%] (Adaptation)
-#> Iteration: 50 / 250 [ 20%] (Adaptation)
-#> Iteration: 100 / 250 [ 40%] (Adaptation)
-#> Iteration: 150 / 250 [ 60%] (Adaptation)
-#> Iteration: 200 / 250 [ 80%] (Adaptation)
-#> Success! Found best value [eta = 1] earlier than expected.
-#> Begin stochastic gradient ascent.
-#> iter ELBO delta_ELBO_mean delta_ELBO_med notes
-#> 100 -6.164 1.000 1.000
-#> 200 -6.225 0.505 1.000
-#> 300 -6.186 0.339 0.010 MEDIAN ELBO CONVERGED
-#> Drawing a sample of size 1000 from the approximate posterior...
-#> COMPLETED.
-#> Finished in 0.2 seconds.
-fit_vb$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp__ -7.14 -6.93 0.528 0.247 -8.21 -6.75
-#> 2 lp_approx__ -0.520 -0.244 0.740 0.326 -1.90 -0.00227
-#> 3 theta 0.251 0.236 0.107 0.108 0.100 0.446
-mcmc_hist(fit_vb$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'pathfinder' method, a new alternative to the variational method
-fit_pf <- mod$pathfinder(data = stan_data, seed = 123)
-#> Path [1] :Initial log joint density = -18.273334
-#> Path [1] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 7.082e-04 1.432e-05 1.000e+00 1.000e+00 126 -6.145e+00 -6.145e+00
-#> Path [1] :Best Iter: [5] ELBO (-6.145070) evaluations: (126)
-#> Path [2] :Initial log joint density = -19.192715
-#> Path [2] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 2.015e-04 2.228e-06 1.000e+00 1.000e+00 126 -6.223e+00 -6.223e+00
-#> Path [2] :Best Iter: [2] ELBO (-6.170358) evaluations: (126)
-#> Path [3] :Initial log joint density = -6.774820
-#> Path [3] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 4 -6.748e+00 1.137e-04 2.596e-07 1.000e+00 1.000e+00 101 -6.178e+00 -6.178e+00
-#> Path [3] :Best Iter: [4] ELBO (-6.177909) evaluations: (101)
-#> Path [4] :Initial log joint density = -7.949193
-#> Path [4] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 2.145e-04 1.301e-06 1.000e+00 1.000e+00 126 -6.197e+00 -6.197e+00
-#> Path [4] :Best Iter: [5] ELBO (-6.197118) evaluations: (126)
-#> Total log probability function evaluations:4379
-#> Finished in 0.2 seconds.
-fit_pf$summary()
-#> # A tibble: 3 × 7
-#> variable mean median sd mad q5 q95
-#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-#> 1 lp_approx__ -1.07 -0.727 0.945 0.311 -2.91 -0.450
-#> 2 lp__ -7.25 -6.97 0.753 0.308 -8.78 -6.75
-#> 3 theta 0.256 0.245 0.119 0.123 0.0824 0.462
-mcmc_hist(fit_pf$draws("theta"))
-#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
-
-# Run 'pathfinder' again with more paths, fewer draws per path,
-# better covariance approximation, and fewer LBFGSs iterations
-fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40,
- history_size=50, max_lbfgs_iters=100)
-#> Warning: Number of PSIS draws is larger than the total number of draws returned by the single Pathfinders. This is likely unintentional and leads to re-sampling from the same draws.
-#> Path [1] :Initial log joint density = -6.777948
-#> Path [1] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 4 -6.748e+00 1.327e-04 3.358e-07 1.000e+00 1.000e+00 101 -6.183e+00 -6.183e+00
-#> Path [1] :Best Iter: [4] ELBO (-6.183163) evaluations: (101)
-#> Path [2] :Initial log joint density = -8.072775
-#> Path [2] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 2.399e-04 1.562e-06 1.000e+00 1.000e+00 126 -6.271e+00 -6.271e+00
-#> Path [2] :Best Iter: [4] ELBO (-6.239963) evaluations: (126)
-#> Path [3] :Initial log joint density = -9.025342
-#> Path [3] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 3.454e-04 2.916e-06 1.000e+00 1.000e+00 126 -6.285e+00 -6.285e+00
-#> Path [3] :Best Iter: [2] ELBO (-6.207932) evaluations: (126)
-#> Path [4] :Initial log joint density = -9.983004
-#> Path [4] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 6.283e-04 7.448e-06 1.000e+00 1.000e+00 126 -6.267e+00 -6.267e+00
-#> Path [4] :Best Iter: [3] ELBO (-6.107202) evaluations: (126)
-#> Path [5] :Initial log joint density = -13.400879
-#> Path [5] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.809e-03 4.509e-05 1.000e+00 1.000e+00 126 -6.206e+00 -6.206e+00
-#> Path [5] :Best Iter: [4] ELBO (-6.199332) evaluations: (126)
-#> Path [6] :Initial log joint density = -7.627321
-#> Path [6] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.419e-04 6.650e-07 1.000e+00 1.000e+00 126 -6.197e+00 -6.197e+00
-#> Path [6] :Best Iter: [4] ELBO (-6.176730) evaluations: (126)
-#> Path [7] :Initial log joint density = -13.719529
-#> Path [7] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.898e-03 4.974e-05 1.000e+00 1.000e+00 126 -6.198e+00 -6.198e+00
-#> Path [7] :Best Iter: [5] ELBO (-6.198257) evaluations: (126)
-#> Path [8] :Initial log joint density = -8.734378
-#> Path [8] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 3.325e-04 2.705e-06 1.000e+00 1.000e+00 126 -6.215e+00 -6.215e+00
-#> Path [8] :Best Iter: [3] ELBO (-6.210584) evaluations: (126)
-#> Path [9] :Initial log joint density = -15.787917
-#> Path [9] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 5 -6.748e+00 1.925e-03 5.805e-05 1.000e+00 1.000e+00 126 -6.251e+00 -6.251e+00
-#> Path [9] :Best Iter: [3] ELBO (-6.246013) evaluations: (126)
-#> Path [10] :Initial log joint density = -7.311648
-#> Path [10] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
-#> 4 -6.748e+00 5.348e-03 1.585e-04 1.000e+00 1.000e+00 101 -6.229e+00 -6.229e+00
-#> Path [10] :Best Iter: [3] ELBO (-6.203261) evaluations: (101)
-#> Total log probability function evaluations:1360
-#> Pareto k value (0.78) is greater than 0.7. Importance resampling was not able to improve the approximation, which may indicate that the approximation itself is poor.
-#> Finished in 0.2 seconds.
-
-# Specifying initial values as a function
-fit_mcmc_w_init_fun <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = function() list(theta = runif(1))
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.4 seconds.
-#>
-fit_mcmc_w_init_fun_2 <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = function(chain_id) {
- # silly but demonstrates optional use of chain_id
- list(theta = 1 / (chain_id + 1))
- }
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.3 seconds.
-#>
-fit_mcmc_w_init_fun_2$init()
-#> [[1]]
-#> [[1]]$theta
-#> [1] 0.5
-#>
-#>
-#> [[2]]
-#> [[2]]$theta
-#> [1] 0.3333333
-#>
-#>
-
-# Specifying initial values as a list of lists
-fit_mcmc_w_init_list <- mod$sample(
- data = stan_data,
- seed = 123,
- chains = 2,
- refresh = 0,
- init = list(
- list(theta = 0.75), # chain 1
- list(theta = 0.25) # chain 2
- )
-)
-#> Running MCMC with 2 sequential chains...
-#>
-#> Chain 1 finished in 0.0 seconds.
-#> Chain 2 finished in 0.0 seconds.
-#>
-#> Both chains finished successfully.
-#> Mean chain execution time: 0.0 seconds.
-#> Total execution time: 0.3 seconds.
-#>
-fit_optim_w_init_list <- mod$optimize(
- data = stan_data,
- seed = 123,
- init = list(
- list(theta = 0.75)
- )
-)
-#> Initial log joint probability = -11.6657
-#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
-#> 6 -5.00402 0.000237915 9.55309e-07 1 1 9
-#> Optimization terminated normally:
-#> Convergence detected: relative gradient magnitude is below tolerance
-#> Finished in 0.2 seconds.
-fit_optim_w_init_list$init()
-#> [[1]]
-#> [[1]]$theta
-#> [1] 0.75
-#>
-#>
-# }
-
-read_cmdstan_csv() is used internally by CmdStanR to read
-CmdStan's output CSV files into R. It can also be used by CmdStan users as
-a more flexible and efficient alternative to rstan::read_stan_csv(). See
-the Value section for details on the structure of the returned list.
It is also possible to create CmdStanR's fitted model objects directly from
-CmdStan CSV files using the as_cmdstan_fit() function.
(character vector) The paths to the CmdStan CSV files. These can -be files generated by running CmdStanR or running CmdStan directly.
(character vector) Optionally, the names of the variables -(parameters, transformed parameters, and generated quantities) to read in.
If NULL (the default) then all variables are included.
If an empty string (variables="") then none are included.
For non-scalar variables all elements or specific elements can be selected:
variables = "theta" selects all elements of theta;
variables = c("theta[1]", "theta[3]") selects only the 1st and 3rd elements.
(character vector) Works the same way as
-variables but for sampler diagnostic variables (e.g., "treedepth__",
-"accept_stat__", etc.). Ignored if the model was not fit using MCMC.
(string) The format for storing the draws or point estimates. -The default depends on the method used to fit the model. See -draws for details, in particular the note about speed -and memory for models with many parameters.
(logical) For models fit using MCMC, should
-diagnostic checks be performed after reading in the files? The default is
-TRUE but set to FALSE to avoid checking for problems with divergences
-and treedepth.
as_cmdstan_fit() returns a CmdStanMCMC, CmdStanMLE, CmdStanLaplace or
-CmdStanVB object. Some methods typically defined for those objects will not
-work (e.g. save_data_file()) but the important methods like $summary(),
-$draws(), $sampler_diagnostics() and others will work fine.
read_cmdstan_csv() returns a named list with the following components:
metadata: A list of the meta information from the run that produced the
-CSV file(s). See Examples below.
The other components in the returned list depend on the method that produced -the CSV file(s).
- - -For sampling the returned list also includes the -following components:
time: Run time information for the individual chains. The returned object
-is the same as for the $time() method except the total run
-time can't be inferred from the CSV files (the chains may have been run in
-parallel) and is therefore NA.
inv_metric: A list (one element per chain) of inverse mass matrices
-or their diagonals, depending on the type of metric used.
step_size: A list (one element per chain) of the step sizes used.
warmup_draws: If save_warmup was TRUE when fitting the model then a
-draws_array (or different format if format is
-specified) of warmup draws.
post_warmup_draws: A draws_array (or
-different format if format is specified) of post-warmup draws.
warmup_sampler_diagnostics: If save_warmup was TRUE when fitting the
-model then a draws_array (or different format if
-format is specified) of warmup draws of the sampler diagnostic variables.
post_warmup_sampler_diagnostics: A
-draws_array (or different format if format is
-specified) of post-warmup draws of the sampler diagnostic variables.
For optimization the returned list also includes the -following components:
point_estimates: Point estimates for the model parameters.
For laplace and -variational inference the returned list also -includes the following components:
draws: A draws_matrix (or different format
-if format is specified) of draws from the approximate posterior
-distribution.
For standalone generated quantities the -returned list also includes the following components:
generated_quantities: A draws_array of
-the generated quantities.
# \dontrun{
-# Generate some CSV files to use for demonstration
-fit1 <- cmdstanr_example("logistic", method = "sample", save_warmup = TRUE)
-csv_files <- fit1$output_files()
-print(csv_files)
-#> [1] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/logistic-202503310849-1-4ed0d7.csv"
-#> [2] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/logistic-202503310849-2-4ed0d7.csv"
-#> [3] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/logistic-202503310849-3-4ed0d7.csv"
-#> [4] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/logistic-202503310849-4-4ed0d7.csv"
-
-# Creating fitting model objects
-
-# Create a CmdStanMCMC object from the CSV files
-fit2 <- as_cmdstan_fit(csv_files)
-fit2$print("beta")
-#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
-#> beta[1] -0.66 -0.66 0.25 0.24 -1.07 -0.27 1.00 3828 2751
-#> beta[2] -0.27 -0.27 0.23 0.22 -0.65 0.09 1.00 4397 2844
-#> beta[3] 0.68 0.68 0.27 0.28 0.24 1.13 1.00 4288 2907
-
-# Using read_cmdstan_csv
-#
-# Read in everything
-x <- read_cmdstan_csv(csv_files)
-str(x)
-#> List of 8
-#> $ metadata :List of 42
-#> ..$ stan_version_major : num 2
-#> ..$ stan_version_minor : num 36
-#> ..$ stan_version_patch : num 0
-#> ..$ start_datetime : chr "2025-03-31 14:49:26 UTC"
-#> ..$ method : chr "sample"
-#> ..$ save_warmup : int 1
-#> ..$ thin : num 1
-#> ..$ gamma : num 0.05
-#> ..$ kappa : num 0.75
-#> ..$ t0 : num 10
-#> ..$ init_buffer : num 75
-#> ..$ term_buffer : num 50
-#> ..$ window : num 25
-#> ..$ save_metric : int 0
-#> ..$ algorithm : chr "hmc"
-#> ..$ engine : chr "nuts"
-#> ..$ metric : chr "diag_e"
-#> ..$ stepsize_jitter : num 0
-#> ..$ num_chains : num 1
-#> ..$ id : num [1:4] 1 2 3 4
-#> ..$ init : num [1:4] 2 2 2 2
-#> ..$ seed : num 1.72e+09
-#> ..$ refresh : num 100
-#> ..$ sig_figs : num -1
-#> ..$ profile_file : chr "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/logistic-profile-202503310849-1-480225.csv"
-#> ..$ save_cmdstan_config : int 0
-#> ..$ stanc_version : chr "stanc3 v2.36.0"
-#> ..$ sampler_diagnostics : chr [1:6] "accept_stat__" "stepsize__" "treedepth__" "n_leapfrog__" ...
-#> ..$ variables : chr [1:105] "lp__" "alpha" "beta[1]" "beta[2]" ...
-#> ..$ step_size_adaptation: num [1:4] 0.785 0.802 0.787 0.715
-#> ..$ model_name : chr "logistic_model"
-#> ..$ adapt_engaged : int 1
-#> ..$ adapt_delta : num 0.8
-#> ..$ max_treedepth : num 10
-#> ..$ step_size : num [1:4] 1 1 1 1
-#> ..$ iter_warmup : num 1000
-#> ..$ iter_sampling : num 1000
-#> ..$ threads_per_chain : num 1
-#> ..$ time :'data.frame': 4 obs. of 4 variables:
-#> .. ..$ chain_id: num [1:4] 1 2 3 4
-#> .. ..$ warmup : num [1:4] 0.077 0.077 0.076 0.078
-#> .. ..$ sampling: num [1:4] 0.071 0.074 0.073 0.077
-#> .. ..$ total : num [1:4] 0.148 0.151 0.149 0.155
-#> ..$ stan_variable_sizes :List of 4
-#> .. ..$ lp__ : num 1
-#> .. ..$ alpha : num 1
-#> .. ..$ beta : num 3
-#> .. ..$ log_lik: num 100
-#> ..$ stan_variables : chr [1:4] "lp__" "alpha" "beta" "log_lik"
-#> ..$ model_params : chr [1:105] "lp__" "alpha" "beta[1]" "beta[2]" ...
-#> $ time :List of 2
-#> ..$ total : int NA
-#> ..$ chains:'data.frame': 4 obs. of 4 variables:
-#> .. ..$ chain_id: num [1:4] 1 2 3 4
-#> .. ..$ warmup : num [1:4] 0.077 0.077 0.076 0.078
-#> .. ..$ sampling: num [1:4] 0.071 0.074 0.073 0.077
-#> .. ..$ total : num [1:4] 0.148 0.151 0.149 0.155
-#> $ inv_metric :List of 4
-#> ..$ 1: num [1:4] 0.0414 0.0575 0.0508 0.0651
-#> ..$ 2: num [1:4] 0.0478 0.0564 0.0458 0.068
-#> ..$ 3: num [1:4] 0.0464 0.0556 0.0444 0.0718
-#> ..$ 4: num [1:4] 0.0418 0.0554 0.0495 0.078
-#> $ step_size :List of 4
-#> ..$ 1: num 0.785
-#> ..$ 2: num 0.802
-#> ..$ 3: num 0.787
-#> ..$ 4: num 0.715
-#> $ warmup_draws : 'draws_array' num [1:1000, 1:4, 1:105] -82.8 -82.8 -82.8 -70.9 -67.5 ...
-#> ..- attr(*, "dimnames")=List of 3
-#> .. ..$ iteration: chr [1:1000] "1" "2" "3" "4" ...
-#> .. ..$ chain : chr [1:4] "1" "2" "3" "4"
-#> .. ..$ variable : chr [1:105] "lp__" "alpha" "beta[1]" "beta[2]" ...
-#> $ post_warmup_draws : 'draws_array' num [1:1000, 1:4, 1:105] -64.9 -64.9 -66.2 -65.8 -65 ...
-#> ..- attr(*, "dimnames")=List of 3
-#> .. ..$ iteration: chr [1:1000] "1" "2" "3" "4" ...
-#> .. ..$ chain : chr [1:4] "1" "2" "3" "4"
-#> .. ..$ variable : chr [1:105] "lp__" "alpha" "beta[1]" "beta[2]" ...
-#> $ warmup_sampler_diagnostics : 'draws_array' num [1:1000, 1:4, 1:6] 6.67e-01 0.00 1.24e-316 9.96e-01 1.00 ...
-#> ..- attr(*, "dimnames")=List of 3
-#> .. ..$ iteration: chr [1:1000] "1" "2" "3" "4" ...
-#> .. ..$ chain : chr [1:4] "1" "2" "3" "4"
-#> .. ..$ variable : chr [1:6] "accept_stat__" "stepsize__" "treedepth__" "n_leapfrog__" ...
-#> $ post_warmup_sampler_diagnostics: 'draws_array' num [1:1000, 1:4, 1:6] 1 0.983 0.825 0.992 0.992 ...
-#> ..- attr(*, "dimnames")=List of 3
-#> .. ..$ iteration: chr [1:1000] "1" "2" "3" "4" ...
-#> .. ..$ chain : chr [1:4] "1" "2" "3" "4"
-#> .. ..$ variable : chr [1:6] "accept_stat__" "stepsize__" "treedepth__" "n_leapfrog__" ...
-
-# Don't read in any of the sampler diagnostic variables
-x <- read_cmdstan_csv(csv_files, sampler_diagnostics = "")
-
-# Don't read in any of the parameters or generated quantities
-x <- read_cmdstan_csv(csv_files, variables = "")
-
-# Read in only specific parameters and sampler diagnostics
-x <- read_cmdstan_csv(
- csv_files,
- variables = c("alpha", "beta[2]"),
- sampler_diagnostics = c("n_leapfrog__", "accept_stat__")
-)
-
-# For non-scalar parameters all elements can be selected or only some elements,
-# e.g. all of the vector "beta" but only one element of the vector "log_lik"
-x <- read_cmdstan_csv(
- csv_files,
- variables = c("beta", "log_lik[3]")
-)
-# }
-
-Deprecated. Use read_cmdstan_csv() instead.
read_sample_csv(files, variables = NULL, sampler_diagnostics = NULL)Deprecated. Use
-read_cmdstan_csv() instead.
Registers CmdStanR's knitr engine eng_cmdstan() for processing Stan chunks.
-Refer to the vignette
-R Markdown CmdStan Engine
-for a demonstration.
register_knitr_engine(override = TRUE)(logical) Override knitr's built-in, RStan-based engine for
-Stan? The default is TRUE. See Details.
If override = TRUE (default), this registers CmdStanR's knitr engine as the
-engine for stan chunks, replacing knitr's built-in, RStan-based engine. If
-override = FALSE, this registers a cmdstan engine so that both engines
-may be used in the same R Markdown document. If the template supports syntax
-highlighting for the Stan language, the cmdstan chunks will have stan
-syntax highlighting applied to them.
See the vignette -R Markdown CmdStan Engine -for an example.
-Note: When running chunks interactively in RStudio (e.g. when using
-R Notebooks), it has
-been observed that the built-in, RStan-based engine is used for stan
-chunks even when CmdStanR's engine has been registered in the session. When
-the R Markdown document is knit/rendered, the correct engine is used. As a
-workaround, when running chunks interactively, it is recommended to use the
-override = FALSE option and change stan chunks to be cmdstan chunks.
If you would like to keep stan chunks as stan chunks, it is possible to
-specify engine = "cmdstan" in the chunk options after registering the
-cmdstan engine with override = FALSE.
Use the set_cmdstan_path() function to tell CmdStanR where the
-CmdStan installation in located. Once the path has been set,
-cmdstan_path() will return the full path to the CmdStan installation and
-cmdstan_version() will return the CmdStan version number. See Details
-for how to avoid manually setting the path in each R session.
set_cmdstan_path(path = NULL)
-
-cmdstan_path()
-
-cmdstan_version(error_on_NA = TRUE)(string) The full file path to the CmdStan installation. If
-NULL (the default) then the path is set to the default path used by
-install_cmdstan() if it exists.
(logical) Should an error be thrown if CmdStan is not
-found. The default is TRUE. If FALSE, cmdstan_version() returns
-NULL.
A string. Either the file path to the CmdStan installation or the -CmdStan version number.
- - -CmdStan version string if available. If CmdStan is not found and
-error_on_NA is FALSE, cmdstan_version() returns NULL.
Before the package can be used it needs to know where the CmdStan -installation is located. When the package is loaded it tries to help automate -this to avoid having to manually set the path every session:
If the environment variable "CMDSTAN" exists at load time
-then its value will be automatically set as the default path to CmdStan for
-the R session. If the environment variable "CMDSTAN" is set, but a valid
-CmdStan is not found in the supplied path, the path is treated as a top
-folder that contains CmdStan installations. In that case, the CmdStan
-installation with the largest version number will be set as the path to
-CmdStan for the R session.
If no environment variable is found when loaded but any directory in the
-form ".cmdstan/cmdstan-[version]" (e.g., ".cmdstan/cmdstan-2.23.0"),
-exists in the user's home directory (Sys.getenv("HOME"), not the current
-working directory) then the path to the cmdstan with the largest version
-number will be set as the path to CmdStan for the R session. This is the
-same as the default directory that install_cmdstan() would use to install
-the latest version of CmdStan.
It is always possible to change the path after loading the package using
-set_cmdstan_path(path).
R/utils.R
- stan_threads.RdDEPRECATED. Please use the threads_per_chain argument when fitting the model.
num_threads()
-
-set_num_threads(num_threads)(positive integer) The number of threads to set.
The value of the environment variable STAN_NUM_THREADS.
Convenience function for writing Stan code to a (possibly
-temporary) file with a .stan extension. By default, the
-file name is chosen deterministically based on a hash
-of the Stan code, and the file is not overwritten if it already has correct
-contents. This means that calling this function multiple times with the same
-Stan code will reuse the compiled model. This also however means that the
-function is potentially not thread-safe. Using hash_salt = Sys.getpid()
-should ensure thread-safety in the rare cases when it is needed.
(character vector) The Stan code to write to the file. This can -be a character vector of length one (a string) containing the entire Stan -program or a character vector with each element containing one line of the -Stan program.
(string) An optional path to the directory where the file will be
-written. If omitted, a global option cmdstanr_write_stan_file_dir is
-used. If the global options is not set, temporary directory
-is used.
(string) If dir is specified, optionally the basename to
-use for the file created. If not specified a file name is generated
-from hashing the code.
(logical) If set to TRUE the file will always be
-overwritten and thus the resulting model will always be recompiled.
(string) Text to add to the model code prior to hashing to
-determine the file name if basename is not set.
The path to the file.
-# stan program as a single string
-stan_program <- "
-data {
- int<lower=0> N;
- array[N] int<lower=0,upper=1> y;
-}
-parameters {
- real<lower=0,upper=1> theta;
-}
-model {
- y ~ bernoulli(theta);
-}
-"
-
-f <- write_stan_file(stan_program)
-print(f)
-#> [1] "/var/folders/s0/zfzm55px2nd2v__zlw5xfj2h0000gn/T/RtmpWzIPg0/model_7f12fc190dd23b0e462f7d73040dd97e.stan"
-
-lines <- readLines(f)
-print(lines)
-#> [1] "" "data {"
-#> [3] " int<lower=0> N;" " array[N] int<lower=0,upper=1> y;"
-#> [5] "}" "parameters {"
-#> [7] " real<lower=0,upper=1> theta;" "}"
-#> [9] "model {" " y ~ bernoulli(theta);"
-#> [11] "}" ""
-cat(lines, sep = "\n")
-#>
-#> data {
-#> int<lower=0> N;
-#> array[N] int<lower=0,upper=1> y;
-#> }
-#> parameters {
-#> real<lower=0,upper=1> theta;
-#> }
-#> model {
-#> y ~ bernoulli(theta);
-#> }
-#>
-
-# stan program as character vector of lines
-f2 <- write_stan_file(lines)
-identical(readLines(f), readLines(f2))
-#> [1] TRUE
-
-Write data to a JSON file readable by CmdStan
-write_stan_json(data, file, always_decimal = FALSE)(list) A named list of R objects.
(string) The path to where the data file should be written.
(logical) Force generate non-integers with decimal
-points to better distinguish between integers and floating point values.
-If TRUE all R objects in data intended for integers must be of integer
-type.
write_stan_json() performs several conversions before writing the JSON
-file:
logical -> integer (TRUE -> 1, FALSE -> 0)
data.frame -> matrix (via data.matrix())
list -> array
table -> vector, matrix, or array (depending on dimensions of table)
The list to array conversion is intended to make it easier to prepare
-the data for certain Stan declarations involving arrays:
vector[J] v[K] (or equivalently array[K] vector[J] v as of Stan 2.27)
-can be constructed in R as a list with K elements where each element a
-vector of length J
matrix[I,J] v[K] (or equivalently array[K] matrix[I,J] m as of Stan
-2.27 ) can be constructed in R as a list with K elements where each element
-an IxJ matrix
These can also be passed in from R as arrays instead of lists but the list
-option is provided for convenience. Unfortunately for arrays with more than
-one dimension, e.g., vector[J] v[K,L] (or equivalently
-array[K,L] vector[J] v as of Stan 2.27) it is not possible to use an R
-list and an array must be used instead. For this example the array in R
-should have dimensions KxLxJ.
x <- matrix(rnorm(10), 5, 2)
-y <- rpois(nrow(x), lambda = 10)
-z <- c(TRUE, FALSE)
-data <- list(N = nrow(x), K = ncol(x), x = x, y = y, z = z)
-
-# write data to json file
-file <- tempfile(fileext = ".json")
-write_stan_json(data, file)
-
-# check the contents of the file
-cat(readLines(file), sep = "\n")
-#> {
-#> "N": 5,
-#> "K": 2,
-#> "x": [
-#> [0.594273774110513, 0.718888729854143],
-#> [0.0591351681787969, 0.251651069028968],
-#> [0.413398894737046, 1.35727443615177],
-#> [-1.09777217457042, 0.404468471278607],
-#> [0.711175257270441, 0.264364269837939]
-#> ],
-#> "y": [10, 11, 13, 11, 12],
-#> "z": [1, 0]
-#> }
-
-
-# demonstrating list to array conversion
-# suppose x is declared as `vector[3] x[2]` (or equivalently `array[2] vector[3] x`)
-# we can use a list of length 2 where each element is a vector of length 3
-data <- list(x = list(1:3, 4:6))
-file <- tempfile(fileext = ".json")
-write_stan_json(data, file)
-cat(readLines(file), sep = "\n")
-#> {
-#> "x": [
-#> [1, 2, 3],
-#> [4, 5, 6]
-#> ]
-#> }
-
-This function is deprecated. Please use write_stan_file() instead.
write_stan_tempfile(code, dir = tempdir())(character vector) The Stan code to write to the file. This can -be a character vector of length one (a string) containing the entire Stan -program or a character vector with each element containing one line of the -Stan program.
(string) An optional path to the directory where the file will be
-written. If omitted, a global option cmdstanr_write_stan_file_dir is
-used. If the global options is not set, temporary directory
-is used.