Skip to content

Reimplement Nuisance to sample in an unconstrained space #1007

@sethaxen

Description

@sethaxen

Drawing from some of the tricks used in probabilistic programming packages, I'm proposing a reimplementation to IMP.isd.Nuisance in IMP that should enable better sampling. The beginnings of an implementation can be found here. How this changes things is detailed in the bullet points in the below exchanges with @benmwebb:

@sdaxen:

I've been reading about modern practices with Hamiltonian Monte Carlo (HMC), and one thing I've learned is that HMC typically has a hard time exploring the parameters of constrained variables near the bound of the constraint (Gibbs is even worse at this), which can result in oscillations between two highly biased regimes in the resulting samples, preventing an accurate sample from being drawn. The trick that's usually employed here is to transform the variables so that their support is the real line. The usual transformations are log for a variable with only an upper or lower bound and logit for a variable with both (with necessary shifting factors, etc). These transformations usually make exploration of the space much more efficient. Modifying all restraints to operate on log-scale parameters, etc is really cumbersome.

My idea is to modify the implementation of Nuisance so that instead of storing the nuisance value and enforcing a constraint, it stores only the transformed nuisance (log, logit, etc) and computes the untransformed nuisance upon request. Adding to the nuisance derivative would simply add to the derivative of the transformed nuisance multiplied by the Jacobian. Somehow the necessary Jacobian correction to the posterior will always be added to the score and derivative (once per transformed nuisance per evaluation). The bounds would be used to determine which transformation should be applied and to warn if the user manually sets the nuisance to a value outside of the constraints. An additional benefit is that all nuisances regardless of scale/bounds are on the same rough scale when transformed, and a nuisance mover can operate with some default parameters on the transformed nuisance value. The only change to how Nuisances behave is that the bounds will be exclusive instead of inclusive, but for parameters like Scale, we generally don't want them to ever be 0 anyways, and the fact they can be 0 often requires us to catch a resulting inf/nan in our restraints. What are your thoughts on this?

@benmwebb:

Seems reasonable, assuming it doesn't break existing code. Your
suggestion to make a branch to play with this sounds ideal.

@sdaxen:

Cool. I've implemented most of it, and I suspect it will break some tests that check result of scoring function evaluation or derivative of nuisance but in a well-defined way that we can check. Any sampled drawn from the posterior should be improved. There are a few other behaviors it changes (question below):

  • Bounds are now exclusive. If setting the nuisance outside of a bound, it sets the transformed nuisance such that the nuisance is as close to the bound as possible without being equal. This means some code that set nuisances to have tighter bounds than intended to avoid overflow is probably unnecessary.
    Related: Hard bounds should only be used if there's a physical/mathematical constraint there (e.g. Scale, Switching). To avoid unrealistically high values, a properly tuned prior should be used. Setting hard bounds that are reachable when sampling by the scoring function can really break inferences, and if they're unreachable, there's no need for a hard constraint (This is now documented). The crosslinking restraints in PMI are an example of where this is done and really should be updated accordingly.
  • Upper and lower bounds now can't be equal. This should go without saying, as the proper way to constrain a nuisance to a singular value is just to not optimize it. There are a couple tests where upper and lower bounds are carelessly set to be equal but the particle is set to be optimized.
  • NormalMover moves the transformed nuisance, not the nuisance. This is necessary to solve the biased sampling issue that's the whole point. Might be necessary to tweak a couple of the hardcoded nuisance moves in PMI so they behave similarly to a mover on the Nuisance.
  • Moving a Nuisance that is used for an upper or lower bound changes the value of the bounded nuisance, though its transformation is unchanged. I can't see any way to not do this without adding a decorator for NuisanceBound. I've not found anywhere in the IMP code where we bound a Nuisance with a Nuisance, so not certain how big of a deal this is.
  • The NuisanceScoreState is now used to update the derivative of the transformed nuisance with the derivative of the Jacobian-based modification of the prior due to change of variables. This happens after evaluation. I chose to do it here instead of adding it in the prior restraint because we want to maintain the default uniform prior on the untransformed nuisance if the user doesn't specify a prior.
  • Upon evaluation, the score should have the Jacobian-based modifications of all Nuisances added to it.
    The last point is the one thing that's missing (besides a couple Jacobian tests) before I go on hunts for failed tests. My idea is that it should be handled by the ScoringFunction base class upon evaluation. So an arbitrary ScoringFunction should have access to a list of all particles that have been decorated as nuisances in the model. Is that information accessible from the attribute table or would we need a new entry? Or is there a better way to do this? I'm trying to make it user-proof so that the user never needs to worry about setting up the transformation and its corrections properly can can just trust that the samples drawn using the scoring function are equivalent to draws from the untransformed scoring function that they defined.

@benmwebb:

I don't think it's a good idea to have the scoring function examine all
particles in the model looking for Nuisances (if I understand you
correctly). If nothing else, this will end up considering Nuisances that
aren't even used in the scoring function. The simplest way would be to
add a suitable restraint to the scoring function that gets given a list
of all the Nuisances and then calculates the necessary correction. A
more involved but "automatic" way might be to write a custom
ScoringFunction subclass that goes through all of its restraints, calls
get_particles() on each one, tries to cast to Nuisance and then
calculates the correction.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions