Skip to content

crash SLiM by adding lots of mutations #551

@bhaller

Description

@bhaller

I was curious to see how much overhead callbacks produce in SLiM, so I wrote a simple model:

initialize() {
	defineConstant("L", 1000000);

	initializeMutationType("m1", 0.5, "f", 0.0);
	initializeGenomicElementType("g1", m1, 1.0);
	
	initializeChromosome(1, L, "H");
	initializeGenomicElement(g1);
	initializeMutationRate(1e-7);
	initializeRecombinationRate(1e-8);
}

1 late() {
	sim.addSubpop("p1", 1);
	target = p1.individuals.haplosomes;
	
	start = clock();
	for (pos in seqLen(sim.chromosome.length))
		target.addNewDrawnMutation(m1, pos);
	end = clock();
	
	catn("Mutation addition elapsed: " + (end - start));
	
	start = clock();
	sim.recalculateFitness();
	end = clock();
	
	catn("Fitness recalculation elapsed: " + (end - start));
	catn("Calculated fitness: " + p1.cachedFitness(0));
}

mutationEffect(m1) {
	return 1.0 + mut.position / 1.0e100;
}

The intention is that this makes a haploid chromosome of length L, adds a mutation at every position, and then times fitness recalculation. The fitness recalculation runs the mutationEffect() callback L times. The callback does a simple calculation, just as a proxy for whatever simple callback a user might implement. Seems straightforward enough. Boom; the process gets killed due to running out of memory, after spinning for a pretty long time. This is surprising, since one individual with 1 million mutations doesn't seem particularly unreasonable. So I tried smaller L, took some runtimes, and got this:

Image

So, clearly the runtime is O(L^2) or something, which should ideally not be the case, and the addition of the new mutations is the culprit. For L=200,000 the peak memory usage is ~76 GB; that's about the biggest my machine will let it get without killing it. :-> For L=100,000 the usage is ~19 GB, so the memory usage also appears to be O(L^2) or something. That memory does appear to get freed; recycling the model makes it all go away. So it doesn't appear to be a leak.

And by the way, for L=100,000 I get this output:

Mutation addition elapsed: 7.3478
Fitness recalculation elapsed: 0.008526

So it looks like SLiM can do about 10 million calls to this particular mutationEffect() callback per second. Maybe more like 20 million, since there is some overhead in fitness recalculation besides simply doing the callback execution.

So, I'd like to get rid of the O(L^2) overhead here in both runtime and memory, if possible; doing something like this should not be out of bounds. But I don't have time to delve into it right this moment, so I'm making an issue. :->

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions