-
Notifications
You must be signed in to change notification settings - Fork 36
Description
I was curious to see how much overhead callbacks produce in SLiM, so I wrote a simple model:
initialize() {
defineConstant("L", 1000000);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeChromosome(1, L, "H");
initializeGenomicElement(g1);
initializeMutationRate(1e-7);
initializeRecombinationRate(1e-8);
}
1 late() {
sim.addSubpop("p1", 1);
target = p1.individuals.haplosomes;
start = clock();
for (pos in seqLen(sim.chromosome.length))
target.addNewDrawnMutation(m1, pos);
end = clock();
catn("Mutation addition elapsed: " + (end - start));
start = clock();
sim.recalculateFitness();
end = clock();
catn("Fitness recalculation elapsed: " + (end - start));
catn("Calculated fitness: " + p1.cachedFitness(0));
}
mutationEffect(m1) {
return 1.0 + mut.position / 1.0e100;
}
The intention is that this makes a haploid chromosome of length L, adds a mutation at every position, and then times fitness recalculation. The fitness recalculation runs the mutationEffect() callback L times. The callback does a simple calculation, just as a proxy for whatever simple callback a user might implement. Seems straightforward enough. Boom; the process gets killed due to running out of memory, after spinning for a pretty long time. This is surprising, since one individual with 1 million mutations doesn't seem particularly unreasonable. So I tried smaller L, took some runtimes, and got this:
So, clearly the runtime is O(L^2) or something, which should ideally not be the case, and the addition of the new mutations is the culprit. For L=200,000 the peak memory usage is ~76 GB; that's about the biggest my machine will let it get without killing it. :-> For L=100,000 the usage is ~19 GB, so the memory usage also appears to be O(L^2) or something. That memory does appear to get freed; recycling the model makes it all go away. So it doesn't appear to be a leak.
And by the way, for L=100,000 I get this output:
Mutation addition elapsed: 7.3478
Fitness recalculation elapsed: 0.008526
So it looks like SLiM can do about 10 million calls to this particular mutationEffect() callback per second. Maybe more like 20 million, since there is some overhead in fitness recalculation besides simply doing the callback execution.
So, I'd like to get rid of the O(L^2) overhead here in both runtime and memory, if possible; doing something like this should not be out of bounds. But I don't have time to delve into it right this moment, so I'm making an issue. :->