The current method for outlier removal (eg removing the top x% in an ordered list) seems to be resulting in a large number of false positives. Recommendation to replace this method with a numpy optimized version of the original mean/sd based approach and/or to to make the number of reads excluded a tunable parameter. Thanks to @jxmavs for raising this issue and for suggestions!
Examples of false positives (at z score 14) (false z score in center of frame). Thanks to James for collecting examples:
