background noise too high for my data?

Hello, thanks for the nice and user-friendly tool! I have atac-seq data (150 bp pair-end) from small plant genome, and have been trying HMMRATAC for a while. So the problem for me is that I can't get the model right, as I had tried many combination of -l, -u and even -z. I showed one of the example as below.

```
Version:        1.2.10
Arguments Used:
-b      Sample.atac.forHMMRATAC.bam
-i     Sample.atac.forHMMRATAC.bam.bai
-g      Sample.genome
-o      Sample_60_25
--bedgraph      True
-u      60
-l      25
Fragment Expectation Maximum Done
Mean    50.0    StdDevs 20.0
Mean    190.63425024911237      StdDevs 62.502097145190604
Mean    400.60895549134216      StdDevs 50.93694387902569
Mean    729.022089951699        StdDevs 145.67453137113827
ScalingFactor   103.086173
Training Regions found and Zscore regions for exclusion found
Training Fragment Pileup completed
Kmeans Model:
HMM with 3 state(s)

State 0
  Pi: 0.3333333333333333
  Aij: 0.333 0.333 0.333
  Opdf: Multi-variate Gaussian distribution --- Mean: [ 0.531 1.169 1.604 4.339 ]

State 1
  Pi: 0.3333333333333333
  Aij: 0.333 0.333 0.333
  Opdf: Multi-variate Gaussian distribution --- Mean: [ 1.075 1.678 1.373 0.931 ]

State 2
  Pi: 0.3333333333333333
  Aij: 0.333 0.333 0.333
  Opdf: Multi-variate Gaussian distribution --- Mean: [ 2.349 4.864 4.519 4.93 ]

Model created and refined. See Can_60_25.model
Model:
HMM with 3 state(s)

State 0
  Pi: 0.3333333333333333
  Aij: 0.979 0.015 0.006
  Opdf: Multi-variate Gaussian distribution --- Mean: [ 0.501 0.855 0.831 1.056 ]

State 1
  Pi: 0.3333333333333333
  Aij: 0.012 0.985 0.003
  Opdf: Multi-variate Gaussian distribution --- Mean: [ 1.495 2.338 1.724 0.847 ]

State 2
  Pi: 0.3333333333333333
  Aij: 0.015 0.009 0.976
  Opdf: Multi-variate Gaussian distribution --- Mean: [ 1.065 1.737 2.032 2.792 ]

Genome split and subtracted masked regions
0 round viterbi done
37 round viterbi done
Total time (seconds)=   4519
```

I had tried at least ten different combinations of -l and -u, and none of them can get the ideal model you described in your paper. After checking some of the log files showed in the issues channel, I realised that most of people have mean values of 0 for state 0 (which is the background or starting model if I understand correctly?). So does this mean that the background noise of my data is pretty high?

I also attach a insert size picture here (our data were generated though the purification of nuclei, and the insert size calculation is for the clean bam after multiple filtering)
[Can.dedup.clean.bam.insertsize.hist.pdf](https://github.com/LiuLabUB/HMMRATAC/files/8699755/Can.dedup.clean.bam.insertsize.hist.pdf)

Could you please give me some advice how I could solve the problem of my data? Thanks a lot in advance!



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

background noise too high for my data? #98

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

background noise too high for my data? #98

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions