Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 21 additions & 3 deletions modules/nf-core/pbmarkdup/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,16 @@ process PBMARKDUP {
script:
def args = task.ext.args ?: ''
prefix = task.ext.prefix ?: "${meta.id}"
suffix = input[0].getExtension() // To allow multiple input types
// To allow multiple input types/files: (compressed) fasta, fastq, bam; Determine suffix from input file names
suffix =
input.find {
it.name ==~ /.*\.(fasta|fa|fna)(\.gz)?$/ }?.with { f ->
f.name.tokenize('.').takeRight(f.name.endsWith('.gz') ? 2 : 1).join('.')
} ?:
input.find { it.name ==~ /.*\.(fastq|fq)(\.gz)?$/ }?.with { f ->
f.name.tokenize('.').takeRight(f.name.endsWith('.gz') ? 2 : 1).join('.')
} ?:
input[0].extension
Comment on lines +25 to +34
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new suffix resolution logic adds explicit handling for compressed FASTA/FASTQ inputs (e.g. .fasta.gz, .fastq.gz), but the updated pbmarkdup tests only cover uncompressed fasta/fastq and BAM inputs, so the .gz branch isn’t exercised. Given this module already has nf-test coverage, it would be worthwhile to add at least one test using a .fasta.gz or .fastq.gz input to verify that the output file name preserves the full compressed extension as intended.

Copilot uses AI. Check for mistakes.
dupfile_name = args.contains('--dup-file') ? (args =~ /--dup-file\s+(\S+)/)[0][1] : ''
def log_args = args.contains('--log-level') ? " > ${prefix}.pbmarkdup.log" : ''
def file_list = input.collect { it.getName() }.join(' ')
Expand Down Expand Up @@ -58,7 +67,16 @@ process PBMARKDUP {
stub:
def args = task.ext.args ?: ''
prefix = task.ext.prefix ?: "${meta.id}"
suffix = input[0].getExtension() // To allow multiple input types
// To allow multiple input types/files: (compressed) fasta, fastq, bam; Determine suffix from input file names
suffix =
input.find {
it.name ==~ /.*\.(fasta|fa|fna)(\.gz)?$/ }?.with { f ->
f.name.tokenize('.').takeRight(f.name.endsWith('.gz') ? 2 : 1).join('.')
} ?:
input.find { it.name ==~ /.*\.(fastq|fq)(\.gz)?$/ }?.with { f ->
f.name.tokenize('.').takeRight(f.name.endsWith('.gz') ? 2 : 1).join('.')
} ?:
input[0].extension
dupfile_name = args.contains('--dup-file') ? (args =~ /--dup-file\s+(\S+)/)[0][1] : ''
def log_args = args.contains('--log-level') ? " > ${prefix}.pbmarkdup.log" : ''
def file_list = input.collect { it.getName() }.join(' ')
Expand All @@ -70,4 +88,4 @@ process PBMARKDUP {
pbmarkdup: \$(echo \$(pbmarkdup --version 2>&1) | awk 'BEFORE{FS=" "}{print \$2}')
END_VERSIONS
"""
}
}
23 changes: 12 additions & 11 deletions modules/nf-core/pbmarkdup/tests/main.nf.test
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,8 @@ nextflow_process {

}

test("acropora cervicornis - bam - multiple tests with dupfile and log") {
test("acropora cervicornis - multiple input with dupfile logfile and remove duplicates") {

when {

params {
Expand All @@ -52,8 +53,9 @@ nextflow_process {
input[0] = Channel.of(
[
[ id:'test' ], // meta map
[ file(params.modules_testdata_base_path + 'genomics/eukaryotes/acropora_cervicornis/m84093_241116_151316_s2.hifi_reads.bc2028.subset.1.bam', checkIfExists: true),
file(params.modules_testdata_base_path + 'genomics/eukaryotes/acropora_cervicornis/m84093_241116_151316_s2.hifi_reads.bc2028.subset.2.bam', checkIfExists: true)
[
file(params.modules_testdata_base_path + 'genomics/eukaryotes/acropora_cervicornis/m84093_241116_151316_s2.hifi_reads.bc2028.subset.1.bam', checkIfExists: true),
file(params.modules_testdata_base_path + 'genomics/eukaryotes/acropora_cervicornis/m84093_241116_151316_s2.hifi_reads.bc2028.subset.2.bam', checkIfExists: true)
]
]
)
Expand All @@ -70,34 +72,33 @@ nextflow_process {

}

test("acropora cervicornis - bam - multiple tests remove duplicates") {
test("homo sapiens - Multiple input types - with dupfile logfile and remove duplicates") {
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test description "homo sapiens - Multiple input types" doesn’t match the actual test data, which still uses genomics/eukaryotes/acropora_cervicornis/... paths, so the species label in the name is misleading. To keep tests self-describing and easier to interpret, consider either renaming the test to reference Acropora cervicornis or switching the input files to a Homo sapiens dataset.

Suggested change
test("homo sapiens - Multiple input types - with dupfile logfile and remove duplicates") {
test("acropora cervicornis - Multiple input types - with dupfile logfile and remove duplicates") {

Copilot uses AI. Check for mistakes.
when {

params {
pbmarkdup_args = "--clobber --rmdup"
pbmarkdup_args = "--clobber --dup-file ${prefix}.dup.bam --log-level INFO"
}
Comment on lines +78 to 79
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here pbmarkdup_args uses ${prefix} inside a Groovy double-quoted string, but prefix is not defined in this nf-test parameter scope, so it interpolates to the literal string null (reflected by the null.dup.bam filenames in the snapshot). To avoid this surprising behaviour and better reflect the intention of tying the duplicate file name to the module’s output prefix, consider either using an explicit fixed dup file name here or passing the desired prefix via task.ext.prefix instead of relying on ${prefix} in the params string.

Copilot uses AI. Check for mistakes.

process {
"""
input[0] = Channel.of(
[
[ id:'test' ], // meta map
[ file(params.modules_testdata_base_path + 'genomics/eukaryotes/acropora_cervicornis/m84093_241116_151316_s2.hifi_reads.bc2028.subset.1.bam', checkIfExists: true),
file(params.modules_testdata_base_path + 'genomics/eukaryotes/acropora_cervicornis/m84093_241116_151316_s2.hifi_reads.bc2028.subset.2.bam', checkIfExists: true)
[
file(params.modules_testdata_base_path + 'genomics/eukaryotes/acropora_cervicornis/m84093_241116_151316_s2.hifi_reads.bc2028.subset.1.fastq', checkIfExists: true),
file(params.modules_testdata_base_path + 'genomics/eukaryotes/acropora_cervicornis/m84093_241116_151316_s2.hifi_reads.bc2028.subset.1.fasta', checkIfExists: true)
]
]
)
"""
}
}


then {
assertAll(
{ assert process.success },
{ assert snapshot(process.out).match() }
)
}

}
}

test("deilephila porcellus - stub") {
Expand Down
128 changes: 74 additions & 54 deletions modules/nf-core/pbmarkdup/tests/main.nf.test.snap
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"acropora cervicornis - bam - multiple tests remove duplicates": {
"acropora cervicornis - multiple input with dupfile logfile and remove duplicates": {
"content": [
{
"0": [
Expand All @@ -11,19 +11,39 @@
]
],
"1": [

[
{
"id": "test"
},
"null.dup.bam:md5,3b74225ad5f7e9e1cbafc45132ad82fb"
]
],
"2": [

[
{
"id": "test"
},
"test.pbmarkdup.log:md5,99987a1331d01b59aa3b5ccd1c787906"
]
],
"3": [
"versions.yml:md5,832e36b56615fb29a94b16e4db32b8db"
],
"dupfile": [

[
{
"id": "test"
},
"null.dup.bam:md5,3b74225ad5f7e9e1cbafc45132ad82fb"
]
],
"log": [

[
{
"id": "test"
},
"test.pbmarkdup.log:md5,99987a1331d01b59aa3b5ccd1c787906"
]
],
"markduped": [
[
Expand All @@ -39,63 +59,43 @@
}
],
"meta": {
"nf-test": "0.9.2",
"nextflow": "25.04.2"
"nf-test": "0.9.3",
"nextflow": "25.10.0"
},
"timestamp": "2025-11-27T22:25:53.428359"
"timestamp": "2026-01-26T16:07:38.506123115"
},
"acropora cervicornis - bam - multiple tests with dupfile and log": {
"deilephila porcellus - stub": {
"content": [
{
"0": [
[
{
"id": "test"
},
"test.bam:md5,86e22a794d904cc48cb3758a03883ba1"
"test.fa:md5,d41d8cd98f00b204e9800998ecf8427e"
]
],
"1": [
[
{
"id": "test"
},
"null.dup.bam:md5,3b74225ad5f7e9e1cbafc45132ad82fb"
]

],
"2": [
[
{
"id": "test"
},
"test.pbmarkdup.log:md5,99987a1331d01b59aa3b5ccd1c787906"
]

],
"3": [
"versions.yml:md5,832e36b56615fb29a94b16e4db32b8db"
],
"dupfile": [
[
{
"id": "test"
},
"null.dup.bam:md5,3b74225ad5f7e9e1cbafc45132ad82fb"
]

],
"log": [
[
{
"id": "test"
},
"test.pbmarkdup.log:md5,99987a1331d01b59aa3b5ccd1c787906"
]

],
"markduped": [
[
{
"id": "test"
},
"test.bam:md5,86e22a794d904cc48cb3758a03883ba1"
"test.fa:md5,d41d8cd98f00b204e9800998ecf8427e"
]
],
"versions": [
Expand All @@ -104,20 +104,20 @@
}
],
"meta": {
"nf-test": "0.9.2",
"nextflow": "25.04.2"
"nf-test": "0.9.3",
"nextflow": "25.10.0"
},
"timestamp": "2025-11-27T22:25:23.374664"
"timestamp": "2026-01-26T16:17:29.606905496"
},
"deilephila porcellus - stub": {
"deilephila porcellus - fasta": {
"content": [
{
"0": [
[
{
"id": "test"
},
"test.fa:md5,d41d8cd98f00b204e9800998ecf8427e"
"test.fa:md5,087cee5291f8d728a62b91765b64af35"
]
],
"1": [
Expand All @@ -140,7 +140,7 @@
{
"id": "test"
},
"test.fa:md5,d41d8cd98f00b204e9800998ecf8427e"
"test.fa:md5,087cee5291f8d728a62b91765b64af35"
]
],
"versions": [
Expand All @@ -149,43 +149,63 @@
}
],
"meta": {
"nf-test": "0.9.2",
"nextflow": "25.04.2"
"nf-test": "0.9.3",
"nextflow": "25.10.0"
},
"timestamp": "2025-11-27T22:26:16.491708"
"timestamp": "2026-01-26T16:16:09.847582484"
},
"deilephila porcellus - fasta": {
"homo sapiens - Multiple input types - with dupfile logfile and remove duplicates": {
"content": [
{
"0": [
[
{
"id": "test"
},
"test.fa:md5,087cee5291f8d728a62b91765b64af35"
"test.fasta:md5,c53086fa9bdb5cf0a4329da56c41b236"
]
],
"1": [

[
{
"id": "test"
},
"null.dup.bam:md5,7a7bec54f6519fc282826ce9f86202d7"
]
],
"2": [

[
{
"id": "test"
},
"test.pbmarkdup.log:md5,0d14b38adfec8f46a47f954e15c88f7c"
]
],
"3": [
"versions.yml:md5,832e36b56615fb29a94b16e4db32b8db"
],
"dupfile": [

[
{
"id": "test"
},
"null.dup.bam:md5,7a7bec54f6519fc282826ce9f86202d7"
]
],
"log": [

[
{
"id": "test"
},
"test.pbmarkdup.log:md5,0d14b38adfec8f46a47f954e15c88f7c"
]
],
"markduped": [
[
{
"id": "test"
},
"test.fa:md5,087cee5291f8d728a62b91765b64af35"
"test.fasta:md5,c53086fa9bdb5cf0a4329da56c41b236"
]
],
"versions": [
Expand All @@ -194,9 +214,9 @@
}
],
"meta": {
"nf-test": "0.9.2",
"nextflow": "25.04.2"
"nf-test": "0.9.3",
"nextflow": "25.10.0"
},
"timestamp": "2025-11-27T22:47:33.595865"
"timestamp": "2026-01-26T17:09:13.084330647"
}
}
Loading