Conversation
|
The proposed changes won't work if there are multiple barcodes for each element. In the example below, there are 2 elements ( We included aggregation before log ratio computation even when We don't have an option for computing and returning log ratios at the barcode level because the package is meant to facilitate differential analysis at the EID level. If a user wants to compute barcode level log ratios, they can just compute it with |
|
Thank you for looking into the proposed change and for your response
________________________________
From: Leslie Myint ***@***.***>
Sent: Thursday, January 9, 2025 18:53
To: hansenlab/mpra ***@***.***>
Cc: Pia Keukeleire ***@***.***>; Author ***@***.***>
Subject: Re: [hansenlab/mpra] fix eid order (PR #12)
The proposed changes won't work if there are multiple barcodes for each element. In the example below, there are 2 elements ("eid1" and "eid2") with 5 barcodes each. Doing dna[eid,] and rna[eid,] will duplicate rows instead of doing the intended sorting.
mat <- matrix(1:30, nrow = 10, ncol = 3)
eids <- rep(paste0("eid", 1:2), each = 5)
rownames(mat) <- eids
eids
[1] "eid1" "eid1" "eid1" "eid1" "eid1" "eid2" "eid2" "eid2" "eid2" "eid2"
mat
[,1] [,2] [,3]
eid1 1 11 21
eid1 2 12 22
eid1 3 13 23
eid1 4 14 24
eid1 5 15 25
eid2 6 16 26
eid2 7 17 27
eid2 8 18 28
eid2 9 19 29
eid2 10 20 30
mat[eids,]
[,1] [,2] [,3]
eid1 1 11 21
eid1 1 11 21
eid1 1 11 21
eid1 1 11 21
eid1 1 11 21
eid2 6 16 26
eid2 6 16 26
eid2 6 16 26
eid2 6 16 26
eid2 6 16 26
We included aggregation before log ratio computation even when aggregate=="none" because the only reason a user should pick aggregate=="none" is if there is only one barcode per EID or if the counts have already been aggregated across the multiple barcodes per EID. In these cases, aggregation doesn't do anything to the counts--it just sorts the count matrices by EID.
We don't have an option for computing and returning log ratios at the barcode level because the package is meant to facilitate differential analysis at the EID level. If a user wants to compute barcode level log ratios, they can just compute it with logr <- log2(rna + 1) - log2(dna + 1) from the rna and dna count matrices they supplied to the MPRASet() constructor.
—
Reply to this email directly, view it on GitHub<#12 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AIBBLEDUANHY7UQCSYY73F32J2ZQRAVCNFSM6AAAAABTUKPHNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOBQHEZDKMZZGE>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
By ordering the DNA and RNA using the eids when creating the MPRA set, the original problem of the eid ordering is solved. This was previously fixed by aggregating the counts when calculating the logratio, also when aggregation=none. With the proposed change, it is now possible to calculate the logratios without aggregating the counts.