-
Notifications
You must be signed in to change notification settings - Fork 9
Description
Hi there
First of all thank you for your work on publishing the code and pipeline.
I was wondering if you could share more details on how you have processed your data.
I have downloaded the data you generated for COV413A cell line and processed it according to your pipeline. Of course, some additional preprocessing steps were necessary, including generating individual fastq files from interleaved format, running STAR and STRINGTIE.
These are the candidate transcripts you recover (Supplementary Table 9 filtered on COV413A):
| Transcript ID | Class | Family | Subfam | Chr TE | Start TE | End TE | Location TE | Gene | Splice Target | Strand | Cell Line | CAGE TPM |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| TCONS_00027238 | DNA | hAT-Charlie | MER1B | chr12 | 130340312 | 130340636 | intron_1 | PIWIL1 | exon_2 | + | COV413A | 0,396505544 |
| TCONS_00034780 | LINE | L1 | L1PA2 | chr14 | 71842964 | 71848996 | Intergenic | RGS6 | exon_2 | + | COV413A | 3,105960093 |
| TCONS_00055478 | LINE | L1 | L1PA2 | chr18 | 34552378 | 34558395 | Intergenic | DTNA | exon_2 | + | COV413A | 0,660842573 |
| TCONS_00086600 | LINE | L1 | L1PA2 | chr3 | 58842154 | 58848179 | Intergenic | FAM3D | exon_2 | - | COV413A | 0,396505544 |
| TCONS_00098838 | LINE | L1 | L1PA2 | chr5 | 102671229 | 102677260 | Intergenic | SLCO6A1 | exon_2 | - | COV413A | 0,72692683 |
| TCONS_00103663 | LINE | L1 | L1PB1 | chr6 | 7347074 | 7349650 | intron_8 | CAGE1 | exon_9 | - | COV413A | 0,396505544 |
| TCONS_00107032 | LINE | L1 | L1HS | chr7 | 12497211 | 12500000 | Intergenic | AC005281.1 | exon_2 | + | COV413A | 0,72692683 |
| TCONS_00107035 | LINE | L1 | L1HS | chr7 | 12497211 | 12500000 | Intergenic | AC005281.1 | exon_5 | + | COV413A | 0,72692683 |
| TCONS_00107037 | LINE | L1 | L1HS | chr7 | 12497211 | 12500000 | Intergenic | SCIN | exon_2 | + | COV413A | 0,72692683 |
| TCONS_00116734 | LINE | L1 | L1PA2 | chr8 | 66949103 | 66955119 | intron_3 | TCF24 | exon_4 | - | COV413A | 0,660842573 |
| TCONS_00119408 | LINE | L1 | L1PA2 | chr9 | 94089082 | 94095103 | intron_4 | PTPDC1 | exon_5 | + | COV413A | 0,330421286 |
| TCONS_00070187 | LTR | ERV1 | LTR7 | chr2 | 38086114 | 38086512 | Intergenic | CYP1B1 | exon_2 | - | COV413A | 15,92630601 |
| TCONS_00074167 | LTR | ERV1 | LTR2B | chr20 | 15985767 | 15986246 | intron_13 | MACROD2 | exon_14 | + | COV413A | 0,396505544 |
| TCONS_00089490 | LTR | ERV1 | LTR2B | chr4 | 37546188 | 37546669 | intron_1 | C4orf19 | exon_2 | + | COV413A | 0,859095345 |
| TCONS_00105271 | LTR | ERVL | LTR18A | chr6 | 79313214 | 79313548 | Intergenic | HMGN3 | exon_1 | - | COV413A | 2,841623064 |
| TCONS_00016149 | SINE | Alu | AluY | chr10 | 101729855 | 101730163 | Intergenic | FBXW4 | exon_1 | - | COV413A | 0,991263859 |
| TCONS_00016150 | SINE | Alu | AluY | chr10 | 101729855 | 101730163 | Intergenic | FBXW4 | exon_2 | - | COV413A | 0,991263859 |
| TCONS_00030551 | SINE | Alu | AluJo | chr12 | 121847358 | 121847535 | intron_9 | HPD | exon_10 | - | COV413A | 0,330421286 |
| TCONS_00041268 | SINE | Alu | AluY | chr15 | 51603584 | 51603891 | intron_1 | DMXL2 | exon_2 | - | COV413A | 1,652106432 |
I recover these - sorry for the truncated output.

I have used the hg38 reference genome and gtf, your reference data download and and your pre-defined arguments.txt.
I hope we together can get to the bottom of why I don't recover any of the same TE chimers as you.
Best regards
Nanna