-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or request
Description
VCF file:
##fileformat=VCFv4.0
##reference=chrRCRS
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=DP,Number=.,Type=Integer,Description="Reads covering the REF position">
##FORMAT=<ID=HF,Number=.,Type=Float,Description="Heteroplasmy Frequency of variant allele">
##FORMAT=<ID=CILOW,Number=.,Type=Float,Description="Value defining the lower limit of the confidence interval of the heteroplasmy fraction">
##FORMAT=<ID=CIUP,Number=.,Type=Float,Description="Value defining the upper limit of the confidence interval of the heteroplasmy fraction">
##INFO=<ID=AC,Number=.,Type=Integer,Description="Allele count in genotypes">
##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SRR043366 SRR043354
chrRCRS 263 0 A G,C 0 PASS AC=1;AN=2 GT:DP:HF:CILOW:CIUP 0/1:167:0.994:0.963:1.0 0/2:167:0.994:0.963:1.0
Idea for a CSV output:
SAMPLE CHROM POS ID REF ALT QUAL AC AN Locus FunctionalLocus CodonPosition AaChange HF CILOW CIUP …
SRR043366 chrRCRS 263 0 A G 0 2;-1 4 MT-DLOOP MT-HV2 (Hypervariable segment 2) . . 0.994 0.963 1 …
SRR043354 chrRCRS 263 0 A C 0 2;-1 4 MT-DLOOP MT-HV2 (Hypervariable segment 2) . . 0.853 0.79 0.9 …
So we have one row for each SAMPLE-ALT. This means that, if one samples has > 1 ALT allele, there will be, for that sample, as many rows as the number of ALT alleles. This would make this table easily usable for downstream processing, eg: wrangling and plotting with tidyverse packages, creating dynamic (sortable/filterable) plots with HTMLwidgets etc.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request