-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Description
Describe the bug
Cell [17] of the human cerebellum pycisTopic tutorial does not produce a TSS annotation BED file but throws a "pyarrow.lib.ArrowTypeError: Expected bytes, got a 'int' object".
To Reproduce
Install scenicplus via conda using the official instructions and follow the pycisTopic tutorial up to cell [17].
Error output
- Get TSS annotation from Ensembl BioMart with the following settings:
- biomart_name: "hsapiens_gene_ensembl"
- biomart_host: "http://www.ensembl.org/"
- transcript_type: ['protein_coding']
- use_cache: True
/home/arne/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pybiomart/dataset.py:269: DtypeWarning: Columns (0) have mixed types. Specify dtype option on import or set low_memory=False.
result = pd.read_csv(StringIO(response.text), sep='\t')
Traceback (most recent call last):
File "/home/arne/miniconda3/envs/scenicplus/bin/pycistopic", line 7, in <module>
sys.exit(main())
^^^^^^
File "/home/arne/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cli/pycistopic.py", line 26, in main
args.func(args)
File "/home/arne/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cli/subcommand/tss.py", line 459, in run_tss_get_tss_annotation
get_tss_annotation_bed_file(
File "/home/arne/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cli/subcommand/tss.py", line 164, in get_tss_annotation_bed_file
tss_annotation_bed_df_pl = ga.get_tss_annotation_from_ensembl(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/arne/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/gene_annotation.py", line 172, in get_tss_annotation_from_ensembl
ensembl_tss_annotation_bed_df_pl = pl.from_pandas(ensembl_tss_annotation).select(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/arne/miniconda3/envs/scenicplus/lib/python3.11/site-packages/polars/convert.py", line 719, in from_pandas
return pl.DataFrame._from_pandas(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/arne/miniconda3/envs/scenicplus/lib/python3.11/site-packages/polars/dataframe/frame.py", line 621, in _from_pandas
pandas_to_pydf(
File "/home/arne/miniconda3/envs/scenicplus/lib/python3.11/site-packages/polars/utils/_construction.py", line 1837, in pandas_to_pydf
arrow_dict[str(col)] = _pandas_series_to_arrow(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/arne/miniconda3/envs/scenicplus/lib/python3.11/site-packages/polars/utils/_construction.py", line 665, in _pandas_series_to_arrow
return pa.array(values, pa.large_utf8(), from_pandas=nan_to_null)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "pyarrow/array.pxi", line 340, in pyarrow.lib.array
File "pyarrow/array.pxi", line 86, in pyarrow.lib._ndarray_to_array
File "pyarrow/error.pxi", line 91, in pyarrow.lib.check_status
pyarrow.lib.ArrowTypeError: Expected bytes, got a 'int' object
Expected behavior
I expected the TSS annotation BED file to be written to "outs/qc/tss.bed".
Version (please complete the following information):
- Python 3.11.8
- pycisTopic 2.0a0
Additional context
Add any other context about the problem here.
Metadata
Metadata
Assignees
Labels
No labels