-
Notifications
You must be signed in to change notification settings - Fork 10
Open
Description
I'm trying to run PlasmidID via the Bioconda release, and Am running into an issue with Pandas. Might be user error though!
CREATING SUMMARY REPORT (Thu Jun 30 01:24:20 UTC 2022)
An html report with miniatures of the images will be generate with useful statistics to determine the correct plasmids in the sample.
Namespace(group=False, input_folder='/home/robert_petit/temp/test/plasmid/NO_GROUP/SRX4563634')
Creating summary
You are trying to merge on object and float64 columns. If you wish to proceed you should use pd.concat
Traceback (most recent call last):
File "/home/robert_petit/miniconda3/envs/test-plasmidid/bin/summary_report_pid.py", line 465, in <module>
main()
File "/home/robert_petit/miniconda3/envs/test-plasmidid/bin/summary_report_pid.py", line 457, in main
summary_df = complete_report_df(complete_file, len_description_df, percentage_df)
File "/home/robert_petit/miniconda3/envs/test-plasmidid/bin/summary_report_pid.py", line 116, in complete_report_df
df = len_description_df.merge(covered_df, on='id', how='left')
File "/home/robert_petit/miniconda3/envs/test-plasmidid/lib/python3.7/site-packages/pandas/core/frame.py", line 9203, in merge
validate=validate,
File "/home/robert_petit/miniconda3/envs/test-plasmidid/lib/python3.7/site-packages/pandas/core/reshape/merge.py", line 119, in merge
validate=validate,
File "/home/robert_petit/miniconda3/envs/test-plasmidid/lib/python3.7/site-packages/pandas/core/reshape/merge.py", line 703, in __init__
self._maybe_coerce_merge_keys()
File "/home/robert_petit/miniconda3/envs/test-plasmidid/lib/python3.7/site-packages/pandas/core/reshape/merge.py", line 1256, in _maybe_coerce_merge_keys
raise ValueError(msg)
ValueError: You are trying to merge on object and float64 columns. If you wish to proceed you should use pd.concat
Traceback (most recent call last):
File "/home/robert_petit/miniconda3/envs/test-plasmidid/bin/summary_report_pid.py", line 465, in <module>
main()
File "/home/robert_petit/miniconda3/envs/test-plasmidid/bin/summary_report_pid.py", line 457, in main
summary_df = complete_report_df(complete_file, len_description_df, percentage_df)
File "/home/robert_petit/miniconda3/envs/test-plasmidid/bin/summary_report_pid.py", line 116, in complete_report_df
df = len_description_df.merge(covered_df, on='id', how='left')
File "/home/robert_petit/miniconda3/envs/test-plasmidid/lib/python3.7/site-packages/pandas/core/frame.py", line 9203, in merge
validate=validate,
File "/home/robert_petit/miniconda3/envs/test-plasmidid/lib/python3.7/site-packages/pandas/core/reshape/merge.py", line 119, in merge
validate=validate,
File "/home/robert_petit/miniconda3/envs/test-plasmidid/lib/python3.7/site-packages/pandas/core/reshape/merge.py", line 703, in __init__
self._maybe_coerce_merge_keys()
File "/home/robert_petit/miniconda3/envs/test-plasmidid/lib/python3.7/site-packages/pandas/core/reshape/merge.py", line 1256, in _maybe_coerce_merge_keys
raise ValueError(msg)
ValueError: You are trying to merge on object and float64 columns. If you wish to proceed you should use pd.concat
---------------------------------------
ERROR in Script plasmidID on or near line 1089; exiting with status 1
MESSAGE:
See /home/robert_petit/temp/test/plasmid/logs/plasmidID.log for more information.
command:
summary_report_pid.py -i /home/robert_petit/temp/test/plasmid/NO_GROUP/SRX4563634 -g
---------------------------------------
Command Used
plasmidID -d plasmidFinder_01_26_2018.fsa -s SRX4563634 -c SRX4563634.fna -T 4
Here are the files used (added .txt so GitHub would allow upload)
plasmidFinder_01_26_2018.fsa.txt
SRX4563634.fna.txt
Update 1.
Doing some digging, covered_df might the issue. It looks like this:
print(covered_df)
id len_covered
0 500039.4128 2363
print(covered_df.dtypes)
id float64
len_covered int64
dtype: object
Going to play around with this some more
Update 2
Converted the ID to a string and now have this
Columns must be same length as key
Traceback (most recent call last):
File "./summary_report_pid.py", line 470, in <module>
main()
File "./summary_report_pid.py", line 462, in main
summary_df = complete_report_df(complete_file, len_description_df, percentage_df)
File "./summary_report_pid.py", line 126, in complete_report_df
df['contig_name'] = df.apply(lambda x: set_to_list(x), axis=1)
File "/home/robert_petit/miniconda3/envs/test-plasmidid/lib/python3.7/site-packages/pandas/core/frame.py", line 3602, in __setitem__
self._set_item_frame_value(key, value)
File "/home/robert_petit/miniconda3/envs/test-plasmidid/lib/python3.7/site-packages/pandas/core/frame.py", line 3729, in _set_item_frame_value
raise ValueError("Columns must be same length as key")
ValueError: Columns must be same length as key
Traceback (most recent call last):
File "./summary_report_pid.py", line 470, in <module>
main()
File "./summary_report_pid.py", line 462, in main
summary_df = complete_report_df(complete_file, len_description_df, percentage_df)
File "./summary_report_pid.py", line 126, in complete_report_df
df['contig_name'] = df.apply(lambda x: set_to_list(x), axis=1)
File "/home/robert_petit/miniconda3/envs/test-plasmidid/lib/python3.7/site-packages/pandas/core/frame.py", line 3602, in __setitem__
self._set_item_frame_value(key, value)
File "/home/robert_petit/miniconda3/envs/test-plasmidid/lib/python3.7/site-packages/pandas/core/frame.py", line 3729, in _set_item_frame_value
raise ValueError("Columns must be same length as key")
ValueError: Columns must be same length as key
Update 3
Looks like the dataframe is empty
print(df)
Empty DataFrame
Columns: [id, length, species, description, fraction_covered, contig_name]
Index: []
.... Code is below ... from complete_report_df()
del df['len_covered']
df = df.merge(contigs_df, on='id', how='left')
df = df.dropna()
print(df)
df['contig_name'] = df.apply(lambda x: set_to_list(x), axis=1)
Not sure if it matters but the percentage_file (e.g. *.coverage_adapted_clustered_percentage) does not exist
Metadata
Metadata
Assignees
Labels
No labels