Skip to content

Conversation

@t0mdavid-m
Copy link
Member

@t0mdavid-m t0mdavid-m commented Dec 3, 2025

Handle cases where protein description values may be non-string types (e.g., NaN, float) by explicitly converting to string before truncation. This prevents "TypeError: object of type 'float' has no len()" errors when parsing FLASHTnT results.

Summary by CodeRabbit

  • Bug Fixes
    • Improved robustness of protein description handling to safely process and truncate descriptions regardless of input data type.

✏️ Tip: You can customize this high-level summary in your review settings.

Handle cases where protein description values may be non-string types
(e.g., NaN, float) by explicitly converting to string before truncation.
This prevents "TypeError: object of type 'float' has no len()" errors
when parsing FLASHTnT results.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 3, 2025

Walkthrough

A lambda function for truncating protein descriptions in src/parse/tnt.py is updated to coerce input values to strings before performing slicing and length checks, improving type safety and preventing potential failures when non-string types are present.

Changes

Cohort / File(s) Summary
Description truncation type safety
src/parse/tnt.py
Lambda function updated to wrap input with str() before slicing and length validation: lambda x: str(x)[:50] + '...' if len(str(x)) > 50 else str(x)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~5 minutes

  • Single file with a focused, straightforward logic tweak
  • Type coercion pattern is clear and direct
  • No structural or control flow changes to verify

Poem

🐰 A string so unsure of its form,
Now safely coerced to the norm!
No more type despair,
With str() care,
Descriptions truncated with charm! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main functional change—fixing a TypeError caused by non-string protein descriptions.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/tnt-protein-description-type-error

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/parse/tnt.py (1)

36-38: String coercion in description truncation correctly fixes the TypeError

Casting x to str before slicing/len is the right fix and will prevent the "object of type 'float' has no len()" error when descriptions are NaN/floats, while keeping the original truncation behavior. Looks good.

If you want to tighten this up a bit, you could (optionally) avoid repeated str(x) calls and make the NA handling explicit via a small helper:

-    protein_df['description'] = protein_df['description'].apply(
-        lambda x: str(x)[:50] + '...' if len(str(x)) > 50 else str(x)
-    )
+    def _truncate_description(x):
+        # Treat missing values as empty; everything else is stringified
+        if pd.isna(x):
+            s = ""
+        else:
+            s = str(x)
+        return s[:50] + "..." if len(s) > 50 else s
+
+    protein_df['description'] = protein_df['description'].apply(_truncate_description)

This keeps the bug fix, improves readability, and avoids rendering missing values as the literal strings "nan"/"None".

If you adopt this, please re-run a small sample through parseTnT to confirm the UI/reporting still shows descriptions as expected, especially for NaN/missing descriptions.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2a74955 and 05aa44a.

📒 Files selected for processing (1)
  • src/parse/tnt.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-openms
  • GitHub Check: build-full-app

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants