Skip to content

Conversation

@fede-kamel
Copy link
Contributor

Summary

This PR addresses code organization and maintainability concerns in the langchain-oci package by introducing proper modularization patterns. As langchain-oracle matures into a production-grade library used by enterprise customers, maintaining high code quality standards becomes increasingly important for long-term sustainability.

Changes

1. Create shared common/ module

  • common/auth.py: Consolidates OCIAuthType enum and authentication logic that was duplicated across 4 files
  • common/utils.py: Extracts shared utility functions (OCIUtils) used by multiple modules

2. Extract provider implementations into providers/ subpackage

  • providers/base.py: Abstract Provider base class defining the interface
  • providers/cohere.py: Cohere-specific implementation
  • providers/generic.py: Generic provider for Meta Llama, xAI Grok, OpenAI, Mistral

3. Streamline main modules

  • chat_models/oci_generative_ai.py: Reduced from 1,738 to 692 lines (60% reduction)
  • llms/oci_generative_ai.py: Reduced from 402 to 352 lines
  • embeddings/oci_generative_ai.py: Reduced from 231 to 185 lines

Why This Matters

For a production-grade open source library, code quality directly impacts:

  • Maintainability: Smaller, focused files are easier to understand, review, and modify
  • Testability: Isolated components can be unit tested independently
  • Extensibility: Adding new providers (e.g., new model vendors) becomes straightforward
  • Onboarding: New contributors can understand and navigate the codebase more quickly
  • Bug Prevention: Single source of truth for shared logic prevents inconsistencies

Before/After

Metric Before After
oci_generative_ai.py lines 1,738 692
Duplicated auth code ~300 lines across 4 files Single 97-line module
Provider classes per file 4 in one file 1-2 per focused file
Max file size 1,738 lines 501 lines

Test Plan

  • All 64 unit tests pass
  • Integration tests pass across all supported models:
    • Meta Llama (llama-4-maverick, llama-4-scout, llama-3.3-70b)
    • Cohere (command-a, command-r-plus, command-r)
    • xAI Grok (grok-4-fast, grok-3-fast, grok-3-mini-fast)
    • OpenAI (gpt-oss-20b, gpt-oss-120b)
  • Tool calling verified for both GenericProvider and CohereProvider
  • Streaming verified across all providers
  • Structured output verified
  • Backward compatibility maintained (no API changes)

Breaking Changes

None. This is a purely internal refactoring with no changes to the public API.

Create langchain_oci/common/ package to consolidate duplicated code:

- common/auth.py: Single source of truth for OCIAuthType enum and
  create_oci_client_kwargs() function that was duplicated across
  llms/, embeddings/, and chat_models/ modules (~75 lines each)

- common/utils.py: Shared OCIUtils class with helper functions for
  tool call conversion, schema resolution, and type checking

This change eliminates approximately 300 lines of duplicated authentication
logic, improving maintainability and reducing the risk of divergent
implementations across modules.
Create langchain_oci/chat_models/providers/ to separate concerns and
improve code organization:

- providers/base.py: Abstract Provider base class defining the interface
  for all OCI GenAI providers (15 abstract methods)

- providers/cohere.py: CohereProvider implementation (~400 lines)
  handling Cohere-specific message formatting, tool calls, and responses

- providers/generic.py: GenericProvider and MetaProvider implementations
  (~500 lines) for Meta Llama, xAI Grok, OpenAI, and Mistral models

Previously, all provider logic was embedded in oci_generative_ai.py
(1,738 lines). This extraction:

- Enables isolated testing of each provider
- Makes it easier to add new providers
- Reduces cognitive load when reading individual files
- Follows the Single Responsibility Principle
Modify existing modules to leverage the new shared infrastructure:

chat_models/oci_generative_ai.py:
- Reduced from 1,738 lines to 692 lines (60% reduction)
- Import providers from new providers/ subpackage
- Import OCIUtils from common/utils

llms/oci_generative_ai.py:
- Replace duplicated OCIAuthType with import from common/auth
- Replace 50+ lines of auth logic with create_oci_client_kwargs()
- Reduced from 402 to 352 lines

embeddings/oci_generative_ai.py:
- Replace duplicated OCIAuthType with import from common/auth
- Replace 50+ lines of auth logic with create_oci_client_kwargs()
- Reduced from 231 to 185 lines

All existing functionality preserved with improved maintainability.
Add OCIAuthType to langchain_oci/__init__.py exports, allowing users
to import directly from the package root:

  from langchain_oci import OCIAuthType

This provides a cleaner API for users who need to reference the
authentication type enum without knowing the internal module structure.
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Dec 19, 2025
- Remove unused OCIAuthType imports from llms and embeddings modules
- Fix line length violations (max 88 characters)
- Apply proper import formatting per ruff/isort standards
- Expand multiline imports for better readability
Use dict key access instead of .get() for required config values
to satisfy mypy type checking for open() function arguments.
@fede-kamel
Copy link
Contributor Author

No Logic Changes - Pure Structural Refactoring

This PR contains zero logic changes. All modifications are organizational only.

What was done:

  1. Extracted files - Code was moved from one large file to smaller focused files:

    • CohereProvider class → providers/cohere.py
    • GenericProvider class → providers/generic.py
    • Provider ABC → providers/base.py
    • OCIAuthType enum + auth logic → common/auth.py
    • OCIUtils class → common/utils.py
  2. Updated imports - Changed from X import Y to point to new file locations

  3. Added re-exports - __init__.py files maintain the same public API

What was NOT done:

  • No algorithms changed
  • No conditionals added/removed
  • No function signatures modified
  • No return values altered
  • No error handling changed
  • No new features added
  • No behaviors modified

Bug fix included:

The only functional change was correcting convert_oci_tool_call_to_langchain to handle Cohere's parameters attribute - this was a bug fix for code that only worked for Generic providers, not Cohere.

Verification:

  • All 64 unit tests pass (Python 3.9, 3.12, 3.13)
  • All 11 integration models tested successfully (Meta Llama, Cohere, xAI Grok, OpenAI)
  • Ruff, isort, and mypy checks all pass

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

OCA Verified All contributors have signed the Oracle Contributor Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant