Skip to content

Conversation

@flying-dragon-ai
Copy link

Description

Related Issue

Fixes #(issue)

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not
    work as expected)
  • Documentation update
  • Code refactoring (no functional changes)
  • Performance improvement
  • Test coverage improvement

Checklist

  • I have read the Contributing Guide
  • I have run formatting tools (pre-commit or manual)
  • I have run relevant unit tests and they pass
  • I have added tests for new functionality
  • I have updated documentation if needed
  • My branch is up to date with main
  • This PR introduces breaking changes (if yes, fill out details below)
  • If this PR changes documentation, I have built and previewed it locally with
    jb build docs
  • No critical issues raised by AI reviewers (/gemini review)

Breaking Change Details (if applicable):

Additional Context


Need help? Check the Contributing Guide or ask in
GitHub Discussions!

- 在 pyproject.toml 文件的依赖列表中添加了 pymysql
- 方便后续数据库相关操作的支持
- 保持依赖一致性和完整性
- Add OceanBaseMetricsLogger class for metrics persistence
  - Database connection with environment variable support
  - Table creation with proper indexes
  - Metric insertion with error handling
  - Query examples for verification
- Add comprehensive quickstart guide
  - OceanBase introduction and Docker deployment
  - Connection configuration and troubleshooting
  - Two integration approaches (direct + custom)
  - Common SQL queries and performance optimization
- Add pymysql dependency to pyproject.toml
- Update README with tutorial link

Closes: OceanBase integration feature request
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @flying-dragon-ai, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此 PR 引入了与 OceanBase 数据库的新集成,使用户能够高效地持久化和管理训练指标。它提供了全面的文档和实用的 Python 示例,指导用户完成 OceanBase 的设置、连接配置、指标记录以及基本数据查询。这项增强旨在为存储大规模训练数据和实验记录提供一个健壮的解决方案。

Highlights

  • 新增 OceanBase 集成: 引入了将训练指标持久化到 OceanBase 数据库的功能,为大规模训练数据存储提供了解决方案。
  • 详细集成文档: 新增了 docs/tutorial/oceanbase_quickstart.md 文档,提供了 OceanBase 的安装、连接配置、使用示例、常见查询、故障排除及高级配置的全面指南。
  • Python 示例代码: 增加了 examples/utils/oceanbase_example.py 脚本,演示了如何使用 OceanBaseMetricsLogger 类连接数据库、创建训练指标表、插入和查询训练指标。
  • 依赖更新: 在 pyproject.toml 中添加了 pymysql 库作为新的项目依赖,以支持 Python 对 OceanBase 的连接操作。

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

这次的 PR 添加了将训练指标持久化到 OceanBase 的示例和文档,做得非常棒。文档内容详实,覆盖了从安装、配置到高级用法和故障排查等多个方面。示例代码也清晰地展示了如何与 Oceanbase 数据库进行交互。我提出了一些建议,主要是关于在示例代码 oceanbase_example.py 中使用上下文管理器(with 语句)来简化资源管理,使代码更健壮和易读。总体来说,这是一次高质量的贡献,将对需要持久化训练指标的用户非常有帮助。

Comment on lines +181 to +230
metrics_logger = OceanBaseMetricsLogger(**config)

try:
# 1. 连接数据库
metrics_logger.connect()

# 2. 创建表
metrics_logger.create_table()

# 3. 插入示例数据
logger.info("插入示例训练指标...")
for step in range(1, 6):
metrics_logger.insert_metric(
experiment_name="gsm8k_grpo_demo",
step=step * 100,
loss=1.5 - step * 0.2,
reward=0.5 + step * 0.1,
)

logger.info("✓ 示例数据插入成功")

# 4. 查询验证
logger.info("查询最近 5 条记录...")
with metrics_logger.connection.cursor() as cursor:
cursor.execute(
"""
SELECT experiment_name, step, loss, reward, timestamp
FROM training_metrics
ORDER BY timestamp DESC
LIMIT 5
"""
)
results = cursor.fetchall()
for row in results:
logger.info(
f" {row['experiment_name']} | "
f"step={row['step']} | "
f"loss={row['loss']:.3f} | "
f"reward={row['reward']:.3f} | "
f"time={row['timestamp']}"
)

logger.info("=== 示例执行完成 ===")

except Exception as e:
logger.error(f"执行失败: {e}")
raise
finally:
# 5. 关闭连接
metrics_logger.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

为了使资源管理更安全、代码更简洁,建议将 OceanBaseMetricsLogger 修改为支持上下文管理器协议(with 语句)。这样可以确保数据库连接在使用后无论是否发生异常都会被正确关闭。

首先,请在 OceanBaseMetricsLogger 类中添加 __enter____exit__ 方法:

    def __enter__(self):
        self.connect()
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        self.close()

然后,main 函数中的 try...finally 块就可以用更简洁的 with 语句来重构。

    try:
        with OceanBaseMetricsLogger(**config) as metrics_logger:
            # 1. 连接数据库 (由 with 语句自动处理)
            # 2. 创建表
            metrics_logger.create_table()

            # 3. 插入示例数据
            logger.info("插入示例训练指标...")
            for step in range(1, 6):
                metrics_logger.insert_metric(
                    experiment_name="gsm8k_grpo_demo",
                    step=step * 100,
                    loss=1.5 - step * 0.2,
                    reward=0.5 + step * 0.1,
                )

            logger.info("✓ 示例数据插入成功")

            # 4. 查询验证
            logger.info("查询最近 5 条记录...")
            with metrics_logger.connection.cursor() as cursor:
                cursor.execute(
                    """
                    SELECT experiment_name, step, loss, reward, timestamp
                    FROM training_metrics
                    ORDER BY timestamp DESC
                    LIMIT 5
                    """
                )
                results = cursor.fetchall()
                for row in results:
                    logger.info(
                        f"  {row['experiment_name']} | "
                        f"step={row['step']} | "
                        f"loss={row['loss']:.3f} | "
                        f"reward={row['reward']:.3f} | "
                        f"time={row['timestamp']}"
                    )

            logger.info("=== 示例执行完成 ===")

    except Exception as e:
        logger.error(f"执行失败: {e}")
        raise

@cafe3310
Copy link

Looks like the check is failing, please resolve the checking issue.

Replace `Optional[X]` with `X | None` syntax (Python 3.10+) in
oceanbase_example.py to comply with ruff UP045 rule.

Changes:
- Remove unused `typing.Optional` import
- Update connection type annotation
- Update insert_metric parameter annotations
- 引入了 typing.Optional 以替代联合类型注解
- 将 pymysql.Connection | None 修改为 Optional[pymysql.Connection]
- 将 float | None 类型参数改为 Optional[float]
- 提升代码的类型一致性和可读性
@ZiyiTsang
Copy link
Collaborator

ZiyiTsang commented Jan 31, 2026

I don't think we need many CLAUDE.md and Chinese annotations in code in this P.R...

Copy link
Collaborator

@rchardx rchardx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not include your Claude code settings in this OceanBase example PR.
Please use English as the primary language for this repository, in both title and contents.

"pebble",
"timeout-decorator",
"prettytable",
"pymysql",
Copy link
Collaborator

@rchardx rchardx Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pymysql is added as a core dependency but is only used by examples/utils/oceanbase_example.py. This forces all users to install pymysql even if they never use OceanBase. The examples/ directory is explicitly excluded from package distribution.
I believe OceanBase users will install this package in their own environment. Please do not add this package in AReaL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants