Skip to content

Conversation

@wuaishare
Copy link
Contributor

已经测试过了,没啥问题,cover、desc、hot都成功抓取到了。
"realtime": "热搜",
"novel": "小说",
"movie": "电影",
"teleplay": "电视剧",
"car": "汽车",
"game": "游戏"
这几个也做了测试,也都抓取正常。

@gemini-code-assist
Copy link

Summary of Changes

Hello @wuaishare, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

这个拉取请求旨在修复百度接口的数据抓取问题,通过更新用户代理、增强数据解析逻辑以及扩展类型定义,确保了各种百度数据(如热搜、小说、电影等)能够被稳定且正确地获取和处理。这些改动提升了数据源的可靠性,并为未来可能的数据结构变化提供了更好的兼容性。

Highlights

  • 修复百度接口数据抓取: 解决了百度接口在某些情况下无法正确抓取数据的问题,确保了 coverdeschot 等字段的成功获取,并支持了热搜、小说、电影等多种分类的数据抓取。
  • 增强数据解析的健壮性: 改进了从百度响应中提取 s-data 的逻辑,增加了对 matchResult 的空值检查,并优化了 cardContent 的解析方式,以适应不同的数据结构,避免潜在的解析错误。
  • 更新用户代理: 将请求头中的 User-Agent 从移动端设备更改为桌面端 Chrome 浏览器,这有助于提高数据抓取的稳定性和成功率。
  • 扩展百度路由类型定义: 在 src/router.types.d.ts 中为百度路由类型添加了 titleimgInfourlhotTag 等可选属性,以支持更灵活和全面的数据映射。
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

本次 PR 修复了百度接口的集成问题,通过更新 User-Agent 并使数据解析逻辑更具鲁棒性以处理不同的响应结构,做得很好。代码更改方向正确,但我在错误处理和代码清晰度方面发现了一些可以改进的地方,并已在代码中添加了具体的审查意见。

Comment on lines 54 to 60
const sData = JSON.parse(matchResult[1]);
const cardContent = sData.data?.cards?.[0]?.content ?? sData.cards?.[0]?.content;
const jsonObject = Array.isArray(cardContent)
? Array.isArray(cardContent[0]?.content)
? cardContent[0].content
: cardContent
: [];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

这段代码存在两个问题:

  1. 可靠性: JSON.parse(matchResult[1]) 可能会因为 matchResult[1] 不是一个有效的 JSON 字符串而抛出异常,这将导致整个请求失败。建议增加 try...catch 来处理潜在的解析错误。
  2. 可读性: 用于确定 jsonObject 值的嵌套三元运算符虽然紧凑,但可读性较差,不易于维护。

建议将此代码块重构为使用 try...catch 和更清晰的 if/else 结构,以提高代码的健壮性和可读性。

  let jsonObject: any[] = [];
  try {
    const sData = JSON.parse(matchResult[1]);
    const cardContent = sData.data?.cards?.[0]?.content ?? sData.cards?.[0]?.content;
    if (Array.isArray(cardContent)) {
      if (cardContent.length > 0 && Array.isArray(cardContent[0]?.content)) {
        jsonObject = cardContent[0].content;
      } else {
        jsonObject = cardContent;
      }
    }
  } catch (error) {
    // 建议记录错误用于调试
    // logger.error("Failed to parse Baidu data", error);
  }

cover: v.img ?? v.imgInfo?.src ?? "",
author: v.show?.length ? v.show : "",
timestamp: 0,
hot: Number(v.hotScore ?? v.hotTag ?? 0),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

使用 Number() 转换热度值时,如果 v.hotScorev.hotTag 是一个非数字字符串(例如 “新”、“热”),Number() 会返回 NaN。这可能会导致下游数据处理出现问题。建议使用 parseInt(..., 10) || 0 的方式进行转换,这样更安全,可以将 “123万” 这样的字符串解析为 123,并且在解析失败时回退到 0

        hot: parseInt(v.hotScore ?? v.hotTag, 10) || 0,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant