Skip to content

Conversation

@pboling
Copy link
Contributor

@pboling pboling commented Dec 19, 2025

Bug 1: Ruby Gemfile.lock resolver fails to detect licenses for local path dependencies

Description:

The current implementation of the Ruby Gemfile.lock resolver fails to correctly identify and resolve licenses for dependencies defined with a local path source (e.g., gem 'citrus', path: '.').

Symptoms:

  • Dependencies sourced locally appear with an "Unknown" license in the report.
  • In Gemfile.lock, these dependencies often appear with a ! suffix (e.g., citrus!) in the DEPENDENCIES section, which was not being handled.
  • The PATH block in Gemfile.lock, which contains the mapping to the local directory, was being ignored.

Reproduction:

Create a Ruby project with a Gemfile that references a local gem via path:.
Run license-eye header check.
The local gem is reported with an "Unknown" license, even if its .gemspec declares a valid license.

Solution:

  • Parser Update: Updated parseGemfileLock to correctly parse PATH blocks in Gemfile.lock and capture the local path (mapped from remote:) for relevant gems.
  • Dependency Name Handling: Updated the parser to strip the ! suffix from dependency names in the DEPENDENCIES section, which indicates a local source.
  • Local License Resolution: Implemented fetchLocalLicense to parse the license directly from the local .gemspec file when a local path is detected, bypassing the RubyGems API lookup.
  • Testing: Added a regression test case using the citrus gem structure to verify the fix.

Bug 2: GemspecResolver does not resolve transitive dependencies

Description:

When resolving dependencies from a .gemspec file, GemspecResolver only considers direct dependencies declared in the gemspec. It does not recursively resolve dependencies of those dependencies. This leads to incomplete dependency graphs and missing license checks for transitive dependencies.

Reproduction:

  • Create a gemspec A that depends on B. B depends on C.
  • Run license-eye on A.
  • C will be missing from the report.

Solution:

  • Added findInstalledGemspec helper to locate installed gemspecs in GEM_HOME.
  • Added parseGemspecDependencies helper to extract dependencies from a gemspec file.
  • Updated GemspecResolver.Resolve to traverse the dependency graph using BFS, ensuring all transitive dependencies are discovered and checked.
  • Added regression test in gemspec_test.go covering the toml-merge -> toml-rb -> citrus case.

Issue 3: Improve dependency resolution and license detection

  • Fix Gemfile.lock parsing: Reset currentRemotePath when entering GEM blocks to prevent incorrect LocalPath inheritance from preceding PATH blocks.
  • Improve Gemspec resolution: Implement recursive dependency resolution for gemspecs by traversing installed gemspecs to find transitive dependencies.
  • Improve License detection:
    • Validate gem name within gemspec content in fetchLocalLicense instead of relying solely on filename conventions.
    • Add parseGemspecInfo to extract both name and license from gemspecs.
  • Add regression tests for transitive dependencies (e.g., citrus in toml-merge case) and fix test data setup.

- Previously, the Ruby Gemfile.lock resolver failed to identify dependencies sourced from local paths, resulting in "Unknown" licenses. The parser ignored PATH blocks and did not handle the ! suffix on dependency names.
- Updates parseGemfileLock to capture local paths from PATH blocks.
- Implements fetchLocalLicense to extract licenses from local .gemspec files.
- Strips the ! suffix from dependency names in the lockfile.
- Adds a regression test for local path dependencies.
- Update GemspecResolver to recursively resolve dependencies by traversing installed gemspecs.
- This ensures that transitive dependencies are correctly identified and their licenses checked.
- Fixes issue where transitive dependencies (e.g. citrus in toml-merge) were missed.
Copilot AI review requested due to automatic review settings December 19, 2025 11:23
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes two bugs in Ruby dependency resolution: (1) support for local path dependencies in Gemfile.lock files, and (2) transitive dependency resolution for gemspec files. The implementation adds PATH block parsing to handle local gems, introduces license resolution from local gemspec files, and implements BFS-based recursive dependency traversal.

  • Added PATH block parsing in Gemfile.lock parser to capture local gem paths
  • Introduced GemspecResolver for direct gemspec dependency resolution with transitive dependency support
  • Implemented local and installed gemspec license extraction as fallback options before RubyGems API

Reviewed changes

Copilot reviewed 7 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pkg/deps/ruby.go Adds LocalPath to gemSpec struct, updates parseGemfileLock to handle PATH blocks and "!" suffix, implements GemspecResolver with BFS transitive resolution, adds helper functions for local and installed gemspec license/dependency parsing
pkg/deps/resolve.go Registers the new GemspecResolver in the resolvers list
pkg/deps/ruby_test.go Adds test cases for citrus library (0 runtime deps) and local_dep app (local path gem with MIT license)
pkg/deps/gemspec_test.go New test file for GemspecResolver covering direct dependency detection and transitive dependency resolution via installed gems
pkg/deps/testdata/ruby/toml-merge/toml-merge.gemspec Test fixture gemspec with runtime dependencies (toml-rb, tree_haver, ast-merge, version_gem) and development dependencies
pkg/deps/testdata/ruby/local_dep/Gemfile.lock Test fixture Gemfile.lock with PATH block pointing to local citrus gem
pkg/deps/testdata/ruby/local_dep/citrus/citrus.gemspec Minimal gemspec for citrus with MIT license and development dependencies
pkg/deps/testdata/ruby/citrus/Gemfile.lock Test fixture Gemfile.lock for citrus library with PATH block using "." as remote
pkg/deps/testdata/ruby/citrus/citrus.gemspec Full citrus gemspec with MIT license and development dependencies for testing library project resolution

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pboling and others added 2 commits December 19, 2025 18:36
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@wu-sheng wu-sheng requested a review from kezhenxu94 December 19, 2025 12:02
- Fix `Gemfile.lock` parsing: Reset `currentRemotePath` when entering `GEM` blocks to prevent incorrect `LocalPath` inheritance from preceding `PATH` blocks.
- Improve Gemspec resolution: Implement recursive dependency resolution for gemspecs by traversing installed gemspecs to find transitive dependencies.
- Improve License detection:
  - Validate gem name within gemspec content in `fetchLocalLicense` instead of relying solely on filename conventions.
  - Add `parseGemspecInfo` to extract both name and license from gemspecs.
- Add regression tests for transitive dependencies (e.g., `citrus` in `toml-merge` case) and fix test data setup.
@pboling
Copy link
Contributor Author

pboling commented Dec 19, 2025

@kezhenxu94
NOTE: I have just discovered a Go tool for parsing Gemfile, Gemspec & Gemfile.lock.
https://github.com/contriboss/gemfile-go
I think we can simplify, and delegate the complexity of parsing these files to that library!
I'll submit another PR exploring that refactor. Now that we have pretty decent specs we should be able to assert that it provides the needed functionality.

- pkg/deps/gemspec_test.go:
  - Fixed goconst errors: Replaced repeated string literals "toml-rb", "citrus", and "MIT" with constants.
  - Fixed gocritic error: Updated the octal literal 0755 to the modern 0o755 syntax.
- pkg/deps/ruby.go:
  - Fixed funlen error: Refactored parseGemfileLock by extracting the parsing logic into a lockParserState struct and helper methods (processLine, processSpecs, processDeps), significantly reducing the function length.
  - Fixed revive error: Renamed the name parameter to targetName in fetchLocalLicense to avoid potential shadowing or unused parameter warnings.
  - Fixed unparam error: Updated fetchInstalledLicense to return only string, as the error return value was always nil.
  - Fixed nestingReduce error: Inverted the if condition inside the loop in fetchInstalledLicense (and used continue) to reduce nesting.
  - Updated Call Sites: Adjusted all calls to fetchInstalledLicense to match the new function signature.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 9 changed files in this pull request and generated 9 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pboling and others added 7 commits December 19, 2025 06:12
- When resolving local gemspecs by scanning the specifications directory, the previous logic relied solely on filename prefix matching. This could lead to false positives where a dependency name is a prefix of another installed gem (e.g., searching for foo matches foo-bar-1.0.gemspec).
- This change adds a verification step that parses the candidate gemspec file to ensure the name attribute inside the gemspec exactly matches the requested dependency name. This applies to both findInstalledGemspec and fetchInstalledLicense.
- Replace the simple `!strings.ContainsAny` check with a robust regex pattern `rubyVersionRe` to validate Ruby gem versions. This ensures that only valid version strings (including pre-releases) are treated as specific versions, avoiding potential issues with constraints or invalid version strings.
- The `parseGemspecLicense` function was unused and redundant, as it
simply wrapped `parseGemspecInfo`. Removing it reduces code duplication
and cleans up the codebase.
- Update the gemspec license parsing logic to support multiple licenses defined in an array (e.g., `s.licenses = ['MIT', 'Apache-2.0']`).
- Previously, only the first license was captured. Now, all licenses are extracted and joined with " OR ".
- Add documentation comments to `GemspecResolver` and its methods to explain that it handles `.gemspec` files by extracting runtime dependencies and recursively resolving their transitive dependencies from installed gems.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
kezhenxu94
kezhenxu94 previously approved these changes Dec 19, 2025
Copy link
Member

@kezhenxu94 kezhenxu94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like some files are missing license header

@kezhenxu94
Copy link
Member

@kezhenxu94 NOTE: I have just discovered a Go tool for parsing Gemfile, Gemspec & Gemfile.lock. https://github.com/contriboss/gemfile-go I think we can simplify, and delegate the complexity of parsing these files to that library! I'll submit another PR exploring that refactor. Now that we have pretty decent specs we should be able to assert that it provides the needed functionality.

This is awesome! Less codes to maintain 😄

@pboling
Copy link
Contributor Author

pboling commented Dec 19, 2025

@kezhenxu94 Added missing license headers in 490aede

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 9 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 9 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Check scanner.Err() after the loop in parseGemspecInfo to ensure any parsing errors are propagated instead of being silently ignored.
- When parsing a gemspec file, if multiple licenses are detected, we currently join them with "OR". This change adds a warning log to inform the user about this assumption, as some gems might intend "AND".
- Explicitly reset currentRemotePath when entering a PATH block in Gemfile.lock parser to prevent state leakage from previous blocks.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 9 changed files in this pull request and generated 8 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pboling and others added 7 commits December 20, 2025 05:03
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- This adds a safety check to the BFS implementation for transitive dependency resolution in pkg/deps/ruby.go.
- It limits the queue size to 10,000 to prevent excessive resource consumption or infinite loops in pathological cases (e.g. circular dependencies).
…lock parser

- The variable currentRemotePath was misleading as it captures local filesystem paths in PATH blocks, not remote URLs. Renaming it to currentLocalPath improves clarity and accuracy.
- Explicitly verify that the 'citrus' local dependency is resolved with the expected 'MIT' license and ensure it is not present in the skipped dependencies list.
- This removes ambiguity in the test result and guarantees that local path resolution is functioning correctly.
- Explicitly document that version constraints in gemspec runtime dependencies are currently ignored.
- This clarifies that the resolver defaults to the first found installed version of a gem, which may lead to incorrect resolution if multiple versions are installed and the first one does not satisfy the constraint.
- Update a misleading comment in the gemspec resolver to accurately reflect that it checks for locally installed gems first before falling back to the RubyGems API for license information.
- Update the regex used to extract licenses from gemspec files to be more specific.
- The new pattern `\.licenses?\s*=\s*(\[[^\]]*\]|['"][^'"]*['"])` ensures that only array brackets or quoted strings are captured, preventing the accidental inclusion of comments or trailing characters that occurred with the previous overly permissive `([^#]*)` pattern.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 9 changed files in this pull request and generated 9 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Move `internal/logger` to `pkg/logger` to make it accessible to other packages.
- Update all import paths from `github.com/apache/skywalking-eyes/internal/logger` to `github.com/apache/skywalking-eyes/pkg/logger`.
- Replace direct usage of `github.com/sirupsen/logrus` in `pkg/deps/ruby.go` with the internal logger wrapper.
- Removes redundant checks for filename prefixes in `findInstalledGemspec` and `fetchInstalledLicense`. The prefix is already validated earlier in the loop, making the subsequent `strings.TrimPrefix` check always false (i.e., the prefix is always present and removed).
  - Remove dead code block in `findInstalledGemspec`
  - Remove dead code block in `fetchInstalledLicense`
- Enhance error message when dependency graph size limit is exceeded to include current size and potential causes.
- Fix queue size check to prevent exceeding the limit of 10000 nodes.
- Add debug logging for failures in finding installed gemspecs or parsing gemspec dependencies, instead of silently ignoring them.
- switch to gem.coop as official gem server
- Implement `getAllGemspecs` to cache gemspec paths, reducing I/O overhead during dependency resolution.
- Update `findInstalledGemspec` and `fetchInstalledLicense` to utilize the cached gemspec list instead of repeatedly scanning directories.
- Introduce `sync.Mutex` to ensure thread-safe access to the gemspec cache.
- Key the cache by `GEM_PATH` environment variable to support changes in the environment during runtime (e.g. in tests).
- Add unit tests for `fetchInstalledLicense` in `pkg/deps/ruby_test.go` covering:
  - Multiple versions of the same gem
  - Invalid version strings
  - Missing gemspec files
  - Gems with similar names
- Add unit tests for path traversal protection in `pkg/deps/ruby_test.go` covering:
  - Path traversal attempts in `Gemfile.lock`
- Fix `fetchInstalledLicense` in `pkg/deps/ruby.go` to return empty string for invalid version strings instead of falling back to any installed version.
@pboling
Copy link
Contributor Author

pboling commented Dec 20, 2025

@kezhenxu94 Aside from the question about logging above, this is done.

- When multiple licenses are specified in a gemspec, we now assume all licenses apply (AND) rather than alternatives (OR).
- This is a safer, more conservative approach. Users can override this in the configuration if the intention was OR.
- Added tests for multiple licenses (both non-conflicting and potentially conflicting) to verify they are joined with "AND".
@kezhenxu94 kezhenxu94 merged commit 13c0e5b into apache:main Dec 20, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants