Skip to content

Configure Gobra via JSON#895

Draft
ArquintL wants to merge 16 commits intomasterfrom
json-cli-inputs
Draft

Configure Gobra via JSON#895
ArquintL wants to merge 16 commits intomasterfrom
json-cli-inputs

Conversation

@ArquintL
Copy link
Member

@ArquintL ArquintL commented Mar 19, 2025

This PR pushes @jcp19's idea further to use a JSON file to configure Gobra by integrating the parsing of the JSON file into Gobra. The accepted JSON structure follows gobrago with the following differences:

  • the module-level and package-level JSON files are called gobra-mod.json and gobra.json, respectively
  • the module-level JSON file has two fields, installation_cfg and default_job_cfg
  • if a field is present in the package-level JSON (a so-called job configuration) then it takes precedence over the module-level field (the default job configuration). This also applies to options expressed in a other field. In particular, these options are parsed and merged with the other non-other fields, constituting a configuration after merging. I.e., this merging is happening before applying the precedence rule that overwrites module-level options.

In addition, the following design choices have been made:

  • --config <path> is used to configure Gobra
  • <path> must exist and can either be one of the following:
    • a path to a package-level JSON file in which case the Gobra files in the same folder as the JSON file are considered a package and are verified
    • a path to a directory in which case the Gobra files in this folder are considered a package and are verified. A package-level JSON file is optional
  • In this configuration mode, only --printConfig is additionally permitted to display the actually used configuration. All other options result in an error. If --printConfig is not provided, Gobra prints a short note stating with configuration files have been considered.
  • In this configuration mode, the module-level JSON file is mandatory, which is located by traversing the filesystem starting from <path> towards the filesystem's root directory

@ArquintL ArquintL marked this pull request as draft March 19, 2025 17:35
@ArquintL ArquintL requested a review from jcp19 March 19, 2025 17:35
@ArquintL ArquintL added the enhancement New feature or request label Mar 19, 2025
@jcp19
Copy link
Contributor

jcp19 commented Mar 27, 2025

@ArquintL this is marked as a draft PR but you asked for the review. Should I provide feedback on the implementation already? regarding the cli options, I would opt for one of the following:

  1. make it an error to provide additional flags when --config <path> is passed
  2. whatever flags passed on the CI would override those in the config (this is useful if you want to mostly reuse the config, but try out different options quickly), and I would produce a message saying that this is the case

@ArquintL
Copy link
Member Author

@jcp19 This PR is in draft mode as the design choices are more like suggestions. Code-wise, I do not have any additional changes planned so I'd appreciate a code review. We currently do not have a JSON field for every existing CLI option but if necessary, we can always add more (in separate PRs)

@jcp19
Copy link
Contributor

jcp19 commented Apr 22, 2025

High-level questions about the design:

  • is the module-level json always required, even when we do not use the --config option? My expectation is that it is not, otherwise it may make it harder for new users to try out Gobra. In addition, what should be the behaviour if a user does not pass the --config flag but still has a module level json, or what happens if they use the -p flag and have a package json there?
  • I wonder if the distinction between the fields of the module level and package level still make sense (why can't we have the same set of fields in both?)

@jcp19
Copy link
Contributor

jcp19 commented Apr 22, 2025

I think that the following design could also work:

  • there are two kinds of files which have the same format. Both are optional. One is called gobra-project.json and must be at the project root or at the module root, the other is called gobra.json and must be in the package folder.
  • by default, we always determine options based on the following hierarchy: default config (determined by the default options for all flags) < gobra-project.json (if present) < gobra.json (if present) < CLI options provided at the call site < file-level options (*) (the order of the last two is up for debate)
  • This chain of options would always be used by default, in all modes (-i, -p, and -r). A short message saying which config files were found and used in the environment should be shown so that they are aware of it and are not confused by the behaviour. If no json files are found, no message is displayed (this keeps the UI light in the likely case users are not using verifying a project, but rather trying out gobra on small files)
  • we might consider having a flag to ignore the json files in the surrounding environment, but I don't see when this would be useful

Under this design, the --config flag would not be necessary.

(*) While thinking of how all options relate to each other, and especially, how the configs of each package relate to each other, I was confronted again with the fact that right now, we read all configs of all files, regardless of their packages. I think this is a bad idea, which leads to unexpected results (e.g., if we import a package that was verified for overflow checks, it enables overflow checks for the current package). Maybe we should consider deprecating in-file configs when we introduce the json config, except for tests.

@ArquintL
Copy link
Member Author

is the module-level json always required, even when we do not use the --config option? My expectation is that it is not, otherwise it may make it harder for new users to try out Gobra. In addition, what should be the behaviour if a user does not pass the --config flag but still has a module level json, or what happens if they use the -p flag and have a package json there?

Sorry for the imprecision, it's only mandatory if --config is used. Otherwise, it's simply ignored (same holds for gobra.json files)

I wonder if the distinction between the fields of the module level and package level still make sense (why can't we have the same set of fields in both?)

One can argue whether we should allow installation_cfg on the module-level as well. Would you then rename default_job_cfg to something else such that it's uniform for module- and package-level config files (i.e., omit the "default" part as this becomes implicitly the case via our overwriting semantics for configs)?

there are two kinds of files which have the same format. Both are optional. One is called gobra-project.json and must be at the project root or at the module root, the other is called gobra.json and must be in the package folder.

If both files are optional? How would we identify which folder forms a module's root? By detecting go.mod?

This chain of options would always be used by default, in all modes (-i, -p, and -r)

Not sure whether this is less confusing but I think that this should work. The only weird case affects -r where the behavior is not that clear. I.e., if you invoke Gobra on a package but specify -r, will it iterate over all packages in the module or only over packages in subfolders contained in the current working directory?

@jcp19
Copy link
Contributor

jcp19 commented Apr 23, 2025

If both files are optional? How would we identify which folder forms a module's root? By detecting go.mod?

Either that or by traversing the directory tree, starting in the current package, and moving from parent to parent until a gobra-project.json is found.

Not sure whether this is less confusing but I think that this should work. The only weird case affects -r where the behavior is not that clear. I.e., if you invoke Gobra on a package but specify -r, will it iterate over all packages in the module or only over packages in subfolders contained in the current working directory?

IIRC, you do not pass a package in the recursive mode, but rather, the project root. So, I guess we would iterate through all packages in that directory. I think we need to think about what should be the roles of the project root and module root (should they be unified somehow?).

@jcp19
Copy link
Contributor

jcp19 commented Apr 23, 2025

@ArquintL this is marked as a draft PR but you asked for the review. Should I provide feedback on the implementation already? regarding the cli options, I would opt for one of the following:

  1. make it an error to provide additional flags when --config <path> is passed
  2. whatever flags passed on the CI would override those in the config (this is useful if you want to mostly reuse the config, but try out different options quickly), and I would produce a message saying that this is the case

After additional consideration, I am happy with @ArquintL general proposal, but I would rather choose one of the behaviours I proposed above for the case when we provide more CLI flags in addition to --config. In addition, we should probably show to the user which packages are being verified when the --config option is used, to avoid confusing the user (for example, because they passed the wrong file to a json file). I will proceed with the code review.

Copy link
Contributor

@jcp19 jcp19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, I reviewed up to src/main/scala/viper/gobra/frontend/Config.scala. I will continue the review later

case None => config
case Some(p) => config.copy(includeDirs = config.includeDirs.map(
// it's important to convert includeDir to a string first as `path` might be a ZipPath and `includeDir` might not
includeDir => p.toAbsolutePath.resolve(includeDir.toString)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do I see it correctly that we are only resolving the include directories here? What about the other parameters that may contain paths? (e.g., --includePackages or -i)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm now disallowing "-i" and "-d" in "other" as one can already express these using the appropriate field and I didn't wanted to bother about correctly resolving these path as these inputs are turned into a packageInfoInputMap

Comment on lines +84 to +98
object CliEnumConverter {
trait EnumCase {
def value: String
}

/** trait for an enum value to configure parsing and serialization of this enum */
trait CliEnum[E <: EnumCase] {
def values: List[E]
def convert(s: String): E = values.find(_.value == s).getOrElse(Violation.violation(s"Unexpected value: $s"))
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks fine to me. An alternative would be to use the builtin Enumeration type in Scala (https://www.scala-lang.org/api/2.13.15/scala/Enumeration.html) which may lead to a shorter implementation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. However, ViperBackend and MoreJoins carry additional fields which are useful but do not seem to be supported by Enumeration (at least according to Claude)

val logLevel: Level = ConfigDefaults.DefaultLogLevel
val debug: Boolean = Level.DEBUG.isGreaterOrEqual(logLevel)
BaseConfig(
gobraDirectory = ConfigDefaults.DefaultGobraDirectory,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are a few options that always get the default value (instead of being configurable). Is there a reason for that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct. However, these options can still be changed via the other field in a JSON configuration. Parsing and merging in of options in an other field happens afterwards

printVpr = c.job_cfg.print_vpr.getOrElse(debug),
streamErrs = !ConfigDefaults.DefaultNoStreamErrors),
backend = c.job_cfg.backend.map(_.underlying),
isolate = ConfigDefaults.DefaultIsolate,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean that chopping never works if we provide a json config?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now made chop a job JSON config option too

Comment on lines 670 to 672
// at this point, we do not know anymore from which config file these arguments come from.
// Thus, we do not resolve potential paths.
Config.parseCliArgs(otherArgs, None).map(Some(_))
Copy link
Contributor

@jcp19 jcp19 Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things get a bit tricky here, but wouldn't it be easier to resolve all paths before merging?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we want to disallow passing options that take paths in "other"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the implementation accordingly

@ArquintL
Copy link
Member Author

ArquintL commented Feb 6, 2026

@jcp19 Sorry this got quite ugly in the end ^^
The problem was that overriding default job options by job-specific options requires knowing which job-specific options the user actually specified (as opposed to all default ones).
Hence, I introduce InputConfig that corresponds to a directly translation from ScallopGobraConfig, i.e., before we apply any of the complicated transformations to obtain Config. In InputConfig, all non-provided options are simply None and we can perform whatever merge or override operation we want (which btw is not possible on a ScallopGobraConfig as ScallopOptions are immutable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants