Skip to content

Conversation

@LoicRiegel
Copy link
Contributor

@LoicRiegel LoicRiegel commented May 4, 2025

Hi everyone,
There are a lot of new features in this PR. Sorry it will be a lot to review, it would have probably been better to make multiple smaller PRs, but I got excited, so I did way more than I expected in the first place.
Of course, don't hesitate to test it on your side too!

So thanks in advance to all reviewers.
Again, I'm fairly new to rust, so any suggestion to improve the code or make it more idiomatic are very welcome :)

Issue

No linked issues.

Description of changes

Reading configuration file now supported (was todo)

After those changes, nufmt will take the configuration this order:

  1. The one provided by the CLI, if --config is used
  2. Look for a file called "nufmt.nuon" in the current directory or any of the parent directory
  3. If nothing is found, use the default

New "check mode"

Inspired by the ruff formatter, you can now run nufmt with --check, and will just tell you if files need formatting, but will not write the fixes.

Possibility to exclude some files

You can now add some patterns to exclude in the configuration file and the file discovery is handled by the ignore crate, which uses walkdir.

Exit codes

Inspired by the ruff formatter again, the exit codes are

  • 0 if files were already good, or if all were successfully formatted
  • 1 only in check mode and if at least one file needs reformatting
  • 2 a failure occurred: config file can't be parsed, cannot write to file (not found...), or found garbage in any of the .nu files

Improved what was printed in stdout and stderr

See below for screenshots

Docs and tests

I did not cover 100%, but I added in:

  • README
  • docstrings
  • unit and integration tests

Performance improvement

Used rayon to format files in parallel.

Dependencies

  • upgraded nu to 0.104.0
  • Added quite some dependencies:
    • nuon to read the config file
    • ignore for file discovery
    • colored for terminal output
    • thiserror for error handling, but removed anyhow as it wasn't used
    • rayon for parallel processing
    • rstest and tmpfile for tests

How does it look?

Various use cases demo:
image

Demo of excluding files:
image

README.md Outdated

### Options

- `--check` if you just want to check files without formatting them. It cannot be combined with stdin.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--check is typically called --dry-run in cli's. i can live with either though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will have a look at other code formatters, see what they use

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also like --dry-run, but it looks like a lot of formatters use --check:

  • Use --check: black, ruff, prettier, rustfmt
  • Use --dry-run: clang-format, gofmt

Both are fine by me, although maybe --dry-run could make the code easier.
What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AucaCoyan any preference?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No preference at all! both are also used in other CLI tools (non-formatters) so both ways are good for me!

Whatever suits you

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced --check by --dry-run. I had a slight preference for this one. Thanks for the suggestion @fdncred !

@fdncred
Copy link
Contributor

fdncred commented May 4, 2025

I skimmed through this and it's probably fine. It's just adding some tooling around the "meat" of the code which is the formatting. I pointed out a couple things like colored vs nu-ansi-term in a few places. The formatting needs some serious work. Hopefully we can get that fixed too at some point. Thanks!

@LoicRiegel
Copy link
Contributor Author

Yes this does not change anything in the formatting, except that if garbage is detected is one of the file, then we exit with failure.

But if you have some time to dig into the implementation, I would still appreciate some feedback:)

I wanted to do this first, but I'm up to improve the formatting next. But I just wasn't sure, there is a ticket tracking the support for each SyntaxShape, so this is something to be done? Cause' I also heard the new nu parsed mentioned, so I didn't know if we were waiting for it to be ready before moving on here... But if the new parser won't be ready before months, then I guess it does not make sense to wait...

@fdncred
Copy link
Contributor

fdncred commented May 4, 2025

The new-nu-parser won't be done for months unless someone decides to work on it regularly. I wouldn't even worry about that parser. If it ever gets done, we can try to make it fit here again. This repo is still kind of an experiment to see if someone could use the current parser as a formatter. The jury is still out on that one. Please feel free to continue working here.

@LoicRiegel
Copy link
Contributor Author

Okay, yeah then I'll continue here, it's a good playground:)

Does the current parser hold comments? If not, that could could be a serious blocker...

@AucaCoyan
Copy link
Contributor

Hi!

Does the current parser hold comments? If not, that could could be a serious blocker...

No, it doesn't. I wrote most the the current (awful) logic, and I added specific logic exactly because of the comments not being read by the parser.
In a vec of bytes, you have some slices which correspond to Spans, but if you add up all the spans, some bytes (characters) are left out. That's the way of finding a comment.
The new-nu-parser is written with having comments in mind, so we can parse them (for documentation, formatting, or anything).

In this line I say:

  • after you checkout the start of the span, if you have some bytes that are skipped. For example bytes starting at 10 and span starting at 25) check if in between 10 and 25 is a # sign. In a positive case, write the comment to the output. Otherwise, skip the contents.
    One possible case of bytes skipped is if you have 15 times \n new lines, for example. The formatter should leave just 1 new line

I hope I could explain myself clear enough! The code is an spaghetti mess 😢

@LoicRiegel
Copy link
Contributor Author

Okay waw, what a workaround 😅
Have you looked at other solutions? I'm thinking, can we add support for comments to the current parser? But maybe you.ve thought about that already...

Or, if your workaround is working, it'd be fine to keep it? Does it detect accurately all comments?

And even if we keep it, you're saying that we should improve this so it's less entangleed in the formatting logic, is that right?

@AucaCoyan
Copy link
Contributor

AucaCoyan commented May 4, 2025

Have you looked at other solutions?

Mmmm, I don't really remember looking out for other solutions. I believe I wrote the first one that worked and get along with it.

can we add support for comments to the current parser?

Sure! It can be added, but also a lot of improvements should be placed also in the parser, that's why the new nu parser was started way back ago. I think it is a long and hard task, so to get the most value it would be better to replace the current parser with a better parser, not just the one that keeps the comments. Take into consideration that what I say is from not a very deep perspective over the current parser. I have done very little work on both, both current and new parsers, so take this with a grain of salt.

if your workaround is working, it'd be fine to keep it? Does it detect accurately all comments?

Yeah! it works really good so far. I never encountered a sample that doesn't work. I can't promise is the most efficient and reliable though. For the formatter speed, I think it's enough 🤔.

you're saying that we should improve this so it's less entangleed in the formatting logic,

Correct. Maybe the comments detection and work is not really bad. What really smells is the match statement, I've done it with very little knowledge of Rust and data structures. It's hard to maintain and understand it just by reading it, That is why I say it's not my brightest moment 😅.

I can share a blogpost that I found really valuable in the day:
the hardest program I've ever written. It's from Bob Nystrom, he work on the Dart language and it's formatter, dart fmt

Shoot any further questions!

@LoicRiegel
Copy link
Contributor Author

I am not familiar with the current or the nu parser either! But from the comment of fdncred above, it looks like the new nu parser is far from ready, so I think we shouldn't rely on it here, and see what we can do with the current one.

My proposition is:

  1. We try to keep your workaround. Maybe we try to improve the structure or something if needed. It sure is fast enough, and if you say it's accurate, then it's the simplest solution. I was asking about the README says something like "warning: comment might get lost" ^^
  2. If we hit some limitations, and if those could be fixed or mitigated by improving the current parser, we should consider it.

Thanks for the article, I'll read it :)

@LoicRiegel
Copy link
Contributor Author

Should be good to merge

@fdncred fdncred merged commit e6b2531 into nushell:main May 8, 2025
5 checks passed
@fdncred
Copy link
Contributor

fdncred commented May 8, 2025

Thanks

@LoicRiegel LoicRiegel deleted the feat/cli-and-parallel branch May 8, 2025 22:35
132ikl pushed a commit that referenced this pull request May 23, 2025
- **flake: Update inputs**
- **fix(direnv): Specify shell w/shebang + use direnv stdlib + use
correct syntax**
- **flake: Simplify input dependencies**
- **refactor(flake): Add formatter for nix + tidy up flake**
- **fix(flake): Nix builds release + always specifies target triple**

## Description of changes
- Updated inputs used in nix flake (tool that defines the nix package of
nufmt)
- Added formatter for nix flake/nix code
- Simplified nix flake's inputs/dependencies (synchronized and
deduplicated)
- Tidied up/fixed .envrc file/direnv
- Fixed build broken by new tests added in #69

## Relevant Issues
None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants