-
-
Notifications
You must be signed in to change notification settings - Fork 53
Tickets #2325 & #4480: cd into directories with long names and/or embedded \n #4804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Tickets #2325 & #4480: cd into directories with long names and/or embedded \n #4804
Conversation
ceeb70e to
d6b260d
Compare
|
There's a regression with tcsh: Previously it could enter directories with a non-alphanumeric UTF-8 symbol (e.g. heart Let me play around a little bit with tcsh to see if I can fix this. Let's hold off this PR for now. |
ossilator
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3nd commit msg: typo in 'platforms'
57e9ac6 to
d267252
Compare
|
Back to tcsh: I think I can do either of these two things:
So it's either-or: invalid UTF-8 but no newlines, or newlines but not invalid UTF-8. And no, I'm not going to implement both and pick runtime so that we only fail if a path contains both :) Which one do you guys vote for? |
|
tcsh: I was wrong, command substitution isn't binary safe either. So, without terrible hacks, I can get newline working but not invalid UTF-8. To get invalid UTF-8 working, I think this would do it:
and when a command completes and tcsh sends it working directory to mc, invoke the external I'm not gonna do this, it's just not worth it. |
My thinking is that if I had to choose, I'd take newlines. |
d267252 to
eefcf8c
Compare
|
Yup I'm going with newlines. New commit pushed to fix regression with tcsh. How confident are we that placing unquoted 128..255 bytes in the command line is safe in every shell, they don't have any special meaning in the shell? |
eefcf8c to
013cd52
Compare
|
New ideas for fully fixing But let's not do everything in this PR, let's leave that for another day. Pushed an unchanged version, just rebase. Please review. |
zyv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In as far as I'm concerned, I'm satisfied with the current state. I have to admit that neither have I looked at the code too closely, nor have I tested it on my machines, but I really like the new structure and the documentation around it.
Just as a general comment, if I leave line comments with just one statement (and not more than one sentence), it looks better to me to omit the final full stop. Which is of course not pursued consistently throughout the whole codebase due to age.
013cd52 to
051cf0f
Compare
|
/rebase
I myself am inconsistent with that, too. Also whether to begin with uppercase or lowercase. Fixed one such occurrence. In the last commit, there's a one-liner and a two-liner that I'd like to keep consistent, so I kept the trailing dot. |
|
/rebase |
051cf0f to
2ae16ca
Compare
|
As I've got the bot to rebase this one and had another look, I've noticed 2 As to the buffer fixme, my understanding is that our buffer is set to something like In any case, I would remove the word "fixme" here and adjust the comment to that effect (saying that it's unlikely, because our buffer size is about this big). The comment itself is valuable if some problem arises and it will have to be debugged, but "fixme" somehow means to me that there is an immediate problem to take care of. In as far as the pipes are concerned, I tried to understand how bad that is. Apparently, it isn't bad, as the default seems to be around 64K. Still, I think it would be nice to adjust the comment to say something like that. Otherwise, one has no feeling at all how bad that is. And given the findings above, maybe say that the other limit will be hit earlier than the pipe buffer, and remove "fixme" on these grounds? https://unix.stackexchange.com/questions/11946/how-big-is-the-pipe-buffer |
2ae16ca to
46bdf72
Compare
|
(the previous push was a rebase only) |
0898e82 to
a2a4846
Compare
Softened both to "Note" and slightly expanded one of the comments. (By the way, for me "fixme" comments are exactly these long-term super-low-priority things. Anything more important or more urgent deserves a separate issue. But it's a matter of taste and project conventions.) Added another commit to fix #4884 for sh and fish (not tcsh tough). Sorry guys that it took me too long to get back to this. @mc-worker @zyv could you please do another round of review? |
| /* | ||
| * Fallback / POSIX sh: Construct a command like this: | ||
| * | ||
| * _mc_newdir_=`printf '%b_' 'ABC\0oooDEF\0oooXYZ'` # ooo are three octal digits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i wouldn't take bets that this works as a fallback for really old shells, as they were funky with quoting inside backticks.
how about mc_newdir=`cat <&$cd_pipe`? this would also significantly simplify and unify the code between shells. concept-wise, it would be quite complementary to the pwd query.
note that we can't cat the tty itself, as that would have the same cooked buffer size limitation, only without a way to express line continuations.
ah, well, whatever ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we support the old Bourne shell, nor anything old-fashioned like that. We only support a fixed list of shells, something like ash(busybox) bash dash fish ksh mksh tcsh zsh, and I'm fairly certain I tested my change with all of them. (Also note that in precmd_fallback we already use $(...) rather than backticks, which as far as I know also didn't work in really old shells, and no one complained so far.)
I'm afraid that cat <&$cd_pipe wouldn't work with unnamed pipes, cat would wait for EOF that would never arrive. It could work with names pipes or plain files, but I really don't want to refactor to using one of them in this PR.
Also, I wouldn't want to complicate the architecture, it's already complicated enough. Whenever mc needs to get some data from the subshell, it's understandable that we need side channels like a separate fd. Here it's the opposite direction: mc wants to send a command to the subshell, wants to tell it to change to a directory. It sounds ridiculous that it'd need a side channel for that, not being able to directly say so. If that's the case, I'd rather just limit support to decent-looking paths where the simplest cd '/over/there'-like commands work, or even claim that we don't support such shells.
| escape_fmt = "\\0%03o"; | ||
| } | ||
|
|
||
| const int quote_cmd_start_len = strlen (quote_cmd_start); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i wonder whether the compiler is smart enough to rewrite that as constant assignments inside the branches? probably not.
[...]
nope, despite being smart enough to use table lookups. both gcc and clang.
(i was just curious. ignore me.)
contrib/mc-wrapper.sh.in
Outdated
|
|
||
| if test -r "$MC_PWD_FILE"; then | ||
| MC_PWD="`cat "$MC_PWD_FILE"`" | ||
| # Command substitution removes final '\n's, hence the added and later removed '_'. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/final/trailing/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please elaborate on the subtle details?
I can already see two final '\n's in that file, and also final byte, final whitespace elsewhere in mc. Several "how to remove final characters from string"-like web pages. Possibly not by native English speakers, I dont know. How bad is it to say "final character"?
This PR also adds final '\n' at three different places. At one of them changing it to trailing would introduce an ugly linebreak :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please elaborate on the subtle details?
to me, "final" sounds distinctly temporal, while we want a positional meaning, preferably using terminology that is conventional for strings.
This PR also adds
final '\n'at three different places. At one of them changing it totrailingwould introduce an ugly linebreak :)
i'm assuming that you are bright enough to extrapolate the conclusion. ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm assuming that you are bright enough to extrapolate the conclusion. ;-)
What I forgot to add in my last commit is that possibly I chose the shorter word to fit the comment in one line :) I can't remember.
Anyway, changed to "trailing". Instead of the linebreak I decided it's pointless to link to the bug.
a2a4846 to
2a706a6
Compare
… the subshell If the subshell writes the working directory slowly, previously we could read its beginning and stop there. Signed-off-by: Egmont Koblinger <egmont@gmail.com>
This piece of code was never live in mc. It would work around a BusyBox bug that was fixed in 2012. Signed-off-by: Egmont Koblinger <egmont@gmail.com>
…ng a directory with special characters Handle trailing '\n' character in the directory name. Make sure to construct the cd command in physical lines no longer than 250 bytes so that we don't hit the small limit of the kernel's cooked mode tty buffer size on some platforms. tcsh still has problems entering directories with special characters (including invalid UTF-8) in their name. Other shells are now believed to handle any directory name properly. Signed-off-by: Egmont Koblinger <egmont@gmail.com>
…ibyte UTF-8 Don't escape safe shell characters commonly used in paths, such as '/', '.', '-' and '_'. Don't escape multibyte UTF-8 characters. Escaping each byte separately in string assignments doesn't work in tcsh. The previous commit introduces a regression here: tcsh cannot enter directories whose name is valid UTF-8 but contains non-alphanumeric UTF-8 characters. It used to work because printf would glue them together correctly, but we no longer use printf and command substitution because that breaks newlines. Signed-off-by: Egmont Koblinger <egmont@gmail.com>
…aracters Signed-off-by: Egmont Koblinger <egmont@gmail.com>
2a706a6 to
2de5125
Compare
Proposed changes
More robust reading of pwd, namely:
\n#4884Checklist
👉 Our coding style can be found here: https://midnight-commander.org/coding-style/ 👈
git commit --amend -smake indent && make check)