Skip to content

Conversation

@simon-batardiere
Copy link

No description provided.

@damjan-we damjan-we requested a review from lkristan-plume June 5, 2020 08:36
Copy link

@zhichengqiu zhichengqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@damjan-we damjan-we requested a review from mitja-plume June 8, 2020 07:45
Copy link
Contributor

@mitja-plume mitja-plume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is the field supposed to be accessed? Directly via the daemon_t structure?

If so, I suggest adding an accessor function. Also, the status must have a clear value representing that the process is still running (ie. the current status is not available because the process is still running). This being 0 is OK with me, but it needs to be documented in the header of the accessor function.

@simon-batardiere
Copy link
Author

simon-batardiere commented Jun 12, 2020

The previous commit add an API to get the exit status.
I don't find an easy way to know if the process is running or not. Because dn_enabled could be set to true in case of restart and doesn't really reflect if the process is running or not.

Copy link
Contributor

@lkristan-plume lkristan-plume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is going in wrong direction. The idea of the daemon module is to take as much of the burden off of the caller as possible. The module itself should (and does) log all the details about the exit status.

Judging by the code in https://github.com/plume-design/opensync/pull/16/files#diff-053599008b20818e4dfcc63a39ad1221R73, the implementation of the 'atexit' callback merely tries to log something that was already logged.

As it is, the implementation of pwm_DHCP_option82_at_exit() should probably be:

    (void)daemon;
    LOGE("DHCP option82 process exited unexpectedly");
    return true;

... which means this PR (#15) is not needed at all.

AFAICS, the daemon module does not call the callback if the caller requests the daemon to stop. Therefore it makes sense to log an error whenever the callback is called. After all, even if the process exited gracefully, the caller expected it to be running, therefore this is an undesired situation.

But the main purpose of this callback is not logging, rather it is to take a suitable action (e.g. stop advertising a service that depends on this deamon).

@lkristan-plume
Copy link
Contributor

I see some room for improvement here:

... here:

        if (WEXITSTATUS(wstatus) != 0)
        {
            ls = LOG_SEVERITY_ERR;
        }
        else if (self->dn_enabled)
        {
            ls = LOG_SEVERITY_WARN;
        }

... and here:

        LOG(INFO, "Daemon %s failed to start (or exited unexpectedly).", self->dn_exec);

The level is intentionally set to INFO. That way if the atexit callback can log a better error, the log backtrace will only be dumped once.

Simon Batardiere added 2 commits June 15, 2020 19:03
Exit status of a daemon could be useful for managers to know if the daemon
executed sucessfully or not, and can be used then in some logic.
This commit makes the exit status of the daemon availble in the API.

Signed-off-by: Simon Batardiere <simon.batardiere@sagemcom.com>
Create a getter in the daemon API to hide the daemon structure
details outside the API.

Signed-off-by: Simon Batardiere <simon.batardiere@sagemcom.com>
@simon-batardiere
Copy link
Author

I think this is going in wrong direction. The idea of the daemon module is to take as much of the burden off of the caller as possible. The module itself should (and does) log all the details about the exit status.

Judging by the code in https://github.com/plume-design/opensync/pull/16/files#diff-053599008b20818e4dfcc63a39ad1221R73, the implementation of the 'atexit' callback merely tries to log something that was already logged.

As it is, the implementation of pwm_DHCP_option82_at_exit() should probably be:

    (void)daemon;
    LOGE("DHCP option82 process exited unexpectedly");
    return true;

... which means this PR (#15) is not needed at all.

AFAICS, the daemon module does not call the callback if the caller requests the daemon to stop. Therefore it makes sense to log an error whenever the callback is called. After all, even if the process exited gracefully, the caller expected it to be running, therefore this is an undesired situation.

But the main purpose of this callback is not logging, rather it is to take a suitable action (e.g. stop advertising a service that depends on this deamon).

Hi Lars, I see two usages of the daemon API:

  • Run a daemon
  • Run a process which takes a very long time to finish (for example, a curl command to download a big file).

I agree with you in the context of the daemon, and your comment on the PR for PWM is right. But in the case of a long command, the daemon API is the only valid option. If we use the other APIs, like cmd_log, we will block the event loop for a while, which is always a bad thing. So here we also need to use the daemon API. And the caller may be interested by the exit status of the long command (like the curl). This is why this PR is still useful.

@lkristan-plume
Copy link
Contributor

Hi Simon,

Regarding the two usages:

  • Your example of downloading a file will be invalidated in near future with the introduction of an API specifically for that purpose.
  • We are also considering to introduce a library for async process execution (which the daemon library could then use internally).

Considering the above, and without an immediate need for this change, I am reluctant to give it my approval.

Regardless, here are some additional comments:

1 - @ https://github.com/plume-design/opensync/pull/15/files#diff-6a0c177f1dce309d321ec6382d55b041R417

    self->dn_exit_status = WEXITSTATUS(wstatus);

Return value of WEXITSTATUS may be undefined here (e.g. if the process was signaled).

A possible solution would be to use:

    if (WIFSIGNALED(wstatus))
    {
        self->dn_exit_status = -(WTERMSIG(wstatus));

...

    else if (WIFEXITED(wstatus))
    {
        self->dn_exit_status = WEXITSTATUS(wstatus);

...

    else
    {
        self->dn_exit_status = wstatus;

2 - daemon_get_exit_status() could theoretically be called not only from within the callback, but at any time, asynchronously.
We could say this is not really supported, but it would still be nice to initialize dn_exit_status in daemon_init() (say to INT_MIN), and daemon_get_exit_status() could actually check that value and return false in that case.

Alternatively, one could use -1 for 'still running', and '-2' instead of -(WTERMSIG(wstatus)) above. Not super clean, but should suffice for typical use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants