Daemon exit status available in the API #15

simon-batardiere · 2020-06-04T23:04:44Z

No description provided.

zhichengqiu

LGTM

mitja-plume

How is the field supposed to be accessed? Directly via the daemon_t structure?

If so, I suggest adding an accessor function. Also, the status must have a clear value representing that the process is still running (ie. the current status is not available because the process is still running). This being 0 is OK with me, but it needs to be documented in the header of the accessor function.

simon-batardiere · 2020-06-12T23:11:29Z

The previous commit add an API to get the exit status.
I don't find an easy way to know if the process is running or not. Because dn_enabled could be set to true in case of restart and doesn't really reflect if the process is running or not.

lkristan-plume

I think this is going in wrong direction. The idea of the daemon module is to take as much of the burden off of the caller as possible. The module itself should (and does) log all the details about the exit status.

Judging by the code in https://github.com/plume-design/opensync/pull/16/files#diff-053599008b20818e4dfcc63a39ad1221R73, the implementation of the 'atexit' callback merely tries to log something that was already logged.

As it is, the implementation of pwm_DHCP_option82_at_exit() should probably be:

    (void)daemon;
    LOGE("DHCP option82 process exited unexpectedly");
    return true;

... which means this PR (#15) is not needed at all.

AFAICS, the daemon module does not call the callback if the caller requests the daemon to stop. Therefore it makes sense to log an error whenever the callback is called. After all, even if the process exited gracefully, the caller expected it to be running, therefore this is an undesired situation.

But the main purpose of this callback is not logging, rather it is to take a suitable action (e.g. stop advertising a service that depends on this deamon).

lkristan-plume · 2020-06-15T12:42:37Z

I see some room for improvement here:

https://github.com/plume-design/opensync/pull/15/files#diff-3ff719e6313ae538b8df107fa799caeeR58
... where the comment should reflect what the callback is really for.

... here:

https://github.com/plume-design/opensync/pull/15/files#diff-6a0c177f1dce309d321ec6382d55b041R443
... where we could use LOG_SEVERITY_WARN when a process exits gracefully, but was not actually asked to stop:

        if (WEXITSTATUS(wstatus) != 0)
        {
            ls = LOG_SEVERITY_ERR;
        }
        else if (self->dn_enabled)
        {
            ls = LOG_SEVERITY_WARN;
        }

... and here:

https://github.com/plume-design/opensync/pull/15/files#diff-6a0c177f1dce309d321ec6382d55b041R506
... where we could log something like

        LOG(INFO, "Daemon %s failed to start (or exited unexpectedly).", self->dn_exec);

The level is intentionally set to INFO. That way if the atexit callback can log a better error, the log backtrace will only be dumped once.

Exit status of a daemon could be useful for managers to know if the daemon executed sucessfully or not, and can be used then in some logic. This commit makes the exit status of the daemon availble in the API. Signed-off-by: Simon Batardiere <simon.batardiere@sagemcom.com>

Create a getter in the daemon API to hide the daemon structure details outside the API. Signed-off-by: Simon Batardiere <simon.batardiere@sagemcom.com>

simon-batardiere · 2020-06-16T23:03:46Z

I think this is going in wrong direction. The idea of the daemon module is to take as much of the burden off of the caller as possible. The module itself should (and does) log all the details about the exit status.

Judging by the code in https://github.com/plume-design/opensync/pull/16/files#diff-053599008b20818e4dfcc63a39ad1221R73, the implementation of the 'atexit' callback merely tries to log something that was already logged.

As it is, the implementation of pwm_DHCP_option82_at_exit() should probably be:
    (void)daemon;
    LOGE("DHCP option82 process exited unexpectedly");
    return true;
... which means this PR (#15) is not needed at all.

AFAICS, the daemon module does not call the callback if the caller requests the daemon to stop. Therefore it makes sense to log an error whenever the callback is called. After all, even if the process exited gracefully, the caller expected it to be running, therefore this is an undesired situation.

But the main purpose of this callback is not logging, rather it is to take a suitable action (e.g. stop advertising a service that depends on this deamon).

Hi Lars, I see two usages of the daemon API:

Run a daemon
Run a process which takes a very long time to finish (for example, a curl command to download a big file).

I agree with you in the context of the daemon, and your comment on the PR for PWM is right. But in the case of a long command, the daemon API is the only valid option. If we use the other APIs, like cmd_log, we will block the event loop for a while, which is always a bad thing. So here we also need to use the daemon API. And the caller may be interested by the exit status of the long command (like the curl). This is why this PR is still useful.

lkristan-plume · 2020-07-07T14:18:11Z

Hi Simon,

Regarding the two usages:

Your example of downloading a file will be invalidated in near future with the introduction of an API specifically for that purpose.
We are also considering to introduce a library for async process execution (which the daemon library could then use internally).

Considering the above, and without an immediate need for this change, I am reluctant to give it my approval.

Regardless, here are some additional comments:

1 - @ https://github.com/plume-design/opensync/pull/15/files#diff-6a0c177f1dce309d321ec6382d55b041R417

    self->dn_exit_status = WEXITSTATUS(wstatus);

Return value of WEXITSTATUS may be undefined here (e.g. if the process was signaled).

A possible solution would be to use:

    if (WIFSIGNALED(wstatus))
    {
        self->dn_exit_status = -(WTERMSIG(wstatus));

...

    else if (WIFEXITED(wstatus))
    {
        self->dn_exit_status = WEXITSTATUS(wstatus);

...

    else
    {
        self->dn_exit_status = wstatus;

2 - daemon_get_exit_status() could theoretically be called not only from within the callback, but at any time, asynchronously.
We could say this is not really supported, but it would still be nice to initialize dn_exit_status in daemon_init() (say to INT_MIN), and daemon_get_exit_status() could actually check that value and return false in that case.

Alternatively, one could use -1 for 'still running', and '-2' instead of -(WTERMSIG(wstatus)) above. Not super clean, but should suffice for typical use.

damjan-we requested a review from lkristan-plume June 5, 2020 08:36

zhichengqiu reviewed Jun 5, 2020

View reviewed changes

damjan-we requested a review from mitja-plume June 8, 2020 07:45

mitja-plume reviewed Jun 8, 2020

View reviewed changes

lkristan-plume suggested changes Jun 15, 2020

View reviewed changes

Simon Batardiere added 2 commits June 15, 2020 19:03

daemon: Daemon exit status available in the API

6d64c08

Create a getter in the daemon API to hide the daemon structure details outside the API. Signed-off-by: Simon Batardiere <simon.batardiere@sagemcom.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Daemon exit status available in the API #15

Daemon exit status available in the API #15

Uh oh!

simon-batardiere commented Jun 4, 2020

Uh oh!

zhichengqiu left a comment

Uh oh!

mitja-plume left a comment

Uh oh!

simon-batardiere commented Jun 12, 2020 •

edited

Loading

Uh oh!

lkristan-plume left a comment

Uh oh!

lkristan-plume commented Jun 15, 2020

Uh oh!

simon-batardiere commented Jun 16, 2020

Uh oh!

lkristan-plume commented Jul 7, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Daemon exit status available in the API #15

Are you sure you want to change the base?

Daemon exit status available in the API #15

Uh oh!

Conversation

simon-batardiere commented Jun 4, 2020

Uh oh!

zhichengqiu left a comment

Choose a reason for hiding this comment

Uh oh!

mitja-plume left a comment

Choose a reason for hiding this comment

Uh oh!

simon-batardiere commented Jun 12, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lkristan-plume left a comment

Choose a reason for hiding this comment

Uh oh!

lkristan-plume commented Jun 15, 2020

Uh oh!

simon-batardiere commented Jun 16, 2020

Uh oh!

lkristan-plume commented Jul 7, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

simon-batardiere commented Jun 12, 2020 •

edited

Loading