Skip to content

storage: the userns idmap copy layer code path still runs under the locks #506

@Luap99

Description

@Luap99

When running a big container image with a custom userns it takes a long time to create the id-mapped copy layer and that all happen under the layers.lock and images.lock so it blocks many other commands at the same time from doing anything.

$ podman pull ghcr.io/home-assistant/home-assistant:stable
$ podman run --rm --userns keep-id ghcr.io/home-assistant/home-assistant:stable true

Using my bpftrace script from #378 (comment) shows it takes like 50s to do that

@lock_duration[containers.lock]:
[0]                   24 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1]                    1 |@@                                                  |

@lock_duration[images.lock]:
[0]                  228 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1]                    1 |                                                    |
[2, 4)                 0 |                                                    |
[4, 8)                 0 |                                                    |
[8, 16)                0 |                                                    |
[16, 32)               0 |                                                    |
[32, 64)               0 |                                                    |
[64, 128)              0 |                                                    |
[128, 256)             0 |                                                    |
[256, 512)             0 |                                                    |
[512, 1K)              0 |                                                    |
[1K, 2K)               0 |                                                    |
[2K, 4K)               0 |                                                    |
[4K, 8K)               0 |                                                    |
[8K, 16K)              0 |                                                    |
[16K, 32K)             0 |                                                    |
[32K, 64K)             1 |                                                    |

@lock_duration[layers.lock]:
[0]                  549 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1]                    2 |                                                    |
[2, 4)                 0 |                                                    |
[4, 8)                 0 |                                                    |
[8, 16)                0 |                                                    |
[16, 32)               0 |                                                    |
[32, 64)               0 |                                                    |
[64, 128)              0 |                                                    |
[128, 256)             0 |                                                    |
[256, 512)             0 |                                                    |
[512, 1K)              0 |                                                    |
[1K, 2K)               0 |                                                    |
[2K, 4K)               0 |                                                    |
[4K, 8K)               0 |                                                    |
[8K, 16K)              0 |                                                    |
[16K, 32K)             0 |                                                    |
[32K, 64K)             1 |                                                    |

@lock_duration[storage.lock]:
[0]                  552 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1]                    1 |                                                    |

@lock_max[storage.lock]: 1
@lock_max[containers.lock]: 1
@lock_max[layers.lock]: 49194
@lock_max[images.lock]: 49194

#378 doesn't touch this part so it didn't help for this. I am not sure how much much the changes there can be used for this id-map copy code path as well but I think that is likely the next obvious bottleneck that needs fixing because holding locks for such long durations is not good.

Metadata

Metadata

Assignees

No one assigned

    Labels

    storageRelated to "storage" package

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions