This repository was archived by the owner on Feb 4, 2020. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 83
Resolve code duplication #225
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This centralises the code for creating a system-wide lock given some path. The function also respects the lock timeout environment variable. I factor this code out into a separate method since I plan to introduce more, finer-grained locks, which will very much use the same logic.
The plan is that these locks synchronise access to an individual section of the cache.
This is a system-wide lock to be used for synchronising accesses to the statistics of a cache.
This reimplements the global 'Cache.lock' lock such that it's defined in terms of the individual section and the statistics locks. This means that acquiring and releasing Cache.lock acquires and releases locks for all sections and for the statistics. This is slower than before (because it requires acquiring and releasing up to 513 locks) but it should be only rarely needed - mostly, when cleaning the cache.
This (big) patch attempts to improve concurrency by avoiding the global cache lock. Instead, we will lock only those manifest and compiler artifact sections which we deal with. The only cases where we need to synchronize all concurrenct processes is when updating the statistics file, because there's only a single statistics file. At least for cache hits this could be avoided by tracking the number of cache hits per section, too. To avoid deadlocks, the locks have to acquired in the same order for all execution paths (the order is defined in Cache.lock, i.e. first manifests, then artifacts, then statistics). Hence, locking of the artifact section had to be pulled out of the addObjectToCache() function since that function was called with the stats locked already - a violation of the locking order. Furthermore, we can no longer perform cache.clean() in addObjectToCache() because cache.clean() acquires the global lock, so e.g. this sequence of steps was possible in non-direct mode: 1. 'A' locks cache section 'x' 2. 'A' locks global statistics lock 3. 'B' locks cache section 'y' 4. 'A' needs to lock all sections to do a cleanup At this point, 'B' cannot proceed because 'A' still holds the statistics lock, but 'A' cannot proceed because 'B' still holds the lock on section 'y'. This issue is caused by -- from B's vie -- the statistics lock being locked before a section lock. This must never happen. At the point addObjectToCache() is called, we already have the statistics locked and we know that it may be that the cache size limit was just exceeded, so it's a good moment to determine that a cleanup is needed. It's not a good moment to *perform* the cleanup though. Instead, let the function return a flag which is propagated all the way back to processCompileRequest(). The flag indicates whether cleanup is needed, and if so, processCompileRequest() will acquire Cache.lock (which acquires all sections and statistics in the correct order) to do the cleanup.
By checking for 'manifest == None' early and returning early, we can reduce the indentation depth for the larger 'manifest != None' branch.
It's really an exceptional issue, let's handle that on the caller side so that we get the 'forward to real compiler' logic for free.
It's just ManifestRepository.getIncludesContentHashForFiles() which can raise IncludeChangedException, so only that needs to be covered by try/except; doing so, moving code out of the 'try', allows reducing the indentation depth.
These functions, just like postprocessObjectEvicted(), don't need to lock the cache section: they are already called while the cache section is locked (in processDirect()).
The majority of this code appears to be repeated in postprocessNoManifestMiss(). The only difference is the statistics field which is updated. So in order to consolidate this code, let's introduce a new postprocessHeaderChangedMiss() function which is parametrised on the statistics field to update.
…nifestMiss() Fixes some code duplication.
The logic for getting the includes used by a source file and invoking the compiler in case a manifest is not usable was duplicated in processDirect(). Centralize this in postprocessUnusableManifestMiss().
These two were really just forwarding to postprocessUnusableManifestMiss(), they didn't pull they own weight. So let's remove one level of indirection by inlining these functions.
Inlining this function shows that there is a strong similarity between processDirect() and processNoDirect(), a duplication I'd like to capture in a shared function in a subsequent commit.
As inlining processObjectEvictedMiss() shows, we already locked the artifact section here. We don't need to do it again.
...by enlarging the scope of the cleanupRequired variable in processDirect() and by renaming 'section' to 'artifactSection' in processNoDirect(). By making the code more similar, fixing the code duplication between these two should be safer to do, i.e. mistakes are hopefully easier to spot.
Not so minor actually, it makes processDirect() and processNoDirect() more similar.
The version in processNoDirect() could actually be simplified by adding two tuples. De-denting the return statement in processDirect() makes the code more similar to what processNoDirect() now does.
...by factoring it out into a getOrSetArtifacts() function (I'm not very happy with that name...). The function is much like processNoDirect() except that it expects to be given an artifact section cache key and a function which bumps the statistics in case of a miss. I suspect that this function can also be reused in processDirect().
Saves some code at the expense of a trace statement which is now lost. A small price to pay for the code savings, I think.
Current coverage is 88.53% (diff: 93.82%)@@ master #225 diff @@
==========================================
Files 1 1
Lines 969 968 -1
Methods 0 0
Messages 0 0
Branches 160 154 -6
==========================================
+ Hits 853 857 +4
+ Misses 86 82 -4
+ Partials 30 29 -1
|
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is based on #217 and takes advantage of the refactored code to save about 40 lines of code (Git says ' 1 file changed, 27 insertions(+), 70 deletions(-)') in comparison to #217 . This PR should be merged only after #217 was merged.