InfluxDB::Writer::RememberingFileTailer: 2 race conditions fixed#8
Open
mephinet wants to merge 4 commits intodomm:masterfrom
Open
InfluxDB::Writer::RememberingFileTailer: 2 race conditions fixed#8mephinet wants to merge 4 commits intodomm:masterfrom
mephinet wants to merge 4 commits intodomm:masterfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We are using
InfluxDB::Writer::RememberingFileTaileron our production system under heavy load with lot of short-lived parallel processes writing into the stats directory. In this setup, we've witnessed two different race conditions:Scenario 1
The process writing stats starts and immediately writes a few lines. Later, the
IO::Async::Filesees the directory mtime change and callswatch_dir, which callssetup_file_watcherfor the process' stats file. A newIO::Async::FileStreaminstance is created for this stats file, but as it callsseek_to_last("\n")all lines written before this takes place are skipped. filetailer.t tests for that scenario. The solution is to always read new files from the beginning.Scenario 2
When many stats-writing processes are created, chances rise that the death of a process will be recognized by
watch_dir/setup_file_watcherbefore all data for this process' stats file has been read. In this case, theIO::Async::FileStreamwatcher is removed from the loop before it had the chance to callbuffer_pushin itson_read. The solution is toread_moreuntil the EOF is reached.