Skip to content

Data loss and cluster unavailability -- Issue fsyncs on database files and directories #32

@ramanala

Description

@ramanala

I am running a three node iNexus cluster. iNexus can lose data since it does fsync the database files before acknowledging the client. Specifically, after acknowledging a client, if a crash happens, then the acknowledged data can be lost as the database files are not fsync'd before the acknowledgment. This results in silent data loss after acknowledging the client.

Also, the cluster can become unavailable if a crash occurs and if the rename of .dbtmp file to CURRENT file is not immediately persisted to disk. If this rename is not immediately persisted, then the node during startup after the crash would simply fail to start. This can render the entire cluster unusable. To fix this, an explicit fsync on the parent directory is required. For more information on this, please see http://research.cs.wisc.edu/wind/Publications/alice-osdi14.pdf and https://www.quora.com/When-should-you-fsync-the-containing-directory-in-addition-to-the-file-itself.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions