python-dev
[Prev] Thread [Next] | [Prev] Date [Next]
Re: [Python-Dev] Ext4 data loss Gisle Aas Thu Mar 12 02:00:22 2009
On Mar 11, 2009, at 22:43 , Cameron Simpson wrote:
On 11Mar2009 10:09, Joachim K?nig <[EMAIL PROTECTED]> wrote:Guido van Rossum wrote:On Tue, Mar 10, 2009 at 1:11 PM, Christian Heimes <[EMAIL PROTECTED]> wrote:If I understand the post properly, it's up to the app to call fsync(), and it's only necessary when you're doing one of the rename dances, or updating a file in place. Basically, as he explains, fsync() is a very[...]https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/54 .[...]heavyweight operation; I'm against calling it by default anywhere.To me, the flaw seem to be in the close() call (of the operating system). I'd expect the data to be in a persistent state once the close() returns. So there would be no need to fsync if the file gets closed anyway.Not really. On the whole, flush() means "the object has handed all datato the OS". close() means "the object has handed all data to the OS and released the control data structures" (OS file descriptor release;like the OS, the python interpreter may release python stuff later too).By contrast, fsync() means "the OS has handed filesystem changes to the disc itself". Really really slow, by comparison with memory. It is VeryExpensive, and a very different operation to close().
...and at least on OS X there is one level more where you actually tell the
disc to flush its buffers to permanent storage with: fcntl(fd, F_FULLSYNC) The fsync manpage says:Note that while fsync() will flush all data from the host to the drive (i.e. the "permanent storage device"), the drive itself may not physi- cally write the data to the platters for quite some time and it may be
written in an out-of-order sequence.
Specifically, if the drive loses power or the OS crashes, the
application
may find that only some or none of their data was written. The
disk
drive may also re-order the data so that later writes may be
present,
while earlier writes are not.
This is not a theoretical edge case. This scenario is easily
reproduced
with real world workloads and drive power failures.
For applications that require tighter guarantees about the
integrity of
their data, Mac OS X provides the F_FULLFSYNC fcntl. The
F_FULLFSYNC
fcntl asks the drive to flush all buffered data to permanent
storage.
Applications, such as databases, that require a strict ordering
of writes
should use F_FULLFSYNC to ensure that their data is written in
the order
they expect. Please see fcntl(2) for more detail.
It's not obvious what level of syncing is appropriate to automatically
happen
from Python so I think it's better to let the application deal with it. --Gisle _______________________________________________ Python-Dev mailing list [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/alexiscircle%40gmail.com
- Re: [Python-Dev] Ext4 data loss, (continued)
- Re: [Python-Dev] Ext4 data loss Steven D'Aprano
- Re: [Python-Dev] [Python-ideas] Ext4 data loss zooko
- Message not available
- Re: [Python-Dev] Ext4 data loss Hrvoje Niksic
- Re: [Python-Dev] Ext4 data loss Martin v. Löwis
- Re: [Python-Dev] Ext4 data loss Greg Ewing
- Re: [Python-Dev] Ext4 data loss Martin v. Löwis
- Re: [Python-Dev] Ext4 data loss Antoine Pitrou
- Re: [Python-Dev] Ext4 data loss Cameron Simpson
Re: [Python-Dev] Ext4 data loss Joachim König
- Re: [Python-Dev] Ext4 data loss Cameron Simpson
- Re: [Python-Dev] Ext4 data loss Gisle Aas <=
Message not available
- Re: [Python-Dev] Ext4 data loss Hrvoje Niksic
Re: [Python-Dev] Ext4 data loss Neil Hodgson
- Re: [Python-Dev] Ext4 data loss Guido van Rossum
- Re: [Python-Dev] Ext4 data loss Barry Warsaw
- Re: [Python-Dev] Ext4 data loss Greg Ewing
- Re: [Python-Dev] Ext4 data loss Antoine Pitrou
- Re: [Python-Dev] Ext4 data loss Neil Hodgson
- Re: [Python-Dev] Ext4 data loss Antoine Pitrou
- Re: [Python-Dev] Ext4 data loss Oleg Broytmann
- Re: [Python-Dev] Ext4 data loss Antoine Pitrou