[rrd-developers] bus error when disk is full, with mmap & sparse file

Francois-Xavier Bourlet francois-xavier.bourlet at dotcloud.com
Mon Apr 18 07:43:31 CEST 2011


Hi Tobi,

Yes it happen at create time.

Checking available free space before the creation process would lead
to some race condition, because between the time you check the free
space and the time you allocate it you can still have some others
process/thread allocating it.

But it could be used in another way, by setting up an handler for bus
error that check the free space and print a little hint message before
exiting the application? The advantage would be zero overhead (until
you crash... but do you really care at crash time ;) ) and no
modification of the current rrd_open function. What do you think?

On Sun, Apr 17, 2011 at 10:07 PM, Tobias Oetiker <tobi at oetiker.ch> wrote:
> Hi Francois,
>
> Yesterday Francois-Xavier Bourlet wrote:
>
>> Hello,
>>
>> On my system rrd_open use mmap and my system support sparse file.
>> That's mean when my disk get full rdd_open can bus error. Here's the
>> scenario in rrd_open:
>>
>> Disk really close to full, few kbytes free
>> open file -> ok
>> seek to end -1 -> ok
>> write 1 -> ok
>> the system will only write the last chunk of the file, every others
>> will be allocated lazily later because of the sparse file feature.
>> So we have a file bigger than the free space available on the system.
>> Next attempt to write on this file, even without extending the size of
>> it will fail with a disk full error.
>>
>> next rrd_open map the file and then
>> memset to zero the whole file... leading to a buserror since the
>> kernel can't write into the file because the filesystem is full.
>
> this happens at create time, right ?
>
>> In my case I just have to extend the disk space available and it's
>> fine. But the problem is you don't have any clue that the bus error
>> happen because you're disk is full, and I really wasted a lots of time
>> before I thought simply checking the free space...
>>
>> I don't really now how to fix the code, maybe we can catch SIGBUS
>> signals, and when discovering that the error is about a file mapping,
>> provide an human readable message on terminal/log?
>>
>> Trying to recover from a bus error on file mapped memory seem to be
>> another challenge...
>>
>> Or rather than memsetting the file to zero, we could simply write
>> zeros in the file before mapping it, and so it would be easy to catch
>> write error.
>
>> Let me know what do you think about it, I am available to patch rrd
>> with the best proposed solution.
>
> how about a cal to statvfs before starting the whole creation
> process ? (for win32 this would bprobably be GetDiskFreeSpaceEx)
>
> cheers
> tobi
>
>
>
>>
>> Regards,
>>
>
> --
> Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
> http://it.oetiker.ch tobi at oetiker.ch ++41 62 775 9902 / sb: -9900
>



-- 
François-Xavier Bourlet



More information about the rrd-developers mailing list