Re: On malloc/free being slow and non-deterministic just lik…

Top Page

Reply to this message
Author: Dean Michael Berris
Date:  
To: True Computer Science Mailing List
Subject: Re: On malloc/free being slow and non-deterministic just like auto GC
On Tue, May 27, 2008 at 12:44 AM, Andy Sy <andy.sy@???> wrote:
> Dean Michael Berris wrote:
>
>> The behaviour of the memory manager (in the OS) is *not*
>> non-deterministic. There is an algorithm that deterministically
>> determines where the next chunk of memory will come from when you
>> request memory from the OS. This is not determined randomly, and thus
>> the algorithm is deterministic.
>
> You don't get it. Just like you can't predict how long a
> garbage cleanup will take, you won't be able to know
> how many milliseconds a particular malloc or free will take
> because you have no idea how fragmented the heap at the point
> in time you call it.
>


You're confusing things again: free() does not clean up garbage.
Returning an area in memory to the available pool is a matter of
inserting a pointer to a list/tree of pointers and is thus
deterministic. How long it takes has a "bound" -- and the only time
you actually have to wait that long is if you've exhausted all
available memory and encounter thrashing (swapping in an out pages of
memory to disk).

>>> By "expensive", I meant in terms of memory consumption,
>>> not speed of access. Preallocation is a lot more
>>> wasteful of memory space compared to dynamic memory
>>> management. Now re-read what I posted with that
>>> new understanding. Couldn't things be any more
>>> obvious??!?
>>
>> You conveniently snipped the part where I explain that you need to
>> qualify what you mean by expensive. If you're after computational
>> efficiency then I've just discussed why accessing data from the heap
>> is more expensive than accessing data that's allocated from somewhere
>> else (i.e., stack or data segments).
>
> We weren't discussing that at all in this particular case.
> All I'm saying is that preallocation is impractical for just
> about all applications we're used to nowadays because it is
> extremely wasteful of memory compared to dynamic memory
> management.
>


Yes, we actually were discussing that because the garbage collector's
algorithm introduces computational complexity -- you just can't say
something's expensive and not qualify the expense you're talking
about. What I was pointing out was that the expense you're talking
about is imaginary even in the case of a pre-allocating "memory
manager" or "allocator" implementation.

If you're thinking preallocation is impractical, tell that to the game
developers who actually pre-allocate areas in memory to maximize cache
locality especially for making sure that the game doesn't skip around
by constantly acquiring and releasing areas in memory. And yes, they
do it on the heap. Tell that to developers working on Microsoft Word
who use the flyweight pattern to keep the representation of individual
letters in pre-defined and pre-allocated areas in memory and just use
pointers/handles to these areas in memory when representing individual
letters rendered on a canvas. Tell that to developers who use a
pre-allocating pool allocator who very well know the effects of many
small memory allocations/deallocations to the heap and how that causes
heap fragmentation and why they opt to use pre-allocation to mitigate
that problem over using a garbage collector.

Oh and yes you can use a garbage collector in C -- the only question
you ask is whether the performance hit is worth it. Consider any
serious application that's handling thousands and thousands of
requests per second and you'll find an application that's avoiding a
garbage collector and is using a memory pool or preallocated areas in
memory to do whatever you need to do to maximize performance.

>
>> Now if you're talking about space efficiency, pre-allocating
>> everything you know you will need is not expensive because you're
>> again pre-allocating everything you know you need. There's no waste in
>> that situation:
>>
>> // I need 4 integers
>> int buffer[] = { 0 , 1 , 2 , 3 };
>> int & a(buffer[0]), & b(buffer[1]), & c(buffer[2]), & d(buffer[3]);
>>
>> Or
>>
>> // I need 4 integers
>> int a = 0, b = 1, c = 2, d = 3;
>>
>> Which is what you do when you perform pre-allocation.
>
> Right... so to take the web browser example, when you load a URL's
> contents in memory, it may take 3kb or 300kb or even more, so just
> how many KB do you "preallocate"? And don't give me that "it is
> stored as a file" crap, because when you're scrolling through a web page,
> you sure as hell are not accessing disk sectors, because it would
> take forever otherwise. The answer is that you must reserve memory
> as you load the URL's contents, and since you don't know just how
> much there is to load, you have to _dynamically_ allocate (aka use the
> heap). The only other alternative is to WASTEFULLY reserve a humongous
> chunk of memory at the beginning, memory which you will not need
> most of the time.
>


Consider the alternative (which is way faster, and less wasteful):

You pre-allocate a certain amount of memory (say, 1000 contiguous
elements to store node descriptors for the DOM of the document you'll
be rendering in the browser) then have the browser use and re-use this
area until it's exhausted: when you need more, you double that area in
memory, or grow it by another constant factor. The point is you
pre-acquire the areas in memory to store _node descriptors_.

Now once you have these node descriptors set up as a tree, it's then
easier to just go through the actual elements that need to be
rendered. The rendering information can also be stored in the node
descriptors.

Then your problem becomes loading the contents of the page/s from the
file and parsing it to create the node descriptors -- which reside in
a pre-allocated area in memory. Of course, you don't understand that
because you didn't bother to think in that sense: you seem to think
that the entire document (a file) is stored all in memory to be
rendered and kept there while you're rendering the information in a
GUI.

> Dido already gave an extremely clear example of why *it is completely
> absurd to think you can avoid usage of the heap for any normal
> program*. It really takes a mulehead and/or a moron to fail to
> appreciate why.
>


Sorry, no. What Dido presented was a scenario where you cannot avoid
using the heap, but not a situation where pre-allocation will not
work. You're using someone's argument that's meant for a different
thread of thought to support your argument. What you are doing now is
taking Dido's post out of context. I even answered that post with the
same answer I gave you just now, only this time I need to be more
explicit for you to understand the point.

If there was a mulehead or a moron here, that would be you.

> Go to any newsgroup or mailing list where people discuss computer
> science or computer language related topics and try to make the claim
> that it is realistic to expect to be able to avoid use of dynamic memory
> management in any non-trivial app and see if you don't end up being
> the butt of ridicule.
>


That doesn't happen here, because I made valid points which you always
fail to appreciate:

1. You do as much as you can to avoid using the heap for performance reasons

2. When you do use the heap, you should be wary of how you use it,
again for performance and scalability reasons

3. When you do use the heap, you should consider the effect that your
garbage collector has on the performance and scalability of an
application over you do manual memory management

For people who actually have worked on serious applications (high
performance, highly scalable, highly available applications like
servers, batch processing gigabytes/terabytes worth of data, etc.)
what I'm saying is not new. Apparently in your case, it's like I'm
talking to someone who doesn't know the difference between C and C++.

>
>
> A beginner learning C encountering malloc()/free() for the first time
> will ask himself/herself, why do I have to use these when declaring
> variables [on the stack] is so much simpler and less error-prone?
>
> Any reasonably intelligent programmer, after actually writing non-trivial
> programs and finding themselves having to resort to free()/malloc() time
> and again, will eventually realize just why they exist and why one cannot
> realistically hope to avoid their usage for any serious app. The question
> is, what sort of freakish, bizarre, nightmarish coding style are you relying
> on that allows you to not realize this OH-SO-FUNDAMENTAL FACT?
>
> I don't think I want to know the answer... if the code snippets you have
> posted so far are any indication of the way you think, my bet is that if
> I were your team lead and did see the actual code you write for apps you're
> responsible for, I'd have an iron maiden ready and waiting for you in no
> time, to serve as just recompense for the suffering you inflict on others
> who have to deal with your code. (And I'm sure I will also require a
> padded room, a psychiatrist, and some straitjackets for all these unfortunate
> souls...)
>


Then be prepared to have to work as part of the Linux kernel
development team, or the Mozilla browser team, or the Microsoft
Word/Excel/PowerPoint team, Adobe Photoshop team, or the Apache
development team.

It's like I'm talking with a typical PHB here who doesn't know when to
shut up to stop himself from embarassing himself in front of other
people.

It's questions like: "how do you ensure that you get a contiguous area
in memory without having to rely on the heap?" that you need to be
able to answer before you start saying how "bad" or "unmanageable"
code for a real serious application is. And it's a pity that you don't
understand or appreciate the point being made by the examples I've
made, it just shows how lacking you are as far as computer science
knowledge and actual real-world applications development experience is
concerned.

-- 
Dean Michael C. Berris
Software Engineer, Friendster, Inc.
______________________________________________________
True Computer Science Mailing List
compsci@??? (#CompSci @ irc.free.net.ph)
http://lists.free.net.ph/mailman/listinfo/compsci
Searchable Archives: http://archives.free.net.ph