Who should clean up vector_ptr out-argument in case of interruption or error?

I found a piece of code which surprised me. I thought it’s a good idea to start a post on it to make sure we are all on the same page.

If a function takes an output argument that is a vector, it will require an initialized vector, and the caller is responsible for de-allocating that vector. It does not matter if the function returned successfully or with an error.

But what if the output argument is a vector_ptr?

Take a look at what igraph_sir does:

  • If the function returns successfully, it will leave it to the caller to free the result (naturally, as the caller will use it)
  • If the function returns with an error (or gets interrupted), it will destroy each element of the vector_ptr, but it will not free them, nor will it modify the vector_ptr in any other way.

Let’s look at another function:

Here in case of an error all elements get destroyed, they also get freed, and the vector_ptr gets cleared. This is different.

And another:

Here all elements get destroyed and freed, but the vector_ptr is not cleared.

This is a bit of a mess. There should be a clear guidelines on how to deal with vector_ptr arguments.

I was just going to add interruption to igraph_sir. It looks like not calling free() in igraph_i_sir_destroy() was on oversight. @Gabor you wrote this code originally. OK to free the elements in this function, or did I miss something?

If they were allocated here in the function, and there are no more references to them, then yes.

1 Like

I am not quite happy with how this works now, i.e. how most functions handle a vector_ptr when they are interrupted and just set destroyed elements to NULL. Now the caller has to be extra careful, and make sure to check each element of the vector_ptr and see that it is not NULL before destroying it.

Do the Python and R interfaces actually do this? I had a quick glance at the Python one, and it seems to mostly just do a igraph_vector_ptr_destroy without bothering to check if any elements are NULL or setting an item destructor. Am I missing something? If this doesn’t already crash it’s because things aren’t properly destroyed (no item destructor set) …

For example, here:

It would be useful to have a clear standard for how these things are supposed to be handled, and make sure that all functions follow it.

I suggest that when functions get interrupted, if they choose to destroy the elements of the vector_ptr, then they should also clear the vector_ptr, as is done in the second example I showed in the original post.

What to do with the item destructors, I do not know. Here’s another discussion:

https://lists.nongnu.org/archive/html/igraph-help/2017-05/msg00000.html

Re-reading parts of that reminded me that igraph_vector_ptr_clear is actually not always safe to call … it may call item destructors, which leads to double-destruction.

In the Mathematica interface, I avoid ever calling the item destructor that was set on the vector_ptr. Instead, I destroy items manually, if they are not NULL, then I free them manually.

I also have a safe version of clear that won’t call the specified item destructors (which cannot be relied upon).

Okay, I’m trying to sort this out in my head. The way I think about it is as follows.

In my head, every chunk of memory that the code allocates is supposed to have an “owner”, and it is the responsibility of the “owner” to free that memory. By default, the “owner” is the function that allocated the chunk. However, ownership can be transferred from the function to its caller, in which case the caller becomes responsible for freeing the pointer (obviously it means that the pointer has to be returned to the caller somehow).

When you initialize (and maybe even allocate) an igraph_vector_ptr_t in a function on your own, you are obviously responsible for destroying (and maybe freeing) it at the end. However, handling the contents of the igraph_vector_ptr_t needs a conscious decision: will the igraph_vector_ptr_t “own” the pointers that you add into it? In most cases in igraph, it makes sense to say that the igraph_vector_ptr_t “owns” the contents during the time you build the contents of the vector, so the general pattern is that we set a per-item destructor in the igraph_vector_ptr_t and put igraph_vector_ptr_free_all() (if we don’t own the igraph_vector_ptr_t itself) or igraph_vector_ptr_destroy_all() (if we do own the igraph_vector_ptr_t) on the finally stack until we are ready to return the igraph_vector_ptr_t back to the user.

When the vector_ptr is passed back to the user, we obviously pass on the ownership of the igraph_vector_ptr_t to the user (if it was ours to begin with). However, it is undecided what to do with the contents – if the igraph_vector_ptr_t still owns its contents, the user must be aware to call igraph_vector_ptr_destroy_all() (and not igraph_vector_ptr_destroy()) on the igraph_vector_ptr_t to destroy the pointer vector properly. This was an early design mistake that we’ve made with igraph_vector_ptr_t and it is hard to fix now without breaking existing code; maybe the solution would be to create an igraph_ptr_vector_t type, design it “properly” w.r.t. pointer ownership, and then slowly deprecate igraph_vector_ptr_t in favour of igraph_ptr_vector_t.

Anyway, since we don’t want the user to have to constantly think about whether she owns the contents of an igraph_vector_ptr_t returned from an igraph function or not, and there are already tons of places where the user is responsible for freeing the contents of the igraph_vector_ptr_t, the general policy for the time being is that the ownership of the contents of the pointer vector is also passed on to the user. To achieve this, we need to remove the item destructor from the igraph_vector_ptr_t when the vector is returned to the user.