Question about the functionality of partial sum trees

Hey all,

I would like to build a custom random graph model for which I would like to use the partial sum trees provided by igraph to draw from a discrete probability distribution (in the same way it is used for i.e. the barabasi aging model).
Now, I am not an expert in partial sum trees and so I have one question about the way it works. The documentation says:

Each leaf contains the probability corresponding to the items. […]
Samples can be drawn from the probability distribution represented by
the tree by generating a uniform random number between 0 (inclusive) and the
value of the root of the tree (exclusive), and then following the branches
of the tree as follows. In each step, the value in the current node is
compared with the generated number. If the value in the node is larger,
the left branch of the tree is taken; otherwise the generated number is
decreased by the value in the node and the right branch of the tree is
taken, until a leaf node is reached.
Note that the sampling process works only if all the values in the tree
are non-negative. This is enforced by the object; in particular, trying to
set a negative value for an item will produce an igraph error.

I understand that negative values are a problem. But what will happen if I draw a uniform random number that is zero and I have a vertex that I want to choose with probability zero but not all vertices have probability zero? From my understanding, the search will start in the root and continually go down the left path until it reaches a leaf, which would then be my vertex with probability zero.

Most likely my understanding is incorrect, but what am I missing? If I am not missing anything, what steps do I need to take to ensure that probability-zero-vertices can not be chosen? (The vertices in my model might decrease in being-chosen-probability, down to zero).

Thank you for your help!

It is safe to insert zero weights into the tree for as long as some of the other weights are positive. The implementation of barabasi_game() makes use of this to avoid creating multi-edges: vertices which have already been connected to in the current step are temporarily set to have zero weight.

Thank you for your reply - I must be making a mistake somewhere else then. Sometimes when I draw a zero from rng_unif, psumtree_search will return an index whose psumtree_get is zero, while psumtree_sum is strictly greater than zero. I will try to produce a MWE or find the error in my code.

I pulled the state of the sumtree when this error happens out of my code (I dont want to share the whole thing, but I basically copied the barabasi psumtree variant and changed the psumtree_update part to a different weight (which is at least zero)). I dont know if that actually helps but here is what I have

#include <igraph.h>
#include <iostream>


int main() {
    igraph_psumtree_t sumtree;
    IGRAPH_CHECK(igraph_psumtree_init(&sumtree, 100));
    IGRAPH_FINALLY(igraph_psumtree_destroy, &sumtree);
    igraph_real_t sumtree_array[100] = {24,24,0,24,0,0,0,12,12,0,0,0,
                                        0,0,0,2,10,12,0,0,0,0,0,
                                        0,0,0,0,0,0,0,0,1,1,3,7,
                                        7,5,0,0,0,0,0,0,0,0,0,0,
                                        0,0,0,0,0,0,0,0,0,0,0,0,
                                        0,0,0,0,0,1,1,0,0,3,3,4,
                                        3,4,3,2,0,0,0,0,0,0,0,0,
                                        0,0,0,0,0,0,0,0,0,0,0,0,
                                        0,0,0,0,0,};

    igraph_vector_t tmp;
    igraph_vector_init_copy(&tmp, reinterpret_cast<const igraph_real_t *>(&sumtree_array), 100);
    sumtree.v = tmp;
    sumtree.offset = 127;

    igraph_real_t test = 0;
    long int to;
    igraph_psumtree_search(&sumtree, &to, test);
    // this outputs "0 and 24" for me
    std::cout << igraph_psumtree_get(&sumtree, to) << " and " << igraph_psumtree_sum(&sumtree) << std::endl;
    igraph_vector_destroy(&tmp);
    return 0;
}

Sadly I have no idea of what is going on here at all. The difference to the barabasi_game implementation is that my vertices are not temporarily set to zero, but once set to zero, will keep their weight zero. The issue has also only appeared after a few iterations (the smallest I saw was 16 of 100) so maybe I am making a mistake in the tree management. If this ends up being intended behavior and my implementation a bastardization of psumtrees, I will just skip the zero-vertices if they end up being chosen so it’s not a big deal, I am just very confused.

Edit: Maybe I should mention that this error has only happened when the uniformly random value is zero, but it does not happen every time in this scenario. So there must be something else going on.

I haven’t had time to look in detail, but there’s a problem with the example above. The vector in the tree must have length offset + size, not size. So if this tree is of size 100, then sumtree->v should have size 100 + 127 = 227. The state recreated in this example is not consistent.

I see what you mean about passing in a zero search value. There is no need to correct the above example. We will look into this.

1 Like

Exciting! I hope to be updated in the future about this :blush: Thanks for your response!

You can follow this issue: psumtree_search() returns element with zero weight · Issue #2080 · igraph/igraph · GitHub

This problem is now fixed. The fix will be released with igraph 0.9.9, which should come in 1-2 weeks if all goes well. You can use it right away if you check out the master branch of the GitHub repo.

Since you’re one of the few users of the igraph C library, I wanted to let you know that the develop branch, which will become igraph 0.10, is in a pretty good state now, and you can consider using it. There are a very large number of breaking changes compared to igraph 0.9, made as part of the preparation for igraph 1.0, at which point the API will become stable. Any feedback about the version on the develop branch is welcome. The documentation is here: igraph Reference Manual The biggest change is that now we consistently use igraph_integer_t for almost all integers, and that this type is 64-bit wide on systems that support this.

Note that develop is still in flux, and some new breaking changes are still being made. Whether these would affect you depend on which parts of igraph you are using.