@GroteGnoom I saw your question about issue priorities before you deleted it. The issues are not very well prioritized at the moment. It would take some effort to triage them. There are priority labels, but they are not consistently applied at the moment.
If you’d like to work on some issues, I’d suggest to choose based on what you prefer and what you are comfortable with. If you give us some specifics about what you’re looking for (bug fix, new feature, infrastructure work, etc.) we can make some more concrete suggestions.
In general when selecting issues I don’t have very specific preferences, and think I can handle a few mathematical challenges, although it might take me a while if I’m not familiar with the subject. I also don’t mind reading papers if that’s necessary. It’s nice for me to always have something to work on, and I prefer writing new C code to testing. Fixing bugs is also interesting. The reason I started testing is because there’s such a clear list of things to work on without really asking where I should start. I can just ‘assign’ myself, and I always have at least some work to do.
I can handle some infrastructure work as well, but I have to learn a bit more about CMake and the other tools we’re using.
I think the nicest projects are ones where I can work on for days without bothering anyone.
Some more projects that I am personally interested in, but haven’t looked into their difficulty yet:
Generally, feel free to choose any issue (even if it is assigned to someone), just ask us before you start. I won’t be able to work on most issues that are assigned to me in the near future, but I might have some ideas about how to do them.
I am testing igraph_laplacian_spectral_embedding.
I have notices a few issues:
It’s supposed to give U, D and V such that L=U D V^T, but it doesn’t for my test cases. L=U D V^(-1) does seem to hold.
For directed graphs it expects type == IGRAPH_EMBEDDING_OAP. This type is not mentioned in the documentation.
For a directed graph with one edge between two vertices, the function fails:
Two_vertices, one edge:
Error at src/linalg/arpack.c:539 : ARPACK error, 2x2 matrix is not symmetric - Invalid value.
Error at ../../src/misc/embedding.c:711 : - Invalid value.
Error at ../../src/misc/embedding.c:984 : - Invalid value.
Fatal error at igraph_laplacian_spectral_embedding.c:30 : Assertion failed: igraph_laplacian_spectral_embedding( g, no_eigenv, weights, IGRAPH_EIGEN_LM, IGRAPH_EMBEDDING_OAP, 0, &X, &Y, &D, options) == IGRAPH_SUCCESS
In the adjacency_spectral_embedding unit test a directed graph is used, so it shouldn’t be a general problem of not being able to handle asymmetric matrices.
Now my question is: Should I spend a lot of time fixing these issues? Or should I leave it up to someone with more knowledge of the intended functionality of this function? For issue 1. for example I don’t know if the documentation is wrong or the implementation is wrong, and it will take a lot of reading to figure that out.
igraph_neighbors says it returns ‘Adjacent vertices to a vertex’, but when there is a loop it will list the starting vertex itself, and when there are multiple edges to another vertex it will list the other vertex multiple times.
igraph_dyad_census expects unique non-loop neighbors to be returned, and when working on pseudo-diameters I also wanted a function that did this. dyad_census now returns incorrect results when loops or multiple edges are present.
One way to work around this is to use adjlists, because there you can indicate NO_LOOPS, and NO_MULTIPLE.
Should igraph_neigbors also take parameters that indicate how loops and multiple edges should be handled? Should its results be changed so it doesn’t use loops and multiple edges? Those would take quite a bit of work because every call needs to be checked. But maybe that needs to be done anyway to see if multigraphs are handled correctly.
We could also add a new function which takes more parameters.
I could also just continue to handle each case separately, and change the docs to igraph_neighbors to warn for this issue.
There’s been a debate about this, but my personal take is a strong YES, it should.
Or at least there should be an internal variant of the function which provides this (but then why not make it public?)
And that’s a strong no from me. Subjectively, most uses cases require the current behaviour. Most other graph packages which support non-simple graph have the same behaviour. The current behaviour is in many (but not all…) ways consistent with degrees, adjacency matrices, etc.
Quite a big part of code that isn’t covered according to Codecov are external libraries like Bliss and Gengraph. Should we treat them the same way as home-built igraph code?
Reasons to test: We have no reason to assume they’re of perfect quality.
Reasons not to test: Maybe we’ll be doing double work if they’re already tested in other ways, or there are parts in it we don’t need. Maybe we’ll use them as external packages in the future and it somehow seems weird to test external packages, even if the situation isn’t much different.
The coverage report is just a tool, not an end in itself. There is no reason to test code that we don’t use, so don’t spend time on it.
The question is rather: why not remove the unused parts of these included libraries? Well, that kind of surgery also takes care, time, and effort. Some parts might be used in the future. Some libraries (gengraph) are really not maintained outside of igraph, not well documented, and are just easier to understand as a whole, so I’m reluctant to strip out code just to increase the coverage numbers. The extra code has very few drawbacks: while it increases the size of the static library slightly, it does not impact the shared library or executables linked to igraph.
Yes, we shouldn’t just make some number higher if that doesn’t contribute to our code quality. But one advantage of the tool is that we can see what work needs to be done. Right now, that isn’t really clear. Shall we add the external libraries to the ignore paths?
Maybe only certain parts of the external libraries …
In some cases, it was useful to see which parts of their functionality was tested. For exampl, with PRPACK, it helped me notice that it switches methods depending on the size of the graph. One of those branches was not tested, and indeed it turned out to be buggy.
We should not test external libraries directly. But we should test the igraph functions that use them, and fix any issues that affect those functions. With the PRPACK example, this meant computing PageRank both for graphs smaller and larger than 128 vertices.