The unit tests thread

I think the only issue is what I noted here. Did I misunderstand something?

Thanks! No, I was just completely looking in the wrong direction. :slight_smile:

I am working on arpack_unpack_complex. It’s supposed to make the output of igraph_arpack_rnsolve more readable. One thing rnsolve does is that when the eigenvalues are complex it does store the two complex conjugate eigenvalues, but only stores the eigenvector of the eigenvalue with the positive imaginary part. unpack_complex doesn’t explicitly say what it’s supposed to output, but I guess it should output both eigenvectors for that complex case. What it actually does is output the eigenvector for the eigenvalue with negative imaginary part twice. You can see this here: Add arpack_unpack_complex unit tests · GroteGnoom/igraph@540ae9b · GitHub. (For the outputs: First there’s a real column, then an imaginary column, then real, etc.)

Am I correct that it should output two different eigenvectors (the original one and it’s complex conjugate) ?

Yes, that is what it is supposed to do. The function has two goals:

  • If you have n eigenvalues, then it should produce a matrix with 2*n columns, real and imaginary columns interspersed, as you have observed.
  • If ARPACK returned more than n eigenvalues for some reason, it should discard the additional eigenvalues.

The main purpose of this function was to make it easier to write unit tests that involve complex eigenvectors. It is not used at the moment in the C core, but the R interface exposes it as arpack.unpack.complex, and its output actually matches what R requires to produce a list of complex vectors.

1 Like

@GroteGnoom On Monday I fixed a bunch of bugs in community detection functions when calling them on 0- or 1-vertex graphs. Having informative error messages (with IGRAPH_ERRORF) that included actual values was hugely helpful in this! Your work is bringing fruit.

:slightly_smiling_face: That’s great to hear!

igraph_zeroin is in /src/internal/, but it is not used internally (no usage found with grep -r igraph_zeroin). Does that mean it isn’t used? Should it still be tested for possible later usage?

Since it’s a public function, it’s not bad to have a test for it.

But since it is not directly graph related, not currently included in the documentation, and not used by any high-level interfaces, it is the lowest priority among such functions. Also, it seems to be borrowed from R, which borrowed it from NETLIB (?) so it’s unlikely to be buggy. I would focus on something else.

@Gabor Do you recall why this function was added in the first place, given that it is not even used by the R interface?

For some personal thing, that maybe never happened. Just ignore it. We can even remove it.

Is there a quick way to build only the docs needed to check them for a single function?

Also: in the docs, it seems like \c A and <code>A</code> are supposed to have the same end result, and <code> is just more practical for longer blocks, but they show up with two different fonts when I render it locally. Is that a bug? I notice that in the html the <code> blocks have class = literal, and \c has class = constant.

@tamas would be able to answer these questions.

As for the slow doc build, I also find that painful, but I’m not sure that there’s an easy fix …

Unfortunately there isn’t.

The styles are provided by DocBook; I can fix the CSS so they look the same.

Maybe that does fix everything, but I notice now that the origin of my confusion is that in the documentation wiki it says

`\c` styles the following word as code. It will be rendered in a monospace font. Use `<code>...</code>`
for longer sections of inline code.

while in c-docbook.re it says

REPLACE ----- doxygen \c command is for <constant> ----------------------------

so maybe the wiki should just be changed and the tags used differently than I thought. Still, them having the same font could still be what everyone expects.

We have been using \c as an alias for any monospaced stuff for a long time (at least I did) so it’s okay. The whole documentation generation system is a hack anyway :slight_smile: Back in the early days of the Dark Ages, we used Doxygen (I almost wrote \c Doxygen now :slight_smile: ), but then we decided to migrate to DocBook so we can generate PDF, HTML and Texinfo docs from the same source. That was the time when doxrox.py was born, out of necessity, because we’ve had lots of comments written in Doxygen format and we did not want to lose them. None of us were experienced Python programmers back then, so doxrox.py is kinda ad-hoc and hacky, it’s terribly slow with the amount of code that we have, but it works and I can’t really see any better solution for building the docs even after so many years, so it’s likely to stay.

One thing that I can probably do is to implement some kind of an on-disk documentation chunk cache that maps SHA256 hashes of files to the documentation chunks in them so we don’t need to re-generate the chunks if the file did not change. This would probably speed up things when you are improving the docs iteratively.

I’ll note this down for reference:

Parsing the chunks took 137 seconds on my machine, xmlto (last step) took 106 seconds. The rest is negligible.

Since xmlto must be run every time, and is outside of our control, it defines the absolute lowest limit to how fast the doc build process can be.

(Thanks again for all the background info, it really helps a lot :slight_smile: )

One crude thing that works is to comment out lots of .xxml files in doc/CMakeLists.txt and comment out the same files in docs/igraph-docs.xml. If there are external links I also have to add --skip-validation to the xmlto command in doc/CMakeLists.txt

COMMAND ${XMLTO_COMMAND} --skip-validation -x ${CMAKE_CURRENT_SOURCE_DIR}/gtk-doc.xsl -o html xhtml igraph-docs.xml

Of course checking if links are correct is important, but to just check your docs in a proper layout it can be nice, especially if you’re not familiar with all the tags.
It’s really fast :slight_smile:, only making the iterators docs takes 18s on my slow, old machine. You also have a bit nicer overview of the warnings, and this still shows you missing links as a warning.
Maybe this is annoying to automate, but even doing it manually can speed things up. Are there other downsides I am missing?

There seem to be some strange differences in how long each stage of the doc build takes on different systems, so let’s time them.

To get timings, I added this in the top-level CMakeLists.txt:

set_property(GLOBAL PROPERTY RULE_LAUNCH_CUSTOM "${CMAKE_COMMAND} -E time")

and I used ninja -v html (when building with ninja) or make html VERBOSE=1 (if you prefer make).

On my machine (macOS 10.14, most tools from MacPorts), I get:

  • 2 seconds for parsing documentation chunks (from from 137 s thanks to @tamas’s work yesterday)
  • Basically no time at all for all following steps excluding the xmlto run (0 s or 1 s reported). @tamas mentioned that this stage is much slower for him.
  • 100 s for “Generating HTML documentation with xmlto”, regardless of whether I add --skip-validation or not.

Yes, the --skip-validation is only useful in enabling skipping xml files.

For me this is much slower, and it is due to source-highlight being called on each of the examples that are to be embedded in the documentation. The part that slows it down is the --gen-references=inline switch; without the generation of cross-references it runs in almost no time. The time command reveals that the process is CPU-bound, not IO-bound.

source-highlight was installed from Homebrew; I’ll try to reinstall and see how it goes.

8 posts were split to a new topic: Consistency of RNG_BEGIN and RNG_END