Add sitewide search to igraph.org

Currently I am using google to search for specific topics/functions in the reference manual, for example:

     edge selector site:https://igraph.org/c/doc/

It would be nice if one could directly search the documentation within the site.

(I tried to post this to the Development category, but perhaps I do not have permission. )

1 Like

I agree that this would indeed be a very useful feature. I find that functionality also lacking myself. Note that it could be a good alternative to use the index of the documentation, at least for the time being.

We should look into the option of providing search on igraph.org. I am not sure whether this will be easy to implement, we will have to see. What do you think @tamas?

We will probably also switch tools to prepare documentation, but that is something for the long term.

Btw: I think this category fits perfectly the topic, so I will just leave it here.

As far as I know, Algolia has a very generous free plan for “personal” projects – I don’t know whether they consider open-source projects as “personal”, but it would be logical to do so. We could try integrating Algolia with the site. Another option would be to brew an indexing solution on our own, which might be needed anyway if we go through with @szhorvat’s plan to produce a Dash docset for igraph so its documentation can be read offline in the Dash app.

Does anyone know about any other alternatives?

Might this Jekyll plugin be an easy alternative, at least for now?

In the longer term, when changing the documentation, we will have to see how things unfold anyway,

Yes, probably it could work – we need to investigate this more thoroughly at some point.

@tamas, when we discussed the Dash docset you mentioned something about an index already being created, if I remember correctly. Where is this index, or how can I generate it? I have the ix01.html file, but this is a HTML file, not a structured index that is easy to convert to other formats (e.g. an index for Dash).

ix01.html is the index that is generated by DocBook. Unfortunately it seems like it does not generate a machine-readable index representation that we could parse.

The way DocBook works is that it is looking for <indexterm> tags in the source XML files. They look like this:

<indexterm><primary>dyad_census</primary></indexterm>

(There could be a secondary part, but we don’t use them yet as far as I know).

Basically, you could either take all the XML files that we use as DocBook sources and use some simple Python script to collect all the <indexterm> tags (or just use grep for a crude solution), or you could implement an alternative DocBook XSL stylesheet that collects the <indexterm> tags and dumps them somewhere. I believe the former is the easier option. You can also try parsing ix01.html with a lenient HTML parser like BeautifulSoup in Python.

Or, we could hack doxrox.py further to create a list of index terms for us.