I am writing to report an issue with the use of the read_graph function from the igraph package in the R program.
For over a year, I have been using the read_graph function to read the empirical networks that I use in my research project, and it has always worked smoothly. However, after the latest update of R (from version 4.3.2 to version 4.3.3), I am having trouble importing my networks into R.
Here is an example of the lines of my code that worked perfectly:
filenames = list.files(pattern="*.graphml")
redes <- lapply(filenames, read_graph, format = "graphml")
Currently, when trying to run the above lines, problems like this arise: Error in read.graph.graphml(file, …) : At vendor/cigraph/src/io/graphml.c:790 : Duplicate attribute name found: ‘Year cubs born’ (for ). Parse error
The errors persist if I try to read some of my networks one by one, using their file names. Given that I had no problem reading these networks with the read_graph function before, I don’t know what is happening or how to resolve it.
The files you are trying to read are invalid. R/igraph 2.0 is more careful when reading files, and will reject some invalid files that previous versions accepted.
Note that if there are duplicate attributes in the file, you might not be getting what you expect! Likely earlier values get overwritten by later ones.
If you post an example of one of these corrupted files, we might be able to help you fix them.
I do not recommend that you downgrade your R version and igraph version. igraph 2.0 has a large number of bugfixes. When something that worked in a previous version seems to stop working in newer versions, it’s likely because you were (inadvertently) doing something wrong. Later igraph versions are stricter and try to do better in alerting you to mistakes that might lead to wrong research results.
Notice that there are two different attributes (d4 and d3) with the same name ("Year cubs born"). If you rename one of these, the file will be read correctly. For example, change "Year cubs born" to "Year cubs born 2" on line 4.
I looked at the GraphML spec but couldn’t decide if this was legal. Gephi and yEd both import this without problems (yEd crashes on exit, unclear if it’s because of this). Obviously, igraph can’t do that if we use the GraphML attribute name as the igraph attribute name. We could do it if we used the GraphML attribute ID as the igraph attribute name, but IDs are typically auto-generated and not meaningful to humans (just like in this file).
NetworkX imports incorrectly: it overwrites earlier Year cubs born attribute values with later ones.
graph-tool fails with a cryptic error message of “RuntimeError: boost::bad_any_cast: failed conversion using boost::any_cast”
I’d say that even if it is technically legal to have duplicate attribute names (which is unclear), such files don’t map well to the data model of most libraries. They might fit GUIs like Gephi and yEd, but look confusing even there.