write failed for graphml

I am trying in Python to write to a file a graph with 22m vertices and 70m edges:

g.write_graphmlz('bigGraph.graphml')

but it keeps telling me this error:

Error at src/io/graphml.c:1767: Write failed. -- File operation error

If I remove a good part of the graph the write process is ok, so I fear it’s a problem with the dimensions.
Is there any workaround to be applied to the machine? I am working in Ubuntu and I have plenty of ram (64gb) and HD resources.
Do you know which is the smallest way to write to a file a huge graph?

What is your python-igraph version and where exactly did you obtain python-igraph?

Have you tried write_graphml() instead, and did it work?

Does your graph have any attributes?

Can you produce a self-contained minimal example (ideally with a generated graph)?

@tamas, can you have a look at this? Here’s line 1767 in graphml.c in curent master:

There’s a chance that OP has a slightly different C/igraph version. Does python-igraph has any means to retrieve the precise C/igraph version? It would be very useful to add this.

I’m not sure how the error would be coming from that line, instead of from inside of igraph_i_xml_escape(). igraph_i_xml_escape() can fail either through out-of-memory (unlikely) or by finding a “forbidden control character”. Is it possible that this happens, but then in python-igraph the earlier error somehow gets overwritten by the later one?

@trekianov Does your graph have any string attributes, and if yes, is it possible that some of them contain characters with character code between 0…31 (excluding newline)?

I am using standard Spyder installation and installed igraph through pip.
I am now using picklez with no problem even if the resulting file is almost 1Gb.
My graph has only stringy names and amount in integers, nothing more, could you please tell which are the characters between 0 and 31?
I cannot reproduce the problem, I suppose is tied to the size of the network, since if I reduce the graph size everything works fine also with graphmlz.

Does python-igraph has any means to retrieve the precise C/igraph version?

from igraph import __igraph_version__
print(__igraph_version__)

Unfortunately this only prints the output of igraph_version(&version, 0, 0, 0); from the C layer and not the exact commit hash. In the case of the OP, he is probably on this commit if he is using the latest released version of igraph from pip:

But indeed, it is not supposed to return Write failed if igraph_i_xml_escape() fails so this is weird.

@trekianov If you can save your graph in picklez format, would it be possible to upload it somewhere so I can download it and try to replicate the issue on my end?

could you please tell which are the characters between 0 and 31?

These are all nonprintable control characters; see the ASCII table. They are not allowed in XML files, expect for tabs, newlines and line feed characters.

If it is indeed a bad character, it should report a proper error. Could it really be an out-of-memory? I still don’t understand why it doesn’t report a proper error.

IPython transcript illustrating bad characters:

In [5]: import igraph as ig

In [6]: g=ig.Graph([(0,1),(1,2)])

In [7]: g.vs['name'] = [chr(32), chr(31), chr(30)]

In [8]: g.write_graphml('/Users/szhorvat/foof.xml')
---------------------------------------------------------------------------
InternalError                             Traceback (most recent call last)
<ipython-input-8-3a1e581b8ed4> in <module>
----> 1 g.write_graphml('/Users/szhorvat/foof.xml')

InternalError: Error at src/io/graphml.c:1281: Forbidden control character 0x1F found in igraph_i_xml_escape. -- Invalid value

You did not say which version you are using, but I expect it was earlier than 0.9.9 (the current one) and the error is coming from this line (note that this link is to C/igraph 0.9.5, not 0.9.6):

It’s unclear why fprintf might fail. Exceeding a file size limit? Out of disk space?

1 Like

I am pretty sure no weird characters inside, they are wallet addresses and amounts.
this is the version
0.9.4-24-gb29e741ea
About the picklez, it’s almost 1Gb, I fear I cannot share it.
About the filesize and out of disk, I have 5g free on / (for tmp files) , 15g on my home and 64gb Ram.

Okay, so this is the line to look at in the source code. It fails with a single fprintf() call so I think that the problem is that your /tmp partition becomes full during the process. It does not matter that you have a lot more space available in /home. When you try to export into GraphMLz, igraph writes a full GraphML file to the disk in a temporary folder and then it compresses it with zlib after the file has fully been written to the disk. (This is because the zlib compression is not implemented in the C core but in the Python-C glue code).

So, either you need a larger tmp partition, or you need to save the graph as GraphML into your /home folder and then compress it with zlib later.

Also, is your /tmp file backed by a “real” filesystem or is it backed by tmpfs (which is essentially an in-memory filesystem)?

Thanks for the precious help, I will definitely use GraphML