ABI compatibility

We agreed that we maintain at least backward API compatibility in the 0.8.x series. What about ABI compatibility? Consider this:

Simply adding a new error code technically breaks ABI compatibility, even if other codes are not renamed. In other words, other programs depending on igraph will need to be recompiled. If they are not, weird things may happen (such as a crash when this error is generated).

Breaking ABI compatibility is the kind of thing that will drive distro maintainers mad.

On the other hand, we are a small project with limited manpower and dealing with such stuff is not always the best use of resources.

@vtraag @tamas Any opinions on this? We can simply leave such changes to 0.9, but then some crash bugs are not fixed for 0.8.

If I understand correctly, “adding an error code” means adding a new member to an enum (igraph_error_t). In the strict sense, this breaks the ABI because the contract is that we never return values outside the range of the enum so third-party code that uses our code should never have to anticipate a new error code that it has not seen before. If we add a new error code, we suddently extend the “domain” of the error codes and other applications using our code should prepare for the new error code.

On the other hand, I believe that the standard way of using igraph’s error code is to check whether it’s zero or non-zero; if it’s non-zero, it should be passed back to igraph anyway to get a human-readable error message. Considering this, plus the limited manpower of the project, I would not consider adding a new error code as breaking the ABI, at least not in the 0.x days.

1 Like

OK, sounds good.

Do you have any opinion on changing what numeric code is assigned for each enum value?

To be specific, now we have this:

When I add a new IGRAPH_GLP_... value, I can

  • either number it 51, and shift the numbers the remaining ones,
  • or number it 61 as currently we have 60 codes, and we want to leave those 60 alone

I’d rather add it at the end even if it makes the numbering less logical. Renumbering existing error codes seems to be a much more severe ABI breakage to me than simply adding a new one and not changing the existing ones.

I think I agree.

In principle, what you suggest, to respect ABI instead of API compatibility, makes sense @szhorvat. Strictly speaking, semantic versions only mentions API, not ABI. From that point of view it might not be necessary.

However, I can understand that it would be preferable if ABI compatibility would be ensured for the same minor versions (or for 1.x, the same major version). I just keep wondering: how well-defined is ABI compatibility, and is it something that you can guarantee? Could we setup an automatic hook on GitHub that indicates API and/or ABI compatibility? Is ABI compatibility only definable w.r.t. a specific compiler and architecture? I can’t seem to find (easy) answers to these questions, perhaps you can enlighten me a bit?

My knowlegde is quite limited in this area. I’ll try to answer, but keep that in mind when you read the below.

Basically, the idea is that if a Linux distro (or whatever package manager like MacPorts or Anaconda) includes package A that depends on library B version 1.2, then it should be safe to update only package B (the binaries) from e.g. version 1.2.3 to 1.2.4 without having to recompile package A (and all other packages that depend on B).

ABI compatibility, or binary compatibility, means that you can upgrade or downgrade a binary package without breaking its dependents.

API compatibility, or source compatibility, means that A should compile with any 1.2.x version of B, but not that an already compiled A can link to any already compiled version of B.

As we have seen in more than one GitHub issue, distro maintainers don’t like duplication. They prefer packages not to embed libraries, but to link to it dynamically. Unless I misremember, there were also suggestions that python-igraph should link dynamically, not statically to igraph in conda? If it did that, binary compatibility would become a serious concern. There’s one good reason to just link statically.

Some projects explicitly guarantee and and consider any breakage a bug. E.g., here’s the statement by Qt: Qt-Version-Compatibility - Qt Wiki

Some don’t. See trouble here: FFmpeg ABI compatibility - possibility to use/compile mpv trying to not check for ffmpeg compatibility? · Issue #5187 · mpv-player/mpv · GitHub

This is not easy to test as breakages tend to be subtle. In my opinion, the fact that our tests pass is a rather weak evidence that nothing broke, when it comes to binary compatibility. Just think about the Python situation on Windows: you’re supposed to use a specific compiler for each Python version. If you don’t, things will probably work just fine … unless you get very unlucky and hit on that special edge case.

Considering the effort, I don’t think it’s worth bothering with it. We’re not Qt and don’t have their resources. Also, I’d hope to be able to contribute to igraph primarily as a scientist who programs (which I consider myself to be), not as a programmer who knows something about network science and graph theory :slight_smile:

tl;dr No, in most cases you do not need to know what compiler will be used to be able to maintain reasonable binary compatibility. You just need to avoid certain kinds of changes in the source code.

Long version:

Compilers, and even compiler versions, are not always ABI-compatible. Something compiled on a newer Linux (new gcc) won’t always run on an older Linux (old gcc). Right now I compile IGraph/M on Ubuntu 16.04 with -D_GLIBCXX_USE_CXX11_ABI=0 to ensure compatibility with some older systems such as RHEL 7 … But I’ll admit that I’m really doing cargo cult programming here …

This is one dimension of ABI compatibility: two libraries must be compiled with the same (or explicitly compatible) compilers if they are to be linked together. This is one unpleasant aspect of creating binaries that work for everyone on all OS versions … But this is not the issue we’re discussing here.

The question now is: if we make a change to the source code, can that break ABI compatibility without breaking API compatibility, provided that the compiler didn’t change? Yes, a simple example is if the integer codes assigned to enum values change. The API is the same, as we use names in the source code. But the underlying numbers have changed. Another example would be if we add a new field to a struct. The old fields are still there, old source code is compatible. But the offset where this field is stored has now changed: the underlying representation is different.

I guess what you are asking here is: is there a kind of source change that breaks ABI compatibility only when using compiler X but not when using compiler Y? I’m sure the answer is yes, but this goes beyond my knowledge and I cannot give a realistic example :slight_smile: I do recall that in the previous version of Raspbian I kept getting warnings like this, which I think indicate a similar situation: Advance warning during compilation with GCC 6.3.0 · Issue #1323 · catchorg/Catch2 · GitHub (I seem to recall reading that the issue was present only on ARM, so in that sense yes, it is w.r.t a specific architecture).

Thank you for the extensive answer @szhorvat!

This is the essential part I think. I see in the linked Qt guidelines a reference to the type of things that you can(not) do to maintain binary compatability: https://community.kde.org/Policies/Binary_Compatibility_Examples. Most of these unfortunately do not apply to C at all, only to C++. I don’t see any easy overview of the things that you should not do to not break ABI. I was sort of surprised that adding a constant to an enum would break ABI, hence the question was you can and cannot do. Do you have a reference of some sorts?

Reading further from the KDE link (arriving at Policies/Binary Compatibility Issues With C++ - KDE Community Wiki) it indeed seems that not all constraints apply to all compilers equally. This is for C++ though, so perhaps the limitations for C are more broadly shared. Nonetheless, MSVC has its quirks with supporting C standards, so I would not be surprised if we would break ABI on Windows unknowingly. Perhaps it is also quite challenging to guarantee cross-platform binary compatibility, or at least more challenging when only supporting gcc for example.

Regarding automatic testing, I see that ABI Compliance Checker could potentially provide something to test whether binary compatibility is ensured. We could (in principle) check the resulting binary to each previously built binary in the CI.

All in all, I think it might indeed be difficult to promise or guarantee ABI compatibility. We can guarantee API compatibility (or at least intent to live up to this), but we do not do the same for ABI compatibility. We can try to say that we will try to maintain ABI compatibility, but that we explicitly cannot guarantee it.