Inside cpyext: Why emulating CPython C API is so Hard

morepypy.blogspot.com

173 points by mattip 6 years ago

haberman 6 years ago

There is a lot of detail here, but the higher-level takeaway is this: the Python C API is a very "wide" API. It exposes lots of details of CPython: the good, the bad, and the ugly. You get direct access to pointers to the underlying objects. You manually manage reference counts.

For the polar opposite of this, consider the Lua API. You don't get pointers to any VM data structures except a single pointer to the "Lua state". You do not perform any manual memory management.

Lua's approach has yielded amazing results. LuaJIT is not only source-compatible with pre-existing Lua 5.1 extensions, it is binary compatible with them. You can take a .so that you built before LuaJIT ever existed and it will work with LuaJIT without recompiling. This is astounding to me.

Moral of the story: keep your interfaces as narrow and encapsulated as possible!

Pxtl 6 years ago

Yup. People always complain about languages with a "reference implementation" instead of a spec, but because the CPython interpreter exposes so much of its guts into the API, it basically means that Python as a platform is defined by a "reference implementation" instead of a spec.
typomatic 6 years ago

Worth noting that the PyPy article links to a whole site devoted to the troubles and rehabilitation of the CPython API: https://pythoncapi.readthedocs.io/ . It's really exciting to see these backend concerns being addressed with depth and care, especially against the backdrop of syntactic sugar PEPs blowing up uglily.
- antt 6 years ago
  
  PyPy is one of the few projects that are doing good work, communicating their work well and making the right strategic decisions. At his rate I won't be surprised if python4 is just PyPy.
  
  sitkack 6 years ago
  
  If the PyPy team made something that was compatible with both Python2 and Python3, that Python would be that Python.
- quotemstr 6 years ago
  
  Thanks. That's a good link. The proposal is a good start, although a bit lacking on the concrete design for a new API. I'm not sure there's a big need for innovation here though: JNI basically got it right. It provides full isolation from VM implementation details (especially of memory management) while not sacrificing performance or completeness. JNI has a bad reputation for some reason I've never fully understood, but it seems like an elegant approach to me.
dasyatidprime 6 years ago

… and then LuaJIT converts started saying “don't ever use the traditional C API, it's so slow and indirect, and the FFI is so easy”, and then anyone not running LuaJIT lost access to a bunch of libraries who stopped caring about the old ways.
wahern 6 years ago

I think the central issue is that PyPy uses a copying GC, which means pointers to objects aren't stable by default. The article implies that this is a consequence of having a generational GC, but you can have a non-copying generational GC, which Lua 5.2 and 5.4 actually have.[1]
[1] Lua 5.3 dropped the generational GC because real-world results didn't justify the complexity, but apparently for 5.4 they came up with a better design.
- quotemstr 6 years ago
  
  Object stability precludes bump-pointer allocation of short-lived temporaries though, and in a very allocation-y language like Python, that's a pretty important.
  
  weberc2 6 years ago
  
  I wonder if escape analysis could compensate as it does with Go.
Boxxed 6 years ago

I don't even think it should be considered an API. The python "API" is just all the interpreter internals.
- int_19h 6 years ago
  
  The more accurate term would be "object model with an ABI", in the same sense COM or GLib is an object model.
  So CPython is actually several things layered on top of each other.
  - PyObject-based object model; this includes PyObject, PyTypeObject, PyUnicodeObject... and I think that's it? This is the equivalent of COM IUnknown. It doesn't actually know anything about Python proper, but it defines the operations in terms of which language itself will later be defined (like the idea that objects have a refcount-centric lifetime, and have attributes, and operations like "call" and "add" etc).
  - A bunch of standard data structures built on top of that, like PyLongObject and PyListObject. This is just a pure extension of the above - again, defining more terms in which the language is defined, like what happens when you add two ints.
  - A bunch of specialized data classes which store Python bytecode and provide the framework for its execution, like PyCodeObject, PyFrameObject and PyFunctionObject. Note that these don't know anything about how the bytecode is produced, nor about how to actually execute it. But they do know about things like local variables (so PyCodeObject will store the list of locals, and PyFrameObject will allocate space for them), so Python-the-language starts creeping in here.
  - The parser which produces AST (which is itself a bunch of Python objects), and the bytecode compiler that produces PyCodeObjects out of that AST.
  - And finally, the actual interpreter, that ties it all together by providing semantics for the bytecode contained in PyCodeObjects.
  This layering is even visible in the Python source itself (https://github.com/python/cpython/tree/v3.7.0) - the first three things live under ./Objects in the source tree, and the parser and the interpreter is under ./Python (with some bits under ./Grammar and ./Parser). So, roughly speaking, ./Objects is the object model, and ./Python is the language proper. The headers are interdependent, unfortunately, but it's not that hard to break them apart if anyone cared to.
dtech 6 years ago

It's a common problem because it is so tempting. Exposing the compiler/runtime internals allows very powerful things at minimal compiler developer effort.
- nicolaslem 6 years ago
  
  This reminds me the case of Firefox: extensions could access almost everything internally which made them very powerful but ultimately made refactoring very hard, hence the switch to web extensions.
kbumsik 6 years ago

Never heard of Lua's approach but that sounds interesting. What about overhead between Lua and native? I think the Python C API is one of the reason Python get popular, by doing "prototype in Python then make C codes later for performance" approach. This approach sometime is not desirable in other platforms such as NodeJs because calling a native module from JS is quite costly.
- dividuum 6 years ago
  
  The overhead is really minimal. I happened to have an old program around that uses LuaJIT or Lua5.1. It calls a two functions that each increments a C int. Using the FFI, calling this function 500 million times takes 0.7 seconds. Calling it as an old-style exported function takes 5.4 seconds for 500 million calls. Disabling the JIT increases the FFI time to 32.7s and decreases the old-style time to 4.7 seconds. On Lua5.1, calling the function takes 6.7s. This is in no way representative of any reach benchmark, but as you can see: Lua is pretty fast.

klibertp 6 years ago

> for a total 2-3 person-months of work.

In a year. In a project which could put Python next to JS, for the last pain-point that prevents it.

Python - one of the top 10 most popular languages - community and all its industrial user, including some of the most successful companies on the planet, can afford to put 3 person-months of work for that feature.

There has to be something else at play here that I'm missing. Well, other than missing that "donate" button for a tad too long...

rtpg 6 years ago

There are very few commercial sponsors for Python and related projects. Django has enough resources for basically one full-time paid contributor, but that’s it. Huge amounts of Python infra is maintained in people’s free time, despite powering so much
ma2rten 6 years ago

Most of the work was done during two sprints, for a total 2-3 person-months of work.
I think what they are saying is just: We got a lot of work done on this during those two sprints. It's not a statement about the work that is remaining.

quotemstr 6 years ago

Right. It's exactly this sort of coupling between the runtime and extension modules that prompted me to adopt a very conservative design for the Emacs extension module interface, which tries as hard as possible to isolate extension modules and abstract over VM implementation details.

I didn't quite get as far as I wanted, since the module system still relies on conservative stack scanning to find C-extension GC tools (because everyone else wasn't sold on JNI-like explicit local references), but it's still much more tightly specified than the Python API.

The Python extension API has another problem: it relies on FILE* and other assorted bits of the C runtime. That's mostly okay on POSIX-y systems where it's common for a whole process to share one C runtime, but on Windows, where different modules can come with different C runtime versions, this kind of leaking of objects across an interface boundary really hurts.

robmccoll 6 years ago

It seems like it would be easier and more performant (begin heresy and likely vast oversimplification) for PyPy to have a mostly complete implementation of CPython that it interacts with at the object level such that each object in PyPy could be a PyPy GC object descended from W_root or a CPython RefCount object implemented on PyObject*. There would be many points where you would still have to convert to make operations work. Either way, running under the assumption of "objects from CPython or objects that are passed to CPython are more likely to interact with other CPython objects, so lets not convert them back until we have to" might result in closer to CPython performance for code with a lot of CPython extensions.

Does this seem reasonable? Is this even possible? I don't know much about the internals of PyPy...

antt 6 years ago

PyPy in general has much better performance than CPython.

faragon 6 years ago

As contrast, calling C from Python, even using callbacks, it is very easy and incredibly useful, thanks to the "ctypes" [1]

[1] https://docs.python.org/2/library/ctypes.html

simias 6 years ago

True but that's not particularly impressive, most non-sandboxed scripting languages have a straightforward way to call C. It's just a lot simpler that way: C doesn't have much of a runtime, you don't have to worry about corrupting C's state too much. There's no garbage collector in C for instance.
mattip 6 years ago

Better to use cffi instead, it will be much more performant on PyPy

mattip 6 years ago

Would be nice if funding comes through to make c extensions fast on PyPy, then we could try out many what if scenarios descibed in other comments:

What if we wrote that in pure python instead? What if we moved the computation to a GPU c extension? What if we use a diffent GC strategy?

sitkack 6 years ago

I have been saying this for 8+ years, maybe longer. The CPython "Python.h" should be viewed as deprecated and folks should be using cffi [1] for all of their native code gluing needs. By targeting cffi, one gets futured proofed multi-runtime extensions for free. "Python.h" ties one to a specific implementation.

[1] https://cffi.readthedocs.io/en/latest/

bratao 6 years ago

I started making PyPy the default interpreter for my projects, and you should too!

Maybe I'm a sucker for the underdog, but in my mind, PyPy could save the Python ecosystem from irrelevance. People are looking at Rust and Go with the excuse of performance, and they are now the new Hip language. Even Ruby is catching up.

Without an answer, Python could be side-tracked for a increasing number of scenarios.

Pxtl 6 years ago

Seriously, this surface area should've been closed off when they broke backwards-compatibility with the Python 2 to Python 3 jump.

aportnoy 6 years ago

Can anyone point to an introduction to CPython internals? My goal is to write a Python extension in C that manipulates Python objects.

mjw1007 6 years ago

The official docs aren't bad. https://docs.python.org/3/extending/index.html
Too 6 years ago

Try boost-python, SWIG or cython instead. They give you a nicer c++ api and automatically handles ref-counting, exception handling and other things that are easy to forget.
https://www.boost.org/doc/libs/1_61_0/libs/python/doc/html/t...
- eesmith 6 years ago
  
  How is the Boost/PyPy integration? Sure, there's cpyext, so in principle a Boost-generated extension can be used by pypy.
  Are there any changes which would make Boost-based extensions better integrated/supported by PyPy?
  The linked-to document only talks about Cython and cffi.
  
  mattip 6 years ago
  
  There is cppyy, which is cffi for c++. Boost python is not being maintained, in the cpython world pybind11 is more popular but cppyy is pypy freindly
  
  eesmith 6 years ago
  
  Boost isn't being maintained? I did not know that. One of the project I use often - F/OSS but I'm a user, not core developer - uses Boost for C++/Python integration. They chose it many years ago. This topic has never come up on the mailing list or user group meetings. I suspect that since Boost "just works", no one has care to re-evaluate that decision.
  Thanks for the pointers to what's going on in the C++/python integration layer. I'll experiment with it.
zerkten 6 years ago

There is a full course on the subject at http://www.pgbovine.net/cpython-internals.htm.
- AdamM12 6 years ago
  
  This looks really interesting. Saving for later as I've been wanting to dive into python much more.
pvg 6 years ago

Depending on what you're trying to do, Cython might be a slightly easier option as well.

ezoe 6 years ago

For the healthy improvement of programming language, it is vital to have a multiple independent and competing implementations. Python doesn't allow that.

For this, I think the Python will follow the same fate as Perl did.