Hello Python, this is C++ calling!

Oliver Ebert

2019-03-18 00:26

In this blog post I'll give a quick rundown on how to embed the CPython interpreter in a C++ application and call a Python function from C++, with C++ arguments, all using Pybind11 on Linux. The HTML for this post was generated from an Org mode document that contains noweb-style literate programs. I got inspired to try that out after reading Linux containers in 500 lines of code by Lizzie Dixon. We'll see how it goes.

At work we're developing high-end test and measurement equipment and among other things I'm responsible for the software interface layer to our FPGA-based data acquisition subsystem. Now when our hardware engineers are developing a new feature or we need to debug an issue reported by CI (black-/gray-box tests), QA, or—occasionally—even customers, every so often some temporary tweaks to our C++ hardware abstraction layer are required to allow these guys to drill down and identify the root cause of a problem. In particular, this means custom (to the problem at hand) modifications to the configuration data¹ that are sent to the FPGAs via PCI Express, which basically constitutes the lowest-level hardware/software boundary. And while we do have a few callbacks in place for the "common cases" these are mostly observers and more crucially, they're thoroughly C++ APIs. Our hardware engineers do not touch anything C++. Luckily, most of them know their way around Python, at least a bit. So I decided to leverage that and see if I could provide a Python callback for the FPGA configuration data! Which would then hopefully enable our (digital) hardware engineers to work independently on a large portion of cases that currently require close cooperation with a systems software engineer (that is usually me).

I've used Boost.Python a couple of years ago to create Python/C API bindings for one of our C++ libraries, but quite honestly even back then Boost.Python felt old. Since then I've become less enthusiastic about Boost in general and although I gather that development has resumed (?), I personally consider Boost.Python stale at this time and wouldn't use it for new projects. Fortunately there is a successor in spirit based on C++11, the aptly-named Pybind11, which we'll make use of here straightaway.

The plan is as follows: We'll create a skeleton C++ application with the minimum scaffolding necessary to evaluate a Python (.py) source file and then call a Python function with a given name (defined in that file, obviously), passing in a reference to a C++ buffer (with the FPGA configuration data from above). The Python function may then read and/or modify the buffer's contents to its liking before ceding control to C++ again.

Alright let's start with the setup of the build environment; I've deliberately kept things terse/declarative and trust that you'll be able to adapt this to your specific circumstances if need be. First I installed Pybind11 using the Fedora package manager:

$ sudo dnf install pybind11-devel

I use CMake out of habit, rather boring:

cmake_minimum_required (VERSION 2.6)

project (cppydemo)

set (PYBIND11_CPP_STANDARD -std=c++11)
find_package (pybind11 REQUIRED)

add_executable (cppydemo main.cpp)
target_link_libraries (cppydemo PRIVATE pybind11::embed)

And that's about all we need to get going. Here's the outline:

#include <array>
#include <cstdint>
#include <cstdlib>
#include <ostream>
#include <sstream>

#include <pybind11/embed.h>

namespace py = pybind11;

<<define fpga config python wrapper>>

int
main()
   {
   <<create embedded python interpreter>>
   <<set up python scope>>
   <<evaluate python script>>
   <<call python function>>
   return EXIT_SUCCESS;
   }

Creating the embedded Python interpreter is straightforward.

<<create embedded python interpreter>> =

py::scoped_interpreter interp;

Next we prepare the Python scope in which we're going to evaluate the Python script. Since module __main__ is apparently already built into the interpreter, we'll just use that as our default scope.

<<set up python scope>> =

auto const py_main = py::module::import("__main__");
auto const py_FpgaConfig = py_class_FpgaConfig(py_main);

Defining the Python wrapper for the C++ FPGA configuration buffer was actually the most intricate part and it took me a while to figure this out. The configuration data¹ controls the entire DAQ subsystem of up to 15 FPGAs and consists, broadly speaking, of an array of 1024 32-bit "dwords". These are the data that we'd like to access and manipulate from a Python script.

<<define fpga config python wrapper>> =

using dword = std::uint32_t;  // legacy

std::array<dword, 1024> g_fpga_config;

In order to avoid copying around 4 KiB blocks all over the place we implement the Buffer Protocol for our wrapper to directly expose C++ memory to Python. We can then create memoryviews for efficient, zero-copy access to the underlying buffer as appropriate. For ease of use we also provide basic implementations of the __len__(), __getitem__(), and __setitem__() special methods, supporting only integer arguments but not slices.

<<define fpga config python wrapper>> +=

struct fpga_config_view
   {
   dword *dwords;
   std::size_t n;
   };

std::ostream &
operator<<(std::ostream &lhs, fpga_config_view const &rhs)
   {
   return lhs << "fpga_config_view@" << &rhs << "{dwords=@" << rhs.dwords << ", n=" << rhs.n << '}';
   }

py::object
py_class_FpgaConfig(py::module const m)
   {
   return
     py::class_<fpga_config_view>(m, "FpgaConfig", py::buffer_protocol())
       .def(py::init<fpga_config_view>())
       .def("__len__", [](fpga_config_view const &cfg)
	  {
	  return cfg.n;
	  })
       .def("__getitem__", [](fpga_config_view const &cfg, std::size_t const i)
	  {
	  if (i < cfg.n)
	     return cfg.dwords[i];

	  throw py::index_error();
	  })
       .def("__setitem__", [](fpga_config_view &cfg, std::size_t const i, dword const val)
	  {
	  if (i < cfg.n)
	     cfg.dwords[i] = val;
	  else
	     throw py::index_error();
	  })
       .def_buffer([](fpga_config_view &cfg)
	  {
	  return py::buffer_info(cfg.dwords, static_cast<ssize_t>(cfg.n));
	  })
       .def("__str__", [](fpga_config_view const &cfg)
	  {
	  std::ostringstream ss;
	  ss << "FpgaConfig" << '{' << cfg << '}';
	  return ss.str();
	  });
   }

Now we are ready to actually evaluate a Python script. Note that it is possible for the script to use the FpgaConfig type already during evaluation.

<<evaluate python script>> =

py::eval_file("cppydemo.py", py_main.attr("__dict__"));

The script is supposed to define a process_fpga_config function with a single parameter (assuming type FpgaConfig):

def process_fpga_config(cfg):
    print(cfg)
    ...

The final step is then to retrieve a handle to that function and call it with a view of our FPGA configuration buffer.

<<call python function>> =

auto const py_process_fpga_config = py_main.attr("process_fpga_config");
auto const py_fpga_config = py_FpgaConfig(fpga_config_view{ g_fpga_config.data(), g_fpga_config.size() });
py_process_fpga_config(py_fpga_config);

And that's about all I have right now! At this point it's really just a demo, but it should hopefully give you an idea what's possible and provide a good starting point for further experimentation/exploration. You can download the "tangled" source files.

Footnotes:

To clarify: I'm not talking about an FPGA bitstream here, which contains the programming information for an FPGA, but about application configuration data that tells the already programmed FPGA what to do.