Cython函数指针解除引用时间（与直接调用函数相比）

我有一些Cython代码涉及以下形式的Numpy数组（代表BGR图像）上的像素方式的非常重复的操作：

ctypedef double (*blend_type)(double, double) # function pointer

@cython.boundscheck(False) # Deactivate bounds checking

@cython.wraparound(False) # Deactivate negative indexing.

cdef cnp.ndarray[cnp.float_t, ndim=3] blend_it(const double[:, :, :] array_1, const double[:, :, :] array_2, const blend_type blendfunc, const double opacity):

# the base layer is a (array_1)

# the blend layer is b (array_2)

# base layer is below blend layer

cdef Py_ssize_t y_len = array_1.shape[0]

cdef Py_ssize_t x_len = array_1.shape[1]

cdef Py_ssize_t a_channels = array_1.shape[2]

cdef Py_ssize_t b_channels = array_2.shape[2]

cdef cnp.ndarray[cnp.float_t, ndim=3] result = np.zeros((y_len, x_len, a_channels), dtype = np.float_)

cdef double[:, :, :] result_view = result

cdef Py_ssize_t x, y, c

for y in range(y_len):

for x in range(x_len):

for c in range(3): # iterate over BGR channels first

# calculate channel values via blend mode

a = array_1[y, x, c]

b = array_2[y, x, c]

result_view[y, x, c] = blendfunc(a, b)

# many other operations involving result_view...

return result;

where blendfunc指另一个cython函数，如下所示overlay_pix：

cdef double overlay_pix(double a, double b):

if a < 0.5:

return 2*a*b

else:

return 1 - 2*(1 - a)*(1 - b)

解决办法参考：

使用函数指针的目的是避免为每种混合模式（其中有很多）一遍又一遍地重写大量的重复代码。因此，我为每种混合模式创建了一个这样的界面，省去了我的麻烦：

def overlay(double[:, :, :] array_1, double[:, :, :] array_2, double opacity = 1.0):

return blend_it(array_1, array_2, overlay_pix, opacity)

blendfunc在blend_it函数中使用而不是直接调用overlay_pixin 时会有相当大的时间损失blend_it。我假设这是因为blend_it每次迭代时都必须取消引用函数指针，而不是让函数立即可用，但我不确定。

为您要执行的所有操作定义一个cdef class包含staticmethod cdef函数。

定义包含所有cdef classes 的融合类型。（这是这种方法的限制 - 它不容易扩展，所以如果你想添加你必须编辑代码的操作）

定义一个函数，该函数接受融合类型的伪参数。使用此类型来调用staticmethod。

定义包装器函数 - 您需要使用显式[type]语法才能使其工作。

码：

import cython

cdef class Plus:

@staticmethod

cdef double func(double x):

return x+1

cdef class Minus:

@staticmethod

cdef double func(double x):

return x-1

ctypedef fused pick_func:

Plus

Minus

cdef run_func(double [::1] x, pick_func dummy):

cdef int i

with cython.boundscheck(False), cython.wraparound(False):

for i in range(x.shape[0]):

x[i] = cython.typeof(dummy).func(x[i])

return x.base

def run_func_plus(x):

return run_func[Plus](x,Plus())

def run_func_minus(x):

return run_func[Minus](x,Minus())

为了比较，使用函数指针的等效代码是

cdef double add_one(double x):

return x+1

cdef double minus_one(double x):

return x-1

cdef run_func_ptr(double [::1] x, double (*f)(double)):

cdef int i

with cython.boundscheck(False), cython.wraparound(False):

for i in range(x.shape[0]):

x[i] = f(x[i])

return x.base

def run_func_ptr_plus(x):

return run_func_ptr(x,add_one)

def run_func_ptr_minus(x):

return run_func_ptr(x,minus_one)

timeit与使用函数指针相比，使用我的速度提高了2.5倍。这表明函数指针没有为我优化（但我没有尝试更改编译器设置以尝试改进）

import numpy as np

import example

# show the two methods give the same answer

print(example.run_func_plus(np.ones((10,))))

print(example.run_func_minus(np.ones((10,))))

print(example.run_func_ptr_plus(np.ones((10,))))

print(example.run_func_ptr_minus(np.ones((10,))))

from timeit import timeit

# timing comparison

print(timeit("""run_func_plus(x)""",

"""from example import run_func_plus

from numpy import zeros

x = zeros((10000,))

""",number=10000))

print(timeit("""run_func_ptr_plus(x)""",

"""from example import run_func_ptr_plus

from numpy import zeros

x = zeros((10000,))

""",number=10000))