python中numpy的矩阵、多维数组的用法-CDA数据分析师官网

热线电话：13121318867

python中numpy的矩阵、多维数组的用法

2018-08-14

python中numpy的矩阵、多维数组的用法

1. 引言

最近在将一个算法由matlab转成python，初学python，很多地方还不熟悉，总体感觉就是上手容易，实际上很优雅地用python还是蛮难的。目前为止，觉得就算法仿真研究而言，还是matlab用得特别舒服，可能是比较熟悉的缘故吧。matlab直接集成了很多算法工具箱，函数查询、调用、变量查询等非常方便，或许以后用久了python也会感觉很好用。与python相比，最喜欢的莫过于可以直接选中某段代码执行了，操作方便，python也可以实现，就是感觉不是很方便。
言归正传，做算法要用到很多的向量和矩阵运算操作，这些嘛在matlab里面已经很熟悉了，但用python的时候需要用一个查一个，挺烦的，所以在此稍作总结，后续使用过程中会根据使用体验更新。
python的矩阵运算主要依赖numpy包，scipy包以numpy为基础，大大扩展了后者的运算能力。
2. 创建一般的多维数组
import numpy as np

a = np.array([1,2,3], dtype=int) # 创建1*3维数组 array([1,2,3])

type(a) # numpy.ndarray类型

a.shape # 维数信息(3L,)

a.dtype.name # 'int32'

a.size # 元素个数：3

a.itemsize #每个元素所占用的字节数目:4

b=np.array([[1,2,3],[4,5,6]],dtype=int) # 创建2*3维数组 array([[1,2,3],[4,5,6]])

b.shape # 维数信息（2L,3L）

b.size # 元素个数：6

b.itemsize # 每个元素所占用的字节数目:4

c=np.array([[1,2,3],[4,5,6]],dtype='int16') # 创建2*3维数组 array([[1,2,3],[4,5,6]],dtype=int16)

c.shape # 维数信息（2L,3L）

c.size # 元素个数：6

c.itemsize # 每个元素所占用的字节数目:2

c.ndim # 维数

d=np.array([[1,2,3],[4,5,6]],dtype=complex) # 复数二维数组

d.itemsize # 每个元素所占用的字节数目：16

d.dtype.name # 元素类型：'complex128'

3. 创建特殊类型的多维数组　
a1 = np.zeros((3,4)) # 创建3*4全零二维数组

输出：

array([[ 0., 0., 0., 0.],

[ 0., 0., 0., 0.],

[ 0., 0., 0., 0.]])

a1.dtype.name # 元素类型：'float64'

a1.size # 元素个数：12

a1.itemsize # 每个元素所占用的字节个数：8

a2 = np.ones((2,3,4), dtype=np.int16) # 创建2*3*4全1三维数组

a2 = np.ones((2,3,4), dtype='int16') # 创建2*3*4全1三维数组

输出：

array([[[1, 1, 1, 1],

[1, 1, 1, 1],

[1, 1, 1, 1]],

[[1, 1, 1, 1],

[1, 1, 1, 1],

[1, 1, 1, 1]]], dtype=int16)

a3 = np.empty((2,3)) # 创建2*3的未初始化二维数组

输出：（may vary）

array([[ 1., 2., 3.],

[ 4., 5., 6.]])

a4 = np.arange(10,30,5) # 初始值10，结束值：30（不包含），步长：5

输出：array([10, 15, 20, 25])

a5 = np.arange(0,2,0.3) # 初始值0，结束值：2（不包含），步长：0.2

输出：array([ 0. , 0.3, 0.6, 0.9, 1.2, 1.5, 1.8])

from numpy import pi

np.linspace(0, 2, 9) # 初始值0，结束值：2（包含），元素个数：9

输出：

array([ 0. , 0.25, 0.5 , 0.75, 1. , 1.25, 1.5 , 1.75, 2. ])

x = np.linspace(0, 2*pi, 9)

输出：

array([ 0. , 0.78539816, 1.57079633, 2.35619449, 3.14159265,

3.92699082, 4.71238898, 5.49778714, 6.28318531])

a = np.arange(6)

输出：

array([0, 1, 2, 3, 4, 5])

b = np.arange(12).reshape(4,3)

输出：

array([[ 0, 1, 2],

[ 3, 4, 5],

[ 6, 7, 8],

[ 9, 10, 11]])

c = np.arange(24).reshape(2,3,4)

输出：

array([[[ 0, 1, 2, 3],

[ 4, 5, 6, 7],

[ 8, 9, 10, 11]],

[[12, 13, 14, 15],

[16, 17, 18, 19],

[20, 21, 22, 23]]])　

使用numpy.set_printoptions可以设置numpy变量的打印格式

在ipython环境下，使用help(numpy.set_printoptions)查询使用帮助和示例

4. 多维数组的基本操作

加法和减法操作要求操作双方的维数信息一致，均为M*N为数组方可正确执行操作。
a = np.arange(4)

输出：

array([0, 1, 2, 3])

b = a**2

输出：

array([0, 1, 4, 9])

c = 10*np.sin(a)

输出：

array([ 0. , 8.41470985, 9.09297427, 1.41120008])

n < 35

输出：

array([ True, True, True, True], dtype=bool)

A = np.array([[1,1],[0,1]])

B = np.array([[2,0],[3,4]])

C = A * B # 元素点乘

输出：

array([[2, 0],

[0, 4]])

D = A.dot(B) # 矩阵乘法

输出：

array([[5, 4],

[3, 4]])

E = np.dot(A,B) # 矩阵乘法

输出：

array([[5, 4],

[3, 4]])

多维数组操作过程中的类型转换

When operating with arrays of different types, the type of the resulting array corresponds to the more general or precise one (a behavior known as upcasting)

即操作不同类型的多维数组时，结果自动转换为精度更高类型的数组，即upcasting
a = np.ones((2,3),dtype=int) # int32

b = np.random.random((2,3)) # float64

b += a # 正确

a += b # 错误
a = np.ones(3,dtype=np.int32)

b = np.linspace(0,pi,3)

c = a + b

d = np.exp(c*1j)

输出：

array([ 0.54030231+0.84147098j, -0.84147098+0.54030231j,

-0.54030231-0.84147098j])

d.dtype.name

输出：

'complex128'

多维数组的一元操作，如求和、求最小值、最大值等
a = np.random.random((2,3))

a.sum()

a.min()

a.max()

b = np.arange(12).reshape(3,4)

输出：

array([[ 0, 1, 2, 3],

[ 4, 5, 6, 7],

[ 8, 9, 10, 11]])

b.sum(axis=0) # 按列求和

输出：

array([12, 15, 18, 21])

b.sum(axis=1) # 按行求和

输出：

array([ 6, 22, 38])

b.cumsum(axis=0) # 按列进行元素累加

输出：

array([[ 0, 1, 2, 3],

[ 4, 6, 8, 10],

[12, 15, 18, 21]])

b.cumsum(axis=1) # 按行进行元素累加

输出：

array([[ 0, 1, 3, 6],

[ 4, 9, 15, 22],

[ 8, 17, 27, 38]])

universal functions

B = np.arange(3)

np.exp(B)

np.sqrt(B)

C = np.array([2.,-1.,4.])

np.add(B,C)

其他的ufunc函数包括：

all, any, apply_along_axis, argmax, argmin, argsort, average, bincount, ceil, clip, conj, corrcoef, cov, cross, cumprod, cumsum, diff, dot, floor,inner, lexsort, max, maximum, mean, median, min, minimum, nonzero, outer, prod, re, round, sort, std, sum, trace, transpose, var,vdot, vectorize, where

5. 数组索引、切片和迭代
a = np.arange(10)**3

a[2]

a[2:5]

a[::-1] # 逆序输出

for i in a:

print (i**(1/3.))
def f(x,y):

return 10*x+y

b = np.fromfunction(f,(5,4),dtype=int)

b[2,3]

b[0:5,1]

b[:,1]

b[1:3,:]

b[-1]
c = np.array([[[0,1,2],[10,11,12]],[[100,101,102],[110,111,112]]])

输出：

array([[[ 0, 1, 2],

[ 10, 11, 12]],

[[100, 101, 102],

[110, 111, 112]]])

c.shape

输出：

(2L, 2L, 3L)

c[0,...]

c[0,:,:]

输出：

array([[ 0, 1, 2],

[10, 11, 12]])

c[:,:,2]

c[...,2]

输出：

array([[ 2, 12],

[102, 112]])

for row in c:

print(row)

for element in c.flat:

print(element)
a = np.floor(10*np.random.random((3,4)))

输出：

array([[ 3., 9., 8., 4.],

[ 2., 1., 4., 6.],

[ 0., 6., 0., 2.]])

a.ravel()

输出：

array([ 3., 9., 8., ..., 6., 0., 2.])

a.reshape(6,2)

输出：

array([[ 3., 9.],

[ 8., 4.],

[ 2., 1.],

[ 4., 6.],

[ 0., 6.],

[ 0., 2.]])

a.T

输出：

array([[ 3., 2., 0.],

[ 9., 1., 6.],

[ 8., 4., 0.],

[ 4., 6., 2.]])

a.T.shape

输出：

(4L, 3L)

a.resize((2,6))

输出：

array([[ 3., 9., 8., 4., 2., 1.],

[ 4., 6., 0., 6., 0., 2.]])

a.shape

输出：

(2L, 6L)

a.reshape(3,-1)

输出：

array([[ 3., 9., 8., 4.],

[ 2., 1., 4., 6.],

[ 0., 6., 0., 2.]])

详查以下函数：

ndarray.shape, reshape, resize, ravel

6. 组合不同的多维数组
a = np.floor(10*np.random.random((2,2)))

输出：

array([[ 5., 2.],

[ 6., 2.]])

b = np.floor(10*np.random.random((2,2)))

输出：

array([[ 0., 2.],

[ 4., 1.]])

np.vstack((a,b))

输出：

array([[ 5., 2.],

[ 6., 2.],

[ 0., 2.],

[ 4., 1.]])

np.hstack((a,b))

输出：

array([[ 5., 2., 0., 2.],

[ 6., 2., 4., 1.]])

from numpy import newaxis

np.column_stack((a,b))

输出：

array([[ 5., 2., 0., 2.],

[ 6., 2., 4., 1.]])

a = np.array([4.,2.])

b = np.array([2.,8.])

a[:,newaxis]

输出：

array([[ 4.],

[ 2.]])

b[:,newaxis]

输出：

array([[ 2.],

[ 8.]])

np.column_stack((a[:,newaxis],b[:,newaxis]))

输出：

array([[ 4., 2.],

[ 2., 8.]])

np.vstack((a[:,newaxis],b[:,newaxis]))

输出：

array([[ 4.],

[ 2.],

[ 2.],

[ 8.]])

np.r_[1:4,0,4]

输出：

array([1, 2, 3, 0, 4])

np.c_[np.array([[1,2,3]]),0,0,0,np.array([[4,5,6]])]

输出：

array([[1, 2, 3, 0, 0, 0, 4, 5, 6]])

详细使用请查询以下函数：

hstack, vstack, column_stack, concatenate, c_, r_

7. 将较大的多维数组分割成较小的多维数组
a = np.floor(10*np.random.random((2,12)))

输出：

array([[ 9., 7., 9., ..., 3., 2., 4.],

[ 5., 3., 3., ..., 9., 7., 7.]])

np.hsplit(a,3)

输出：

[array([[ 9., 7., 9., 6.],

[ 5., 3., 3., 1.]]), array([[ 7., 2., 1., 6.],

[ 7., 5., 0., 2.]]), array([[ 9., 3., 2., 4.],

[ 3., 9., 7., 7.]])]

np.hsplit(a,(3,4))

输出：

[array([[ 9., 7., 9.],

[ 5., 3., 3.]]), array([[ 6.],

[ 1.]]), array([[ 7., 2., 1., ..., 3., 2., 4.],

[ 7., 5., 0., ..., 9., 7., 7.]])]

实现类似功能的函数包括：

hsplit,vsplit,array_split

8. 多维数组的复制操作
a = np.arange(12)

输出：

array([ 0, 1, 2, ..., 9, 10, 11])

not copy at all

b = a

b is a # True

b.shape = 3,4

a.shape # (3L,4L)

def f(x) # Python passes mutable objects as references, so function calls make no copy.

print(id(x)) # id是python对象的唯一标识符

id(a) # 111833936L

id(b) # 111833936L

f(a) # 111833936L

浅复制
c = a.view()

c is a # False

c.base is a # True

c.flags.owndata # False

c.shape = 2,6

a.shape # (3L,4L)

c[0,4] = 1234

print(a)

输出：

array([[ 0, 1, 2, 3],

[1234, 5, 6, 7],

[ 8, 9, 10, 11]])

s = a[:,1:3]

s[:] = 10

print(a)

输出：

array([[ 0, 10, 10, 3],

[1234, 10, 10, 7],

[ 8, 10, 10, 11]])

深复制
d = a.copy()

d is a # False

d.base is a # False

d[0,0] = 9999

print(a)

输出：

array([[ 0, 10, 10, 3],

[1234, 10, 10, 7],

[ 8, 10, 10, 11]])

numpy基本函数和方法一览

arange, array, copy, empty, empty_like, eye, fromfile, fromfunction, identity, linspace, logspace, mgrid, ogrid, ones, ones_like, r, zeros,zeros_like

Conversions

ndarray.astype, atleast_1d, atleast_2d, atleast_3d, mat

Manipulations

array_split, column_stack, concatenate, diagonal, dsplit, dstack, hsplit, hstack, ndarray.item, newaxis, ravel, repeat, reshape, resize,squeeze, swapaxes, take, transpose, vsplit, vstack

Questionsall, any, nonzero, where

Ordering

argmax, argmin, argsort, max, min, ptp, searchsorted, sort

Operations

choose, compress, cumprod, cumsum, inner, ndarray.fill, imag, prod, put, putmask, real, sum

Basic Statistics

cov, mean, std, var

Basic Linear Algebra

cross, dot, outer, linalg.svd, vdot

完整的函数和方法一览表链接：
9. 特殊的索引技巧

a = np.arange(12)**2
输出：
array([ 0, 1, 4, ..., 81, 100, 121])
i = np.array([1,1,3,8,5])
a[i]
输出：
array([ 1, 1, 9, 64, 25])

j = np.array([[3,4],[9,7]])
a[j]
输出：
array([[ 9, 16],
[81, 49]])

palette = np.array([[0,0,0],[255,0,0],[0,255,0],[0,0,255],[255,255,255]])
image = np.array([[0,1,2,0],[0,3,4,0]])
palette[image]
输出：
array([[[ 0, 0, 0],
[255, 0, 0],
[ 0, 255, 0],
[ 0, 0, 0]],

[[ 0, 0, 0],
[ 0, 0, 255],
[255, 255, 255],
[ 0, 0, 0]]])

i = np.array([[0,1],[1,2]])
j = np.array([[2,1],[3,3]])
a[i,j]
输出：
array([[ 2, 5],
[ 7, 11]])
l = [i,j]
a[l]
输出：
array([[ 2, 5],
[ 7, 11]])

a[i,2]
输出：
array([[ 2, 6],
[ 6, 10]])

a[:,j]
输出：
array([[[ 2, 1],
[ 3, 3]],

[[ 6, 5],
[ 7, 7]],

[[10, 9],
[11, 11]]])
s = np.array([i,j])
print(s)
array([[[0, 1],
[1, 2]],

[[2, 1],
[3, 3]]])

a[tuple(s)]
输出：
array([[ 2, 5],
[ 7, 11]])
print(tupe(s))
输出：
(array([[0, 1],
[1, 2]]), array([[2, 1],
[3, 3]]))

10. 寻找最大值/最小值及其对应索引值
time = np.linspace(20, 145, 5)
输出：
array([ 20. , 51.25, 82.5 , 113.75, 145. ])

data = np.sin(np.arange(20)).reshape(5,4)
输出：
array([[ 0. , 0.84147098, 0.90929743, 0.14112001],
[-0.7568025 , -0.95892427, -0.2794155 , 0.6569866 ],
[ 0.98935825, 0.41211849, -0.54402111, -0.99999021],
[-0.53657292, 0.42016704, 0.99060736, 0.65028784],
[-0.28790332, -0.96139749, -0.75098725, 0.14987721]])

ind = data.argmax(axis=0)
输出：
array([2, 0, 3, 1], dtype=int64)

time_max = time[ind]
输出：
array([ 82.5 , 20. , 113.75, 51.25])

data_max = data[ind, xrange(data.shape[1])]
输出：
array([ 0.98935825, 0.84147098, 0.99060736, 0.6569866 ])

np.all(data_max == data.max(axis=0))
输出：
True

a = np.arange(5)
a[[1,3,4]] = 0
print(a)
输出：
array([0, 0, 2, 0, 0])
a = np.arange(5)
a[[0,0,2]] = [1,2,3]
print(a)
输出：
array([2, 1, 3, 3, 4])

a = np.arange(5)
a[[0,0,2]] += 1
print(a)
输出：
array([1, 1, 3, 3, 4])
a = np.arange(12).reshape(3,4)
b = a > 4
输出：
array([[False, False, False, False],
[False, True, True, True],
[ True, True, True, True]], dtype=bool)

a[b]
输出：
array([ 5, 6, 7, 8, 9, 10, 11])

a[b] = 0
print(a)
输出：
array([[0, 1, 2, 3],
[4, 0, 0, 0],
[0, 0, 0, 0]])
a = np.arange(12).reshape(3,4)
b1 = np.array([False,True,True])
b2 = n.array([True,False,True,False])
a[b1,:]
输出：
array([[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])

a[b1]
输出：
array([[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])

a[:,b2]
输出：
array([[ 0, 2],
[ 4, 6],
[ 8, 10]])

a[b1,b2]
输出：
array([ 4, 10])

11. ix_() function
a = np.array([2,3,4,5])
b = np.array([8,5,4])
c = np.array([5,4,6,8,3])
ax,bx,cx = np.ix_(a,b,c)
print(ax) # (4L, 1L, 1L)
输出：
array([[[2]],

[[3]],

[[4]],

[[5]]])
print(bx) # (1L, 3L, 1L)
输出：
array([[[8],
[5],
[4]]])
print(cx) # (1L, 1L, 5L)
输出：
array([[[5, 4, 6, 8, 3]]])

result = ax + bx*cx
输出：
array([[[42, 34, 50, 66, 26],
[27, 22, 32, 42, 17],
[22, 18, 26, 34, 14]],

[[43, 35, 51, 67, 27],
[28, 23, 33, 43, 18],
[23, 19, 27, 35, 15]],

[[44, 36, 52, 68, 28],
[29, 24, 34, 44, 19],
[24, 20, 28, 36, 16]],

[[45, 37, 53, 69, 29],
[30, 25, 35, 45, 20],
[25, 21, 29, 37, 17]]])

result[3,2,4]
输出：17

12. 线性代数运算
a = np.array([[1.,2.],[3.,4.]])
a.transpose() # 转置
np.linalg.inv(a) # 求逆
u = np.eye(2) # 产生单位矩阵
np.dot(a,a) # 矩阵乘积
np.trace(a) # 求矩阵的迹
y = np.array([5.],[7.]])
np.linalg.solve(a,y) # 求解线性方程组
np.linalg.eig(a) # 特征分解

“Automatic” Reshaping

a = np.arange(30)
a.shape = 2,-1,3
a.shape # (2L, 5L, 3L)
print(a)
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14]],

[[15, 16, 17],
[18, 19, 20],
[21, 22, 23],
[24, 25, 26],
[27, 28, 29]]])

x = np.arange(0,10,2)
y = np.arange(5)
m = np.vstack([x,y])
输出：
array([[0, 2, 4, 6, 8],
[0, 1, 2, 3, 4]])
n = np.hstack([x,y])
输出：
array([0, 2, 4, 6, 8, 0, 1, 2, 3, 4])

13. 矩阵的创建
a = np.array([1,2,3])
a1 = np.mat(a)
输出：
matrix([[1, 2, 3]])
type(a1)
输出：
numpy.matrixlib.defmatrix.matrix
a1.shape
输出：
(1L, 3L)
a.shape
输出：
(3L,)

b=np.matrix([1,2,3])
输出：
matrix([[1, 2, 3]])

from numpy import *
data1 = mat(zeros((3,3)))
data2 = mat(ones((2,4)))
data3 = mat(random.rand(2,2))
data4 = mat(random.randint(2,8,size=(2,5)))
data5 = mat(eye(2,2,dtype=int))

14. 常见的矩阵运算
a1 = mat([1,2])
a2 = mat([[1],[2]])
a3 = a1 * a2
print(a3)
输出：
matrix([[5]])

print(a1*2)
输出：
matrix([[2, 4]])

a1 = mat(eye(2,2)*0.5)
print(a1.I)
输出：
matrix([[ 2., 0.],
[ 0., 2.]])

a1 = mat([[1,2],[2,3],[4,2]])
a1.sum(axis=0)
输出：
matrix([[7, 7]])
a1.sum(axis=1)
输出：
matrix([[3],
[5],
[6]])
a1.max() # 求矩阵元素最大值
输出：
4
a1.min() # 求矩阵元素最小值
输出：
1

np.max(a1,0) # 求矩阵每列元素最大值
输出：
matrix([[4, 3]])
np.max(a1,1) # 求矩阵每行元素最大值
输出：
matrix([[2],
[3],
[4]])

a = mat(ones((2,2)))
b = mat(eye((2)))
c = hstack((a,b))
输出：
matrix([[ 1., 1., 1., 0.],
[ 1., 1., 0., 1.]])
d = vstack((a,b))
输出：
matrix([[ 1., 1.],
[ 1., 1.],
[ 1., 0.],
[ 0., 1.]])

15. 矩阵、数组、列表之间的互相转换
aa = [[1,2],[3,4],[5,6]]
bb = array(aa)
cc = mat(bb)

cc.getA() # 矩阵转换为数组
cc.tolist() # 矩阵转换为列表
bb.tolist() # 数组转换为列表

# 当列表为一维时，情况有点特殊
aa = [1,2,3,4]
bb = array(aa)
输出：
array([1, 2, 3, 4])
cc = mat(bb)
输出：
matrix([[1, 2, 3, 4]])

cc.tolist()
输出：
[[1, 2, 3, 4]]

bb.tolist()
输出：
[1, 2, 3, 4]

cc.tolist()[0]
输出：
[1, 2, 3, 4]