Pytorch自定义求导功能(定义BP操作)

发布 : 2020-02-27 分类 : 深度学习 浏览 :

如果用到 pytorch 不能自动求导的东西,这时可能就需要自己完成 BP 算法了。但是怎么去替代这个过程呢,pytorch 中提供了扩展方法。

在此实现中,我们实现了自己的自定义 autograd 函数以执行 ReLU 函数。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
import torch


class MyReLU(torch.autograd.Function):
"""
We can implement our own custom autograd Functions by subclassing
torch.autograd.Function and implementing the forward and backward passes
which operate on Tensors.
"""

@staticmethod
def forward(ctx, input):
"""
In the forward pass we receive a Tensor containing the input and return
a Tensor containing the output. ctx is a context object that can be used
to stash information for backward computation. You can cache arbitrary
objects for use in the backward pass using the ctx.save_for_backward method.
"""
ctx.save_for_backward(input)
return input.clamp(min=0)

@staticmethod
def backward(ctx, grad_output):
"""
In the backward pass we receive a Tensor containing the gradient of the loss
with respect to the output, and we need to compute the gradient of the loss
with respect to the input.
"""
input, = ctx.saved_tensors
grad_input = grad_output.clone()
grad_input[input < 0] = 0
return grad_input


dtype = torch.float
device = torch.device("cpu")
# device = torch.device("cuda:0") # Uncomment this to run on GPU

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold input and outputs.
x = torch.randn(N, D_in, device=device, dtype=dtype)
y = torch.randn(N, D_out, device=device, dtype=dtype)

# Create random Tensors for weights.
w1 = torch.randn(D_in, H, device=device, dtype=dtype, requires_grad=True)
w2 = torch.randn(H, D_out, device=device, dtype=dtype, requires_grad=True)

learning_rate = 1e-6
for t in range(500):
# To apply our Function, we use Function.apply method. We alias this as 'relu'.
relu = MyReLU.apply

# Forward pass: compute predicted y using operations; we compute
# ReLU using our custom autograd operation.
y_pred = relu(x.mm(w1)).mm(w2)

# Compute and print loss
loss = (y_pred - y).pow(2).sum()
if t % 100 == 99:
print(t, loss.item())

# Use autograd to compute the backward pass.
loss.backward()

# Update weights using gradient descent
with torch.no_grad():
w1 -= learning_rate * w1.grad
w2 -= learning_rate * w2.grad

# Manually zero the gradients after updating weights
w1.grad.zero_()
w2.grad.zero_()

以上需要注意的是,在 Pytorch1.3 版本之后,必须添加@staticmethod注解,才能正常调用。还有容易忽略的一点,就是这行代码。

1
relu = MyReLU.apply

只有 apply 后,才能正常传递参数。

另外,在自定义 Function 之后,一般还会使用 nn.Module 对其再进行一层包装,这就像 nn.Function.conv2d()和 nn.Conv2d()的关系一样。

本文作者 : HeoLis
原文链接 : https://ishero.net/Pytorch%E8%87%AA%E5%AE%9A%E4%B9%89%E6%B1%82%E5%AF%BC%E5%8A%9F%E8%83%BD%EF%BC%88%E5%AE%9A%E4%B9%89BP%E6%93%8D%E4%BD%9C%EF%BC%89.html
版权声明 : 本博客所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明出处!

学习、记录、分享、获得

微信扫一扫, 向我投食

微信扫一扫, 向我投食