0%

convolutions

The Cross-Correlation Operation

假设对于一个只有一个颜色空间的3x3的图片,另外有一个2x2的卷积核,Cross-Correlation Operation的定义如下图:

cross-correlation-example

代码实现:

1
2
3
4
5
6
7
8
9
import tensorflow as tf
def corr2d(X, K):
"""Compute 2D cross-correlation."""
h, w = K.shape
Y = tf.Variable(tf.zeros((X.shape[0] - h + 1, X.shape[1] - w + 1)))
for i in range(Y.shape[0]):
for j in range(Y.shape[1]):
Y[i, j].assign(tf.reduce_sum(X[i:i + h, j:j + w] * K))
return Y

Custom Convolution Layer

1
2
3
4
5
6
7
8
9
10
11
12
13
class Conv2DDense(tf.keras.layers.Layer):
def __init__(self):
super(Conv2DDense, self).__init__()
self.weight = None
self.bias = None

def build(self, input_shape):
initializer = tf.random_normal_initializer()
self.weight = self.add_weight(name="weight", shape=input_shape, initializer=initializer)
self.bias = self.add_weight(name="bias", shape=(1,), initializer=initializer)

def call(self, inputs, **kwargs):
return corr2d(inputs, self.weight) + self.bias
注意卷机层的输出shape是由输入和自己的卷积核一起决定的,而之前自定义Layer的输出shape是初始化时由用户决定的。

Learning a Kernel

在对图片进行卷积操作时,我们很多时候是不知道应该将卷积核的每个元素的数值设置为多少的。假设我们知道输入的图片以及经过卷积操作之后应该得到的输出图片,要想知道卷积核每个元素的取值,这个问题其实本质上就是一个线性规划的问题,可以用梯度下降法来求的近似数值解。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
def learning_a_kernel(input: Union[tf.Tensor, tf.Variable], wanted_output: Union[tf.Tensor, tf.Variable], kernel_shape):
# Construct a two-dimensional convolutional layer with 1 output channel and a
# kernel of kernal_shape. For the sake of simplicity, we ignore the bias here
conv2d = tf.keras.layers.Conv2D(1, shape=kernel_shape, use_bias=False)

reshaped_input = tf.reshape(input, (1, input.shape[0], input.shape[1], 1))
reshaped_wanted_output = tf.reshape(wanted_output, (1, wanted_output.shape[0], wanted_output.shape[1], 1))
_ = conv2d(reshaped_input) # this is use to gen weights in layer
for i in range(10):
with tf.GradientTape(watch_accessed_variables=False) as g:
g.watch(conv2d.weights[0])
calculated_output = conv2d(reshaped_input)
loss = (abs(calculated_output - reshaped_wanted_output)) ** 2

update = tf.multiply(3.21e-2, g.gradient(loss, conv2d.weights[0]))

weights = conv2d.get_weights()
weights[0] = conv2d.weights[0] - update
conv2d.set_weights(weights)