Understanding Tensorflow's RNN model

The RNN cell implementation in Tensorflow can be found at here. The RNN model can be found here.

One great LSTM RNN tutorial is Colah’s Understanding LSTM Networks.

RNN Cell

The basic definition of RNN cell in Tensorflow is

1
2
def __call(self, inputs):
truereturn inputs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
def __call__(self, inputs, state, scope=None):

true"""Run this RNN cell on inputs, starting from the given state.
truetrue
true Args:
truetrueinputs: 2D Tensor with shape [batch_size x self.input_size].
truetruestate: 2D Tensor with shape [batch_size x self.state_size].
truetruescope: VariableScope for the created subgraph; defaults to class name.
true
true Returns:
truetrueA pair containing:
truetrue- Output: A 2D Tensor with shape [batch_size x self.output_size]
truetrue- New state: A 2D Tensor with shape [batch_size x self.state_size].
true"""

true
```

And an instance looks like

1
2
3
4
5
6
def __call__(self, inputs, state, scope=None):
true"""Most basic RNN: output = new_state = tanh(W * input + U * state + B).
truewith vs.variable_scope(scope or type(self).__name__): # "BasicRNNCell"
truetrueoutput = tanh(linear([inputs, state], self._num_units, True))
truetruereturn output, output
true"""

As we can see from the code, a RNN cell needs two inputs, inputs, and state, and then calculate the score and then return the result. Inside the cell, the calcuation is performed by tanh(linear([inputs, state], self._num_units, True)), therefore we need to check the definition of linear, which is

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
def linear(args, output_size, bias, bias_start=0.0, scope=None):
true"""Linear map: sum_i(args[i] * W[i]), where W[i] is a variable.
truetrueArgs:
truetruetrueargs: a 2D Tensor or a list of 2D, batch x n, Tensors.
truetruetrueoutput_size: int, second dimension of W[i].
truetruetruebias: boolean, whether to add a bias term or not.
truetruetruebias_start: starting value to initialize the bias; 0 by default.
truetruetruescope: VariableScope for the created subgraph; defaults to "Linear".
truetrueReturns:
truetruetrueA 2D Tensor with shape [batch x output_size] equal to
truetruetruesum_i(args[i] * W[i]), where W[i]s are newly created matrices.
truetrueRaises:
truetruetrueValueError: if some of the arguments has unspecified or wrong shape.
true"""

true
trueassert args
trueif not isinstance(args, (list, tuple)):
truetrueargs = [args]

true# Calculate the total size of arguments on dimension 1.
truetotal_arg_size = 0
trueshapes = [a.get_shape().as_list() for a in args]
truefor shape in shapes:
truetrueif len(shape) != 2:
truetruetrueraise ValueError("Linear is expecting 2D arguments: %s" % str(shapes))
truetrueif not shape[1]:
truetruetrueraise ValueError("Linear expects shape[1] of arguments: %s" % str(shapes))
truetrueelse:
truetruetruetotal_arg_size += shape[1]

true# Now the computation.
truewith vs.variable_scope(scope or "Linear"):
truetruematrix = vs.get_variable("Matrix", [total_arg_size, output_size])
truetrueif len(args) == 1:
truetruetrueres = math_ops.matmul(args[0], matrix)
truetrueelse:
truetruetrueres = math_ops.matmul(array_ops.concat(1, args), matrix)
truetrueif not bias:
truetruetruereturn res
truetruebias_term = vs.get_variable(
truetruetrue"Bias", [output_size],
truetruetrueinitializer=init_ops.constant_initializer(bias_start))
truereturn res + bias_term