Mathematics for Data Science — Linear Algebra Tricks
Data science problems are full of mathematical complexities. With weights, inputs and outputs, subsequent activations there are many cases where linear algebra plays a fundamental role in computation of deep learning and data science problems. The same is also the reason why the problems tend to run faster on GPUs than CPUs. However, most GPU operations are not quite indexing friendly. Hence, index operations should be minimized for better performance on GPUs.
RNN based Autoencoders
RNNs or their derivatives, LSTM and GRUs pose unique challenges. When RNNs are computed the back propagations are carried out in the unfolded RNNs. This essentially means output to the encoder RNN is presented as a matrix of size N x S. N is the number of output values and S is the number of RNN states. When such data is to be carried to the decoder the last column or the final RNN state or the m[0:N, -1] has need to be picked up and be replicated in all the columns of the decoder input array of size N x S.
Extracting the Last Column of a Matrix
The last column from a matrix can be selected by using m[0:N, -1] as an indexing operation. However, indexing may not be the best option for GPU performance. Here is an alternate.
Create a column matrix of size S x 1 where all other elements are 0 but the last element.
X = [0 0 0 … 1] 1 x S
A = Matrix of size N x S
C = A x Transpose(X)
Replicating Column Matrix at all Rows
Replicating the column matrix to S rows is fairly simple. Multiplying the Column matrix C to a Row matrix of length S of all 1s can achieve the same.
R = [1 1 1 … 1] 1 x S
C = N x 1 size matrix
decoder input = C x R
Reversing the Input Sequence
RNN based auto-encoders sometimes expect the results be generated in the reverse sequence of the input. Since, the data is presented as a N x S matrix, while comparing it’s expected that the output will be generated as a N x S matrix but the S-th state will be generated as the first output in the decoder.
Reversing the column order of a matrix of size N x S can be easily achieved by multiplying the matrix with the cross-diagonal matrix of S x S size.
Cross diagonal matrix of S x S size is represented as below:
XD = [ 1 if (i + j == S) else 0 for i in range(S), j in range(S)]
Mrev = M x XD
Mrev will have all the columns reversed as they are in M.
Note: The code used here is not exactly Python but a form of pseudo code and may not be used directly through copy and paste.