python keras deep-learning taylor-series

How to implement maclaurin series in keras?

I am trying to implement expandable CNN by using maclaurin series. The basic idea is the first input node can be decomposed into multiple nodes with different orders and coefficients. Decomposing single nodes to multiple ones can generate different non-linear line connection that generated by maclaurin series. Can anyone give me a possible idea of how to expand CNN with maclaurin series non-linear expansion? any thought?

I cannot quite understand how to decompose the input node to multiple ones with different non-linear line connections that generation by maclaurin series. as far as I know, the maclaurin series is an approximation function but the decomposing node is not quite intuitive to me in terms of implementation. How to implement a decomposing input node to multiple ones in python? How to make this happen easily? any idea?

my attempt:

import tensorflow as tf
import numpy as np
import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, MaxPooling2D, Dropout, Flatten
from keras.datasets import cifar10
from keras.utils import to_categorical

(train_imgs, train_label), (test_imgs, test_label)= cifar10.load_data()
output_class = np.unique(train_label)
n_class = len(output_class)

nrows_tr, ncols_tr, ndims_tr = train_imgs.shape[1:]
nrows_ts, ncols_ts, ndims_ts = test_imgs.shape[1:]
train_data = train_imgs.reshape(train_imgs.shape[0], nrows_tr, ncols_tr, ndims_tr)

test_data = test_imgs.reshape(test_imgs.shape[0], nrows_ts, ncols_ts, ndims_ts)
input_shape = (nrows_tr, ncols_tr, ndims_tr)
train_data = train_data.astype('float32')
trast_data = test_data.astype('float32')
train_data //= 255
test_data //= 255
train_label_one_hot = to_categorical(train_label)
test_label_one_hot = to_categorical(test_label)

def pown(x,n):
    return(x**n)

def expandable_cnn(input_shape, output_shape, approx_order):
    inputs=Input(shape=(input_shape))
    x= Dense(input_shape)(inputs)
    y= Dense(output_shape)(x)
    model = Sequential()
    model.add(Conv2D(filters=32, kernel_size=(3,3), padding='same', activation="relu", input_shape=input_shape))
    model.add(Conv2D(filters=32, kernel_size=(3,3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2,2)))
    model.add(Dropout(0.25))

    model.add(Flatten())
    model.add(Dense(512, activation='relu'))
    model.add(Dropout(0.5))
    for i in range(2, approx_order+1):
        y=add([y, Dense(output_shape)(Activation(lambda x: pown(x, n=i))(x))])
    model.add(Dense(n_class, activation='softmax')(y))
    return model

but when I ran the above model, I had bunch of compile errors and dimension error. I assume that the way for Tylor non-linear expansion for CNN model may not be correct. Also, I am not sure how to represent weight. How to make this work? any possible idea of how to correct my attempt?

desired output:

I am expecting to extend CNN with maclaurin series non-linear expansion, how to make the above implementation correct and efficient? any possible idea or approach?

Solution

Interesting question. I have implemented a Keras model that computes the Taylor expansion as you described:

from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Input, Lambda


def taylor_expansion_network(input_dim, max_pow):
    x = Input((input_dim,))

    # 1. Raise input x_i to power p_i for each i in [0, max_pow].
    def raise_power(x, max_pow):
        x_ = x[..., None]  # Shape=(batch_size, input_dim, 1)
        x_ = tf.tile(x_, multiples=[1, 1, max_pow + 1])  # Shape=(batch_size, input_dim, max_pow+1)
        pows = tf.range(0, max_pow + 1, dtype=tf.float32)  # Shape=(max_pow+1,)
        x_p = tf.pow(x_, pows)  # Shape=(batch_size, input_dim, max_pow+1)
        x_p_ = x_p[..., None]  # Shape=(batch_size, input_dim, max_pow+1, 1)
        return x_p_

    x_p_ = Lambda(lambda x: raise_power(x, max_pow))(x)

    # 2. Multiply by alpha coefficients
    h = LocallyConnected2D(filters=1,
                           kernel_size=1,  # This layer is computing a_i * x^{p_i} for each i in [0, max_pow]
                           use_bias=False)(x_p_)  # Shape=(batch_size, input_dim, max_pow+1, 1)

    # 3. Compute s_i for each i in [0, max_pow]
    def cumulative_sum(h):
        h = tf.squeeze(h, axis=-1)  # Shape=(batch_size, input_dim, max_pow+1)
        s = tf.cumsum(h, axis=-1)  # s_i = sum_{j=0}^i h_j. Shape=(batch_size, input_dim, max_pow+1)
        s_ = s[..., None]  # Shape=(batch_size, input_dim, max_pow+1, 1)
        return s_

    s_ = Lambda(cumulative_sum)(h)

    # 4. Compute sum w_i * s_i each i in [0, max_pow]
    s_ = LocallyConnected2D(filters=1,  # This layer is computing w_i * s_i for each i in [0, max_pow]
                            kernel_size=1,
                            use_bias=False)(s_)  # Shape=(batch_size, input_dim, max_pow+1)
    y = Lambda(lambda s_: tf.reduce_sum(tf.squeeze(s_, axis=-1), axis=-1))(s_)  # Shape=(batch_size, input_dim)

    # Return Taylor expansion model
    model = Model(inputs=x, outputs=y)
    model.summary()
    return model

The implementation applies the same Taylor expansion to each element of the flattened tensor with shape (batch_size, input_dim=512) coming from the convolutional network.

UPDATE: As we discussed in the comments section, here is some code to show how your function expandable_cnn could be modified to integrate the model defined above:

def expandable_cnn(input_shape, nclass, approx_order):
    inputs = Input(shape=(input_shape))
    h = inputs
    h = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu', input_shape=input_shape)(h)
    h = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(h)
    h = MaxPooling2D(pool_size=(2, 2))(h)
    h = Dropout(0.25)(h)
    h = Flatten()(h)
    h = Dense(512, activation='relu')(h)
    h = Dropout(0.5)(h)
    taylor_model = taylor_expansion_network(input_dim=512, max_pow=approx_order)
    h = taylor_model(h)
    h = Activation('relu')(h)
    print(h.shape)
    h = Dense(nclass, activation='softmax')(h)
    model = Model(inputs=inputs, outputs=h)
    return model

Please note that I do not guarantee that your model will work (e.g. that you will get good performance). I just provided a solution based on my interpretation of what you want.