Search code examples
neural-networkpde

How can a PDE be solved using a Neural Network?


I have tried to search for a simple answer to this question, but all explanations are intended towards people who can understand numerical methods and PDEs. I have a computer science background and all papers seem like jargon. I have learned about and used machine learning algorithms for classification and regression problems related to images or normal tabular data (which has input features and target class). I have also used neural networks, but only as a black box for producing required outputs given some inputs, and tuning parameters for best accuracy.

I fail to understand, given a PDE to solve, what is the input to a neural network when the net is trained with some examples? In the language of machine learning, What are the examples and what are the input features? what are the parameters?

Any help to understand this would be appreciated.


Solution

  • It of course depends on the type of PDE. Many PDE describe the evolution of a spatially distributed system over time. The state of such a system is defined by a value v(x,t) that depends on a spatial variable x that is usually a vector and on time t. The value itself can be a vector as well.

    Learning samples can consist of discrete values of the initial condition v_i_0(x_i,0) and values v_i_j(x_i,t_j) at later times. The initial values are presented to the network as inputs. The outputs of the (recurrent) network at later times t_j are then compared with the sample values to calculate the error.

    Since PDE are often spatially homogeneous, it makes sense to use recurrent convolutional networks here. This is the same type that is often used in image processing.

    The size of the convolutional kernel depends on the degree of spatial derivatives in your PDE:

    The network needs to have at least one layer for each component of the value v. Additional hidden layers might be required, especially if the PDE have higher temporal derivatives.

    This is a very brief answer to a complex question but I hope that it points you to the right direction. I am also aware that I used som PDE jargon but I tried to reduce it to the absolute minimum.