Search code examples
juliacurve-fittingleast-squaresdata-fitting

Automatically find the scaling factor of the x-axis using LsqFit (or other method)?


I have the following data: a vector B and a vector R. The vector B is the "independent" variable. For this pair, I have two data sets: One is an experimental measurement of Bex, Rex and the other is a simulation produced by me Bsim, Rsim. The simulation does not have any "scale" for the x-axis (the B vector). Therefore when I am trying to fit my curve to the experiment, I have to find out a scaling parameter B0 "by eye", and with this number B0 I multiply the entire Bsim vector and simply plot(Bsim, Rsim, Bex, Rex).

I wanted to use the package LsqFit to make the procedure automatic and more accurate. However I am having trouble in understanding how I could use it to find the scaling on the independent variable.

My first thought was to just "invert" the roles of B and R. However, there are two issues that I think make matters worse: 1) the R curve/data is not monotonous, 2) the experimental data are much more "dense" (they have more data-points: my simulation has 120 points in total, the experiments have some thousands).

Below I give an example if what I am trying to accomplish (of course, the answer need not use LsqFit). I also attach two figures that demonstrate everything very clearly.

#= stuff happened before this point =#

Bsim, Rsim = load(simulation)
Bex, Rex = load(experiment)

#this is what I want to do:
some_model(x, p) = ???
fit = curve_fit(some_model, Bex, Rex, [3.5])
B0 = fit.param[1]

#this is what I currently do by trail and error:
B0 = 3.85 #this is what I currently do by trial and error

plot(B0*Bsim, Rsim, Bex, Rex)

P.S.: The R curves (dependent variables) are both normalized by their maximum value because their scaling is not important. before fit after fit


Solution

  • A simple approach iff you can always expect both your experiment and simulation to feature one high peak, and you're sure that there's only a scaling factor rather than also an offset, is to simply multiply your Bsim vector by mode_rex / mode_rsim (e.g. in your example, mode_rsim = 1, and mode_rex = 4, so multiply Bsim by 4. But I'm sure you've thought of this already.

    For a more general approach, one way is as follows:

    • add and load Interpolations package
    • Create a grid to interpolate over, e.g. Grid = 0:0.01:Bex[end]
    • interpolate Rex over that grid, e.g.

      RexInterp = interpolate( (Bex,), Rex, Gridded(Linear())); 
      RexGridVec = RexInterp[Grid];
      
    • interpolate Rsim over the same grid, but introduce your multiplier on the Bsim "knots", e.g.

      Multiplier = 0.1; 
      RsimInterp = interpolate( (Multiplier * Bsim,), Rsim, Gridded(Linear()));
      RsimGridVec = RsimInterp[Grid]
      
    • Now you can calculate a square error value between RsimGridVec and RexGridVec, e.g.

      SqErr = sum((RsimGridVec - RexGridVec).^2)
      

    If you follow this technique, then if you create a loop for a multiplier range (say 0:0.01:10), and get the square error associated with each multiplier, you can find out the multiplier for which the square error is the minimum.

    In theory if you wanted to find the optimal for a particular offset too, you can make it the outer loop for a range of offsets. Mind you this is a brute force approach, but it be reasonably efficient judging by the vectors in your graph.