I have four "principal" spectra that I want to find coefficients/scalars for to best fit my data. The goal is to know how much of principal x is in the data. I am trying to get the "percent composition" of each principal spectrum to the overall spectrum (I.e. 50% a1, 25% a2, 20% a3, 5% a4.)
#spec = spectrum, a1,a2,a3,a4 = principal components these are all nx1 dimensional arrays
c = 0 #some scalar
d = 0 #some scalar
e = 0 #some scalar
g = 0 #some scalar
def f(spec, c, d, e, g):
y = spec - (a1.multiply(c) - a2.multiply(d) - a3.multiply(e)- a4.multiply(g))
return np.dot(y, y)
res = optimize.minimize(f, spec, args=(c,d,e,g), method='COBYLA', options={'rhobeg': 1.0, 'maxiter': 1000, 'disp': False, 'catol': 0.0002}) #z[0], z[1], z[2], z[3]
best = res['x']
The issue I'm having is that it doesn't seem to give me the scalar values (c,d,e,g) but instead another nx1 dimensional array. Any help greatly appreciated. Also open to other minimize/fit techniques.
After some work, I found two methods that give similar results for this problem.
mport numpy as np
import pandas as pd
import csv
import os
from scipy import optimize
path = '[insert path]'
os.chdir(path)
data = 'data.csv' #original spectra
factors = 'factors.csv' #factor spectra
nfn = 'weights.csv' #new filename
df_data = pd.read_csv(data, header = 0) # read in the spectrum file
df_factors = pd.read_csv(factors, header = 0)
# this array a is going to be our factors
a = df_factors[['0','1','2','3']
Need to seperate the factor spectra from the original data frame.
a1 = pd.Series(a['0'])
a2 = pd.Series(a['1'])
a3 = pd.Series(a['2'])
a4 = pd.Series(a['3'])
b = df_data[['0.75M']] # original spectrum!
b = pd.Series(b['0.75M']) # needs to be in a series
x0 is my initial guess for my coefficient
x0 = np.array([0., 0., 0.,0.])
def f(c):
return b -((c[0]*a1)+(c[1]*a2)+(c[2]*a3)+(c[3]*a4))
using least squares from Scipy optimize least squares and then later with minimize from the same package, both work, minimize is slightly better IMO.
res = optimize.least_squares(f, x0, bounds = (0, np.inf))
xbest = res.x
x0 = np.array([0., 0., 0., 0.])
def f(c):
y = b -((c[0]*a1)+(c[1]*a2)+(c[2]*a3)+(c[3]*a4))
return np.dot(y,y)
res = optimize.minimize(f, x0, bounds = ((0,np.inf),(0,np.inf),(0,np.inf),(0,np.inf)))