Introduction

SinusoidalRegressions.jl aims to provide a set of functions for conveniently fitting noisy data to a variety of sinusoidal models, with or without initial estimates of the parameters.

Features

This package supports five sinusoidal models:

  • 3-parameter sinusoidal, where the frequency $f$ is known exactly:

    $s_s(x; \textrm{DC}, I, Q) = \textrm{DC} + Q\sin(2πfx) + I\cos(2πfx)$

  • 4-parameter sinusoidal, where the frequency $f$ is unknown:

    $s_s(x; f, \textrm{DC}, I, Q) = \textrm{DC} + Q\sin(2πfx) + I\cos(2πfx)$

  • Mixed linear-sinusoidal:

    $s_{mls}(x; f, \textrm{DC}, I, Q, m) = \textrm{DC} + mx + Q\sin(2πfx) + I\cos(2πfx)$

  • (Not yet implemented) General sinusoidal with unknown coefficients $λ_1, λ_2, \ldots, λ_m$, and known functions $g_1(x), g_2(x), \ldots, g_m(x)$:

    $s_g(x; f, I, Q, λ_n) = Q\sin(2πfx) + I\cos(2πfx) + λ_1g_1(x) + \ldots + λ_mg_m(x)$

  • (Not yet implemented) Damped sinusoidal:

    $s_d(x; f, I, Q, α) = \exp(αx) ( Q\sin(2πfx) + I\cos(2πfx) )$

A note on the frequency

The frequency $f$ is assumed to be expressed in hertz throughout.

Three types of sinusoidal fitting algorithms are provided:

  • IEEE 1057 3-parameter and 4-parameter algorithms. These are able to fit only the sinusoidal model $s_s(x)$. The 4-parameter version requires a very close estimate of the frequency $f$ and dense sampling of several periods of the sinusoid.

  • The algorithms proposed by J. Jacquelin, based on the idea of finding an integral equation whose solution is the desired sinusoidal model. This integral equation is linear and can be solved using linear least-squares. These algorithms have several benefits:

    • They are non-iterative.
    • They do not require initial estimates of any of the model's parameters.
    • They often perform quite well, but may also be used to calculate a good initial guess for the more powerful non-linear methods described below.
  • The non-linear fitting function curve_fit from the package LsqFit. This function uses the Levenberg-Marquardt algorithm and is quite powerful, but it requires an initial estimate of the parameters. This package "wraps" curve_fit, automatically defining the correct model and calculating initial parameter estimates (using IEEE 1057 or Jacquelin's algorithms, as appropriate) if none are provided by the user.

Installation

This package is in Julia's general registry. It can be installed by running

using Pkg; Pkg.add("SinusoidalRegressions)

Roadmap and Contributions

The following features are on the roadmap towards version 1.0:

  • Implement Jacquelin's algorithms for damped and general sinusoids.
  • Enhance the front-end to curve_fit by allowing some paramters to be declared as known a priori, and fitting only the remaining parameters.
  • Implement other algorithms and add support for other models, as suggested by the community.
  • Improve tests.

Contributions, feature requests, bug reports and suggestions are welcome; please use the issue tracker on github, or open a discussion on the Julia forums.

Tutorial

Fitting experimental data

Let us fit noisy data to a mixed linear-sinusoidal model using Jacquelin's integral equation algorithm. The model is

$s(x) = \textrm{DC} + mx + Q\sin(2πfx) + I\cos(2πfx)$

with parameters $f$ (the frequency is unknown), $\textrm{DC}$, $Q$, $I$ and $m$.

Types

This package includes several types used to store the parameters of the different models:

All of these types are subtypes of the abstract type SinusoidalFunctionParameters.

Instances of these types are also function types (functors), making it easy to evaluate the model at any point (or collection of points). A plot recipe is also provided. We will use both of these features below.

Since we're assuming a mixed linear-sinusoidal model, we'll use the type MixedLinearSinusoidP.

Generating the data

We will first generate the "true" or exact data, and then add noise. As a second step, we'll fit the noisy data and compare the fit to the exact model.

The exact parameters are $f = 4.1$, $\textrm{DC} = 0.2$, $m = 1$, $Q = -0.75$, and $I = 0.8$. We define the model as follows:

using SinusoidalRegressions
p_exact = MixedLinearSinusoidP(f = 4.1, DC = 0.2, m = 1, Q = -0.75, I = 0.8)
Mixed Linear-Sinusoidal parameters MixedLinearSinusoidP{Float64}:
  Frequency (Hz)       : 4.1
  DC                   : 0.2
  Sine amplitude (Q)   : -0.75
  Cosine amplitude (I) : 0.8
  Linear term (m)      : 1.0

We now genereate 40 "exact" data points with a sampling rate of 40 Hz.

x = range(0, length = 40, step = 1/40)
d_exact = p_exact(x)  # exact data

And finally, we add gaussian noise to the data:

using Random: seed!
seed!(6778899)

data = d_exact .+ 0.2*randn(40)  # noisy data

Fitting and error measurement

The vector data is assumed to contain the experimental or measured data, which we wish to fit to the mixed linear-sinusoidal model. We do so as follows:

fit = mixlinsinfit_j(x, data)
Mixed Linear-Sinusoidal parameters MixedLinearSinusoidP{Float64}:
  Frequency (Hz)       : 4.007299758712902
  DC                   : 0.272063407279927
  Sine amplitude (Q)   : -0.9174404283779977
  Cosine amplitude (I) : 0.579524744987893
  Linear term (m)      : 0.9597992006642564

We calculate the fit's root mean-square error and mean absolute error:

rmse(fit, p_exact, x)
0.1330860129428344
mae(fit, p_exact, x)
0.10876438830411586

Plotting

Plotting the measured data along with the fit:

using Plots
plot(x, data, fit)

Improving the fit with non-linear least squares

We may try to improve the fit by using LsqFit.curve_fit with the parameters estimated above. The function mixlinsinfit calls LsqFit.curve_fit behind the scenes, setting up the model and calculating the initial parameter estimates (using mixlinsinfit_j) if none are provided.

fit_nls = mixlinsinfit(x, data)
Mixed Linear-Sinusoidal parameters MixedLinearSinusoidP{Float64}:
  Frequency (Hz)       : 4.04178681088101
  DC                   : 0.2615232845594382
  Sine amplitude (Q)   : -0.8538423033723734
  Cosine amplitude (I) : 0.6713713094475814
  Linear term (m)      : 0.9669430564245449
rmse(fit_nls, p_exact, x)
0.08911610415562218
mae(fit_nls, p_exact, x)
0.07433163202382657

As expected, this second fit has a smaller error. Plotting the data, the fit, and the exact function for comparison:

plot(x, data, fit_nls, exact = p_exact)

A more difficult scenario

Let us fit the same model as above, with fewer data points, more noise and non-equidistant samples.

x = sort(rand(20))
d_exact = p_exact(x)
data = d_exact .+ 0.3*randn(20)
fit = mixlinsinfit_j(x, data)    # using Jacquelin's algorithm
Mixed Linear-Sinusoidal parameters MixedLinearSinusoidP{Float64}:
  Frequency (Hz)       : 3.6266065318254928
  DC                   : -0.26513310834427833
  Sine amplitude (Q)   : -1.0988190618588671
  Cosine amplitude (I) : -0.2782555281073247
  Linear term (m)      : 1.9355008080025093

Error for Jacquelin's algorithm:

rmse(fit, p_exact, x)
0.4011770903906381
mae(fit, p_exact, x)
0.3190577721521813

As expected, we see larger errors than in the more benign scenario above. Let us improve the fit using non-linear least-squares. Here, fit (calculated above) is explicitly given as an initial guess:

fit_nls = mixlinsinfit(x, data, fit)
Mixed Linear-Sinusoidal parameters MixedLinearSinusoidP{Float64}:
  Frequency (Hz)       : 4.17345824754076
  DC                   : 0.35433918472654413
  Sine amplitude (Q)   : -0.6412613122780881
  Cosine amplitude (I) : 1.279451454423565
  Linear term (m)      : 0.872752006714999

Error for LsqFit.curve_fit:

rmse(fit_nls, p_exact, x)
0.2315065775101001
mae(fit_nls, p_exact, x)
0.20986940773832075

We see that LsqFit.curve_fit was able to improve the fit significantly. Plotting the data, the fit and the exact function:

plot(x, data, fit_nls, exact = p_exact)

References

Jacquelin J., Régressions et Equations Intégrales, 2014, (online) https://fr.scribd.com/doc/14674814/Regressions-et-equations-integrales.

IEEE, "IEEE Standard for Digitizing Waveform Recorders", 2018.

Levenberg, K., "A Method for the Solution of Certain Non-Linear Problems in Least Squares", Quarterly of Applied Mathematics, 2 (2), 1944. doi:10.1090/qam/10666.

Marquardt, D., "An Algorithm for Least-Squares Estimation of Nonlinear Parameters", SIAM Journal on Applied Mathematics, 11 (2), 1944. doi:10.1137/0111030.

LsqFit.jl User Manual, (online) https://julianlsolvers.github.io/LsqFit.jl/latest/