--- title: Introduction to dspline output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Get started} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} --- ```{r, include=FALSE} knitr::opts_knit$set(global.par = TRUE) knitr::opts_chunk$set(dev = "png", dpi = 300) ``` ```{r, include=FALSE} par(mar = c(4.5, 4.5, 0.5, 0.5)) ``` ## Overview This vignette provides a brief introduction to the `dspline` package. We'll do the following: - compute discrete derivatives; - smooth by regression onto the space of discrete splines; - perform interpolation in the space of discrete splines; - check that a discrete spline's discrete derivatives match its derivatives. We'll load the package before diving in to these sections. ```{r} library(dspline) ``` ## Computing discrete derivatives Let's begin with a test function. ```{r} expr = expression(sin(x * (1 + x) * 7 * pi / 2) + (1 - 4 * (x - 0.5)^2) * 4) f = function(x) eval(expr) curve(f, n = 1000) ``` As usual, we can compute derivatives using `D()`. Below, we show how to compute its discrete derivatives based on evaluating `f` at 100 design points `xd`, and multiplying these evaluations the discrete difference operator defined over the design points. This is done using the function [d_mat_mult()]. We'll demonstrate this with both first and second derivatives. ```{r} n = 100 xd = 1:n / (n+1) x = seq(0, 1, length = 1000) # First derivatives fd = D(expr, "x") u1 = as.numeric(eval(fd)) vhat1 = d_mat_mult(f(xd), 1, xd) plot(x, u1, type = "l", ylab = "First derivative") points(xd[2:n], vhat1, col = 2, type = "o") legend("bottomleft", lty = 1, pch = c(NA, 21), col = 1:2, legend = c("Exact", "Discrete")) rug(xd) # Second derivatives fd2 = D(fd, "x") u2 = as.numeric(eval(fd2)) v2 = d_mat_mult(f(xd), 2, xd) plot(x, u2, type = "l", ylab = "Second derivative") points(xd[3:n], v2, col = 2, type = "o") legend("bottomleft", legend = c("Exact", "Discrete"), lty = 1, pch = c(NA, 21), col = 1:2) rug(xd) ``` Note that the discrete derivatives come back as a vector of length `n-k-1`, with `n` being the number of design points and `k` being the derivative order. Each discrete derivative here is left-aligned, meaning that the `k`th derivative at a given point `x` uses the evaluation of `f` at `x` and `k` design points to the left of `x`. Thus we associate the `n-k-1` discrete derivatives with the design points numbered `k+1` through `n`. (The design points `xd` here are taken to be evenly-spaced, but this is not fundamental, and the design points could instead be at arbitrary locations.) As an alternative to [d_mat_mult()], we could form the difference operator (as a sparse matrix) using [d_mat()], and then do the matrix multiplication manually, but using [d_mat_mult()] is more efficient. It doesn't actually form any such matrix internally, and instead performs a specialized routine to carry out the discrete derivative computations. ## Discrete spline smoothing Now we'll smooth some noisy data, by regressing it onto the space of discrete splines of cubic degree, with 9 knots which are roughly evenly-spaced among the design points. This is done using the function [dspline_solve()]. ```{r} y = f(xd) + rnorm(n, sd = 0.5) knot_idx = 1:9 * 10 res = dspline_solve(y, 3, xd, knot_idx) yhat = res$fit # Fitted values from the regression of y onto discrete splines N = res$mat # Discrete B-spline basis, for visualization purposes only! plot(xd, y, xlab = "x") points(xd, yhat, col = 2, pch = 19) matplot(xd, N * 2, type = "l", lty = 1, add = TRUE) rug(xd) ``` As we can see, the fitted values from the regression onto the space of discrete splines do a qualitatively reasonable job of smoothing. The basis functions used here (i.e., covariates in the regression) are called discrete B-splines, and are the default choice (corresponding to `basis = "N"`) in [dspline_solve()]. These have compact support, just like the usual B-spline basis. While other bases are available, for regression problems in which the knot points are a sparse subset of the design points, the discrete B-spline basis is likely the best choice in terms of efficiency and numerical stability. The computational cost is linear in the number of knots. For more details, see Section 8.1 of [Tibshirani (2020)](https://www.stat.berkeley.edu/~ryantibs/papers/dspline.pdf). ### Relation to trend filtering The smoothing above is done using ordinary least squares regression, with fixed knots and without regularization. A related and more advanced technique would be to use trend filtering, which places a knot at each eligible design point (rather than fixing the knots ahead of time) and performs a regularized regression onto the space of discrete splines, by penalizing the $\ell_1$ norm of discrete derivatives across the design points. This has the advantage of being more locally adaptive: allocating more flexibility to the fitted function dynamically, at parts of the input space in which such flexibility is warranted, rather than letting this be dictated by a given fixed knot sequence. To learn more, see the [trendfilter](https://github.com/glmgen/trendfilter) package, or [Tibshirani (2014)](https://www.stat.berkeley.edu/~ryantibs/papers/trendfilter.pdf) or Sections 1 and 2.5 of [Tibshirani (2020)](https://www.stat.berkeley.edu/~ryantibs/papers/dspline.pdf). ## Discrete spline interpolation To go from a sequence of fitted values to a fitted function, we must interpolate in the space of cubic discrete splines. We'll do this using [dspline_interp()]. ```{r} x = seq(0, 1, length = 1000) fhat = dspline_interp(yhat, 3, xd, x, implicit = FALSE) plot(xd, y, xlab = "x") lines(x, fhat, col = 2, lwd = 2) matplot(xd, N * 2, type = "l", lty = 1, add = TRUE) abline(v = xd[knot_idx], col = 8) rug(xd) ``` The function plotted in as a thick red line above is a bona fide cubic discrete spline: it is a piecewise cubic polynomial with knots at the specified points (marked as vertical gray lines), and its discrete derivatives of orders 1 and 2 are all continuous at the knots. This visualized below. ```{r} dd1 = d_mat_mult(yhat, 1, xd) dd2 = d_mat_mult(yhat, 2, xd) plot(xd[2:n], dd1, xlab = "x", ylab = "First discrete derivative") abline(v = xd[knot_idx], col = 8) rug(xd) plot(xd[3:n], dd2, xlab = "x", ylab = "Second discrete derivative") abline(v = xd[knot_idx], col = 8) rug(xd) ``` ## Matching derivatives What about the discrete derivatives of order 3? Being a cubic discrete spline (piecewise cubic polynomial), these are, unsurprisingly, piecewise constant. ```{r} inds = x > 0.05 # Exclude points near left boundary x = x[inds] fhat = fhat[inds] dd3 = discrete_deriv(c(yhat, fhat), 3, xd, x) plot(x, dd3, type = "l", ylab = "Third discrete derivative") abline(v = xd[knot_idx], col = 8) rug(xd[xd > min(x)]) ``` Here we used [discrete_deriv()] to compute the discrete derivatives over a fine grid of locations `x` outside of the design points `xd`. (Note: as used earlier, the function [d_mat_mult()] is a convenient way to compute discrete derivatives at the design points `xd` themselves.) A special property of a `k`th degree discrete spline is that their `k`th degree discrete derivatives match their `k`th derivatives, everywhere in the domain. To check this, we'll need to first compute an analytic representation for the discrete cubic spline interpolant underlying `fhat`, and then differentiate it using `D()`. For this analytic representation, it is most convenient to use the falling factorial (rather than discrete B-spline) basis. *Warning: what happens below to compute and evaluate symbolic derivatives of the analytic representation gets a bit hairy! Importantly, you'll never have to do this in order to use the `dspline` package. It's only done for the purpose of verifying the claim that the discrete derivatives match the usual derivatives.* ```{r} res = dspline_solve(y, 3, xd, knot_idx, basis = "H") yhat = res$fit # Fitted values from the regression of y onto discrete splines H = res$mat # Falling factorial basis, for visualization purposes only! sol = res$sol # Falling factorial basis coefficients, for expansion later # Sanity check: the fitted values from H (instead of N) should look as before plot(xd, y, xlab = "x") points(xd, yhat, col = 2, pch = 19) matplot(xd, H * 80, type = "l", lty = 1, add = TRUE) rug(xd) # Now build analytic expansion in terms of falling factorial bases functions. # Unfortunately we need to separate out the terms involving inequalities, since # D() can't properly (symbolically) differentiate through inequalities, so this # ends up being more complicated than it should be ... poly_terms = sprintf("%f", sol[1]) ineq_terms = "1" for (j in 2:length(sol)) { if (j <= 4) { x_prod = paste( sprintf("(x - %f)", xd[1:(j-1)]), collapse = " * ") poly_terms = c( poly_terms, sprintf("%f * %s / factorial(%i)", sol[j], x_prod, j-1)) ineq_terms = c(ineq_terms, "1") } else { x_prod = paste( sprintf("(x - %f)", xd[knot_idx[j-4] - 2:0]), collapse = " * ") poly_terms = c( poly_terms, sprintf("%f * %s / factorial(3)", sol[j], x_prod)) ineq_terms = c( ineq_terms, sprintf("(x > %f)", xd[knot_idx[j-4]])) } } # Sanity check: the interpolant from this expression should look as before combined_terms = paste( paste(poly_terms, ineq_terms, sep = " * "), collapse = " + ") fhat = eval(str2expression(combined_terms)) plot(xd, y, xlab = "x") lines(x, fhat, col = 2, lwd = 2) matplot(xd, N * 2, type = "l", lty = 1, add = TRUE) abline(v = xd[knot_idx], col = 8) rug(xd) # Higher-order symbolic derivative function (from help file for D()) DD = function(expr, name, order = 1) { if(order == 1) D(expr, name) else DD(D(expr, name), name, order - 1) } # Finally, compute third derivatives of falling factorial basis expansion d3 = numeric(length(x)) for (i in 1:length(x)) { if (x[i] <= xd[knot_idx[1]]) { expr = str2expression(paste(poly_terms[1:4], collapse = " + ")) } else { j = max(which(x[i] > xd[knot_idx])) + 4 expr = str2expression(paste(poly_terms[1:j], collapse = " + ")) } d3[i] = eval(DD(expr, "x", 3), list(x = x[i])) } plot(x, d3, type = "l", ylab = "Third derivative") lines(x, dd3, col = 2, lty = 2, lwd = 2) abline(v = xd[knot_idx], col = 8) legend("bottomleft", lty = 1:2, col = 1:2, legend = c("Exact", "Discrete")) rug(xd[xd > min(x)]) ```