Sometimes one needs to create new variables by finding the “previous” (lag()) or “next” (lead()) values in a vector (e.g. in a df column). A real-world example is first order differencing in a time-series analysis, where new series formed by subtracting \(x_{t-1}\) from \(x_{t}\). \[\nabla x_t=x_{t}-x_{t-1}\]

The following R snippet is a simple demo of lag and lead functions.

Just a tibble containing columns defined with lag() & lead()

Lets create a tibble with tibble() function:

df <- tibble(
   x = c(28, 25, 24, 17, 10),
   lagx = lag(x),
   leadx = lead(x),
   diff = x - lead(x))

The result is:

# A tibble: 5 x 4
      x  lagx leadx  diff
  <dbl> <dbl> <dbl> <dbl>
1    28    NA    25     3
2    25    28    24     1
3    24    25    17     7
4    17    24    10     7
5    10    17    NA    NA

Note that diff is actually equal to the aforementioned first order difference (\(\nabla x_t\)).

tidyverse-way with pipe (%>%) and mutate

df %>% mutate(diff2 = x - lead(x))

diff2 is same as diff:

# A tibble: 5 x 5
      x  lagx leadx  diff diff2
  <dbl> <dbl> <dbl> <dbl> <dbl>
1    28    NA    25     3     3
2    25    28    24     1     1
3    24    25    17     7     7
4    17    24    10     7     7
5    10    17    NA    NA    NA