Sometimes one needs to create new variables by finding the “previous” (lag()) or “next” (lead()) values in a vector (e.g. in a df column). A real-world example is first order differencing in a time-series analysis, where new series formed by subtracting \(x_{t-1}\) from \(x_{t}\). \[\nabla x_t=x_{t}-x_{t-1}\]
The following R snippet is a simple demo of lag and lead functions.
Just a tibble containing columns defined with lag()
& lead()
Lets create a tibble with tibble()
function:
df <- tibble(
x = c(28, 25, 24, 17, 10),
lagx = lag(x),
leadx = lead(x),
diff = x - lead(x))
The result is:
# A tibble: 5 x 4
x lagx leadx diff
<dbl> <dbl> <dbl> <dbl>
1 28 NA 25 3
2 25 28 24 1
3 24 25 17 7
4 17 24 10 7
5 10 17 NA NA
Note that diff
is actually equal to the aforementioned first order difference (\(\nabla x_t\)).
tidyverse-way with pipe (%>%
) and mutate
df %>% mutate(diff2 = x - lead(x))
diff2
is same as diff
:
# A tibble: 5 x 5
x lagx leadx diff diff2
<dbl> <dbl> <dbl> <dbl> <dbl>
1 28 NA 25 3 3
2 25 28 24 1 1
3 24 25 17 7 7
4 17 24 10 7 7
5 10 17 NA NA NA