Problem:
I’m having trouble rearranging the following data frame:
set.seed(45)
dat1 <- data.frame(
name = rep(c("firstName", "secondName"), each=4),
numbers = rep(1:4, 2),
value = rnorm(8)
)
dat1
name numbers value
1 firstName 1 0.3407997
2 firstName 2 -0.7033403
3 firstName 3 -0.3795377
4 firstName 4 -0.7460474
5 secondName 1 -0.8981073
6 secondName 2 -0.3347941
7 secondName 3 -0.5013782
8 secondName 4 -0.1745357
I want to reshape it so that each unique “name” variable is a rowname, with the “values” as observations along that row and the “numbers” as colnames. Sort of like this:
name 1 2 3 4
1 firstName 0.3407997 -0.7033403 -0.3795377 -0.7460474
5 secondName -0.8981073 -0.3347941 -0.5013782 -0.1745357
I’ve looked at melt
and cast
and a few other things, but none seem to do the job.
How to reshape data from long to wide format? Answer #1:
It’s a very simple and one-liner answer.
Using reshape
function:
reshape(dat1, idvar = "name", timevar = "numbers", direction = "wide")
How to reshape data from long to wide format? Answer #2:
You can do this with the reshape()
function, or with the melt()
/ cast()
functions in the reshape package. For the second option, example code is
library(reshape)
cast(dat1, name ~ numbers)
Or using reshape2
library(reshape2)
dcast(dat1, name ~ numbers)
How to reshape data from long to wide format? Answer #3:
With the devel version of tidyr
‘0.8.3.9000’
, there is pivot_wider
and pivot_longer
which is generalized to do the reshaping (long -> wide, wide -> long, respectively) from 1 to multiple columns. Using the OP’s data
-single column long -> wide
library(dplyr)
library(tidyr)
dat1 %>%
pivot_wider(names_from = numbers, values_from = value)
# A tibble: 2 x 5
# name `1` `2` `3` `4`
# <fct> <dbl> <dbl> <dbl> <dbl>
#1 firstName 0.341 -0.703 -0.380 -0.746
#2 secondName -0.898 -0.335 -0.501 -0.175
-> created another column for showing the functionality
dat1 %>%
mutate(value2 = value * 2) %>%
pivot_wider(names_from = numbers, values_from = c("value", "value2"))
# A tibble: 2 x 9
# name value_1 value_2 value_3 value_4 value2_1 value2_2 value2_3 value2_4
# <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 firstName 0.341 -0.703 -0.380 -0.746 0.682 -1.41 -0.759 -1.49
#2 secondName -0.898 -0.335 -0.501 -0.175 -1.80 -0.670 -1.00 -0.349
How to reshape data from long to wide format? Answer #4-5:
#4: Using your example dataframe, we could:
xtabs(value ~ name + numbers, data = dat1)
#5:
Other two options:
Base package:
df <- unstack(dat1, form = value ~ numbers)
rownames(df) <- unique(dat1$name)
df
sqldf
package:
library(sqldf)
sqldf('SELECT name,
MAX(CASE WHEN numbers = 1 THEN value ELSE NULL END) x1,
MAX(CASE WHEN numbers = 2 THEN value ELSE NULL END) x2,
MAX(CASE WHEN numbers = 3 THEN value ELSE NULL END) x3,
MAX(CASE WHEN numbers = 4 THEN value ELSE NULL END) x4
FROM dat1
GROUP BY name')