TransWikia.com

Wrangling dataframes with Dplyr

Stack Overflow Asked on January 28, 2021

I’m trying to find a neat Dplyr solution to convert this dataframe;

Rule <- c('Rule 1', 'Rule 1', 'Rule 1', 'Rule 1', 'Rule 2', 'Rule 2', 'Rule 2')
Condition <- c('1 of 4', '2 of 4', '3 of 4', '4 of 4', '1 of 3', '2 of 3', '3 of 3')
Clause <- c('Temperature > 60', 'Temperature < 90', 'Rain = 0', 'Wind < 20', 'Temperature > 55', 'Temperature < 85', 'Rain <= 2')
Lift <- c('1.30', '1.30', '1.30', '1.30', '1.60', '1.60', '1.60')
Coverage <- c('20%','20%','20%','20%','35%','35%','35%')
DF <- data.frame(Rule, Condition, Clause, Lift, Coverage)

Into this dataframe;

Rule <- c('Rule 1', 'Rule 1', 'Rule 1', 'Rule 1','', 'Rule 2', 'Rule 2', 'Rule 2')
Condition <- c('1 of 4', '2 of 4', '3 of 4', '4 of 4','', '1 of 3', '2 of 3', '3 of 3')
Clause <- c('Temperature > 60', 'Temperature < 90', 'Rain = 0', 'Wind < 20','', 'Temperature > 55', 'Temperature < 85', 'Rain <= 2')
Lift <- c('', '', '', '1.30', '','', '', '1.60')
Coverage <- c('','','','20%','','','','35%')
Result <- data.frame(Rule, Condition, Clause, Lift, Coverage)

Notice new blank rows which separates rules and repetitive Lift and Coverage metrics have been removed. Only retaining the Lift and Coverage from the final row of each rule.

One Answer

You can create a blank row to insert in every Rule :

empty_df <- data.frame(matrix('', nrow = 1, ncol = ncol(DF), 
                       dimnames = list(NULL, names(DF))))

Split the data for each unique Rule, replace the repeating values in Lift and Coverage column with blank add empty_df and combine the result.

library(dplyr)

DF %>%
  group_split(Rule) %>%
  purrr::map_df(~.x %>% 
        mutate(across(c(Lift, Coverage), 
              ~replace(., duplicated(., fromLast = TRUE), ''))) %>%
        bind_rows(empty_df)
        ) %>%
  #Remove the blank row from last `Rule`. 
  slice(-n())

#    Rule     Condition Clause             Lift   Coverage
#  <chr>    <chr>     <chr>              <chr>  <chr>   
#1 "Rule 1" "1 of 4"  "Temperature > 60" ""     ""      
#2 "Rule 1" "2 of 4"  "Temperature < 90" ""     ""      
#3 "Rule 1" "3 of 4"  "Rain = 0"         ""     ""      
#4 "Rule 1" "4 of 4"  "Wind < 20"        "1.30" "20%"   
#5 ""       ""        ""                 ""     ""      
#6 "Rule 2" "1 of 3"  "Temperature > 55" ""     ""      
#7 "Rule 2" "2 of 3"  "Temperature < 85" ""     ""      
#8 "Rule 2" "3 of 3"  "Rain <= 2"        "1.60" "35%"   

Correct answer by Ronak Shah on January 28, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP