TransWikia.com

Split a string, tokenize substrings, and convert tokens to numeric vectors

Stack Overflow Asked by Mo.ms on December 20, 2021

I have a character string:

String <- "268.1,271.1,280.9,294.7,285.6,288.6,384.4n124.8,124.2,116.2,117.7,118.3,122.0,168.3n18,18,18,18,18,18,18"

I would like to split it into three substrings based on n.

I did that using the following code:

strsplit(String, "n")

It resulted in three substrings.

  1. How can I get three separate subsisting so that I can use each vector for calculations?

  2. How can I tokenize the substrings, to create vectors of numeric values?

4 Answers

We can use read.table to read String as dataframe with separator as comma (,) which will make columns numeric automatically.

read.table(text = String, sep = ",")

#     V1    V2    V3    V4    V5    V6    V7
#1 268.1 271.1 280.9 294.7 285.6 288.6 384.4
#2 124.8 124.2 116.2 117.7 118.3 122.0 168.3
#3  18.0  18.0  18.0  18.0  18.0  18.0  18.0

We can then use asplit to split the data on each row :

asplit(read.table(text = String, sep = ","), 1)

#[[1]]
#   V1    V2    V3    V4    V5    V6    V7 
#268.1 271.1 280.9 294.7 285.6 288.6 384.4 

#[[2]]
#   V1    V2    V3    V4    V5    V6    V7 
#124.8 124.2 116.2 117.7 118.3 122.0 168.3 

#[[3]]
#V1 V2 V3 V4 V5 V6 V7 
#18 18 18 18 18 18 18 

Answered by Ronak Shah on December 20, 2021

We can use scan. After splitting the 'String' at n, loop over the list and scan the string to read it as a vector

lapply(strsplit(String, "n")[[1]], function(x) 
       scan(text = x, what = numeric(), sep=","))

Or using read.table (as was originally shown)

read.table(text = String, sep=",")

If there are unequal number of elements, use fill = TRUE

 read.table(text = String, sep=",", fill = TRUE)

Original answer:

read.table(text = String, sep=",")
#    V1    V2    V3    V4    V5    V6    V7
#1 268.1 271.1 280.9 294.7 285.6 288.6 384.4
#2 124.8 124.2 116.2 117.7 118.3 122.0 168.3
#3  18.0  18.0  18.0  18.0  18.0  18.0  18.0

Answered by akrun on December 20, 2021

Here's an approach with base R. strsplit is a little tricky in that it returns a list and also does not work on a list.

  1. As you suggest in your question, use strsplit with split = "n" to split into a list of 3 strings.

  2. Use unlist to change that list into a vector of 3 character strings.

  3. Use strsplit again with split = "," to create a list of 3 character vectors

  4. Use lapply to convert those character vectors into numeric vectors.

lapply(strsplit(unlist(strsplit(String,"n")),","),as.numeric)
[[1]]
[1] 268.1 271.1 280.9 294.7 285.6 288.6 384.4

[[2]]
[1] 124.8 124.2 116.2 117.7 118.3 122.0 168.3

[[3]]
[1] 18 18 18 18 18 18 18

Answered by Ian Campbell on December 20, 2021

String<- "268.1,271.1,280.9,294.7,285.6,288.6,384.4n124.8,124.2,116.2,117.7,118.3,122.0,168.3n18,18,18,18,18,18,18"
   
string_vector <- unlist(strsplit(String, "n"))

unlist(lapply(strsplit(string_vector, ','),as.numeric))

Output

 [1] 268.1 271.1 280.9 294.7 285.6 288.6 384.4 124.8 124.2 116.2 117.7 118.3 122.0 168.3  18.0  18.0  18.0  18.0  18.0  18.0
[21]  18.0

Answered by mustafaakben on December 20, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP