TransWikia.com

Finding meaningful boundaries between two continuous variables in R

Cross Validated Asked on January 10, 2021

To find the relationship between two columns of the iris dataset, I am performing kruskal.test and p.value shows a meaningful relationship between these two columns.

data(iris)
kruskal.test(iris$Petal.Length, iris$Sepal.Width)

Here are the results:

    Kruskal-Wallis rank sum test

data:  iris$Petal.Length and iris$Sepal.Width
Kruskal-Wallis chi-squared = 41.827, df = 22, p-value = 0.00656

The Scatter plot also shows some sort of relationship.

plot(iris$Petal.Length, iris$Petal.Width)

enter image description here

To find the meaningful boundaries of these two variables, I ran pairwise.wilcox.test test, but for this test to work, one of the variables needs to be categorical. If I pass both continuous variables to it, then the results are not as expected.

pairwise.wilcox.test(x = iris$Petal.Length, g = iris$Petal.Width, p.adjust.method = "BH")

As an output, I need a clear cut point where these two variables have some sort of relationship and where this relationship ends (As shown through the red line in the attached image above)

I am not sure if there is any statistical test or another programming technique to find these boundaries.

e.g. manually I can do something like this to mark boundaries –

setDT(iris)[, relationship := ifelse(Petal.Length > 3 & Sepal.Width < 3.5, 1, 0)]

But, is there a test to find such boundaries?

Thanks, Saurabh

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP