TransWikia.com

How to fix the index value that I use for X_test and y_test?

Data Science Asked by As if on July 16, 2021

I am writing code for SVR. Therefore, I have generated my code as the requirements. But I am stuck on writing codes for the index of for-loop of X_test and y_test. I have to write code as it should be associated with the line in the datasets just next to the X_train and y_train. So their index should be +1 of the ending index of X_train and y_train.

For Example:

  • In the first iteration (i.e. when i=0), we are using the first 1000 rows for training and the next row (i.e. the 1001st row) for testing

  • In the second iteration (i.e. when i=1), we are using the rows from 1 to 1001 for training and the next row (i.e. the 1002nd row) for testing

  • In the third iteration (i.e. when i=2), we are using the rows from 2 to 1002 for training and the next row (i.e. the 1003rd row) for testing and so on.

My full code:

import pandas as pd
import numpy as np

# Make fake dataset
dataset = pd.DataFrame(data= np.random.rand(2000,22))
dataset['age'] = np.random.randint(2, size=2000)

# Separate the target from the other features
target = dataset['age']
data = dataset.drop('age', axis = 1)

X_train, y_train = data.loc[:1000], target.loc[:1000]

X_test,  y_test  = data.loc[1001], target.loc[1001] 

X_test = np.array(X_test).reshape(1, -1)
print(X_test.shape)

SupportVectorRefModel = SVR()
SupportVectorRefModel.fit(X_train, y_train)

y_pred = SupportVectorRefModel.predict(X_test)
y_pred

y_pred_list = []
y_test_list = []


for i in range(1, 2000):

    X_train, y_train = dataset.iloc[i:1000+i], target.iloc[i:1000+i]
    X_test, y_test = dataset.iloc[i], target.iloc [i]


    X_test = np.array(X_test).reshape(1, -1)
    print(X_test.shape)

    SupportVectorRefModel = SVR()
    SupportVectorRefModel.fit(X_train, y_train)
    y_pred = SupportVectorRefModel.predict(X_test)

    y_pred_list.append(y_pred)

    y_test_list.append(y_test)

print(y_test_list, y_pred_list)

I want to update my code on this below:

 X_test, y_test = dataset.iloc[i], target.iloc [i]

So how may I update this index line as an above requirement?

One Answer

I've changed the range values in the loop and indexing for training and test data. (Also, it seems you are using the dataset instead of data by mistake.)

i:(999 + i) contains thought rows locations of which will be incremented by one as the loop progresses.

for i in range(0, 999):

    X_train, y_train = data.iloc[i:(999 + i)], target.iloc[i:(999 + i)]
    X_test, y_test = data.iloc[i + 1000], target.iloc [i + 1000]

Answered by Suren on July 16, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP