TransWikia.com

How can l extract a section of the pandas dataframe like marked in the picture below?

Stack Overflow Asked by gsoft on November 12, 2021

Click here to open the marked image

I am trying to extract the section (matrix) of the numbers in pandas dataframe like as marked in the given picture embedded above.
Please anyone who can assist me, I want to perform analytics based on the section (matrix) of a bigger data frame. Thank you in advance!!

2 Answers

Be sub_rows and sub_cols the dimension of the datafram to be extracted:

import pandas as pd

sub_rows = 10 # Amount of rows to be extracted
sub_cols = 3  # Amount of columns to be extracted


if sub_rows > len(df.index):
    print("Defined sub dataframe rows are more than in the original dataframe")
elif sub_cols > len(df.columns):
    print("Defined sub dataframe columns are more than in the original dataframe")
else:
    for i in range(0,len(df.index)-sub_rows):
        for j in range(0, len(df.columns)):
            d.iloc[i:i+sub_rows, j:j+sub_cols] # Extracted dataframe
            # Put here the code you need for your analysis

Answered by David Felipe Medina Mayorga on November 12, 2021

You can use the .iloc[] function to select the rows and columns you want.

dataframe.iloc[5:15,6:15]

This should select rows 5-14 and columns 6-14. Not sure if the numbers are correct but I think this method is what you were looking for.

edit: changed .loc[] to .iloc[] because we're using index values, and cleaned it up a bit

Here is the code to iterate over the whole dataframe

#df = big data frame
shape = (10,10) #shape of matrix to be analized, here is 10x10
step = 1 #step size, itterate over every number
        #or
step = 10 #step size, itterate block by block
        #keep in mind, iterating by block will leave some data out at the end of the rows and columns
#you can set step = shape if you are working  with a matrix that isn't square, just be sure to change step in the code below to step[0] and step[1] respectively 
for row in range( 0, len(df[0]) - shape[0]+1, step): #number of rows of big dataframe - number of rows of matrix to be analized 
   for col in range(0, len(df.iloc[0,:]) - shape[1]+1, step): #number of columns of big dataframe - number of columns of matrix to be analized 
        matrix = df.iloc[row:shape[0]+row, col:shape[1]+col] #slice out matrix and set it equal to 'matrix'
        #analize matrix here
    

This is basically the same as @dafmedinama said, i just added more commenting and simplified specifying the shape of the matrix as well as included a step variable if you don't want to iterate over every single number every time you move the matrix.

Answered by George Sebastiaan van Heerden on November 12, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP