Is it possible to create a python script that looks for files in a directory on a given time daily?

Stack Overflow Asked by UndefinedKid01 on January 1, 2022

So basically, I’m creating a directory that allows users to put csv files in there. But I want to create python script that would look in that folder everyday at a given time (lets say noon) and pick up the latest file that was placed in there if it’s not over a day old. But I’m not sure if that’s possible.

Its this chunk of code that I would like to run if it the app finds a new file in the desired directory:

def better_Match(results, best_percent = "Will need to get the match %"):
    result = {}
    result_list = [{ for item in result} for result in results]
    if result_list:
        score_list = [float(item['score']) for item in result_list]
        match_index = max(enumerate(score_list),key=lambda x: x[1])[0]
        logger.debug('MRCs:{}, Chosen MRC:{}'.format(score_list,score_list[match_index]))
        above_threshold = float(result_list[match_index]['score']) >= float(best_percent)
        if above_threshold:
            result = result_list[match_index]
    return result

def clean_plate_code(platecode):
    return str(platecode).lstrip('0').zfill(5)[:5]

def re_ch(file_path, orig_data, return_columns = ['ex_opbin']):
    list_of_chunk_files = list(file_path.glob('*.csv'))
    cb_ch = [pd.read_csv(f, sep=None, dtype=object, engine='python') for f in tqdm(list_of_chunk_files, desc='Combining ch', unit='chunk')]
    cb_ch = pd.concat(cb_ch)
    shared_columns = [column_name.replace('req_','') for column_name in cb_ch.columns if column_name.startswith('req_')]
    cb_ch.columns = cb_ch.columns.str.replace("req_", "")
    return_columns = return_columns + shared_columns
    cb_ch = cb_ch[return_columns]
    for column in shared_columns:
        cb_ch[column] = cb_ch[column].astype(str)
        orig_data[column] = orig_data[column].astype(str)
    final= orig_data.merge(cb_ch, how='left', on=shared_columns)
    return final

2 Answers

This will do the job!

import os
import time
import threading
import pandas as pd


def create_csv_file():
    # create files.csv file that will contains all the current files
    # This will run for one time only
    if not os.path.exists('files.csv'):
        list_of_files = os.listdir(DIR_PATH )

def check_for_new_files():
    files = pd.read_csv('files.csv')
    list_of_files = os.listdir(DIR_PATH )
    if len(files.files) != len(list_of_files):
        print('New file added')
        #do what you want
        #save your excel with the name sample.xslx
        #append your excel into list of files and get the set so you will not have the sample.xlsx twice if run again


        #save again the curent list of files
        print('Finished for the day!')

ticker = threading.Event()
# Run the program every 86400 seconds = 24h
while not ticker.wait(86400):

It basically uses threading to check for new files every 86400s which is 24h, and saves all the current files in a directory where the py file is in and checks for new files that does not exist in the csv file and append them to the files.csv file every day.

Answered by JaniniRami on January 1, 2022

For running script at certain time:

You can use cron for linux. In windows you can use windows scheduler

Here is an example for getting latest file in directory

files = os.listdir(output_folder)
files = [os.path.join(output_folder, file) for file in files]
files = [file for file in files if os.path.isfile(file)]
latest_file = max(files, key=os.path.getctime)

Answered by DD_N0p on January 1, 2022

Add your own answers!

Related Questions

Replace duplicates items in array with different values

3  Asked on November 7, 2021 by mohamed-eshaftri


How to fix object “x” not found

2  Asked on November 7, 2021 by ronie-febriansah


link error when using random_device from boost

2  Asked on November 7, 2021 by dikiidog


Error: invalid input syntax for type integer

0  Asked on November 7, 2021 by hammad-ali


android studio : My app crashes immediately after start

5  Asked on November 7, 2021 by user13930404


Int + Str type python

1  Asked on November 7, 2021


Javascript Output is giving empty response

0  Asked on November 7, 2021 by manish-jha


Create many CSV Files based on Pandas df column value

2  Asked on November 7, 2021 by eithar


How to change radio button checked attribute

2  Asked on November 7, 2021 by white-death


Vue js/Javascript Objects.assign() not working

0  Asked on November 7, 2021 by madsongr


Problems with HTTP/2 on nginx for Windows?

0  Asked on November 7, 2021 by camaross


Error to start application after insert Entity Framework

1  Asked on November 7, 2021 by bircastri


Ask a Question

Get help from others!

© 2023 All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP, SolveDir