Is it possible to create a python script that looks for files in a directory on a given time daily?

Stack Overflow Asked by UndefinedKid01 on January 1, 2022

So basically, I’m creating a directory that allows users to put csv files in there. But I want to create python script that would look in that folder everyday at a given time (lets say noon) and pick up the latest file that was placed in there if it’s not over a day old. But I’m not sure if that’s possible.

Its this chunk of code that I would like to run if it the app finds a new file in the desired directory:

def better_Match(results, best_percent = "Will need to get the match %"):
    result = {}
    result_list = [{ for item in result} for result in results]
    if result_list:
        score_list = [float(item['score']) for item in result_list]
        match_index = max(enumerate(score_list),key=lambda x: x[1])[0]
        logger.debug('MRCs:{}, Chosen MRC:{}'.format(score_list,score_list[match_index]))
        above_threshold = float(result_list[match_index]['score']) >= float(best_percent)
        if above_threshold:
            result = result_list[match_index]
    return result

def clean_plate_code(platecode):
    return str(platecode).lstrip('0').zfill(5)[:5]

def re_ch(file_path, orig_data, return_columns = ['ex_opbin']):
    list_of_chunk_files = list(file_path.glob('*.csv'))
    cb_ch = [pd.read_csv(f, sep=None, dtype=object, engine='python') for f in tqdm(list_of_chunk_files, desc='Combining ch', unit='chunk')]
    cb_ch = pd.concat(cb_ch)
    shared_columns = [column_name.replace('req_','') for column_name in cb_ch.columns if column_name.startswith('req_')]
    cb_ch.columns = cb_ch.columns.str.replace("req_", "")
    return_columns = return_columns + shared_columns
    cb_ch = cb_ch[return_columns]
    for column in shared_columns:
        cb_ch[column] = cb_ch[column].astype(str)
        orig_data[column] = orig_data[column].astype(str)
    final= orig_data.merge(cb_ch, how='left', on=shared_columns)
    return final

2 Answers

This will do the job!

import os
import time
import threading
import pandas as pd


def create_csv_file():
    # create files.csv file that will contains all the current files
    # This will run for one time only
    if not os.path.exists('files.csv'):
        list_of_files = os.listdir(DIR_PATH )

def check_for_new_files():
    files = pd.read_csv('files.csv')
    list_of_files = os.listdir(DIR_PATH )
    if len(files.files) != len(list_of_files):
        print('New file added')
        #do what you want
        #save your excel with the name sample.xslx
        #append your excel into list of files and get the set so you will not have the sample.xlsx twice if run again


        #save again the curent list of files
        print('Finished for the day!')

ticker = threading.Event()
# Run the program every 86400 seconds = 24h
while not ticker.wait(86400):

It basically uses threading to check for new files every 86400s which is 24h, and saves all the current files in a directory where the py file is in and checks for new files that does not exist in the csv file and append them to the files.csv file every day.

Answered by JaniniRami on January 1, 2022

For running script at certain time:

You can use cron for linux. In windows you can use windows scheduler

Here is an example for getting latest file in directory

files = os.listdir(output_folder)
files = [os.path.join(output_folder, file) for file in files]
files = [file for file in files if os.path.isfile(file)]
latest_file = max(files, key=os.path.getctime)

Answered by DD_N0p on January 1, 2022

Add your own answers!

Related Questions

Javascript returning NaN when multiplying

0  Asked on January 21, 2021 by lm_margaux


How to use For Loop inside the map function in ReactJS

1  Asked on January 20, 2021 by muzamil-hussain


Upgrade .then .catch to async await and try catch

2  Asked on January 20, 2021 by sonetlumiere


Pandas sum of last four not nan values

1  Asked on January 20, 2021 by luca-r


Have the class as value in a dictionary

4  Asked on January 20, 2021 by bows


Add Speed to WASD Controls for A-Frame

2  Asked on January 19, 2021 by dionoh


Calculate the address of the subnet

2  Asked on January 19, 2021 by kingvince


IBM MQ Client running under Windows Docker

2  Asked on January 19, 2021 by lumberjack


Ask a Question

Get help from others!

© 2023 All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP