AnswerBun.com

Is it possible to create a python script that looks for files in a directory on a given time daily?

Stack Overflow Asked by UndefinedKid01 on January 1, 2022

So basically, I’m creating a directory that allows users to put csv files in there. But I want to create python script that would look in that folder everyday at a given time (lets say noon) and pick up the latest file that was placed in there if it’s not over a day old. But I’m not sure if that’s possible.

Its this chunk of code that I would like to run if it the app finds a new file in the desired directory:

def better_Match(results, best_percent = "Will need to get the match %"):
    result = {}
    result_list = [{item.name:item.text for item in result} for result in results]
    if result_list:
        score_list = [float(item['score']) for item in result_list]
        match_index = max(enumerate(score_list),key=lambda x: x[1])[0]
        logger.debug('MRCs:{}, Chosen MRC:{}'.format(score_list,score_list[match_index]))
        logger.debug(result_list[match_index])
        above_threshold = float(result_list[match_index]['score']) >= float(best_percent)
        if above_threshold:
            result = result_list[match_index]
    return result

def clean_plate_code(platecode):
    return str(platecode).lstrip('0').zfill(5)[:5]

def re_ch(file_path, orig_data, return_columns = ['ex_opbin']):
    list_of_chunk_files = list(file_path.glob('*.csv'))
    cb_ch = [pd.read_csv(f, sep=None, dtype=object, engine='python') for f in tqdm(list_of_chunk_files, desc='Combining ch', unit='chunk')]
    cb_ch = pd.concat(cb_ch)
    shared_columns = [column_name.replace('req_','') for column_name in cb_ch.columns if column_name.startswith('req_')]
    cb_ch.columns = cb_ch.columns.str.replace("req_", "")
    return_columns = return_columns + shared_columns
    cb_ch = cb_ch[return_columns]
    for column in shared_columns:
        cb_ch[column] = cb_ch[column].astype(str)
        orig_data[column] = orig_data[column].astype(str)
    final= orig_data.merge(cb_ch, how='left', on=shared_columns)
    return final

2 Answers

This will do the job!

import os
import time
import threading
import pandas as pd

DIR_PATH = 'DIR_PATH_HERE'

def create_csv_file():
    # create files.csv file that will contains all the current files
    # This will run for one time only
    if not os.path.exists('files.csv'):
        list_of_files = os.listdir(DIR_PATH )
        list_of_files.append('files.csv')
        pd.DataFrame({'files':list_of_files}).to_csv('files.csv')
    else:
        None


def check_for_new_files():
    create_csv_file()
    files = pd.read_csv('files.csv')
    list_of_files = os.listdir(DIR_PATH )
    if len(files.files) != len(list_of_files):
        print('New file added')
        #do what you want
        #save your excel with the name sample.xslx
        #append your excel into list of files and get the set so you will not have the sample.xlsx twice if run again

        list_of_files.append('sample.xslx')
        list_of_files=list(set(list_of_files))

        #save again the curent list of files
        pd.DataFrame({'files':list_of_files}).to_csv('files.csv')
        print('Finished for the day!')



ticker = threading.Event()
# Run the program every 86400 seconds = 24h
while not ticker.wait(86400):
    check_for_new_files()

It basically uses threading to check for new files every 86400s which is 24h, and saves all the current files in a directory where the py file is in and checks for new files that does not exist in the csv file and append them to the files.csv file every day.

Answered by JaniniRami on January 1, 2022

For running script at certain time:

You can use cron for linux. In windows you can use windows scheduler

Here is an example for getting latest file in directory

files = os.listdir(output_folder)
files = [os.path.join(output_folder, file) for file in files]
files = [file for file in files if os.path.isfile(file)]
latest_file = max(files, key=os.path.getctime)

Answered by DD_N0p on January 1, 2022

Add your own answers!

Related Questions

Replace duplicates items in array with different values

3  Asked on November 7, 2021 by mohamed-eshaftri

         

How to fix object “x” not found

2  Asked on November 7, 2021 by ronie-febriansah

 

link error when using random_device from boost

2  Asked on November 7, 2021 by dikiidog

     

Error: invalid input syntax for type integer

0  Asked on November 7, 2021 by hammad-ali

     

android studio : My app crashes immediately after start

5  Asked on November 7, 2021 by user13930404

   

Int + Str type python

1  Asked on November 7, 2021

     

Javascript Output is giving empty response

0  Asked on November 7, 2021 by manish-jha

     

Create many CSV Files based on Pandas df column value

2  Asked on November 7, 2021 by eithar

       

How to change radio button checked attribute

2  Asked on November 7, 2021 by white-death

       

Vue js/Javascript Objects.assign() not working

0  Asked on November 7, 2021 by madsongr

     

Problems with HTTP/2 on nginx for Windows?

0  Asked on November 7, 2021 by camaross

     

Error to start application after insert Entity Framework

1  Asked on November 7, 2021 by bircastri

   

Ask a Question

Get help from others!

© 2023 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP, SolveDir