TransWikia.com

Sending a CSV file for a client list from a database

Code Review Asked by escarta on October 27, 2021

What can I improve about this code in order to be production-ready? I’m not worried about security but about errors that could occur.

What exceptions should I catch? I feel overwhelmed about exceptions because I feel that there are a lot of exceptions that one can consider.

What else should I do? Should I do unit testing for this?

The script first loads the client list from an SQL database, then takes the daily work quantity about pallet movements from a SQL database, after that it writes a .csv file with this data, and finally sends it through email.

import pyodbc
import csv
import smtplib
import os.path
from datetime import date, timedelta
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.mime.application import MIMEApplication


def load_clients():
    client_list = {}
    connection_string = f'*'
    query = """
          SELECT clifor.a1_codice_clifor, anacf.ragione_soc_fiscale
            FROM anacf anacf, clifor clifor
           WHERE (clifor.a1_ditta_codice=5)
             AND (clifor.a1_tipo_cli1_for2=1)
             AND
              (anacf.a1_codice_anagrafica_generale=clifor.a2_anagrafica_codice)
    """
    cnxn = pyodbc.connect(connection_string, autocommit=True)
    with cnxn:
        cursor = cnxn.cursor()
        cursor.execute(query)
        while True:
            row = cursor.fetchone()
            if not row:
                break
            client_list[row[0]] = row[1]
    return client_list


def load_movements(clients):
    movement_list = []
    connection_string = f'*'
    query = """
          SELECT datadoc, cliente,
            SUM (@DECODE(tipodoc, 3 ,palletts)) AS inputs,
            SUM (@DECODE(tipodoc, 5 ,palletts)) AS outputs
              FROM docmagat
           WHERE ditta=5
             AND anno=2020
             AND tipodoc in (3,5)
             AND datadoc=SYSDATE-1
           GROUP BY 1,2
    """
    cnxn = pyodbc.connect(connection_string)
    with cnxn:
        cursor = cnxn.cursor()
        cursor.execute(query)
        while True:
            row = cursor.fetchone()
            if not row:
                break
            inputs = 0 if row[2] is None else row[2]
            outputs = 0 if row[3] is None else row[3]
            movement_list.append(tuple(
                (clients[row[1]], inputs, outputs, inputs
                 + outputs)))
    return movement_list


def load_file(movements, yesterday, folder):
    filename = 'pallet_movements ' + yesterday + '.csv'
    full_path = os.path.join(folder, filename)
    with open(full_path, mode='w', newline='') as pallet_movements:
        movements_writer = csv.writer(pallet_movements,
                                           delimiter=';')
        movements_writer.writerow(['Day: ' + yesterday])
        movements_writer.writerow(['Company', 'inputs', 'outputs',
                                       'Total'])
        for movement in movements:
            movements_writer.writerow(movement)
    return filename


def send_email(filename, yesterday, folder):
    sender = "*"
    destination = "*"
    msg = MIMEMultipart()
    msg['Subject'] = 'Pallet movements ' + yesterday
    msg['From'] = sender
    msg['To'] = destination
    message_text = 'Good morning,nnYou can find Pallet movements from
                      day ' + yesterday + ' attached.nnGoodbye'
    msg.attach(MIMEText(message_text))
    full_path = os.path.join(folder, filename)
    attachment = MIMEApplication(open(full_path, 'rb').read())
    attachment.add_header('Content-Disposition', 'attachment',
                        filename=filename)
    msg.attach(attachment)
    try:
        with smtplib.SMTP('*', 587) as smtpObj:
            smtpObj.ehlo()
            smtpObj.starttls()
            smtpObj.login("*", "*")
            smtpObj.sendmail(sender, destination, msg.as_string())
    except Exception as e:
        print(e)


def main():
    yesterday = date.today() - timedelta(days=1)
    yesterday = yesterday.strftime(f'%d-%m-%Y')
    local_folder = os.path.dirname(os.path.abspath(__file__))
    clients = load_clients()
    movements = load_movements(clients)
    filename = load_file(movements, yesterday, local_folder)
    send_email(filename, yesterday, local_folder)


if __name__ == "__main__":
    main()

One Answer

Answers to your questions

What can I improve about this code in order to be production-ready? I'm not worried about security but about errors that could occur.

So make a list of errors that you expect could occur, and think about what should happen in each case. For example, just looking at main(), I can think of:

  • Loading clients fails
  • Loading movements fails
  • Creating the movement report file fails
  • Sending the email fails

Is it important to know why something fails exactly? If so make it more specific. For example:

  • Loading client fails:
    • Could not connect to the database
    • Could not run the query

What should happen if any of those errors occur in production? If you know the network is flaky, you could for example retry connecting to the database a few times before giving up. But if the query fails there is most likely not much you can do.

What exceptions should I catch? I feel overwhelmed about exceptions because I feel that there are a lot of exceptions that one can consider.

There are many places where exceptions can be thrown, and besides exceptions there are also function that do not throw an exception but just return a value that indicates that an error occurred. The advantage of exceptions is that if you don't catch them, the program will abort instead of doing something unexpected.

As the name implies, exceptions are normally used to signal that something unexpected has happened. If the system runs out of memory, or it fails to open a file that you expect to exist, there is most likely not much you can do at that point, and letting the program abort is the right thing to do. However, there are situations where you can do better than that. It is up to you to decide what exceptions would be beneficial to catch.

Another nice thing about exceptions is that you do not have to catch them in the same function as the exception is generated. So you don't necessarily have to add exception handling all over the place, but you can defer it to a function higher up the call stack.

The code you showed looks like it does a one-off operation, loading some data, and generating a single email based on the data. You could decide to just not handle exceptions at all, and then if any error occurred, the email would just not be sent. But you could do a bit better than that: handle exceptions from all database access and file loading, and in the exception handler send an email notifying the recipient that an error occurred and that there is no list of pallet movements available today due to an error. Alternatively, you could send a notification to a technician so they can investigate why the code failed. Handling this would look like so:

def main():
    try:
        yesterday = date.today() - timedelta(days=1)
        yesterday = yesterday.strftime(f'%d-%m-%Y')
        local_folder = os.path.dirname(os.path.abspath(__file__))
        clients = load_clients()
        movements = load_movements(clients)
        filename = load_file(movements, yesterday, local_folder)
    except:
        send_error_email()
        raise # Causes the program to abort anyway

    send_email(filename, yesterday, local_folder)

I think that's more useful than just printing the exception and exitting the program normally. For example, if your program is run as a cron job, then handling the exception in send_email() like you did would just hide errors.

What else should I do? Should I do unit testing for this?

Yes, you should test your code, regardless of whether you handle exceptions or not. Your test cases should test whether the code performs as required in all the scenarios you expect can happen. It's perfectly fine to not handle exceptions if the requirement is that your code just aborts in case of any error.

Give load_file() a better name

The function load_file() does not load a file like its name implies, but rather creates a new file. The name create_file() would already be better. But it is still quite generic. What kind of file does it generate? Maybe create_movement_report() would be even better.

Answered by G. Sliepen on October 27, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP