TransWikia.com

C date information program

Code Review Asked on December 15, 2021

Inspired by a section from K&R C (exercises 5-8 and 5-9), I decided to write a program that would take a date numerous numeric forms, all based around DD/MM/YYYY, and from the given date determine some information.

Firstly, find out if the date is valid, i.e. a valid day of a valid month and from there figure out some other things. Is it a leap year? What is the day of the week? What is the day of the year? What is the name of the month? Once this is all calculated the program converts the date into a string and everything is printed out to the user.

I spent a lot of time re-factoring my code and trying to make as many optimisations as possible to code size, memory usage, and potential performance chokes, but I feel as if some of these may have been counterproductive and conflicted with each other and may have netted a decrease in performance. These include, but are not limited to, using the smallest possible variable type for a job (i.e. a short instead of an int) and using the ternary operator where ever possible. How can I improve this code to make it faster, more efficient, less redundant, and easier to read?

/* 
 * Filename:    date.c
 * Author:      Jess Turner
 * Date:        5-9-17
 * Licence:     GNU GPL V3
 *
 * Reads a date string in the form DD/MM/YYYY and determines if it is a valid date, the day of the year, if that year is a leap year, and converts the date to a string
 *
 * Options:
 *  N/A
 *
 * Return/exit codes:
 *  0   EXEC_SUCCESS    - Successful execution
 *  1   MEM_ERROR       - Memory allocation error
 *  2   READ_ERROR      - fgets() error
 *
 * Todo:
 *  N/A
 *
 * Possible additions:
 *  - Add option(s) for taking different date forms (i.e. -a switches to American date formats)
 *
 */

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <stdbool.h>

#define EXEC_SUCCESS 0
#define MEM_ERROR 1
#define READ_ERROR 2

typedef struct {
    short day;
    short month;
    short year;
} date_t;

const char *months[] = {
            NULL ,
            "January",
            "February", 
            "March", 
            "April",
            "May",
            "June",
            "July",
            "August",
            "September",
            "October",
            "November",
            "December"
};

const char *days[] = {
            "Sunday",
            "Monday",
            "Tuesday",
            "Wednesday",
            "Thursday",
            "Friday",
            "Saturday",
};

const short days_per_month[2][13] = {
                {0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31},
                {0, 31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31}
};

char *date_to_string(char **string, date_t date);   /* Converts the date in decimal form into a string containing the day and name of the month */
short day_of_year(date_t date);                     /* Determines the day of the year from a given date */
short day_of_week(date_t date);                     /* Determines the numerical value of the day of the week (taking Sunday as the first day), only works on dates after 14/09/1752 due to the adoption of the Gregorian calendar in the UK */
bool is_valid_date(date_t date); 
bool is_leap_year(date_t date);

int main()
{
    date_t date;
    char *date_string;
    char input[128];
    char *format[] = { 
            "%d/%d/%d",
            "%d-%d-%d",
            "%d.%d.%d",
            "%d,%d,%d",
            "%d %d %d"
    };
    
    printf("Note: day of week only accurate after 14/09/1752n");
    
    for(;;) {
        printf("nEnter date: ");
        
        date = (const date_t){ 0 };
        
        if(!fgets(input, sizeof(input), stdin)) {
            fprintf(stderr, "nError: fgets() read faliure!n");
            return READ_ERROR;
        }
        
        for(int i = 0; i < 5 && sscanf(input, format[i], &date.day, &date.month, &date.year) != 3; i++)
            ;
        
        if(!is_valid_date(date)) {
            fprintf(stderr, "Invalid date!n");
            continue;
        }
        
        if(!date_to_string(&date_string, date)) {
            fprintf(stderr, "nError: Memory allocation faliure!n");
            return MEM_ERROR;
        }
        
        printf( "The %s is a %s and is day number %d of the year %d, which is %sa leap yearn",
                date_string,
                days[day_of_week(date)],
                day_of_year(date),
                date.year,
                is_leap_year(date) ? "" : "not "
        );
        
        free(date_string);
    }
    
    return EXEC_SUCCESS;
}

char *date_to_string(char **string, date_t date)
{
    char date_str[48] = { '' };
    char *suffix[] = { "st", "nd", "rd", "th" };
    short date_str_len;
        
    sprintf(date_str,
            "%d%s of %s %d",
            date.day,
            suffix[ (date.day == 1 || date.day == 21 || date.day == 31) ? 0 :
                    (date.day == 2 || date.day == 22) ? 1 :
                    (date.day == 3 || date.day == 23) ? 2 : 3],
            months[date.month],
            date.year
    );
    
    date_str_len = strlen(date_str);
    
    if(!(*string = calloc(date_str_len + 1, sizeof(char)))) { /* Remember kids: Always zero your character strings before you use them */
        fprintf(stderr, "Error: Cannot allocate memory!n");
        return NULL;
    }
    
    strncpy(*string, date_str, date_str_len);
    
    return *string;
}

short day_of_year(date_t date)
{
    short day;
    short month;
    short leap;
    
    leap = is_leap_year(date);
    
    for(day = 0, month = 1; month < date.month; month++)
        day += *(*(days_per_month + leap) + month);
    
    day += date.day;
    
    return day;
}

bool is_valid_date(date_t date)
{
    return ((date.month > 0 && date.month < 13) && ((date.day > 0 && date.day <= days_per_month[0][date.month]) || (date.day == 29 && date.month == 2 && is_leap_year(date))));
}

bool is_leap_year(date_t date)
{
    return ((date.year % 4 == 0 && date.year % 100 != 0) || date.year % 400 == 0);
}

short day_of_week(date_t date)
{
    short day = date.day;
    short month = date.month;
    short year = date.year;

    return (day += month < 3 ? year-- : year - 2, 23 * month / 9 + day + 4 + year / 4 - year / 100 + year / 400) % 7;  
}

3 Answers

In addition to the other reviews:

Bad practice

  • (major issue) Replacing if-else statements with the ternary operator does not lead to better performance! There is no relation between the number of source code lines and program performance. A single line of may result in hundreds of machine instructions, while a whole function may result in a single machine instruction. All that matters for performance is what the code is actually doing.

    So all you achieved by tossing in ?: all over the place is severely reduced readability, as been pointed out in other answers. This ?: readability problem is by far the most serious issue in your code - such "magic one-liners" would never pass any professional review.

    In general, stay away from the ternary operator - it has very limited uses and comes with the quirk of invoking implicit conversions on the operands.

    There are a few valid uses of ?:: conditional variable initialization, emulating function wrappers with function-like macros, making repetitive, simple code more compact etc. But all of these are specialized cases.

  • Never use the strncpy function, it is dangerous and should be avoided. See Why is strncpy insecure? and Why are strlcpy and strlcat considered insecure?. Instead, check the size of the data to copy (to avoid buffer overruns), followed by a call to strcpy(). Alternatively use memcpy() and manually terminate the string.

  • (major issue) Using the short type is almost never correct practice. The use of it in your code is fishy, I assume it is to save memory? Be aware that using small integer types opens up the can of worms that is implicit integer promotion. Therefore it is always best to avoid small integer types unless your system is very resource-restrained, such as a small, bare-metal microcontroller applications.

    If you wish to save memory, you should use a type such as int_least16_t instead. Most often it makes more sense to optimize for speed though, using int_fast16_t. (Though note that these could also be subject to implicit promotion!) In general, the types in stdint.h are to prefer over the native integer data types.

  • Mixing the ++ and -- operators with other operators in the same expression is bad practice, since this opens up the potential for all manner of poorly-defined behavior bugs. In addition, it tends to make the code less readable.

  • Keep tables of strings that aren't modified by the program const, for example const char *suffix = .... In general, whenever you have a pointer to a string literal (or an array of such pointers) it should always be const-qualified. This is to avoid undefined behavior in case the data is accidentally or intentionally modified.

    Look-up tables declared at file scope ("globals") should in addition to const be declared as static, so that you don't clutter up the global namespace.

Standard compliance

  • (minor issue) The form int main() is obsolete and may not work in future versions of the C language. Instead, use int main (void).

  • The return type from strlen() is size_t not short. Your compiler should have told you this. Make sure you have warnings enabled for such implicit type conversions (for example gcc -Wconversion), to avoid all manner of subtle bugs.

Program design

  • Avoid the use of continue. The presence of this keyword is almost always a certain sign of a badly written loop that should be rewritten in more readable ways. continue is also often abused for spaghetti programming.

    As already pointed out, your eternal for loop isn't the best idea. You should replace this loop and all the continue/returns, perhaps with something like this:

    func_status_t status = OK; // some custom result type, enum
    
    while(status == OK)
    {
      status = do_stuff(); // all the code goes into this function
    }
    
    cleanup(); // free memory etc
    
    if(status == SOMETHING)
    {
      return WHATEVER;
    }
    ...
    return 0;
    

Bugs

  • In case something goes wrong, you return from the program without calling free() and doing clean-up. Yes, the OS will do that in most cases, that's not the issue. You should still consider this as a bug, for one important reason:

    If your code contains any form of heap memory corruption bug or dangling pointers - fairly common bugs that could be caused by pretty much anything - your program will likely crash upon calling free(). This is a good thing, because it means you'll get a head's up of those potentially dormant bugs early on, preventing them from getting into the production code.

  • You should initialize date_string to NULL so that even if calloc fails, you can pass the pointer to free(). Passing NULL to free() is well-defined to do nothing.

Coding style

  • (minor issue) Any reason why your brace placement style is different for functions and statements? This is inconsistent. Pick one style and stick to it, don't mix two different styles.

  • (minor issue) Inconsistent indention. Whenever you write an initializer list, you use different indention. Instead, use your standard indention width for these as well - in your case 4 spaces:

    const char *months[] = {
        NULL,
        "January",
        "February",
        ...
    };
    

Answered by Lundin on December 15, 2021

Bug: Wrong specifiers. Temporarily change to below to see the mis-match. Formats should have been "%hd/%hd/%hd", not "%d/%d/%d"

sscanf(input, "%d-%d-%d", &date.day, &date.month, &date.year) != 3; i++)

This implies testing, was not done on a 32-bit platform or not done with this code.


How can I improve this code to make it faster, more efficient, less redundant, and easier to read?

First, make code easier to read and compilable using the strictness warnings. Any code improvements done by OP are linear. To improve performance, focus on larger issues.

The biggest performance issues are in sscanf(), sprintf(), calloc(), strncpy(), array initialization. Improving any of those will only yield modest linear improvements.


No need to perform type shortening - no space not code savings expected. Use the type of the functions return/use, size_t in this case. This avoids automated warnings about shortening/sign-ness change.

Recommend to use the sizeof of the de-referenced variable rather than the size of the type. Easier to code, review and maintain.

// short date_str_len;
size_t date_str_len;
...
date_str_len = strlen(date_str);
// if(!(*string = calloc(date_str_len + 1, sizeof(char)))) {
if(!(*string = calloc(date_str_len + 1, sizeof **string))) {

date_to_string() is not a static function. Suggest to either 1) make it static so it only receives qualified arguments or 2) make the function more tolerant of date values of unexpected values:

char *suffix[] = { "st", "nd", "rd", "th" };
#define SUFFIX_N (sizeof suffix/ sizeof suffix[0])
int ending = abs(date.day % 10);  // works for entire range of data.day
if (ending > SUFFIX_N) { 
  ending = SUFFIX_N;
}

sprintf(date_str,
        "%d%s of %s %d",
   ...
        suffix[ending],
   ...
);

strncpy() gains little here as the certainly of null character termination relies on the prior code in a not so-obvious-way.

strncpy(*string, date_str, date_str_len);
// Alternatives
strcpy(*string, date_str);
memcpy(*string, date_str, date_str_len + 1);

Or better yet, use the common strdup() for that is what is being done here. Not standard yet easier to code.

*string = strdup(date_str);
if (*string == NULL) {
    ...

The adage /* Remember kids: Always zero your character strings before you use them */ is cute but not universal. To zero a string, code only needs to str_var[0] = '';. This is a performance issue in the general case - here is not a major issue either way. Best to follow group's code guidelines on this matter.

If coding guidelines encourage initializing arrays, this is very usually accompanied with initializing all variables.

char date_str[48] = { '' };
// short date_str_len;
short date_str_len = 0;

is_valid_date() does not test the year.

Leap year calculation is incorrect for years prior to 1582. This code appears resilient for very distant years, yet even a test for that is good design practice.. Suggest year checks

#define YEAR_MIN 1583
#define YEAR_MAX INT_MAX
bool is_valid_date(date_t date)
{
    if (date.year < YEAR_MIN || data.year > YEAR_MAX) return false;
    return ((date.month > 0 ...
}

Source code does not present for review well with excessive long lines like

short day_of_week(date_t date);                     /* Determines the numerical value of the day of the week (taking Sunday as the first day), only works on dates after 14/09/1752 due to the adoption of the Gregorian calendar in the UK */

Versus

short day_of_week(date_t date);                     /* Determines the numerical value 
    of the day of the week (taking Sunday as the first day), only works on dates 
    after 14/09/1752 due to the adoption of the Gregorian calendar in the UK */

Source code should be tolerant of auto-formatting to various widths and not rely on manual editing to align.


Dubious use of short when int or bool would work. Rarely does short make for faster code nor smaller code. If memory space was an issue, unclear why code does not use bool or char. As codes tends to zero fill arrays, also better to initialize the variable rather than assign it later.

 // short leap;
 // leap = is_leap_year(date);
 int /* or bool */ leap = is_leap_year(date);

Unclear why code copies the variable into auxiliaries.

// short day = date.day;
// short month = date.month;
// short year = date.year;
// return (day += month < 3 ? year-- : year - 2, 23 * month / 9 + day + 4 + year / 4 - year / 100 + year / 400) % 7; 

Versus

return (date.day += date.month < 3 ? date.year-- : date.year - 2, 23
    * date.month / 9 + date.day + 4 + date.year / 4 - date.year / 100
    + date.year / 400) % 7;

 // or perhaps
date.day += date.month < 3 ? date.year-- : date.year - 2;
return (23 * date.month / 9 + date.day + 4 + date.year / 4 - date.year / 100
    + date.year / 400) % 7;

Answered by chux - Reinstate Monica on December 15, 2021

All in all, this code looks very good. I will touch on a few minor things that can be improved. This will probably be rather unordered as I'm repeatedly scrolling up and down, but I'll do my best to keep it sane. Enough.

Line length

I have to scroll horizontally in a 1920 pixel wide window (mainly due to the amount of empty space on the left and right, but that's not the point). Consider imposing a limit to line length on yourself, something around 80 or 100 characters. This has multiple benefits:

  • Wide enough for most statements
  • Makes you rethink very long lines
  • Source is able to be read without scrolling on (nearly) any display
  • It looks... clean shaven? I feel some very long lines sprinkled here and there make it look like the code is trying to grow a beard, but fails.

General program usage

This probably is very subjective, but since it's C code I deem it fitting: Instead of reading from stdin in an endless loop (that you cannot exit without Ctrl-C!), make it so the programs takes the date as a command line argument. It's more in the spirit of C to do something like

./date_information 6/1/2017
The 6th of January...

instead of

./date_information
Enter date:

At least that's my take on that.

Clever code

for(int i = 0; i < 5 && sscanf(input, format[i], &date.day, &date.month, &date.year) != 3; i++)
        ;

This line is clever, I admit. But I had to stop and think a moment about what's actually going on here. Why not do something like

for(int i = 0; i != 5; ++i) {
    if (sscanf(input, format[i], &date.day, &date.month, &date.year) == 3) {
        break;
    }
}

which does the same thing but shows what's going on intuitively? But props to you for including that semicolon on the otherwise empty line to show the empty loop body is intended! Not many people do that.

The same goes for

return (day += month < 3 ? year-- : year - 2, 23 * month / 9 + day + 4 + year / 4 - year / 100 + year / 400) % 7;

which is... I don't even know. It does so many things, there's even the good old comma operator there! Remember, just because it's written on one line does not mean it's faster. The compiler still produces three-address code.

(Ab)using the ternary operator falls into this category as well.

suffix[ (date.day == 1 || date.day == 21 || date.day == 31) ? 0     
        (date.day == 2 || date.day == 22) ? 1 :
        (date.day == 3 || date.day == 23) ? 2 : 3]

This is borderline readable. But it's mostly a formatting thing!

suffix[ (date.day == 1 || date.day == 21 || date.day == 31) ? 0
      : (date.day == 2 || date.day == 22)                   ? 1
      : (date.day == 3 || date.day == 23)                   ? 2 : 3]

This looks (a bit) cleaner. Haskell probably biased me towards that style of formatting, but I really like it. You can refactor that whole segment though, no shame in local variables.

Tetris-packing Types

short day = date.day;
short month = date.month;
short year = date.year;
...

Now, I could be picky and say both day and month don't need a short, but just a char - a single byte that is. Since their highest possible values are 31 and 12 respectively, you could even cram them into a single byte. Generally, you should prefer using int to any cramming. They will be handled most efficiently (since they're as large as a word) [citation needed]. Using three shorts in a function which is rather short itself (pun intended) is no big deal, yet it may affect performance. Or it may not. This is such a small scale, it's hard to notice. Better go with "default type for integers" instead of "use as little space as possible for these 4 instructions".

Declare it like it's 1980

You declare all your variables at the very top of main. You're doing K&R, so this is expected, but not needed anymore. Also, if the months and days are global, why not make the formats global as well?

printf, sprintf, fprintf

I see lots of usages for fprintf where you don't actually... format something. If you just want to print a string, use puts or fputs. Note though that puts automatically appends a newline!

fprintf(stderr, "nError: fgets() read faliure!n");

becomes

fputs("nError: fgets() read failure!", stderr);

Final Output

You have your final output format hardcoded in a printf

printf( "The %s is a %s and is day number %d of the year %d, which is %sa leap yearn",...

Say you want to go nuts and introduce internationalization. Or just want the user to be able to switch to a different format. Or... just realized you'd prefer "is the 3rd day of the year" instead of "day number 3 of the year". Instead of looking for line 115 (yes I looked that up) you could extract that format and make it global as well.

Conclusion

All of this was just pointing at minor blemishes and uneven parts. As a whole, your code is very nice. Your style is consistent, your names make sense... it handles even calloc errors!

Answered by Phil Kiener on December 15, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP