TransWikia.com

I'm trying to set up a database, to run a small pig farm. What should my events model look like?

Software Engineering Asked by FeRcHo on November 8, 2021

I’m trying to set up a database, to run a small pig farm. The idea is the following:

I have a table called BELLIES, where I codify the sows along with relevant information about them (weight, id, birth date, purchase value, etc).

I want to manage on the basis of events the different processes involved, that is, the sows are inseminated (event), have deliveries (event), wean the piglets (event), etc .; I’m stuck in building the structure, because I do not know which is the most optimal way to relate the tables. I have tried to create an EVENTS table and to relate it N: N (many to many with the table BELLIES), through an intermediate table. But each event has its own information and different between them, so I think it necessary to create one table per event. In short, what is the best way to relate all tables?

3 Answers

I'm going to jump in here because I see some issues with the solutions offered thus far.

As far as approach...

I hate the terms "Database First" and "Code First" because they make it sound like you can only do one or the other. There is no rule that says you can't have a healthy database and a workable object model. That's the whole purpose of the Data Access Layer - to marry these two different worlds together. Design your object model to be robust and make sure you understand how data is going to flow through your database. With a little pre-planning, you really can do both. Anyone who tells you otherwise is way too invested in one of those schools of thought.

So about database...

  • First off... normalize... Normalize... NORMALIZE. Relational integrity beats out convenience every time. If you take anything away from this post, understand that as soon as you step away from normalized data design you are guaranteeing that data anomalies will enter your system. This has been mathematically proven by people significantly smarter than me: relational systems must be normalized to guarantee data integrity. Unless this is a school project that you immediately plan to throw away - short-cuts in design now will make your life miserable during the maintenance phase of your application. I was taught in school to target 3NF as a rule but I've seen people here swear to higher standards. Whatever you decide, just make sure you understand what you sacrifice if you decide to go off-script.
  • If you want your design to work, you need to really understand what data you are planning to collect. Before you ever plan on typing your first SQL script you should have your entire database plan mapped out. For this reason I highly recommend you read up about entity relationship modeling so that you can plan your design in advance. Very simply.. there is no one right way to create a database (other than keeping it normalized) and a single piece of data can radically change your design. Unlike front end design there is some benefit on planning ahead (as a redesign of the database is a heck of a lot more painful than a redesign of the front-end). For that reason, I recommend figuring out what you need now but also what you might need in the near future. Try to make sure that your current design doesn't railroad you later on. Here's a for instance around how the use of a piece of info can radically change your design:

    • You say you want to track Weaning pigs. So before you can determine how you should set this up you need to know more with what you plan to do with this information:

      • Do you just want to track maturity in the life of your bellies? Then this is an attribute and should be represented in a single maturity column of your bellies table. This attribute can hold plenty of statuses: (weaning, piglet, sexually mature, pregnant, nursing, elderly) and you can design business logic around each and use SQL to extract sets of your bellies as needed.
      • Are you trying to track detailed infant information around a belly? Perhaps this detail justifies it's own normalized entity called BabyBellies where you can track weaning information as well as many other important health details in the life of your pigs before the graduate to the bellies table.
      • Do you truly want to treat this as an event only between two separate bellies entries? Then you should read up on bridge or composite entities. Bridge entities are tables placed in between two Many to Many relationships in order to force two 1 to Many relationships. This is a fine tool that you should keep at your disposal when you design your database
  • Don't delete your data! This is a very important point as you work on your data design. One of the important tenets database management is that data stays reliable. That means no deleting data... only marking it as no longer relevant. This is especially important when working with events as all events end.. make use of date and time columns for this purpose.

  • Understand the value of derived attributes. Do you have a piece of information that can be calculated from two or more other pieces of stored data? Great! Unless this calculation is expensive, this doesn't belong on the database. This is a derived attribute and should be defined in your Data Access Layer. Understanding what you shouldn't store on the database is usually just as important as understanding what you should. Make sure to include these types of data elements in your overall plan.

  • Think in set theory not in application calls. One of the other answers mentioned concurrent users. If you have to worry about deadlocks when managing database calls then you have a critically serious design problem with your application. All modern Relational Database Management Systems handle locking at such an advanced level that you will rarely if ever need to worry about transaction management as long as you are thoughtful with your database calls. When you work at the application layer everything is about looping and threading and running multiple processes at once. Not so with the database!! Individual database calls are expensive but most database parsers can process millions of lines of data at a time. Make sure you are taking advantage of your ability to send large requests rather than try and piecemeal small ones. The database is like your harddrive. You want to limit how many times you have to call to it (as it's expensive processing-wise) but when you do need to make that call you want to pull as much info up to your RAM as possible. Take advantage of the tools that SQL gives you to efficiently call to the database and perform comparisons of data in bulk on the database as opposed to bringing it up to the application level.

Ultimately, no one can tell you how to design your database without knowing exactly how you plan to use that data. But I will mention a few ideas that popped into my head as I read your post:

  • Bellies represent individual pigs. Baby pigs are bellies so whenever one of your pigs gives birth it should create some circular relationship to bellies table. To do that in a normalized fashion you'll want a second table which defines relationships between separate records of bellies. This way you can pull in both tables into a SQL script and define pig ancestry.
  • Events typically are best stored as bridge or composite entities between two more significant entities. The parent/child relationships mentioned by Aby Sheffer may work in the short term but their is something about that design that makes me uneasy. If you plan to store additional information about these events, you may want to consider the broader implications of your entity relationships.
  • Insemination to me can have broad reaching consequences. You need to figure out exactly what you want to do with this. Are you tracking mating pairs of pigs? Are you checking health details of infertile pigs? Are you simply trying to ascertain which pigs may become pregnant? Figure out what you want to do with this information and what other information will also need to be collected before deciding on your design. The preferable design, like the Weaning attribute, will change depending on what you want to do with this info.

Answered by DanK on November 8, 2021

KISS - Keep it Simple, Stupid

If an event can only happen once, then it is often good just to model the event attributes as part of the main entity. So having birth_date as part of BELLIES is valid. Note that the Piglets may also be BELLIES so you might indicate parentage by an association back into BELLIES and INSEMINATIOn.

Maybe a state transition engine to capture the weaning/weaned status and date as well as other life-cycle events.

With luck you can pack everything into a BELLIES and an INSEMINATION entity/table.

Answered by Chris Reynolds on November 8, 2021

From everything in this thread, it sounds like you have the outline of your application and the beginning of your DB already set. With that in mind.

I agree that DB first is viable. I would add, what are your strengths Criss? If you are more experienced with development than database design, then it would be a good idea to at least flesh out your application. Classes, libraries, basic flows and handoffs. That way you can think of your database design more along the lines of "what does my app need to get/to send at this point." It may help you see what is best for your data and business processes.

To look at the question you asked:
You have a Bellies table with Bellies_ID, and the pertinent information about that particular sow. You could create that Events table, but as you noted you will end up with a huge table with the potential for a lot of Null values. One of my concerns is how many concurrent users do you have? Because having only 1 table for all events will create locks and potential conflicts if multiple writes are coming in.

So, that leads us to either a table per event, or is there a way to combine insemination + births? Something to think about. Let's assume 3 tables: insemination, births, weaning.

For each you have an FK linking back to your Bellies table. There is a ID each for insemination, births, weaning. What that is based on depends on what is in each table. For example, Insemination seems straightforward: Insemination_ID, FK Bellies_ID, date, sperm source....etc.

From my perspective you can use the Bellies FK to relate the tables, with each Bellies_ID relating to multiple records in the event tables. It may not be fully normalized, but it is easy to understand, create and maintain.
DB Tables

Answered by Aby Sheffer on November 8, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP