TransWikia.com

Mysql : what are the steps for creating a database schema for a big project with many tables?

Database Administrators Asked on November 25, 2021

I am creating a project with lots of tables like one for taxes, one for employees, customers etc. So how should I go about it?
For eg: building the ER model first or something like that

One Answer

These are the steps I take:

  1. Start with a piece of paper, write down the entities and their relations to each other. Write the relations in plain language, e.g. Customer places zero to many Orders.
  2. Once you have an idea of how many entities you will have and their rough relations, formalize that drawing into a hierarchy diagram - this is usually just the entities and their relations to each other, not any column or key information.
  3. Starting at the top of your hierarchy, identify the properties of each entity and identify potential primary keys (candidate keys). There may be more than one per entity. Do not create/assign "id" or other columns at this point unless they are being created by the calling application (InvoiceNumber,etc) and/or generally accepted practice for things like Person,Address. "Codes" provided by outside authorities (ISO, others) are not only acceptable but preferred - CountryCd,CurrencyCd, etc.
  4. Again starting at the top of your hierarchy, choose a primary key from your list of candidate keys and migrate that primary key through all other related entities. Identify and list all required alternate keys.
  5. Create your preliminary ER diagram. I would use IDEF1X as it is easy to read and commonly accepted in the real world. If you are unfamiliar with IDEF1X, please read this primer https://www.softwaregems.com.au/Documents/Documentary%20Examples/IDEF1X%20Notation.pdf
  6. Once you have your draft diagram complete, you can identify instances where a surrogate ("Id") may be appropriate. My rule of thumb for determining when a surrogate is appropriate is a) subtypes that group separate entities b) very wide keys where no element is required for referential integrity later c) the data needs to be obscured for security/privacy reasons. Once you choose a surrogate, it becomes the primary key and your existing primary key must become an alternate key. That last sentence is key. If you use a surrogate and do not have an alternate key, you will not have a relational database.
  7. If there are rules which must be enforced through transactions or check constraints, a brief description of the rule can be assigned and labeled to the appropriate entities.
  8. You now have a logical model which can be used to build the physical model. The physical model will include data types and any instances where you might implement things differently than the logical model (different tables for optional columns, etc.).
  9. Now you can start writing your DDL.

I would also recommend that when you are designing your database that instead of using an "Id" column as a surrogate in every instance, you find if there is an appropriate/accepted shorthand that could be used instead. So if each Order has a Status of "Received/In Process/Shipped/Canceled", when you create the reference table perhaps use a StatusCode like "R/P/S/C" instead of meaningless numbers "1/2/3/4". Not only is CHAR(1) less storage than a typical INT, any queries/code written against the database are much easier to understand and you're less likely to commit the sin of using the wrong digit for the wrong column.

Answered by bbaird on November 25, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP