|
Designing Your Data Model
The Template Manager allows you to edit your data model in HGA.

The easiest way to get started is to begin with the standard Environmental Template that comes with HGA, (available in either imperial or metric units). You can then adjust it to meet your specific project needs. There are only a very limited set of fields that should remain untouched, or which require special configuration in the Profile Settings. A summary of these fields is available here.
One of the most common modifications is to rename fields and tables to match project terminology. For example, you may wish to rename the Hydrogeology table to be Hydrostratigraphy.
Some examples of tables that clients have added include:
- Top of Casing (TOC) Changes – to store historical data showing changes in TOC over time
- Free Product Thickness - to store measurements of thickness of free product found in wells
- Stream Flow – to store date, average daily flow rate, peak flow, gage height, reservoir storage, and lake elevations
- Meteorology – to store daily precipitation, temperature, and evapotranspiration rates
- Well Production – to store production data (for example, for pump and treat systems)
Primary keys
It is good practice to set up your tables with primary keys. Why would you want one? For two reasons. First, having a primary key means you can’t end up with duplicate entries (for example, if you accidentally try to import the same data twice). Second, once the data are in your project, if you want to update a large number of records quickly, having a primary key on your table allows you use it and the Data Transfer System (DTS) in HGA to do a bulk update of all affected records. Examples of when you might want to do a bulk update might be because you have new data for several records, or because you realized a part of the data is wrong (hey – it does happen!).
A primary key can be a single field (such as StationID in the Station table), or a combination of fields (such as From and To for a table like Casing). The important point is that whatever you choose, it should create a UNIQUE combination for every record in that table.
Relationships
Like everything in life, a good data model is built on good relationships. The most common form of relationship (in HGA at least) is the Parent-Child relationship. This relationship describes a hierarchy, where the parent table contains data that is, in some ways, at a higher level than the child table. There is usually a one-to-many relationship between the parent and child. Meaning that, just as in real life, one parent can have many children. And (in the world of HGA), each child table can have only one parent table. As a result, the parent table must contain data before the child table can.

A good example that you would be familiar with from your current projects are the Parameter_Sample and Parameter_Result tables. The Parameter_Sample table contains information about the physical sample (this generally corresponds to the types of data you would collect in the field), while the Parameter_Result table contains information about the analytical results (this generally corresponds to information created by the laboratory). Parameter_Sample is parent to the Parameter_Result table. One sample can have many different results (one for each analyte).
Understanding these types of relationships and having the ability to identify situations where a parent-child relationship is helpful (or necessary) can reduce duplication in your data, and make your data easier to maintain.
For example, I recently worked with an HGA client who has a lot of geophysical data. They often ran the same tool set down the same wells, to see how values were changing over time. The existing geophysical data structure in the classic Environmental Template can’t handle this, because each geophysical data table has only depth and value fields and assumes each log is run only once for each station. There are two different ways this situation could be handled.
Basic approach – no parent child relationship
The client could have modified each relevant table to include a date field, and a run number. They could then use the run number to track the log – for example, a gamma log, over time in a single well. Such a table would look like the one shown below.

You can see that the Run and the Date fields repeat many times, each time with the same content. In fact, they repeat for every Gamma value collected for each specific date. Imagine if you found a mistake in the date for one of the runs. In the example shown, you would have to correct that date 1311 times! Want to import new data? You would be importing 2620 more values than you need to, because of the duplication.
Smart approach – nurture a parent child relationship
The alternative to this approach is to create two tables.
The first we will call Geophysical_Run. This table contains information about each run – a unique run_id field (the primary key), the date the survey was made, the type of geophysical data collected, the measurement unit of the data, and the station in which it was performed. This will be the parent table.

The second table, we will call Geophysical_Result. This table is simple; it contains three fields: run_id, depth, and value. This is the child table. The two tables are linked through the run_id. The primary key is a combination of the run_id and depth fields.
The same data are stored, but now there are no duplicated pieces of data. If you discover an error in a survey date – you can correct it once, and it’s updated everywhere. Plus, you can see at a glance, from the Geophysical_Run table, how many times a particular well has been surveyed. And if that client decides to run a new type of geophysical survey at the site, he doesn’t have to change his data model to include a new table for that survey data. Instead, he just adds a new value to the survey_type pick list and he’s ready to import.

We’d love to hear about the tweaks you’ve made to your data models – let us know.
Top |