-
Notifications
You must be signed in to change notification settings - Fork 0
Data Modeling: MetaData
Metadata is data about data.
In more technical terms: "Whatever describes all the members of a set cannot itself be a member of the set" Ludwig Whittgenstein - Tractatus
Metadata gives the context of the data of interest. While often overlooked by relational database folks, it is a significant issue for geospatial data systems (ref).
The trouble with metadata is that there is always more of it. Every context has another context. If you have ever spent several hours wandering from one internet link to another ("wandering down the rabbit hole"), then been snapped awake to ask yourself "how did I get here"; you have been tracing metadata links without having a clear goal. A Smart Rock sensor is made of PVC pipe, the pipe was bought in a store, the purchase was processed by a clerk, the clerk has a birthdate, ... To determine what metadata to collect requires knowing the business context in which the data will be used, what questions will be asked.
Always capture the immediate context, and the next layer down is always a good safety measure. The immediate context is what you intend to use, the next layer down is 'insurance' for the unexpected future uses.
Location is a key metadata item. Usually cell-phone accuracy is sufficient (3-4 meter). With this piece of information it is possible to join other analytic data such as weather, soil type or land slope or other observation records.
The difficulty of adding later is it may require more and more guess-work about the precision and accuracy. Weather data is rarely available for a sampling site. Frequently weather data is derived by triangulating from the three nearest weather reporting sites and calculating an estimated rainfall or temperature.
The critical context for the water quality monitoring are time and location. Time is metadata about the individual sensor readings, while location applies to all readings during a deployment. At a minimum, location should be latitude and longitude. Additional metadata might be the HUC (Hydrologic Unit) id and possibly the mile.
Device type ("Smart Rock") and version number are also important meta-data. These are useful for determining sensor accuracy, but will likely have little value once the data is trusted (for accuracy and precision) and you are doing the actual analytics.
GIS metadata always includes contact information, the organization and possibly a person to contact to find out details that are not evident in the data package.