-
Notifications
You must be signed in to change notification settings - Fork 0
Data_Structures
Data is stored in different formats for different purposes. The actual physical structure is often customized for particular needs. The importance is not so much the structure, but knowing how to convert data from one structure to another. This is where the patterns and logical models show their value.
There are three main storage categories:
- Data Capture
- Data Exchange
- Data Reporting
Data capture may or may not be physically constrained. If memory is precious, data may be compressed and labels omitted for the sake of space. Data format also dictates the structure, storing directly to a relational database is different from storing to a CSV file. Important meta-data that is common to each record (sensor id, location, subject) may be put in a meta-data file, or added when the data is removed from the recording device and exported to longer-term storage.
The critical concern is capturing the variable information (timestamp, values), and being able to expand the data into the full template with post-processing.
Data exchange focuses on compactness to minimize bandwidth (and transfer time). On the other hand, it is important that all the necessary information is captured and associated in the proper relationships. It is useful to look at GIS (Geospatial Information Systems) for meta-data standards and exchange. They have a long history of working in this are and do a lot of things right. Meta-data often includes the provenance of the data, where it was sourced and the processes by which it has been modified and transformed. It often includes a contact person who can give more explanation about the data. This meta-data is usually a file separate from the actual data.
The critical concern is making sure all the information is present and the correct relationships are expressed.
This is the reporting system. It will differ depending on whether it is a relational database, data warehouse or a no-SQL system. In general the records are 'fluffed' for ease of query where each value is a separate record that has its own timestamp and location information. This expansion may be the result of a relational view, materialized view or the physical structure design. It is critical that every field of the StAT template is present as a minimum, along with any field expansions.
This is usually the 'end of the road' for the data. Focus is on meeting reporting needs.
The goal of data is analytics and reporting. An exchange structure is validated by the ability to convert the data into a fully populated StAT reporting structure. If this cannot be done, the exchange structure is does not work and must be modified.
Exchange structures can have different hierarchies depending on which is most compact. If a structure element is common to all data records, it can be moved to the metadata.
- Subject, sensor, attribute - all attributes of the sensor apply to the same subject
- Sensor, subject, attribute - one sensor is measuring multiple things