ufzLogo rdmLogo

SaQC - System for automated Quality Control#

Anomalies and errors are the rule not the exception when working with time series data. This is especially true, if such data originates from in-situ measurements of environmental properties. Almost all applications, however, implicitly rely on data, that complies with some definition of ‘correct’. In order to infer reliable data products and tools, there is no alternative to quality control. SaQC provides all the building blocks to comfortably bridge the gap between ‘usually faulty’ and ‘expected to be corrected’ in a accessible, consistent, objective and reproducible way.

Documentation#

Getting Started
  • installation

  • first steps

  • python API introduction

  • command line syntax

SaQC Configurator
  • parametrisation tool and sand-box for all the SaQC methods

  • accessible without having any environment

Functionality Overview (API)
  • flagging methods overview

  • processing algorithms overview

  • tools overview

Cookbooks
  • outlier detection

  • frequency alignment

  • drift detection

  • data modelling

  • wrapping generic or custom functionality

Documentation
  • CSV file-controlled flagging

  • global keywords

  • customization

Developer Resources
  • writing documentation

  • implementing SaQC methods

SaQC is developed and maintained by the Research Data Management Team at the Helmholtz-Centre for Environmental Research - UFZ. It manifests the requirements and experiences made from the implementation and operation of fully automated quality control pipelines for environmental sensor data. The diversity of communities involved in this process and the special needs within the realm of scientific data acquisition and its provisioning, have shaped SaQC into its current state. We define this state as: inherently consistent, yet externally extensible, traceable, approachable for non-programmers and usable in a wide range of applications, from exploratory interactive programming environments to large-scale fully automated, managed workflows.