Search
Schema Drift
A related concept discussed in RW Concept Drift and Model Decay in Machine Learning:
Its predictive ability decays over time. Broadly, there are two ways a model can decay. Due to data drift or due to concept drift. In case of data drift, data evolves with time potentially introducing previously unseen variety of data and new categories required thereof.
# Problem Definition
From industry experience (source: see origin):
The source databases all have similar schemas but are slightly different as they have been snapshotted off a master schema at different points in time. For business reasons, these source schemas cannot be updated as the master schema evolves over time, with the result that we see schema drift/divergence across our source databases.
# Related Concepts
- Schema Registry
- Avro with included db schema
- Schema Evolution - (needs exploration on differences)
# Open Questions
- What are the key differences between Schema Drift and Schema Evolution?
- How can Schema Drift be effectively managed in data pipelines?
Origin:
ETL tool for schema drift : dataengineering
References: Schema Evolution
Created 2022-09-27