Search
Schema Registry
Schema Registry provides a centralized repository for managing and validating schemas for topic message data, and for serialization and deserialization of the data over the network.
Producers and consumers of topics (e.g. Apache Kafka) can use schemas to ensure data consistency and compatibility as schemas evolve.
Schema Registry is a key component for data governance, helping to ensure data quality, adherence to standards, visibility into data lineage, audit capabilities, collaboration across teams, efficient application development protocols, and system performance.
# Schema Management
Schema Registry provides a serving layer for your metadata, offering a RESTful interface for storing and retrieving schemas. It’s available in different implementations:
# Key Features
Key features across implementations:
- RESTful interface for storing and retrieving schemas
- Versioned history of all schemas based on specified subject name strategies
- Multiple compatibility settings
- Schema Evolution according to configured compatibility settings
- Enhanced schema type support
- For example, services can return diffs between two Modern ValueStore versions, showing new and deprecated columns
# Schema Registry Tools
- Confluent Schema Registry with Apache Kafka integration
- Apollo Schema Registry via schema reporting for GraphQL
- SchemaRegistry also in Apollo Schema registration via schema reporting - Studio - Apollo GraphQL Docs within connection with GraphQL
- Snowplow’s Iglu Server
-
xregistry/spec
- Microsoft Fabric’s stream catalog (“Fabric Real-Time Hub”) will be based on xRegistry work, focusing on Apache Avro’s schema with extensions as the canonical schema model for tooling. Bsky
# Confluent Schema Registry Deep Dive
Confluent Schema Registry provides a serving layer for your metadata with Apache Avro schemas support. It lives outside of and separately from your Kafka brokers. While producers and consumers communicate with Kafka to publish and read data (messages) to topics, they can concurrently interact with Schema Registry to send and retrieve schemas describing the data models for the messages.
Key design aspects of Kafka Schema Registry
- Assigns globally unique, monotonically increasing IDs to each registered schema
- Uses Kafka as durable backend and write-ahead changelog
- Distributed architecture with single-primary design
- ZooKeeper/Kafka coordinates primary election based on configuration
# Related
Origin: Analytics API
Related notes:
Created: 2021-10-12