AI Playbook: 11 - Deprecation and Cleanup¶
This playbook provides a safe and structured process for deprecating and removing resources (such as data models, views, datasets, or transformations) from the CDF project. A disciplined cleanup strategy is essential for managing costs, reducing complexity, and avoiding the accumulation of technical debt.
Goal: To ensure that the removal of resources is a deliberate, communicated, and non-disruptive process.
1. The Four-Step Deprecation Process¶
Simply deleting a resource is dangerous. A proper deprecation process should be followed to prevent breaking downstream dependencies.
Step 1: Mark for Deprecation (in Docs)¶
-
Action: The first step happens in the knowledge base. The resource to be removed (e.g., a specific view in an object specification) is marked with a
deprecation_date. -
Example in
..._specification.md:
- **Property:**
- **Name:** oldLegacyMetric
- **Data Type:** float64
- **Status:** DEPRECATED
- **Deprecation Date:** 2024-12-31
- **Deprecation Note:** "This metric is being replaced by `newAdvancedMetric`. Please update all downstream dependencies before the deprecation date."
- Communication: This change, once merged, serves as the official public notice to all users of the data model that the resource is scheduled for removal.
Step 2: Analyze Impact¶
- Action: Before removing the resource, run an impact analysis to find all downstream dependencies.
- Details:
- Use CDF's lineage capabilities to identify which assets, dashboards, or applications are querying the deprecated resource.
- Search application code repositories for any references to the resource's external ID.
- Tooling: This can be a combination of using the CDF UI, the SDK, and code search tools. The goal is to produce a "dependency report."
Step 3: Remove the Resource (in Docs)¶
- Action: After the
deprecation_datehas passed and all downstream dependencies have been migrated, the resource is removed from the Markdown specification file. - Process:
- Delete the corresponding property, relationship, or entire object file from the knowledge base.
- Create a pull request with this change.
Step 4: Automated Cleanup via YAML Sync¶
- Action: The
05_TOOLKIT_YAML_SYNC.mdplaybook handles the final cleanup automatically. - Process:
- When the pull request from Step 3 is merged, it triggers the YAML Sync playbook.
- The sync agent will compare the new state of the knowledge base (where the resource is missing) with the current state of the Toolkit YAML.
- It will detect that the resource has been removed and will automatically delete the corresponding YAML file or block.
- This change is then promoted through the environments just like any other change, ensuring the resource is cleanly removed from CDF.
2. Deleting a Full Module or Project¶
The same principles apply when retiring an entire module or project, but with additional safety checks.
- Action: Before deleting the entire module folder from
docs/cdf_project/modules/, a final, comprehensive impact analysis must be performed. - Checklist:
- Have all downstream consumers been notified and migrated?
- Have the associated data ingestion pipelines been turned off?
- Has the data been archived if required for compliance?
- Has a final cost savings report been generated?
- Execution: Once the checklist is complete and approved, deleting the module's folder from the knowledge base and running the YAML sync will trigger a cascading delete of all its associated resources in CDF.
Note: This playbook emphasizes that deletion is not a single action but a communicated, multi-step process. The "docs-as-code" framework is what makes this process safe and auditable, as the removal of any resource leaves a clear trail in the Git history.