Context Engineering¶

Context engineering is the strategic discipline of designing and building dynamic systems that provide AI assistants with the right information, in the right format, at the right time. It's the foundation of effective AI-assisted development.

🎯 What is Context Engineering?¶

While prompt engineering focuses on crafting the perfect instructions for a single request-response cycle, context engineering governs the selection, compression, memory, and tool schemas for an entire workflow.

Key Distinction¶

Prompt Engineering: "How do I ask the question?"
Context Engineering: "What does the AI need to know before I even ask the question?"

🏗️ The Three Tiers of Context¶

Tier 1: Project-Level Context (Global Blueprint)¶

Purpose: Define the overall project environment and standards

Contents:

Technology stack and specific versions
Overarching architectural patterns
Core design principles and coding standards
Code organization and folder structure
Naming conventions and workflows

Implementation: Store in .cursor/rules.md file in project root

Tier 2: Feature-Level Context (Local Schematic)¶

Purpose: Provide information relevant to a specific feature or module

Contents:

Related component interfaces
API schemas and data models
State management patterns
Relevant test files and documentation

Implementation: Have relevant files open in editor or use @file references

Tier 3: Task-Level Context (Immediate Instruction)¶

Purpose: Provide specific details for a single, immediate action

Contents:

Specific implementation goal
Acceptance criteria
Performance requirements
Environment compatibility needs

Implementation: Include directly in the prompt or chat message

🛠️ Implementation in Cursor IDE¶

1. Project-Level Context (`.cursor/rules.md`)¶

Real Example: CDF Asset Monitoring Solution¶

# Project Rules for AI Assistant - CDF Asset Monitoring

## Technology Stack
- Cognite Python SDK 7.13.0 for CDF interactions
- Cognite Toolkit 0.1.0 for infrastructure as code
- Python 3.11.7 with type hints everywhere
- FastAPI 0.109.0 for custom APIs
- Apache Spark 3.5.0 for data transformations
- Docker + Kubernetes for deployment
- Poetry for dependency management

## CDF Architecture Standards
- Data Space: `sp_asset_monitoring` for all data models
- Core Data Model: Extend from cdf_cdm:CogniteAsset v1
- Naming Convention: `{source}:{type}:{identifier}`
  - Example: `sap:pump:P-101`, `scada:sensor:temperature`
- External IDs: Use snake_case for all external IDs
- Timestamps: ISO 8601 with UTC timezone everywhere

## Data Modeling Patterns
- Always extend from Cognite Core Data Model (CDM)
- Container structure:

containers/ Pump.container.yaml # Custom properties for pumps Compressor.container.yaml # Custom properties for compressors MaintenanceOrder.container.yaml views/ Pump.view.yaml # Implements CogniteAsset Compressor.view.yaml # Implements CogniteAsset MaintenanceOrder.view.yaml # Implements CogniteActivity

- Use `requires` constraints to ensure data integrity
- Link time series using consistent patterns: `{asset_type}:{asset_id}:{measurement}`

## Transformation Standards
- Use Cognite Transformations for all data ingestion
- SQL transformations must include:
- Proper error handling with COALESCE
- Data quality checks (NOT NULL constraints)
- Consistent timestamp handling
- node_reference() for relationships
- Schedule patterns:
- Real-time data: Every 5 minutes
- Master data: Daily at 02:00 UTC
- Historical loads: One-time with backfill

## Time Series Patterns
- External ID format: `{source}:{asset}:{measurement}:{unit}`
- Example: `pi:pump:P-101:flow:m3h`
- Metadata requirements:
- unit: Required for all measurements
- description: Human-readable description
- assetId: Link to parent asset
- source: Origin system identifier

## Security & Access Control
- Groups follow pattern: `{environment}_{role}`
- Example: `prod_admins`, `dev_users`
- Capabilities assignment:
- Admins: Full access to space and transformations
- Users: Read-only access to views and time series
- Service accounts: Specific write permissions

## Query Optimization
- Use GraphQL for complex queries with relationships
- Implement pagination for large result sets
- Cache frequently accessed master data
- Use projections to limit returned fields
- Index commonly filtered properties

## Error Handling
- All SDK calls must handle CogniteAPIError
- Implement exponential backoff for rate limits
- Log errors with correlation IDs
- Never expose internal IDs in error messages

Real Example: CDF Data Pipeline Solution¶

# Project Rules for AI Assistant - CDF Data Pipeline

## Technology Stack
- Cognite Python SDK 7.13.0 for CDF operations
- Apache Spark 3.5.0 for large-scale transformations
- Databricks Runtime 14.3 LTS
- Delta Lake for data storage
- GitHub Actions for CI/CD
- Cognite Functions for serverless compute
- Great Expectations for data quality

## Pipeline Architecture Standards
- RAW databases follow pattern: `raw_{source_system}`
  - Example: `raw_sap`, `raw_pi`, `raw_scada`
- Staging pattern: RAW → Staging → Core
- Data lineage tracking via CDF Datasets
- Idempotent operations (safe to re-run)
- Checkpoint-based processing for reliability

## Transformation Patterns
- Cognite Transformations structure:
  ```yaml
  transformations/
    01_raw_to_staging/
      assets_transformation.sql
      timeseries_transformation.sql
    02_staging_to_core/
      asset_hierarchy.sql
      relationships.sql
    03_data_quality/
      validation_checks.sql

Use MERGE for upsert operations
Window functions for time-based calculations
CTEs for complex transformations
Parameterized queries with Jinja2 templates

Data Quality Standards¶

Implement quality gates at each stage:
Schema validation on RAW ingestion
Business rule validation in staging
Referential integrity in core
Quality metrics to track:
Completeness: % of non-null required fields
Uniqueness: Duplicate detection
Timeliness: Data freshness checks
Accuracy: Range and format validation

CDF Functions Patterns¶

Function naming: {purpose}_{source}_{target}
Example: ingest_sap_assets, calculate_oee_metrics
Standard function structure:

def handle(client: CogniteClient, data: dict) -> dict:
    # Input validation
    # Business logic
    # Error handling
    # Return status

Environment variables for configuration
Structured logging with correlation IDs
Timeout handling for long-running operations

Time Series Ingestion¶

Batch size: 100,000 data points per request
Use async operations for parallel ingestion
Implement backpressure handling
Data point validation:
Timestamp within reasonable range
Value within expected bounds
No duplicate timestamps
Aggregation patterns for downsampling

Monitoring & Alerting¶

Log all transformation runs to CDF Events
Track metrics:
Records processed
Processing time
Error rates
Data quality scores
Alert thresholds:
Failed transformations
Data quality below 95%
Processing time > 2x average

Error Recovery¶

Implement circuit breaker pattern
Dead letter queues for failed records
Automatic retry with exponential backoff
Manual intervention workflows
Rollback procedures for critical failures

#### Real Example: CDF InField Inspector App
```markdown
# Project Rules for AI Assistant - CDF InField Mobile App

## Technology Stack
- React Native 0.73.2 with TypeScript 5.3.3
- Cognite React Components 6.2.0
- Cognite SDK JS 9.13.0 for CDF access
- React Navigation 6.x for navigation
- Redux Toolkit for offline state management
- React Native MMKV for secure storage
- Mapbox for asset location display
- React Native Camera for inspection photos

## CDF Integration Patterns
- Authentication via Cognite OAuth2
- Project configuration:
  ```typescript
  const client = new CogniteClient({
    appId: 'cdf-infield-inspector',
    project: process.env.CDF_PROJECT,
    baseUrl: process.env.CDF_CLUSTER,
    authentication: {
      type: 'OIDC',
      options: { /* OIDC config */ }
    }
  });

Offline-first architecture with sync queue
Cache CDF resources locally for offline access

Asset Inspection Workflow¶

Load asset hierarchy from CDF
Display asset 360 images and documents
Create inspection events:

interface InspectionEvent {
  externalId: string;
  type: 'inspection';
  subtype: 'routine' | 'corrective' | 'preventive';
  assetIds: number[];
  startTime: Date;
  endTime: Date;
  metadata: {
    inspector: string;
    findings: Finding[];
    photos: FileReference[];
  };
}

Offline Capabilities¶

Download asset data for offline routes
Queue all CDF operations when offline
Sync strategy:
Store operations in MMKV queue
Retry with exponential backoff
Handle conflicts with server data
Notify user of sync status
Maximum offline storage: 500MB

3D Model Integration¶

Use Cognite 3D Web Viewer
Support for CAD and point cloud models
Highlight equipment in 3D view
Link 3D positions to assets:

const assetMapping3D = {
  nodeId: model.nodeId,
  assetId: asset.id,
  boundingBox: {...}
};

Time Series Display¶

Show last 24 hours by default
Support for multiple series overlay
Gesture-based zoom and pan
Offline caching of recent data
Aggregation for performance:

const datapoints = await client.timeseries.data.retrieve({
  items: [{ id: timeseriesId }],
  start: '24h-ago',
  aggregates: ['average'],
  granularity: '1h'
});

Work Order Integration¶

Create work orders from findings
Link to maintenance system
Required fields:
Asset reference
Priority level
Description with photos
Estimated completion
Status tracking and updates

Performance Requirements¶

Asset list loading < 2 seconds
3D model initial load < 10 seconds
Support 10,000+ assets in hierarchy
Smooth 60fps scrolling
Image upload with compression
Background sync without UI blocking

Security Standards¶

Biometric authentication for app access
Encrypted local storage for sensitive data
Certificate pinning for CDF endpoints
Session timeout after 15 minutes
Audit trail for all inspections
GDPR compliance for personal data

### 2. Feature-Level Context

#### Example: CDF Asset Hierarchy Feature
When working on asset hierarchy management, provide these contexts:

**Open Files:**
```yaml
# @file:config/modules/asset_monitoring/containers/Pump.container.yaml
apiVersion: v1
kind: Container
metadata:
  externalId: Pump
  name: Pump
  description: Container for pump assets with operational data
spec:
  usedFor: node
  properties:
    flowRate:
      type: float64
      description: Current flow rate in m3/h
    pressure:
      type: float64
      description: Operating pressure in bar
    efficiency:
      type: float64
      description: Pump efficiency percentage
    lastMaintenance:
      type: timestamp
      description: Last maintenance date
    mtbf:
      type: float64
      description: Mean time between failures in hours

# @file:src/transformations/asset_hierarchy.py
from cognite.client import CogniteClient
from cognite.client.data_classes import Asset
from typing import List, Dict

def build_asset_hierarchy(
    client: CogniteClient,
    source_data: List[Dict]
) -> List[Asset]:
    """Build asset hierarchy from source system data.

    Args:
        client: CogniteClient instance
        source_data: Raw asset data from source system

    Returns:
        List of Asset objects ready for upload
    """
    assets = []

    # Create asset mapping for parent-child relationships
    asset_map = {item['TAG']: item for item in source_data}

    for item in source_data:
        asset = Asset(
            external_id=f"sap:{item['TAG']}",
            name=item['DESCRIPTION'],
            parent_external_id=f"sap:{item['PARENT_TAG']}" if item.get('PARENT_TAG') else None,
            metadata={
                'source': 'SAP',
                'equipmentType': item['EQUIPMENT_TYPE'],
                'location': item['LOCATION'],
                'criticality': item['CRITICALITY']
            },
            labels=[item['EQUIPMENT_TYPE'], item['LOCATION']]
        )
        assets.append(asset)

    return assets

# @file:config/modules/asset_monitoring/views/Pump.view.yaml
apiVersion: v1
kind: View
metadata:
  externalId: Pump
  name: Pump
  description: View for pump assets with relationships and time series
spec:
  implements:
    - type: view
      space: cdf_cdm
      externalId: CogniteAsset
      version: v1
  properties:
    # Inherited from CogniteAsset
    name:
      container:
        space: cdf_cdm
        externalId: CogniteAsset
      containerPropertyIdentifier: name
    # Custom pump properties
    flowRate:
      container:
        externalId: Pump
      containerPropertyIdentifier: flowRate
    pressure:
      container:
        externalId: Pump
      containerPropertyIdentifier: pressure
  # Time series links
  timeSeries:
    - source: pi
      pattern: "pi:pump:{{externalId}}:flow_rate"
      property: flowRate
    - source: pi
      pattern: "pi:pump:{{externalId}}:pressure"
      property: pressure

Example: CDF Maintenance Management Feature¶

Context Setup for Maintenance Workflow:

# @file:config/modules/maintenance/containers/MaintenanceOrder.container.yaml
apiVersion: v1
kind: Container
metadata:
  externalId: MaintenanceOrder
  name: MaintenanceOrder
  description: Container for maintenance work orders
spec:
  usedFor: node
  properties:
    orderNumber:
      type: text
      description: SAP work order number
    orderType:
      type: text
      description: Type of maintenance (PM01, PM02, PM03)
    priority:
      type: int32
      description: Priority level (1-5)
    plannedStart:
      type: timestamp
      description: Planned start date
    plannedEnd:
      type: timestamp
      description: Planned completion date
    actualStart:
      type: timestamp
      nullable: true
    actualEnd:
      type: timestamp
      nullable: true
    status:
      type: text
      description: Current status (CRTD, REL, TECO, CLSD)
    cost:
      type: float64
      nullable: true
      description: Total cost in USD

# @file:src/functions/maintenance_analytics.py
from cognite.client import CogniteClient
from cognite.client.data_classes import Event
from datetime import datetime, timedelta
import pandas as pd

def calculate_maintenance_kpis(
    client: CogniteClient,
    asset_ids: List[int],
    start_date: datetime,
    end_date: datetime
) -> Dict[str, float]:
    """Calculate maintenance KPIs for assets.

    Args:
        client: Cognite client instance
        asset_ids: List of asset IDs to analyze
        start_date: Start of analysis period
        end_date: End of analysis period

    Returns:
        Dictionary of KPI values
    """
    # Fetch maintenance events
    events = client.events.list(
        asset_ids=asset_ids,
        type="maintenance",
        start_time={"min": start_date, "max": end_date},
        limit=None
    )

    # Convert to DataFrame for analysis
    df = pd.DataFrame([
        {
            'asset_id': e.asset_ids[0],
            'type': e.subtype,
            'duration': (e.end_time - e.start_time) / 3600000,  # hours
            'cost': e.metadata.get('cost', 0)
        }
        for e in events if e.end_time
    ])

    # Calculate KPIs
    kpis = {
        'mttr': df.groupby('asset_id')['duration'].mean().mean(),  # Mean Time To Repair
        'maintenance_cost': df['cost'].sum(),
        'preventive_ratio': len(df[df['type'] == 'preventive']) / len(df),
        'emergency_ratio': len(df[df['type'] == 'corrective']) / len(df)
    }

    return kpis

-- @file:transformations/maintenance_order_transformation.sql
-- Transform SAP PM orders to CDF maintenance events
SELECT
  node_reference('MaintenanceOrder', concat('sap:pm:', AUFNR)) as id,
  'maintenance' as type,
  CASE
    WHEN AUART = 'PM01' THEN 'preventive'
    WHEN AUART = 'PM02' THEN 'corrective'
    WHEN AUART = 'PM03' THEN 'emergency'
    ELSE 'other'
  END as subtype,
  KTEXT as description,
  GSTRP as startTime,
  GETRI as endTime,
  array(node_reference('Asset', concat('sap:', TPLNR))) as assetIds,
  to_json(object(
    'orderNumber', AUFNR,
    'priority', PRIOK,
    'status', STTXT,
    'cost', ACTUAL_COST,
    'plannedCost', PLANNED_COST,
    'technician', RESPONSIBLE
  )) as metadata
FROM
  `raw_sap`.`pm_orders`
WHERE
  AUFNR IS NOT NULL
  AND GSTRP >= CURRENT_TIMESTAMP - INTERVAL '90 days'

Example: CDF Time Series Analytics Feature¶

Context for Time Series Processing:

# @file:src/analytics/time_series_processor.py
from cognite.client import CogniteClient
from cognite.client.data_classes import TimeSeries, DatapointsList
from typing import List, Dict, Tuple
import numpy as np
import pandas as pd

class TimeSeriesAnalyzer:
    def __init__(self, client: CogniteClient):
        self.client = client
        self.aggregation_methods = ['average', 'min', 'max', 'sum', 'count']

    async def calculate_asset_oee(
        self,
        asset_external_id: str,
        start_time: str,
        end_time: str
    ) -> Dict[str, float]:
        """Calculate Overall Equipment Effectiveness (OEE) for an asset.

        OEE = Availability × Performance × Quality

        Args:
            asset_external_id: External ID of the asset
            start_time: Start of calculation period
            end_time: End of calculation period

        Returns:
            Dictionary with OEE components and overall score
        """
        # Define time series external IDs
        ts_mapping = {
            'runtime': f'pi:{asset_external_id}:runtime:hours',
            'planned_runtime': f'pi:{asset_external_id}:planned_runtime:hours',
            'actual_output': f'pi:{asset_external_id}:production:units',
            'target_output': f'pi:{asset_external_id}:target_production:units',
            'good_units': f'pi:{asset_external_id}:good_units:count',
            'total_units': f'pi:{asset_external_id}:total_units:count'
        }

        # Fetch all required time series
        datapoints = self.client.time_series.data.retrieve(
            external_id=list(ts_mapping.values()),
            start=start_time,
            end=end_time,
            aggregates=['sum'],
            granularity='1d'
        )

        # Calculate OEE components
        availability = self._calculate_availability(datapoints)
        performance = self._calculate_performance(datapoints)
        quality = self._calculate_quality(datapoints)

        return {
            'availability': availability,
            'performance': performance,
            'quality': quality,
            'oee': availability * performance * quality,
            'calculation_period': f'{start_time} to {end_time}'
        }

# @file:config/functions/oee_calculator.function.yaml
apiVersion: v1
kind: CogniteFunction
metadata:
  externalId: calculate_oee_daily
  name: Daily OEE Calculator
  description: Calculate OEE metrics for all production assets
spec:
  runtime: python39
  handler: handler.main
  schedule:
    cronExpression: "0 2 * * *"  # Run daily at 2 AM UTC
  env:
    - name: AGGREGATION_WINDOW
      value: "24h"
    - name: ASSET_FILTER
      value: "type:production_line"
  secrets:
    - name: CDF_PROJECT
      value: my-project
  cpu: 0.5
  memory: 512
  timeout: 300

# @file:src/transformations/time_series_aggregation.py
import pandas as pd
from cognite.client.data_classes import TimeSeriesUpdate
from typing import List

def create_aggregated_timeseries(
    client: CogniteClient,
    source_ts_id: str,
    target_ts_external_id: str,
    aggregation: str = 'average',
    granularity: str = '1h'
) -> TimeSeries:
    """Create aggregated time series for performance optimization.

    Args:
        client: CogniteClient instance
        source_ts_id: Source time series ID
        target_ts_external_id: External ID for aggregated series
        aggregation: Aggregation method
        granularity: Time granularity

    Returns:
        Created aggregated time series
    """
    # Get source time series metadata
    source_ts = client.time_series.retrieve(id=source_ts_id)

    # Create aggregated time series
    aggregated_ts = client.time_series.create(
        TimeSeries(
            external_id=target_ts_external_id,
            name=f"{source_ts.name} ({aggregation} per {granularity})",
            description=f"Aggregated {aggregation} values per {granularity}",
            asset_id=source_ts.asset_id,
            unit=source_ts.unit,
            metadata={
                **source_ts.metadata,
                'aggregation_method': aggregation,
                'aggregation_granularity': granularity,
                'source_timeseries': source_ts.external_id
            }
        )
    )

    # Set up continuous aggregation job
    client.transformations.create(
        external_id=f"aggregate_{target_ts_external_id}",
        name=f"Aggregate {source_ts.name}",
        query=f"""
        SELECT
          date_trunc('{granularity}', timestamp) as timestamp,
          {aggregation}(value) as value
        FROM
          timeseries('{source_ts.external_id}')
        GROUP BY
          date_trunc('{granularity}', timestamp)
        """,
        destination={
            "type": "timeseries",
            "external_id": target_ts_external_id
        },
        schedule="0 * * * *"  # Run hourly
    )

    return aggregated_ts

3. Task-Level Context¶

Example: Implementing Asset Search in CDF¶

I need to implement an asset search feature using CDF that:

**Functional Requirements:**
- Search across asset name, description, and metadata fields
- Support fuzzy matching for equipment tags (e.g., "P101" matches "P-101")
- Filter by asset type, location, criticality, and status
- Include parent/child assets in results optionally
- Search within specific asset subtrees
- Support saved search queries per user

**CDF Integration Requirements:**
- Use CDF search API with proper query syntax
- Implement search across multiple asset properties:
  ```python
  search_filter = {
      "search": {
          "name": search_term,
          "description": search_term
      },
      "filter": {
          "labels": {"containsAny": selected_labels},
          "metadata": {"location": location_filter}
      }
  }

Handle pagination with cursor-based navigation
Respect user's access permissions (capability-based filtering)

Performance Requirements:

Search results return in < 500ms for up to 10,000 assets
Implement client-side caching of recent searches
Use CDF's search suggestions endpoint for autocomplete
Batch API requests when fetching related assets

UI Requirements:

Display asset hierarchy in tree view
Show asset preview with key metadata
Highlight matching terms in results
Include asset type icons and criticality indicators
Export search results to CSV/Excel

Acceptance Criteria:

Search works across all indexed asset properties
Filters correctly narrow results
Performance meets targets with large asset hierarchies
Search history persists across sessions
Results respect data access permissions
Export includes all relevant asset metadata

#### Example: Debugging CDF Query Performance
```markdown
I'm experiencing slow asset hierarchy loading in our CDF application:

**Current Behavior:**
- Asset tree takes 15-20 seconds to load for facility with 5,000 assets
- CDF API calls show high latency (3-5 seconds per request)
- Browser freezes when expanding large asset nodes
- Memory usage spikes to 1.5GB when loading full hierarchy

**Expected Behavior:**
- Initial load should complete in < 3 seconds
- Smooth expansion of asset nodes without UI freeze
- Incremental loading of child assets
- Memory usage under 300MB

**What I've Checked:**
- Using latest Cognite SDK version (7.13.0)
- API requests include proper projections to limit fields
- Implemented pagination with 1000 items per request
- Network latency to CDF cluster is normal (< 50ms)

**Current Implementation:**
```python
# Loading entire hierarchy at once
assets = client.assets.list(
    asset_subtree_ids=[root_asset_id],
    limit=None  # This might be the issue
)

Environment:

Cognite Python SDK 7.13.0
CDF Project: acme-production
Asset count: ~50,000 total, ~5,000 in this facility
Running from Azure East US, CDF cluster in Europe

Code Context: @file:src/services/asset_service.py (asset loading logic) @file:src/components/AssetTree.tsx (React component) @file:src/hooks/useAssetHierarchy.ts (data fetching hook)

#### Example: CDF Events Integration Task
```markdown
Implement event tracking system for equipment failures in CDF:

**Integration Requirements:**
- Create events in CDF for all equipment failures
- Link events to affected assets and work orders
- Include root cause analysis data in event metadata
- Support batch event creation for historical data import
- Integrate with notification system for critical events

**Event Schema:**
```python
event = Event(
    external_id=f"failure:{asset_id}:{timestamp}",
    type="equipment_failure",
    subtype=failure_category,  # mechanical, electrical, instrumentation
    description=failure_description,
    asset_ids=[affected_asset_id],
    start_time=failure_start,
    end_time=failure_end,
    metadata={
        "severity": "critical|high|medium|low",
        "root_cause": root_cause_code,
        "detection_method": "sensor|inspection|operator",
        "downtime_hours": downtime_duration,
        "production_loss": units_lost,
        "repair_cost": estimated_cost
    }
)

Data Quality Requirements:

Validate asset IDs exist before creating events
Ensure start_time < end_time
Require minimum metadata fields for analysis
Check for duplicate events (same asset + time)
Validate against business rules (e.g., severity thresholds)

Performance Requirements:

Batch create up to 1000 events per API call
Process historical data (100k+ events) within 1 hour
Real-time event creation latency < 1 second
Support concurrent event streams from multiple sources

Integration Points:

Pull failure data from maintenance system (SAP PM)
Trigger notifications via CDF Functions
Update asset metadata with failure counts
Generate KPI metrics for dashboards

Testing Requirements:

Unit tests for event validation logic
Integration tests with CDF SDK
Load tests for batch import performance
End-to-end test including notifications

## 📈 Benefits of Context Engineering

### Reduced Refactoring
When AI assistants understand project architecture upfront, developers report **60-80% reduction** in manual code refactoring time.

### Accelerated Debugging
With system-level context, debugging shifts from reactive symptom-chasing to proactive root-cause analysis.

### Enhanced Consistency
Centralized context ensures all developers and AI assistants build to the same standards, leading to more uniform, maintainable codebases.

### Improved Onboarding
New team members can leverage the same context as experienced developers, accelerating their productivity.

## 🔄 The Virtuous Cycle

Context engineering creates a positive feedback loop:

1. **Better Documentation** → Forces teams to codify implicit knowledge
2. **Better AI Output** → AI produces code that adheres to standards
3. **Better Developer Adherence** → All developers follow established patterns
4. **Better Failure Analysis** → Structured analysis identifies context gaps
5. **Even Better Documentation** → Continuous improvement of shared knowledge

## 🎯 Best Practices

### Start Simple
Begin with basic project-level context in `.cursor/rules.md` and gradually expand as needed.

### Be Specific
Include concrete examples, not just abstract principles. Show the AI exactly what you want.

### Keep Current
Regularly update context as your project evolves. Outdated context is worse than no context.

### Validate Results
Monitor AI outputs and refine context based on what works and what doesn't.

### Share Knowledge
Make context engineering a team practice. Share successful patterns and learn from failures.

## 📊 Before/After: The Context Impact

### Without Context: Generic, Error-Prone Output

**Prompt**: "Create a function to fetch asset data"

**AI Output** (without context):
```python
def get_asset_data(asset_id):
    # Generic implementation
    response = requests.get(f"https://api.example.com/assets/{asset_id}")
    return response.json()

❌ Problems:

No CDF SDK usage
No error handling
No authentication
No type hints
Doesn't follow project patterns

With Context: Production-Ready Code¶

Same Prompt with proper context:

AI Output (with context):

from typing import Optional, List, Dict
from cognite.client import CogniteClient
from cognite.client.data_classes import Asset
from cognite.client.exceptions import CogniteAPIError
import logging
from src.config import get_cdf_client
from src.utils.retry import exponential_backoff

logger = logging.getLogger(__name__)

@exponential_backoff(max_retries=3)
async def get_asset_with_hierarchy(
    asset_external_id: str,
    include_subtree: bool = False,
    client: Optional[CogniteClient] = None
) -> Optional[Asset]:
    """Fetch asset data with optional subtree from CDF.

    Args:
        asset_external_id: External ID of the asset to fetch
        include_subtree: Whether to include all descendant assets
        client: Optional CogniteClient instance (uses default if None)

    Returns:
        Asset object with metadata and relationships

    Raises:
        CogniteAPIError: If asset not found or API error occurs
    """
    if not client:
        client = get_cdf_client()

    try:
        # Fetch main asset with optimized projection
        asset = client.assets.retrieve(
            external_id=asset_external_id,
            ignore_unknown_ids=False
        )

        if not asset:
            logger.warning(f"Asset not found: {asset_external_id}")
            return None

        # Optionally fetch subtree
        if include_subtree:
            subtree_assets = client.assets.list(
                asset_subtree_external_ids=[asset_external_id],
                limit=None
            )
            asset.subtree = subtree_assets

        logger.info(
            f"Successfully fetched asset: {asset.name}",
            extra={
                "asset_external_id": asset_external_id,
                "subtree_count": len(subtree_assets) if include_subtree else 0
            }
        )

        return asset

    except CogniteAPIError as e:
        logger.error(
            f"Failed to fetch asset: {asset_external_id}",
            extra={"error": str(e), "code": e.code}
        )
        raise

✅ Improvements:

Uses Cognite SDK properly
Comprehensive error handling
Follows project logging standards
Includes retry logic
Type hints throughout
Proper documentation
Performance optimizations

🤝 Team Collaboration Strategies¶

Shared Context Repository¶

Create a centralized context repository that all team members contribute to:

team-context/
├── .cursor/
│   ├── rules.md                    # Main project rules
│   ├── rules.backend.md            # Backend-specific rules
│   ├── rules.frontend.md           # Frontend-specific rules
│   └── rules.data-pipeline.md     # Data pipeline rules
├── examples/
│   ├── asset-hierarchy/            # Example implementations
│   ├── time-series-processing/
│   └── transformations/
├── templates/
│   ├── function-template.py
│   ├── transformation-template.sql
│   └── container-template.yaml
└── docs/
    ├── patterns.md                 # Documented patterns
    └── anti-patterns.md           # What to avoid

Context Review Process¶

Context Pull Requests: Treat context updates like code

git checkout -b update/context-maintenance-patterns
# Edit .cursor/rules.md
git add .cursor/rules.md
git commit -m "feat(context): add maintenance event patterns"
git push origin update/context-maintenance-patterns

Weekly Context Sync: Team meeting agenda
Review AI output failures
Identify missing context
Update rules collaboratively
Share successful patterns
Context Ownership: Assign domain experts
Data Models: Data Architect owns model patterns
Transformations: Data Engineer owns SQL patterns
Functions: Backend Lead owns Python patterns
Security: Security Lead owns access patterns

# .github/workflows/context-validation.yml
name: Validate Context Updates

on:
  pull_request:
    paths:
      - '.cursor/**'
      - 'team-context/**'

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Validate Markdown Syntax
        run: |
          npm install -g markdownlint-cli
          markdownlint .cursor/rules.md

      - name: Check for Placeholders
        run: |
          if grep -r "TODO\|FIXME\|<REPLACE_ME>" .cursor/; then
            echo "Found uncommitted placeholders"
            exit 1
          fi

      - name: Validate Code Examples
        run: |
          python scripts/validate_context_examples.py

🔧 Troubleshooting Common Context Issues¶

Issue: AI Ignores Project Context¶

Symptoms: AI generates generic code despite .cursor/rules.md

Solutions:

Check file location: Must be exactly .cursor/rules.md in project root
Verify file detection:

ls -la .cursor/rules.md
# Should show file with content

Test with explicit reference:

Following our project rules in .cursor/rules.md, create a function to...

Restart Cursor IDE: Sometimes requires fresh context load

Issue: Inconsistent AI Outputs¶

Symptoms: Same prompt gives different styles/patterns

Solutions:

Add concrete examples:

## Good Example
```python
def calculate_oee(asset_id: str) -> float:
    """ALWAYS include docstrings like this."""
    pass

Bad Example¶

def calc_oee(id):  # NO: Missing types and docstring
    pass

Use explicit constraints:

## MANDATORY Rules
- ALWAYS use type hints
- NEVER use print() - use logger instead
- ALWAYS handle CogniteAPIError

Issue: Context Too Large/Slow¶

Symptoms: AI responses are slow or incomplete

Solutions:

Split context by domain:

.cursor/
├── rules.md          # Core rules only (< 1000 lines)
├── examples.md       # Separate examples file
└── domain/
    ├── assets.md     # Asset-specific rules
    └── timeseries.md # Time series rules

Use context hierarchy:

# Core Rules (Always Active)
Essential patterns here...

# Extended Rules (Reference when needed)
See .cursor/domain/specific-feature.md

Issue: Conflicting Team Patterns¶

Symptoms: Different team members' code conflicts

Solutions:

Version control context:

git log -p .cursor/rules.md  # Review context history

Context linting:

# scripts/lint_context.py
def check_context_conflicts(file_path):
    rules = parse_rules(file_path)
    conflicts = find_contradictions(rules)
    if conflicts:
        raise ContextConflictError(conflicts)

🚀 Getting Started¶

Create .cursor/rules.md in your project root
Use a context template (see below) as starting point
Document your tech stack and coding standards
Include real examples from your codebase
Test and iterate based on AI outputs
Share with team via version control

📋 Context Templates for Different Domains¶

CDF Data Modeling Context Template¶

# CDF Data Modeling Project Context

## Project Overview
- **Project**: [Project Name]
- **CDF Project**: [CDF project name]
- **Data Space**: sp_[project_identifier]
- **Purpose**: [Brief description]

## Data Model Standards
- Extend Cognite Core Data Model (CDM) v1
- Use containers for data storage
- Use views for data consumption
- Implement proper inheritance hierarchies

## Naming Conventions
- Containers: PascalCase (e.g., `PumpStation`)
- Views: PascalCase matching container
- Properties: camelCase (e.g., `maxFlowRate`)
- External IDs: snake_case (e.g., `pump_station_01`)

## Required Properties
All business objects must include:
- `name` (inherited from CDM)
- `sourceId` (reference to source system)
- `sourceUpdatedTime` (last update in source)
- `validityStatus` (active|inactive|deleted)

## Relationship Patterns
[Include your specific patterns]

## Example Implementation
[Include a complete example]

CDF Functions Context Template¶

# CDF Functions Project Context

## Function Standards
- Python 3.11 runtime
- Use Cognite Function handler pattern
- Include comprehensive error handling
- Return structured responses

## Handler Pattern
```python
def handle(client: CogniteClient, data: dict, secrets: dict) -> dict:
    """Standard function handler.

    Args:
        client: Authenticated CogniteClient
        data: Input parameters
        secrets: Environment secrets

    Returns:
        Structured response dict
    """
    try:
        # Implementation
        return {"status": "success", "result": result}
    except Exception as e:
        logger.error(f"Function failed: {e}")
        return {"status": "error", "error": str(e)}

Deployment Configuration¶

[Include your deployment patterns]

### CDF Transformations Context Template
```markdown
# CDF Transformations Project Context

## SQL Standards
- Use Spark SQL syntax
- Include NULL handling
- Use CTEs for readability
- Add comments for complex logic

## Transformation Patterns
- RAW to Staging: Data cleaning and validation
- Staging to Core: Business logic and relationships
- Core to Mart: Aggregations and reporting

## Node Reference Pattern
```sql
SELECT
  node_reference('ViewName', external_id) as id,
  property1,
  property2,
  node_reference('RelatedView', related_id) as relationshipName
FROM source_table

Error Handling¶

[Include your error handling patterns]

### CDF InField App Context Template
```markdown
# CDF InField App Project Context

## Mobile App Architecture
- React Native with TypeScript
- Offline-first with sync queue
- Cognite SDK for all CDF operations
- Biometric authentication required

## Offline Patterns
- Cache asset hierarchies locally
- Queue mutations when offline
- Conflict resolution: Server wins
- Sync on connection restore

## UI/UX Standards
- Follow Cognite Design System
- Support iOS and Android patterns
- Minimum touch targets: 44pt
- Loading states for all async operations

## Performance Requirements
[Include your specific requirements]

📖 Next Steps¶

Quick Start Guide - Practical setup instructions
Advanced Prompting - Combine context with sophisticated prompting
AI Workflows - Real-world implementation examples

🎯 Success Metrics¶

Track these metrics to measure context engineering success:

Productivity Metrics¶

Time to Working Code: 60-80% reduction after context implementation
Refactoring Required: < 20% of AI-generated code needs changes
Review Comments: 70% reduction in style-related PR comments
Onboarding Time: New developers productive in 2-3 days vs 2-3 weeks

Quality Metrics¶

Code Consistency: 95%+ adherence to patterns
Error Rates: 50% reduction in common mistakes
Test Coverage: AI generates tests meeting coverage requirements
Documentation: 100% of generated code includes proper docs

Team Metrics¶

Context Contributions: Each team member contributes monthly
Pattern Reuse: 80% of new code uses established patterns
Knowledge Sharing: Weekly context improvements
Satisfaction: Team reports higher confidence in AI tools

Ready to implement context engineering? Start with the Quick Start Guide for immediate, practical steps.

Context Engineering¶

🎯 What is Context Engineering?¶

Key Distinction¶

🏗️ The Three Tiers of Context¶

Tier 1: Project-Level Context (Global Blueprint)¶

Tier 2: Feature-Level Context (Local Schematic)¶

Tier 3: Task-Level Context (Immediate Instruction)¶

🛠️ Implementation in Cursor IDE¶

1. Project-Level Context (.cursor/rules.md)¶

Real Example: CDF Asset Monitoring Solution¶

Real Example: CDF Data Pipeline Solution¶

Data Quality Standards¶

CDF Functions Patterns¶

Time Series Ingestion¶

Monitoring & Alerting¶

Error Recovery¶

Asset Inspection Workflow¶

Offline Capabilities¶

3D Model Integration¶

Time Series Display¶

Work Order Integration¶

Performance Requirements¶

Security Standards¶

Example: CDF Maintenance Management Feature¶

Example: CDF Time Series Analytics Feature¶

3. Task-Level Context¶

Example: Implementing Asset Search in CDF¶

With Context: Production-Ready Code¶

🤝 Team Collaboration Strategies¶

Shared Context Repository¶

Context Review Process¶

Knowledge Sharing Workflows¶

🔧 Troubleshooting Common Context Issues¶

Issue: AI Ignores Project Context¶

Issue: Inconsistent AI Outputs¶

Bad Example¶

Issue: Context Too Large/Slow¶

Issue: Conflicting Team Patterns¶

🚀 Getting Started¶

📋 Context Templates for Different Domains¶

CDF Data Modeling Context Template¶

CDF Functions Context Template¶

Deployment Configuration¶

Error Handling¶

📖 Next Steps¶

🎯 Success Metrics¶

Productivity Metrics¶

Quality Metrics¶

Team Metrics¶

1. Project-Level Context (`.cursor/rules.md`)¶