717 words
4 minutes
Object-Centric Event Data Management for Python
VincenzoImp
/
python-oced
Waiting for api.github.com...
00K
0K
0K
Waiting...

A comprehensive Python library for managing Object-Centric Event Data (OCED), providing a flexible framework for modeling complex business processes where events can involve multiple objects with rich relationships and attributes.

Features#

  • Complete OCED Implementation: Full support for objects, relations, attributes, and events
  • Rich Qualifier System: Create, delete, modify, and involve operations for all entity types
  • State Management: Real-time current state tracking with historical event logging
  • Multiple Export Formats: JSON and XML serialization/deserialization
  • Flexible Event Modeling: Support for complex event structures with multiple qualifiers
  • Data Integrity: Comprehensive validation and constraint checking

Installation#

pip install pandas
pip install python-dateutil

Quick Start#

from OCED import *
from datetime import datetime

# Create OCED model
oced_model = OCED()

# Create qualifiers for an event
qualifiers = []
qualifiers.append(create_object('customer_1', 'Customer'))
qualifiers.append(create_object_attribute_value('attr_1', 'customer_1', 'name', 'John Doe'))
qualifiers.append(create_object_attribute_value('attr_2', 'customer_1', 'email', 'john@example.com'))

# Create event
time = datetime.now().isoformat()
event = Event(time, 'customer_registration', qualifiers, 
              {'channel': 'web', 'campaign': 'summer2024'})

# Insert event into model
oced_model.insert_event(event)

# Access current state
print(oced_model.current_state)

# Export to JSON
dump_json('model.json', oced_model)

Core Concepts#

Objects#

Objects represent entities in your process (customers, orders, products, etc.):

create_object('order_123', 'Order')
delete_object('order_123')
modify_object('order_123', 'CancelledOrder')
involve_object('order_123')  # Reference without modification

Object Relations#

Define relationships between objects:

create_object_relation('rel_1', 'customer_1', 'order_123', 'places')
delete_object_relation('rel_1')
modify_object_relation('rel_1', 'cancelled')

Object Attributes#

Add dynamic attributes to objects:

create_object_attribute_value('attr_1', 'order_123', 'total_amount', '299.99')
modify_object_attribute_value('attr_1', '349.99')
delete_object_attribute_value('attr_1')

Events#

Events group multiple qualifiers into atomic operations:

qualifiers = [
    create_object('payment_1', 'Payment'),
    create_object_relation('rel_2', 'order_123', 'payment_1', 'paid_by'),
    modify_object_attribute_value('status_attr', 'paid')
]

event = Event(
    event_time=datetime.now().isoformat(),
    event_type='payment_processed',
    qualifiers=qualifiers,
    event_attributes={'amount': '299.99', 'method': 'credit_card'}
)

Data Model#

The OCED model maintains several interconnected tables:

Core Tables#

  • event: Event metadata (id, type, timestamp)
  • object: Object instances with types and existence status
  • object_relation: Relationships between objects
  • object_attribute_value: Dynamic attributes for objects

Cross-Reference Tables#

  • event_x_object: Links events to involved objects
  • event_x_object_relation: Links events to relation changes
  • event_x_object_attribute_value: Links events to attribute changes

Lookup Tables#

  • event_type, object_type, object_relation_type: Type definitions
  • event_attribute_name, object_attribute_name: Attribute schemas

Advanced Usage#

Complex Event Example#

# Multi-step order processing event
qualifiers = [
    # Create order
    create_object('order_456', 'Order'),
    create_object_attribute_value('oa_1', 'order_456', 'status', 'pending'),
    create_object_attribute_value('oa_2', 'order_456', 'total', '150.00'),
    
    # Link to customer
    involve_object('customer_1'),
    create_object_relation('rel_3', 'customer_1', 'order_456', 'placed'),
    
    # Add line items
    create_object('item_1', 'LineItem'),
    create_object_attribute_value('oa_3', 'item_1', 'product_id', 'SKU123'),
    create_object_attribute_value('oa_4', 'item_1', 'quantity', '2'),
    create_object_relation('rel_4', 'order_456', 'item_1', 'contains')
]

event = Event(
    event_time=datetime.now().isoformat(),
    event_type='order_created',
    qualifiers=qualifiers,
    event_attributes={'source': 'mobile_app', 'promotion_code': 'SAVE10'}
)

oced_model.insert_event(event)

State Querying#

# Access current state
current_state = oced_model.current_state

# Check object existence
if 'order_456' in current_state['object']:
    order_info = current_state['object']['order_456']
    print(f"Order type: {order_info['type']}")
    print(f"Is active: {order_info['existency']}")
    print(f"Related objects: {order_info['object_relation_ids']}")
    print(f"Attributes: {order_info['object_attribute_value_ids']}")

# Access attribute values
for attr_id in order_info['object_attribute_value_ids']:
    if attr_id in current_state['object_attribute_value']:
        attr = current_state['object_attribute_value'][attr_id]
        if attr['existency']:  # Only active attributes
            print(f"{attr['name']}: {attr['value']}")

Serialization#

JSON Export/Import#

# Export to JSON
dump_json('process_model.json', oced_model)

# Import from JSON
loaded_model = load_json('process_model.json')

XML Export/Import#

# Export to XML
dump_xml('process_model.xml', oced_model)

# Import from XML
loaded_model = load_xml('process_model.xml')

Data Access#

All data is accessible through pandas DataFrames:

# Access event log
events_df = oced_model.event
print(events_df.head())

# Access objects
objects_df = oced_model.object
active_objects = objects_df[objects_df['object_existency'] == True]

# Access relations
relations_df = oced_model.object_relation
customer_orders = relations_df[
    (relations_df['object_relation_type'] == 'places') & 
    (relations_df['object_relation_existency'] == True)
]

# Event-object associations
event_objects = oced_model.event_x_object
create_operations = event_objects[event_objects['qualifier_type'] == 'CREATE']

Validation Rules#

The system enforces several validation rules:

  • Temporal Consistency: Events must have timestamps later than previous events
  • Reference Integrity: Cannot delete/modify non-existent objects
  • Unique Identifiers: Object IDs, relation IDs, and attribute value IDs must be unique
  • Type Safety: All inputs must match expected types
  • Self-Reference Prevention: Objects cannot have relations to themselves

Architecture#

Qualifier System#

All operations are modeled as qualifiers with three main types:

  • CREATE: Add new entities (objects, relations, attributes)
  • DELETE: Mark entities as non-existent (soft delete)
  • MODIFY: Change entity properties
  • INVOLVE: Reference entities without modification

State Management#

  • Current State: Real-time view of all active entities
  • Historical Log: Complete audit trail of all operations
  • Event Log: Structured event history with qualifier details

Data Integrity#

  • Atomic Operations: Events execute completely or fail entirely
  • Consistency Checks: Validation before any state changes
  • Audit Trail: Complete history preservation

Performance Considerations#

  • Memory Usage: All data is kept in memory using pandas DataFrames
  • Scalability: Suitable for moderate-sized datasets (< 100K events)
  • Query Performance: Direct DataFrame access for analytical queries
  • State Computation: Current state is maintained incrementally

File Structure#

OCED/
├── OCED.py              # Main library implementation
├── test.ipynb           # Example usage notebook
├── OCED_model.json      # Sample JSON export
├── OCED_model.xml       # Sample XML export
└── README.md            # This documentation

Examples#

See test.ipynb for complete examples including:

  • Basic object and relation creation
  • Complex multi-step events
  • State querying and analysis
  • Import/export operations
  • Data validation scenarios

Contributing#

This project follows object-oriented design principles with:

  • Clear separation of concerns
  • Comprehensive input validation
  • Immutable access to internal state
  • Complete audit trail maintenance

Known Limitations#

  1. String-only Values: All attribute values must be strings
  2. Memory Constraints: Not optimized for very large datasets
  3. No Concurrent Access: Single-threaded design
  4. Limited Query Interface: Basic DataFrame access only

Future Enhancements#

Planned improvements include:

  • Advanced query DSL
  • Database backend integration
  • Performance optimizations
  • Visualization tools
  • Process mining integration
  • Multi-type attribute support
Object-Centric Event Data Management for Python
https://vincenzo.imperati.dev/posts/python-oced/
Author
Vincenzo Imperati
Published at
2023-10-10