Remove task_execution_id column from task_instances table
Summary
The task_execution_id column in task_instances is redundant. Since the composite primary key (instance_id, task_position) is the true identity for a task execution, task_execution_id should be removed from the database schema and all layers of the stack.
Problem
In MODE 3 (Kafka), task_execution_id is already derived from instanceId + taskPosition — confirming that the composite key carries all the identity information and the column is purely redundant.
Keeping task_execution_id around:
- Wastes storage and index space (
idx_task_instances_task_execution_id)
- Misleads API consumers into thinking it's a stable identifier
- Adds unnecessary complexity to triggers, mappers, and ingestion code
Scope of changes
Database (MODE 1 — PostgreSQL)
Domain model
JPA / PostgreSQL storage (MODE 1)
Kafka ingestion (MODE 3)
Elasticsearch (MODE 2)
GraphQL API
Tests
Documentation
Identity strategy (resolved)
TaskExecution.id will be derived from instanceId + taskPosition at the domain/mapper level — the same derivation MODE 3 (Kafka) already uses today. No database column needed; the composite primary key (instance_id, task_position) is the source of truth, and id is computed on read.
Remove task_execution_id column from task_instances table
Summary
The
task_execution_idcolumn intask_instancesis redundant. Since the composite primary key(instance_id, task_position)is the true identity for a task execution,task_execution_idshould be removed from the database schema and all layers of the stack.Problem
In MODE 3 (Kafka),
task_execution_idis already derived frominstanceId + taskPosition— confirming that the composite key carries all the identity information and the column is purely redundant.Keeping
task_execution_idaround:idx_task_instances_task_execution_id)Scope of changes
Database (MODE 1 — PostgreSQL)
task_execution_idcolumn andidx_task_instances_task_execution_idindexnormalize_task_event()trigger function (both fast path and FK-violation fallback) to removetask_execution_idfrom INSERT and ON CONFLICT clausesDomain model
TaskExecution.java— deriveidfrominstanceId + taskPositionat the domain/mapper level (no database column needed)JPA / PostgreSQL storage (MODE 1)
TaskInstanceEntity.java— removetaskExecutionIdfield, updateequals(),hashCode(),getId()TaskInstanceEntityMapper.java— removetaskExecutionId↔idmappingsTaskExecutionJPAStorage.java— removegetTaskExecutionIdreferenceKafka ingestion (MODE 3)
TaskExecutionProcessor.java— removegenerateTaskExecutionId()Mapper.java— removegenerateTaskExecutionId()and its calltask-instance-upsert.sql— removetask_execution_idfrom INSERT columnsElasticsearch (MODE 2)
task-events.jsonindex template — removetaskExecutionIdfield mappingGraphQL API
WorkflowInstanceGraphQLApi.java— updategetTaskExecution(id)identity strategyTests
WorkflowInstanceGraphQLApiTest.java— removesetTaskExecutionId()calls, update assertionsDocumentation
data-index-docs/,data-index-storage-postgresql/README.md, etc.Identity strategy (resolved)
TaskExecution.idwill be derived frominstanceId + taskPositionat the domain/mapper level — the same derivation MODE 3 (Kafka) already uses today. No database column needed; the composite primary key(instance_id, task_position)is the source of truth, andidis computed on read.