Skip to content

H-3848, H-6154: Handle larger Flow payloads, historical flight data Flows#8356

Open
CiaranMn wants to merge 9 commits intomainfrom
cm/historical-flights-and-large-flow-improvements
Open

H-3848, H-6154: Handle larger Flow payloads, historical flight data Flows#8356
CiaranMn wants to merge 9 commits intomainfrom
cm/historical-flights-and-large-flow-improvements

Conversation

@CiaranMn
Copy link
Member

@CiaranMn CiaranMn commented Feb 4, 2026

🌟 What is the purpose of this PR?

This PR:

  1. Adds the ability to retrieve historical flight arrival information over a time span for a given airport
  2. Given that this can result in a lot of data (~30k entities for 1 month of London Gatwick arrivals), the PR also improves handling of large Flows by:
  • Offloading payloads to S3 for both persisted and proposed entities – activities store data in S3 and output object paths, and consuming activities download the object.
  • Rate limiting handling for the flight APIs we're using.
  • Better timeout / heartbeat configuration for affected Flow activities.
  1. Drive-bys H-3848, where enum data types were not being generated correctly.

Pre-Merge Checklist 🚀

🚢 Has this modified a publishable library?

This PR:

  • does not modify any publishable blocks or libraries, or modifications do not need publishing

📜 Does this require a change to the docs?

The changes in this PR:

  • are internal and do not require a docs change

🕸️ Does this require a change to the Turbo Graph?

The changes in this PR:

  • do not affect the execution graph

🐾 Next steps

  • Custom dashboards including LLM generation assistance, with flight data as a test case.

🛡 What tests cover this?

  • None.

@CiaranMn CiaranMn requested a review from Copilot February 4, 2026 18:18
@CiaranMn CiaranMn self-assigned this Feb 4, 2026
@vercel
Copy link

vercel bot commented Feb 4, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
petrinaut Ready Ready Preview Feb 5, 2026 0:48am
3 Skipped Deployments
Project Deployment Actions Updated (UTC)
hash Ignored Ignored Preview Feb 5, 2026 0:48am
hashdotdesign Ignored Ignored Preview Feb 5, 2026 0:48am
hashdotdesign-tokens Ignored Ignored Preview Feb 5, 2026 0:48am

@cursor
Copy link

cursor bot commented Feb 4, 2026

PR Summary

High Risk
Touches core Flow orchestration, Temporal history parsing, and storage provider interfaces while changing how step outputs are serialized/resolved; failures could break or stall workflows or hide payloads if storage keys/permissions are wrong.

Overview
Improves Flow scalability by offloading large step outputs/inputs to object storage: multiple AI/integration activities now storePayload and return a StoredPayloadRef, and downstream activities (plus Flow run detail retrieval) resolve refs via resolvePayloadValue/retrievePayload so large ProposedEntity/PersistedEntitiesMetadata payloads no longer traverse Temporal history.

Updates Flow execution robustness by adding heartbeats/longer timeouts for persistence activities, introducing shared flow-context utilities (including workflowId/runId and flow-entity lookup by stored workflowId), and tightening payload/parallelization handling to reject aggregating/parallelizing on stored refs.

Adds new aviation integration capabilities: historical arrivals over date ranges, live flight position updates, rate-limited + retrying API clients, and batched create/patch persistence for integration entities/links; also cleans up enum datatype TS generation and extends storage providers with direct upload/download + flow-output key generation.

Written by Cursor Bugbot for commit 810094a. This will update automatically on new commits. Configure here.

@github-actions github-actions bot added area/apps > hash* Affects HASH (a `hash-*` app) area/apps > hash-api Affects the HASH API (app) area/libs Relates to first-party libraries/crates/packages (area) type/eng > backend Owned by the @backend team area/tests New or updated tests area/tests > integration New or updated integration tests area/apps labels Feb 4, 2026
@codecov
Copy link

codecov bot commented Feb 4, 2026

Codecov Report

❌ Patch coverage is 0% with 75 lines in your changes missing coverage. Please review.
✅ Project coverage is 60.10%. Comparing base (1492bce) to head (a5f4208).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...orker-ts/src/activities/shared/get-flow-context.ts 0.00% 13 Missing ⚠️
apps/hash-api/src/storage/local-file-storage.ts 0.00% 12 Missing ⚠️
...bs/@local/hash-isomorphic-utils/src/flows/types.ts 0.00% 10 Missing ⚠️
...ities/flow-activities/write-google-sheet-action.ts 0.00% 8 Missing ⚠️
...ivities/flow-activities/persist-entities-action.ts 0.00% 7 Missing ⚠️
...ctivities/flow-activities/persist-entity-action.ts 0.00% 7 Missing ⚠️
...ies/research-entities-action/coordinating-agent.ts 0.00% 4 Missing ⚠️
...w-activities/shared/create-file-entity-from-url.ts 0.00% 3 Missing ⚠️
...es/shared/map-action-input-entities-to-entities.ts 0.00% 2 Missing ⚠️
apps/hash-api/src/storage/index.ts 0.00% 2 Missing ⚠️
... and 7 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8356      +/-   ##
==========================================
- Coverage   60.10%   60.10%   -0.01%     
==========================================
  Files        1235     1234       -1     
  Lines      118201   118219      +18     
  Branches     5180     5184       +4     
==========================================
  Hits        71050    71050              
- Misses      46324    46342      +18     
  Partials      827      827              
Flag Coverage Δ
apps.hash-ai-worker-ts 1.41% <0.00%> (-0.01%) ⬇️
apps.hash-api 0.00% <0.00%> (ø)
local.hash-isomorphic-utils 0.00% <0.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@augmentcode
Copy link

augmentcode bot commented Feb 4, 2026

🤖 Augment PR Summary

Summary: This PR adds support for larger Flow payloads by offloading certain activity inputs/outputs to storage, and introduces new aviation integration flows for historical flight arrivals and live flight position updates.

Changes:

  • Added integration actions/flow definitions for fetching historical flight arrivals (AeroAPI) and live flight positions (FlightRadar24).
  • Introduced StoredPayloadRef and a new payload-storage utility to store/retrieve large payloads (e.g. ProposedEntities, PersistedEntitiesMetadata) outside Temporal.
  • Updated multiple Flow activities to store large outputs and resolve stored refs when consuming inputs, reducing Temporal payload size.
  • Enhanced persistence activities with batching and more frequent Temporal heartbeats/timeouts for long-running work.
  • Updated Flow run detail retrieval to resolve stored payload refs in step outputs when reading Temporal history.
  • Extended the file storage provider interface to support direct upload/download and flow-output storage key generation (S3 + local FS implementations updated).
  • Fixed TypeScript codegen for enum-constrained data types by removing redundant inheritance in schema preprocessing.

Technical Notes: Stored payload kinds are now represented as references in Flow I/O types and must be resolved by activities (and by backend APIs when returning Flow run details to clients).

🤖 Was this summary useful? React with 👍 or 👎

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 7 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for retrieving historical flight arrival information over time spans and improves handling of large Flow payloads by offloading them to S3 storage. It also fixes enum data type generation issues (H-3848).

Changes:

  • Introduced a StoredPayloadRef system that stores large payloads (ProposedEntity, ProposedEntityWithResolvedLinks, PersistedEntitiesMetadata) in S3 instead of passing them through Temporal activities
  • Added historical flight data retrieval capabilities with automatic 24-hour chunking to handle API limitations
  • Implemented rate limiting for AeroAPI (5 requests/second) and retry logic for 429 errors
  • Added live flight position tracking via FlightRadar24 integration
  • Fixed enum data type code generation by removing redundant allOf inheritance
  • Extended activity timeouts and added heartbeat support for long-running batch operations

Reviewed changes

Copilot reviewed 56 out of 56 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
libs/@local/hash-isomorphic-utils/src/flows/types.ts Core type system changes introducing StoredPayloadRef, StoredPayloadKind, and separated Payload/ResolvedPayload types
libs/@local/hash-backend-utils/src/flows/payload-storage.ts New payload storage infrastructure for S3 offloading with store/retrieve/resolve functions
libs/@local/hash-backend-utils/src/integrations/aviation/aero-api/client.ts Added historical arrivals API integration with date chunking and rate limiting
apps/hash-integration-worker/src/activities/flow-activities/integration-activities/persist-integration-entities-action.ts Refactored to batch operations and resolve stored payloads, with heartbeat support
apps/hash-integration-worker/src/activities/flow-activities/aviation-activities/get-historical-flight-arrivals-action.ts New action for fetching historical flight data over date ranges
apps/hash-integration-worker/src/activities/flow-activities/aviation-activities/get-live-flight-positions-action.ts New action for fetching live flight positions from FlightRadar24
libs/@blockprotocol/graph/src/codegen/preprocess/remove-redundant-data-type-inheritance.ts New preprocessing step to fix enum data type generation
libs/@local/hash-backend-utils/src/flows/get-flow-run-details.ts Added StoredPayloadRef resolution for step outputs before returning to GraphQL clients
apps/hash-integration-worker/src/workflows/run-flow-workflow.ts Extended activity timeouts to 10 hours with 10-second heartbeat timeouts for batch operations
libs/@local/hash-backend-utils/src/file-storage.ts Unified FileStorageProvider interface with direct upload/download methods
apps/hash-ai-worker-ts/src/activities/shared/get-flow-context.ts Changed flow entity lookup from UUID-based to property query-based

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@vercel vercel bot temporarily deployed to Preview – petrinaut February 4, 2026 19:14 Inactive
@vercel vercel bot temporarily deployed to Preview – petrinaut February 5, 2026 08:46 Inactive
@github-actions github-actions bot added the area/deps Relates to third-party dependencies (area) label Feb 5, 2026
@vercel vercel bot temporarily deployed to Preview – petrinaut February 5, 2026 09:04 Inactive
@vercel vercel bot temporarily deployed to Preview – petrinaut February 5, 2026 09:27 Inactive
@graphite-app graphite-app bot requested review from a team February 5, 2026 10:04
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/apps > hash* Affects HASH (a `hash-*` app) area/apps > hash-api Affects the HASH API (app) area/apps area/deps Relates to third-party dependencies (area) area/libs Relates to first-party libraries/crates/packages (area) area/tests > integration New or updated integration tests area/tests New or updated tests type/eng > backend Owned by the @backend team

Development

Successfully merging this pull request may close these issues.

1 participant