Create new datasets from formatting pipelines by naglepuff · Pull Request #1566 · Kitware/dive

naglepuff · 2026-01-06T19:06:19Z

Changes

Pipelines of the type filter and transcode now save their output to a new directory under VIAME_DATA. This data is imported after the pipelines run.

If running pipelines in bulk, a default name is given to the datasets that are created. If running one of these pipelines from a single dataset from the main data view, a new modal window prompts the user to name the new dataset themselves.

A new function sendToRenderer has been created to send messages from the main process to render processes. It is used as part of this set of changes to tell the renderers to refresh the available datasets, meaning that the newly created datasets are shown to the user right after they are available.

Sometimes after import it is clear that a dataset needs to be converted to a web-friendly format. This conversion happens in the same job as the original pipeline after the new data is ingested.

Testing

Test filter and transcode pipelines from both the bulk pipeline menu and dataset view pipeline selector. Ensure that new datasets are created with the expected names, and the resultant data is visible.

Desktop only. Sets up some scaffolding for doing the same with transcoding pipelines in the future.

BryonLewis

Some unused props in JobConfigFilterTranscodeDialog.vue.
Using the Constant instead of ['filter', 'transcode] in a location.
Double job.on('exit', () => ....) calls.

There is a suggestion about refactoring and simplification for the job.on('exit') and that isn't a hard requirement for this PR, just a possible suggestion to simplify some of the logic at the end of the runPipeline function.

There may be another thing that Matt wants and that is support for the system when running a filter to import new Annotations as well. This would require when you find out it is a filter job or a transcode job that it will copy over the existing annotations or if the pipeline creates new annotations it would copy that over.

client/dive-common/components/RunPipelineMenu.vue

client/dive-common/components/JobConfigFilterTranscodeDialog.vue

client/platform/desktop/backend/native/viame.ts

mattdawkins · 2026-02-11T19:54:39Z

One issue is that when track annotations are generated alongside images/videos, the annotations correspond to the new images not the original sequence.

I've asked claude to fix this on a local branch, and it has on commit:
31281e3

This commit uses the pipeline prefixes to determine which pipelines produce image or video output. Which is one way to do it, though I'm worried not the best. Maybe DIVE can auto-detect when images are produced, or as a fallback pipelines could have some specifier in their headers that indicates this? Alternatively this commit could probably be taken as-is, though I'm worried I'll put a pipeline in Utilities or something that produces image outputs at some point.

mattdawkins · 2026-02-12T16:27:19Z

My other concern besides the annotation issue is that having the default output filename have a random hash might be counterintuitive for non-programming users. Maybe instead of random characters at the end, we could have the default string be something more legible - e.g., [origin_name]_[pipeline_postfox (e.g.enhance, debayered, filtered or just filtered by default)][an integer starting from 1])? like sequence_name_filtered1

naglepuff · 2026-02-12T17:56:59Z

My other concern besides the annotation issue is that having the default output filename have a random hash might be counterintuitive for non-programming users. Maybe instead of random characters at the end, we could have the default string be something more legible - e.g., [origin_name]_[pipeline_postfox (e.g.enhance, debayered, filtered or just filtered by default)][an integer starting from 1])? like sequence_name_filtered1

It's a timestamp, which should hopefully still prevent collisions but we don't have to care about any state while creating the default name. As an enhancement I can see updating the bulk pipeline table to let users pick a name for each dataset, but if everything else here seems ok, I'd rather do that as a follow-up.

BryonLewis

For any remaining issues, just create a task in the backlog for future PRs.

naglepuff · 2026-02-16T20:52:29Z

@BryonLewis if you could PTAL the newest commit, I updated the detectorOutput and trackOutput assignments to be in the directory of the new dataset if the pipeline is a filter/transcode.

@mattdawkins this causes the viame pipelines to write those files directly to the new dataset directory instead of having to move them there later, as claude suggests in 31281e3

BryonLewis · 2026-02-17T01:51:09Z

I fixed a small issue where the conditional was checking the pipelineCreatesDatasetMarkers which is ['transcode', 'filter'] against the pipeline.name when really it should be checking against the pipeline.type or the runPipelineArgs.pipeline.type

I also did a small modification so that when deleting datasets that are found inside of DIVE_Jobs_output it would delete the whole folder to prevent random remaining data inside of a users local directory (On windows it could be buried in hidden folders). This involded created a global const for the DIVE_Jobs_output string and using that instead of the local constant we were using before.

Just check over what I did and I think this is good to merge.

BryonLewis · 2026-02-17T03:11:31Z

Additional Commit to simplify the DIVE_Output_Jobs checking logic.

It took a bit of remembering but I needed to update the tests. The tests before didn't load the meta.json file. To load the meta.json file it uses mock-fs which requires that the file be in a string value (hence the JSON.stringify()) for the files.

naglepuff changed the title ~~Show default output dataset name for pipelines~~ Create new datasets from formatting pipelines Jan 6, 2026

naglepuff force-pushed the issue-1453-formatting-pipelines branch 2 times, most recently from 902638e to e8128d0 Compare January 16, 2026 20:57

naglepuff force-pushed the issue-1453-formatting-pipelines branch 2 times, most recently from a7b60ea to 85f8a26 Compare February 4, 2026 17:24

naglepuff added 3 commits February 10, 2026 09:31

Create output datasets from filter pipelines

69c154e

Desktop only. Sets up some scaffolding for doing the same with transcoding pipelines in the future.

Allow naming single dataset pipeline output

cf487cc

Convert pipeline output and make transcode datasets

03a3985

naglepuff force-pushed the issue-1453-formatting-pipelines branch from c648fa5 to 03a3985 Compare February 10, 2026 14:32

Remove unused imports/packages

499ce36

naglepuff requested a review from BryonLewis February 10, 2026 15:11

naglepuff marked this pull request as ready for review February 10, 2026 18:05

Backout changes to yarn lockfile

55687e2

BryonLewis requested changes Feb 11, 2026

View reviewed changes

Clean up new component's props

94ca763

Refactor to only use a single exit handler

865360c

naglepuff requested a review from BryonLewis February 12, 2026 17:57

BryonLewis previously approved these changes Feb 16, 2026

View reviewed changes

Use new dataset dir for filter/transcode csv files

2b895c3

naglepuff dismissed BryonLewis’s stale review via 2b895c3 February 16, 2026 20:50

naglepuff requested a review from BryonLewis February 16, 2026 20:50

BryonLewis added 2 commits February 16, 2026 20:38

fix the dataset marker test conditional

041cd6c

delete dataset if inside DIVE_Jobs_Output dir

5b6d7ae

simplify DIVE_Jobs_Output check, update tests

f36d9e1

BryonLewis approved these changes Feb 17, 2026

View reviewed changes

naglepuff merged commit b8fd916 into main Feb 17, 2026
4 checks passed

naglepuff deleted the issue-1453-formatting-pipelines branch February 17, 2026 14:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create new datasets from formatting pipelines#1566

Create new datasets from formatting pipelines#1566
naglepuff merged 11 commits intomainfrom
issue-1453-formatting-pipelines

naglepuff commented Jan 6, 2026 •

edited

Loading

Uh oh!

BryonLewis left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mattdawkins commented Feb 11, 2026

Uh oh!

mattdawkins commented Feb 12, 2026

Uh oh!

naglepuff commented Feb 12, 2026

Uh oh!

BryonLewis left a comment

Uh oh!

naglepuff commented Feb 16, 2026

Uh oh!

BryonLewis commented Feb 17, 2026

Uh oh!

BryonLewis commented Feb 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

naglepuff commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Testing

Uh oh!

BryonLewis left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mattdawkins commented Feb 11, 2026

Uh oh!

mattdawkins commented Feb 12, 2026

Uh oh!

naglepuff commented Feb 12, 2026

Uh oh!

BryonLewis left a comment

Choose a reason for hiding this comment

Uh oh!

naglepuff commented Feb 16, 2026

Uh oh!

BryonLewis commented Feb 17, 2026

Uh oh!

BryonLewis commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

naglepuff commented Jan 6, 2026 •

edited

Loading

BryonLewis commented Feb 17, 2026 •

edited

Loading