Skip to content

Terraform, Lambda, and Frontend Changes#104

Open
gt1074 wants to merge 51 commits into
mainfrom
Terraform2
Open

Terraform, Lambda, and Frontend Changes#104
gt1074 wants to merge 51 commits into
mainfrom
Terraform2

Conversation

@gt1074
Copy link
Copy Markdown
Collaborator

@gt1074 gt1074 commented Apr 15, 2026

Full implementation of the Terraform AWS suite. Transfers data pipeline from scripts to Terraform configuration. Initializes all necessary S3 buckets, Lambda functions, IAM privileges, ECS and EC2 modules, SNS modules, and private VPC configurations. All required attributes are configurable through terraform variables.

Changed Geojson metadata configuration. Geojson metadata is now configured in the validation lambda, fixing a bug where non validated files were left hanging in the staging bucket with no way to delete them.

Added AWS SSM parameter to handle the provider authentication string for submitting to DCDB using the submission lambda.

Submission and validation lambda now correctly update metadata to include upload and validation status, while showing whether a file is online, archived, or deleted. Metadata is now kept even if a file is deleted from S3.

Changed python consumers and socket manager to include payload and reconnection support.

Configured the frontend Geojson map to render tiles from AWS Location Services.

Configured frontend to now display status (upload/validation/conversion) and state (online/archived/deleted) of files.

Use AWS RDS instances instead of ECS container database instances for the frontend and manager.

Switched to AWS Elasticache for Redis implementation.

gt1074 and others added 30 commits September 25, 2025 10:25
…n-fix

Update conversion start lambda permissions
…redis / Changed frontend database configurations / Working towards fixed static file s3 storage
…bdas & ecs / Moving geojson metadata initialization from submission to validation lambda
Comment thread wibl-python/wibl-manager/src/wibl_manager/statistics.py Outdated
@brian-r-calder
Copy link
Copy Markdown
Collaborator

I can't claim to have fully understood every part of the PR (because I haven't internalised all of the structure of the code, only examined the diffs), but except as indicated it appears logical to me. I agree with Brian that we should do a dry install of the system using this setup to check before fully merging.

@selimnairb
Copy link
Copy Markdown
Member

@gt1074 As part of wrapping up this PR, please implement the S3 backend for storing Terraform state as is done for the UploadServer, as described here. You can use the same default bucket name ("unhjhc-wibl-tf-state"), but pick a different key for now, maybe something like "terraform/state/wibl-processing-server-deploy.tfstate". Eventually, we can use the same key so that the UploadServer can be deployed into the same stack, but for now we'll keep them separate.

…oving unused subnet_efs Terraform module / manager statistics bug fixes.
@gt1074
Copy link
Copy Markdown
Collaborator Author

gt1074 commented May 14, 2026

Do you want me to have the user run a script nearly identical to the Terraform-bootstrap.sh script, meaning the upload server and the wibl-python section both use different buckets. Or do you want to just port the configurations, leading the Terraform module in wibl-python to save a copy of its state file in the same bucket as the upload server, just a different location.

@selimnairb
Copy link
Copy Markdown
Member

Do you want me to have the user run a script nearly identical to the Terraform-bootstrap.sh script, meaning the upload server and the wibl-python section both use different buckets. Or do you want to just port the configurations, leading the Terraform module in wibl-python to save a copy of its state file in the same bucket as the upload server, just a different location.

Just port the configuration over for now, but use a different default key name for the state file for wibl-python. We can merge them later (ideally, the upload server will be fully integrated into the wibl-python cloud stack). I just want to make them as amenable to merging as we can right now.

Comment thread wibl-python/scripts/cloud/AWS/Terraform/plan.sh Outdated
Comment thread wibl-python/scripts/cloud/AWS/Terraform/README.md Outdated
Comment thread wibl-python/src/wibl/validation/cloud/aws/lambda_function.py Outdated
Comment thread wibl-python/scripts/cloud/AWS/Terraform/terraform.tfvars Outdated
Comment thread wibl-python/scripts/cloud/AWS/Terraform/terraform.tfvars Outdated
Copy link
Copy Markdown
Member

@selimnairb selimnairb May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gt1074 This file path should be listed in .gitignore. We should then check a template file into Git (e.g., default_auth.txt.proto), then copy the .proto to .txt before the build (probably in build.sh). Users should be instructed in the README to add their DCDB key to the .proto. Then no need for them to change or even look at the auth_file_name TF variable. As it stands, it would be too easy for someone (even us) to push a DCDB key to a public repo.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gt1074 On second thought, we'll need to add a step where the user copies the .proto to .tfvars. They shouldn't be adding secrets in the .proto file, since that will be checked into the repo. Duh. Was trying to avoid an extra step for users, but there's not a great way around it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gt1074 Like with default_auth.txt we should .gitignore terraform.tfvars and add a terraform.tfvars.proto to the repo. Though I suppose then we'll need to add a step where the user copies the .proto to .tfvars. Having the build.sh script copy .proto to .tfvars for us doesn't help with accidentally committing secrets if we are having the user edit the .proto file, which would be checked into the repo.

terraform_state_bucket ="unhjhc-wibl-tf-state"
terraform_state_key ="terraform/state/wibl-processing-server-deploy.tfstate"

incoming_bucket_name = "default-incoming-bucket"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gt1074 The user also must be instructed to change all four of these bucket names as S3 has a more-or-less global namespace (unless they are using GovCloud or are in China).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gt1074 Actually, I see that as of March 12, the namespace for S3 is no longer global. However, we should still recommend that they change the bucket names.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gt1074 And it looks like we have to manually scope bucket creation to region::account as per here. This is an important enough change that we should modify the TF config to do this for the user by default.

}

backend "s3" {
bucket = "unhjhc-wibl-tf-state"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gt1074 Please change the default to use an account-region scoped bucket name as per here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants