#107 Reduce file list memory usage#108
Conversation
We ran into issues with allocation failures when attempting to set up SSL connections. After a lot of searching and trial-and-error, we realized that as the number of log files on the SD card grew, the size of the JSON document returned by GenerateFilelist() grew to consume a significant chunk of heap memory. This document was resident while the upload loop ran, which is the exact time we needed the memory most. Once this was realized, we almost immediately notices that we didn't need this JSON at all, as the only thing we were pulling from it was the file numbers, something we can easily get from a CountLogFiles() call.
While dealing with heap shortages, I realized that we were always holding onto 4 KiB array, even if only the first few entries were actually populated. Four kibibytes isn't a massive amount, but when running near capacity, it might be enough to push us over the edge into SSL allocation failures. To do this shrink-to-fit operation consistently, I replaced one overload of logging::Manager::CountLogFiles() with a new GetLogFileNumbers() method that returns a vector, which provides much nicer ergonomics to the caller.
Previously any missing field would cause the whole row to be dropped. Given that MD5 hashes aren't guaranteed to be present, that felt like overkill.
My goal is to pull the file list out of the status JSON document, which will require calculating the total file count and size separately from the full file list.
I'm running into situations where a large JSON document is being successfully generated, but then allocation fails when trying to copy it into ESP32WiFiAdapter::m_messages. Move semantics to the rescue!
Keeping the document itself on the heap only introduced complexity which we didn't really need. Also store the default buffer size in a named constant.
This document's capacity increased every time a large message was sent, but that memory was never recovered. So if we succeeded in sending a giant file list, ESP32WiFiAdapter would hold onto that memory forever.
…ument This is a subtle dance that was repeated a few times. All of those copies were missing a std::move() call that significantly reduces the peak memory usage of the operation. To me that justifies abstracting it out.
Given that our large JSON documents can grow to fill temporarily fill all available memory, a more conservate growth factor seems appropriate. 1. Less aggressive growth means we are more likely to "just" fit in memory. 2. A growth factor less than the golden ratio allows the possibility of reusing memory blocks the document previously occupied. A growth factor of 2 prevents this.
There was a lot of unnecessary repetition between commands. Other improvements: * Serializing directly to Serial instead of allocating a large temporary string, like some jobs were doing. * Using Serial.println() to print the trailing new line instead of appending a single character to a large string, potentially resulting in another large allocation and copy. * Avoid some unnecessary JsonDocument->String->JsonDocument round-trips.
Just include the total file count and total file size. The file list JSON just isn't guaranteed to fit in memory, so it doesn't make sense to try to cram it inside the status JSON. A failed allocation means no status is sent, which is much more problematic than an incomplete file list. Moving "catalog" out into its own command means the WIBL can serve the status, release memory, then serve the file list. This approach also lets us handle a partial file list by appending a row of ellipses to the file table.
|
Thanks for the PR --- this is a massive improvement (and also updates the C++ to be more idiomatic for newer compilation systems, which is a bonus). Have you attempted to send update status messages from this implementation to an UploadServer implementation? I think it might cause problems with the current implementation (not a problem for this PR, but something we'll have to address later). |
Yes, I assumed that C++11 was fine to use as it is a hard dependency of ArduinoJson version the project currently uses.
No, I have not and I had intended to point this out when posting the PR. As it isn't being used for the CHS project we are working on, we never got around to setting up an UploadServer. I took a quick look at potentially porting it to these changes, but as someone with zero Go experience, I quickly got in over my head. |
Thanks for the update. Because we'll need to make changes to the code before we can fully merge (e.g., to step the firmware version, update and test the update server, etc.) I've made a new branch ("development") to capture this PR in a way that we can modify before final merge. Could you please edit the PR to target this branch instead of main? |
This is a series of small changes that add up to significantly reduce issues related to the memory consumed by JSON documents on the WIBL, especially the file list.
All togther, this is a pretty big change, but each commit is atomic and includes some justification of the change in its commit message. If you have any questions, I'd be happy to answer them.