fix(gmail): re-pad base64url attachment data for standard decoder compatibility#834
Conversation
🦋 Changeset detectedLatest commit: 5a9a1dd The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses an issue where the Gmail API returns unpadded base64url strings for attachments, causing failures in standard decoders that expect specific padding. By introducing a targeted padding function in the executor, the system now ensures that attachment data is correctly formatted before further processing, improving reliability across different environments. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a fix to re-pad base64url-encoded attachment data in Gmail API responses, ensuring compatibility with standard decoders. The feedback points out that parsing and re-serializing the entire JSON response is highly inefficient for large attachments (up to 25MB) and suggests an in-place string scanning approach to avoid significant CPU overhead and memory allocations.
| fn pad_attachment_data(body: String) -> String { | ||
| let mut val: Value = match serde_json::from_str(&body) { | ||
| Ok(v) => v, | ||
| Err(_) => return body, | ||
| }; | ||
|
|
||
| if let Some(obj) = val.as_object_mut() { | ||
| if let Some(Value::String(data)) = obj.get_mut("data") { | ||
| let padding = (4 - data.len() % 4) % 4; | ||
| data.push_str(&"=".repeat(padding)); | ||
| } | ||
| } | ||
|
|
||
| serde_json::to_string(&val).unwrap_or_else(|_| body) | ||
| } |
There was a problem hiding this comment.
Parsing and re-serializing the entire JSON response body just to append a few padding characters to the data field is highly inefficient, especially for large Gmail attachments (which can be up to 25MB). This approach causes significant CPU overhead and multiple large memory allocations.
Instead, we can perform a highly efficient, in-place string scan to locate the "data" key and append the padding characters directly to the String without parsing or re-serializing the JSON.
fn pad_attachment_data(mut body: String) -> String {
let mut start_idx = 0;
while let Some(key_idx) = body[start_idx..].find("\"data\"").map(|idx| start_idx + idx) {
let rest = &body[key_idx + 6..];
if let Some(colon_relative_idx) = rest.find(|c: char| !c.is_whitespace()) {
if rest.as_bytes()[colon_relative_idx] == b':' {
let after_colon = &rest[colon_relative_idx + 1..];
if let Some(quote_relative_idx) = after_colon.find(|c: char| !c.is_whitespace()) {
if after_colon.as_bytes()[quote_relative_idx] == b'"' {
let value_start = key_idx + 6 + colon_relative_idx + 1 + quote_relative_idx + 1;
let after_quote = &body[value_start..];
if let Some(quote_end) = after_quote.find('"') {
let value_end = value_start + quote_end;
let data_str = &body[value_start..value_end];
if data_str.chars().all(|c| c.is_ascii_alphanumeric() || c == '-' || c == '_') {
let padding = (4 - data_str.len() % 4) % 4;
if padding > 0 {
body.insert_str(value_end, &"=".repeat(padding));
}
return body;
}
}
}
}
}
}
start_idx = key_idx + 6;
}
body
}
Description
Google's API returns unpadded base64url data for
gmail.users.messages.attachments.get. Standard decoders in Python, Node.js, and other languages require=padding and fail whenlen % 4 != 0.This fix detects the
gmail.users.messages.attachments.getmethod and appends the correct number of=padding characters to thedatafield before returning the response. Scoped to this one method to avoid unintended side effects.Closes #774
Checklist:
google-*crates).cargo fmt --allto format the code perfectly.cargo clippy -- -D warningsand resolved all warnings.pnpx changeset) to document my changes.