Skip to content

Defer log process. until I/O to avoid expensive op on filtered logs#174

Merged
bert-e merged 2 commits into
development/8.2from
improvement/WRLGS-17
Oct 31, 2025
Merged

Defer log process. until I/O to avoid expensive op on filtered logs#174
bert-e merged 2 commits into
development/8.2from
improvement/WRLGS-17

Conversation

@ghost

@ghost ghost commented Sep 8, 2025

Copy link
Copy Markdown

Optimize the logger but not doing any heavy (copy, serialization) operation unless the log is actually printed. This almost remove all overheads we had today when calling log.debug(...) even with complex objects and when the API would not fail (so, the log is never visible).

This is not a breaking change: no need to update test or any of the projects using this library.

Issue: WRLGS-17

Previously, all log entries underwent expensive processing (object
cloning, serialization, timestamp generation) regardless of whether
they would be output based on the configured log level. This caused
significant CPU overhead when logging large objects at debug level
in production environments where the log level is typically set
higher (info/warn/error).

Changes:
- RequestLogger: Move object processing from log() to doLogIO() method
- SimpleLogger: Add per-stream level checking before object operations
- Store raw log data in buffer and defer expensive operations until
  actual I/O
- Maintain existing dump behavior when error threshold is reached

Issue: WRLGS-17
@bert-e

bert-e commented Sep 8, 2025

Copy link
Copy Markdown
Contributor

Hello williamlardier,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Available options
name description privileged authored
/after_pull_request Wait for the given pull request id to be merged before continuing with the current one.
/bypass_author_approval Bypass the pull request author's approval
/bypass_build_status Bypass the build and test status
/bypass_commit_size Bypass the check on the size of the changeset TBA
/bypass_incompatible_branch Bypass the check on the source branch prefix
/bypass_jira_check Bypass the Jira issue check
/bypass_peer_approval Bypass the pull request peers' approval
/bypass_leader_approval Bypass the pull request leaders' approval
/approve Instruct Bert-E that the author has approved the pull request. ✍️
/create_pull_requests Allow the creation of integration pull requests.
/create_integration_branches Allow the creation of integration branches.
/no_octopus Prevent Wall-E from doing any octopus merge and use multiple consecutive merge instead
/unanimity Change review acceptance criteria from one reviewer at least to all reviewers
/wait Instruct Bert-E not to run until further notice.
Available commands
name description privileged
/help Print Bert-E's manual in the pull request.
/status Print Bert-E's current status in the pull request TBA
/clear Remove all comments from Bert-E from the history TBA
/retry Re-start a fresh build TBA
/build Re-start a fresh build TBA
/force_reset Delete integration branches & pull requests, and restart merge process from the beginning.
/reset Try to remove integration branches unless there are commits on them which do not appear on the source branch.

Status report is not available.

@bert-e

bert-e commented Sep 8, 2025

Copy link
Copy Markdown
Contributor

Incorrect fix version

The Fix Version/s in issue WRLGS-17 contains:

  • None

Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:

  • 8.2.3

Please check the Fix Version/s of WRLGS-17, or the target
branch of this pull request.

@ghost ghost changed the title Improvement/wrlgs 17 Defer log process. until I/O to avoid expensive op on filtered logs Sep 8, 2025
@bert-e

bert-e commented Sep 8, 2025

Copy link
Copy Markdown
Contributor

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • 2 peers

Comment thread lib/RequestLogger.js
Comment on lines +451 to +452
* @param {object} logEntry - The Logging entry to be processed.
* @param {string} logEntry.msg - The message to be logged

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we include other expected fields such as isEnd in this doc ?

Comment thread lib/RequestLogger.js
level,
fields,
msg,
logFields,

@francoisferrand francoisferrand Sep 9, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to keep the objectCopy() here to ensure the fields appearing in the log cannot be affected by code happening after the log call : we need to perform a deep copy before returning...

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that should be up to the caller: I cannot find places where what is logged changes after, this is very uncommon, yet has high CPU impact.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is the semantics of any logger : you log the value of the object when the call is made. Not respecting this will not break at every call hopefully, but creates a very significant risk that we don't log relevant data....and thus spend even more time debugging than if we did not have log at all.

An alternative would be to freeze() the object, but this would also cause significant (and breaking!) change on caller, as it may now crash if the variable is modified after being logged. So I don't think it should be used either.

Improving performance is good, but correctness and ease/safety of development matter as well : what is the real-life benefit of not copying the object? Is it really worth updating all logging calls and risking to have an issue with every new log line?

@ghost ghost Sep 10, 2025

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the precision, I understand better as I thought the lack of deep copy was on purpose (we have this bug with the existing code!).

First, on performance, I measured 8% CPU impact on the visible side, and then, it greatly increases the GC pressure. It's hard to precisely measure that part, but it becomes visible when we remove the logs in most places, I can see half of GC duration is from logger. The logger should not be this impactful on the runtime environment... I think we agree on that. Under high load the GC would run up to 50, even 70% of the total duration, clearly a saturation.

On the safety side, I agree. However the original objectCopy was not a deep copy to begin with, but a shallow copy. We only copy the top level fields to recreate an object. This does not protect against object being edited in between. Here is an example script:

#!/usr/bin/env node

const werelogs = require('./index.js');

werelogs.configure({
    level: 'info',
    dump: 'error'
});

const logger = new werelogs.Logger('test-logger');
const requestLogger = logger.newRequestLogger();

const userSession = {
    userId: 12345,
    username: "john_doe", 
    permissions: {
        canRead: true,
        canWrite: false,
        canDelete: false
    },
    metadata: {
        loginTime: new Date().toISOString(),
        lastActivity: "browsing",
        flags: ["premium", "verified"]
    }
};

requestLogger.info('User session started', { session: userSession });

requestLogger.debug('Session details for debugging', { 
    sessionData: userSession,
    additionalInfo: "This debug log will be dumped on error"
});

userSession.username = "jane_doe_MUTATED";
userSession.permissions.canWrite = true;
userSession.permissions.canDelete = true;
userSession.metadata.lastActivity = "MUTATED_ACTIVITY";
userSession.metadata.flags.push("MUTATED_FLAG");

requestLogger.error('Something went wrong - dumping all logs!', {
    errorCode: 'TEST_ERROR',
    message: 'Check if the debug log above shows original or mutated values'
});

Here we log in info, then debug, then edit the object, and finally we log at the dump level: we see that the original object is mutated. Results:

{"name":"test-logger","session":{"userId":12345,"username":"john_doe","permissions":{"canRead":true,"canWrite":false,"canDelete":false},"metadata":{"loginTime":"2025-09-10T07:23:03.230Z","lastActivity":"browsing","flags":["premium","verified"]}},"time":1757488983230,"req_id":"29c710519945a971d51a","level":"info","message":"User session started","hostname":"william","pid":20275}
{"name":"test-logger","session":{"userId":12345,"username":"jane_doe_MUTATED","permissions":{"canRead":true,"canWrite":true,"canDelete":true},"metadata":{"loginTime":"2025-09-10T07:23:03.230Z","lastActivity":"MUTATED_ACTIVITY","flags":["premium","verified","MUTATED_FLAG"]}},"time":1757488983230,"req_id":"29c710519945a971d51a","level":"info","message":"User session started","hostname":"william","pid":20275}
{"name":"test-logger","sessionData":{"userId":12345,"username":"jane_doe_MUTATED","permissions":{"canRead":true,"canWrite":true,"canDelete":true},"metadata":{"loginTime":"2025-09-10T07:23:03.230Z","lastActivity":"MUTATED_ACTIVITY","flags":["premium","verified","MUTATED_FLAG"]}},"additionalInfo":"This debug log will be dumped on error","time":1757488983231,"req_id":"29c710519945a971d51a","level":"debug","message":"Session details for debugging","hostname":"william","pid":20275}
{"name":"test-logger","errorCode":"TEST_ERROR","message":"Something went wrong - dumping all logs!","time":1757488983231,"req_id":"29c710519945a971d51a","level":"error","hostname":"william","pid":20275}

I hope it's easy to read, but in short we can see the mutated object in both the re-printed info log, and the debug log that follows. This is not with the werelogs of this PR branch, but what we have in development/8.2.

So I make two observations:

  • Popular solutions like winston do not copy at all. They are as such affected by the same issue not yet resolved. So a valid concern indeed, and a real problem inherent with the nodejs language, as they also have a PR, not yet merged, as it has clear performance impacts.

  • We have this problem since the beginning in our code. But I cannot exclude that in some case, without knowing, we were looking at wrong logs... That's why in my head this was something done on purpose, as we never edit an object we log, or it's very rare. Now if we do not want that, then it's a real bug we want to fix indeed. So for our optimization I do not really see any solution, unless we make it explicit in the lib, that the object is not deep copied, and add an explicit flag if we want it to be deep copied... But not ideal and likely something a developer will miss.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WRLGS-18 was created

@bert-e

bert-e commented Sep 9, 2025

Copy link
Copy Markdown
Contributor

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • 2 peers

The following reviewers are expecting changes from the author, or must review again:

@ghost ghost requested a review from francoisferrand September 9, 2025 11:22
@ghost

ghost commented Sep 30, 2025

Copy link
Copy Markdown
Author

Note for reviewers: following discussions in https://scality.atlassian.net/wiki/spaces/OS/pages/3414261785/Heads+Up+Werelogs+Logs+Aren+t+Immutable

The current PR is a status-quo: we keep the shallow copy, but reorganize the order of operations to optimize the lib.
I don't want to introduce a fag to deep copy here: best to have the caller doing it if needed. As such there is no difference for the caller after this PR and before, hence, no real need to communicate on the behavior, this is the standard way of logging in NodeJS.

@bert-e

bert-e commented Oct 28, 2025

Copy link
Copy Markdown
Contributor

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • 2 peers

@ghost

ghost commented Oct 31, 2025

Copy link
Copy Markdown
Author

/approve

@bert-e

bert-e commented Oct 31, 2025

Copy link
Copy Markdown
Contributor

I have successfully merged the changeset of this pull request
into targetted development branches:

  • ✔️ development/8.2

The following branches have NOT changed:

  • development/6.4
  • development/7.10
  • development/7.4
  • development/8.0
  • development/8.1

Please check the status of the associated issue WRLGS-17.

Goodbye williamlardier.

The following options are set: approve

@bert-e bert-e merged commit 99afdc0 into development/8.2 Oct 31, 2025
1 check passed
@bert-e bert-e deleted the improvement/WRLGS-17 branch October 31, 2025 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants