Defer log process. until I/O to avoid expensive op on filtered logs#174
Conversation
Previously, all log entries underwent expensive processing (object cloning, serialization, timestamp generation) regardless of whether they would be output based on the configured log level. This caused significant CPU overhead when logging large objects at debug level in production environments where the log level is typically set higher (info/warn/error). Changes: - RequestLogger: Move object processing from log() to doLogIO() method - SimpleLogger: Add per-stream level checking before object operations - Store raw log data in buffer and defer expensive operations until actual I/O - Maintain existing dump behavior when error threshold is reached Issue: WRLGS-17
Issue: WRLGS-17
Hello williamlardier,My role is to assist you with the merge of this Available options
Available commands
Status report is not available. |
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
|
| * @param {object} logEntry - The Logging entry to be processed. | ||
| * @param {string} logEntry.msg - The message to be logged |
There was a problem hiding this comment.
should we include other expected fields such as isEnd in this doc ?
| level, | ||
| fields, | ||
| msg, | ||
| logFields, |
There was a problem hiding this comment.
we need to keep the objectCopy() here to ensure the fields appearing in the log cannot be affected by code happening after the log call : we need to perform a deep copy before returning...
There was a problem hiding this comment.
I think that should be up to the caller: I cannot find places where what is logged changes after, this is very uncommon, yet has high CPU impact.
There was a problem hiding this comment.
that is the semantics of any logger : you log the value of the object when the call is made. Not respecting this will not break at every call hopefully, but creates a very significant risk that we don't log relevant data....and thus spend even more time debugging than if we did not have log at all.
An alternative would be to freeze() the object, but this would also cause significant (and breaking!) change on caller, as it may now crash if the variable is modified after being logged. So I don't think it should be used either.
Improving performance is good, but correctness and ease/safety of development matter as well : what is the real-life benefit of not copying the object? Is it really worth updating all logging calls and risking to have an issue with every new log line?
There was a problem hiding this comment.
Thanks for the precision, I understand better as I thought the lack of deep copy was on purpose (we have this bug with the existing code!).
First, on performance, I measured 8% CPU impact on the visible side, and then, it greatly increases the GC pressure. It's hard to precisely measure that part, but it becomes visible when we remove the logs in most places, I can see half of GC duration is from logger. The logger should not be this impactful on the runtime environment... I think we agree on that. Under high load the GC would run up to 50, even 70% of the total duration, clearly a saturation.
On the safety side, I agree. However the original objectCopy was not a deep copy to begin with, but a shallow copy. We only copy the top level fields to recreate an object. This does not protect against object being edited in between. Here is an example script:
#!/usr/bin/env node
const werelogs = require('./index.js');
werelogs.configure({
level: 'info',
dump: 'error'
});
const logger = new werelogs.Logger('test-logger');
const requestLogger = logger.newRequestLogger();
const userSession = {
userId: 12345,
username: "john_doe",
permissions: {
canRead: true,
canWrite: false,
canDelete: false
},
metadata: {
loginTime: new Date().toISOString(),
lastActivity: "browsing",
flags: ["premium", "verified"]
}
};
requestLogger.info('User session started', { session: userSession });
requestLogger.debug('Session details for debugging', {
sessionData: userSession,
additionalInfo: "This debug log will be dumped on error"
});
userSession.username = "jane_doe_MUTATED";
userSession.permissions.canWrite = true;
userSession.permissions.canDelete = true;
userSession.metadata.lastActivity = "MUTATED_ACTIVITY";
userSession.metadata.flags.push("MUTATED_FLAG");
requestLogger.error('Something went wrong - dumping all logs!', {
errorCode: 'TEST_ERROR',
message: 'Check if the debug log above shows original or mutated values'
});Here we log in info, then debug, then edit the object, and finally we log at the dump level: we see that the original object is mutated. Results:
{"name":"test-logger","session":{"userId":12345,"username":"john_doe","permissions":{"canRead":true,"canWrite":false,"canDelete":false},"metadata":{"loginTime":"2025-09-10T07:23:03.230Z","lastActivity":"browsing","flags":["premium","verified"]}},"time":1757488983230,"req_id":"29c710519945a971d51a","level":"info","message":"User session started","hostname":"william","pid":20275}
{"name":"test-logger","session":{"userId":12345,"username":"jane_doe_MUTATED","permissions":{"canRead":true,"canWrite":true,"canDelete":true},"metadata":{"loginTime":"2025-09-10T07:23:03.230Z","lastActivity":"MUTATED_ACTIVITY","flags":["premium","verified","MUTATED_FLAG"]}},"time":1757488983230,"req_id":"29c710519945a971d51a","level":"info","message":"User session started","hostname":"william","pid":20275}
{"name":"test-logger","sessionData":{"userId":12345,"username":"jane_doe_MUTATED","permissions":{"canRead":true,"canWrite":true,"canDelete":true},"metadata":{"loginTime":"2025-09-10T07:23:03.230Z","lastActivity":"MUTATED_ACTIVITY","flags":["premium","verified","MUTATED_FLAG"]}},"additionalInfo":"This debug log will be dumped on error","time":1757488983231,"req_id":"29c710519945a971d51a","level":"debug","message":"Session details for debugging","hostname":"william","pid":20275}
{"name":"test-logger","errorCode":"TEST_ERROR","message":"Something went wrong - dumping all logs!","time":1757488983231,"req_id":"29c710519945a971d51a","level":"error","hostname":"william","pid":20275}
I hope it's easy to read, but in short we can see the mutated object in both the re-printed info log, and the debug log that follows. This is not with the werelogs of this PR branch, but what we have in development/8.2.
So I make two observations:
-
Popular solutions like winston do not copy at all. They are as such affected by the same issue not yet resolved. So a valid concern indeed, and a real problem inherent with the nodejs language, as they also have a PR, not yet merged, as it has clear performance impacts.
-
We have this problem since the beginning in our code. But I cannot exclude that in some case, without knowing, we were looking at wrong logs... That's why in my head this was something done on purpose, as we never edit an object we log, or it's very rare. Now if we do not want that, then it's a real bug we want to fix indeed. So for our optimization I do not really see any solution, unless we make it explicit in the lib, that the object is not deep copied, and add an explicit flag if we want it to be deep copied... But not ideal and likely something a developer will miss.
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
The following reviewers are expecting changes from the author, or must review again: |
|
Note for reviewers: following discussions in https://scality.atlassian.net/wiki/spaces/OS/pages/3414261785/Heads+Up+Werelogs+Logs+Aren+t+Immutable The current PR is a status-quo: we keep the shallow copy, but reorganize the order of operations to optimize the lib. |
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
|
|
/approve |
|
I have successfully merged the changeset of this pull request
The following branches have NOT changed:
Please check the status of the associated issue WRLGS-17. Goodbye williamlardier. The following options are set: approve |
Optimize the logger but not doing any heavy (copy, serialization) operation unless the log is actually printed. This almost remove all overheads we had today when calling
log.debug(...)even with complex objects and when the API would not fail (so, the log is never visible).This is not a breaking change: no need to update test or any of the projects using this library.
Issue: WRLGS-17