Skip to content

Recover from corrupt utmp records instead of aborting at the first#5115

Open
kev365 wants to merge 7 commits into
log2timeline:mainfrom
kev365:bugfix-utmp-resync-btmp
Open

Recover from corrupt utmp records instead of aborting at the first#5115
kev365 wants to merge 7 commits into
log2timeline:mainfrom
kev365:bugfix-utmp-resync-btmp

Conversation

@kev365

@kev365 kev365 commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Description

The libc6 utmp parser's per-record loop did break on the first ParseError, abandoning every record after a single corrupt one. Because libc6 utmp/wtmp/btmp records are a fixed 384-byte size, a corrupt record can be skipped and the rest recovered: on a ParseError, if a full record could still follow, emit a warning and seek to the next 384-byte boundary instead of stopping; trailing data shorter than a record is still treated as end-of-file. The first record is still validated strictly (a non-utmp first record still raises WrongParser), and the loop cannot spin (the offset strictly increases).

Related issue: fixes #5111

Testing

Added testParseCorruptUtmpFile with a synthetic, self-authored test_data/utmp_corrupted: a valid record, two consecutive unsupported-type records, a valid record, then a truncated trailing record → 2 events + 2 warnings, proving the trailing record is recovered after the skipped ones and the truncated tail is treated as end-of-file. Expected values were derived from utmpdump. Existing testParseUtmpFile / testParseWtmpFile are unchanged; Black + pylint clean.

Open question for the reviewer

Is skip-and-continue acceptable, or do you prefer the current conservative break? The change is isolated to the loop's exception handler and can be dropped if you'd rather keep the existing behaviour.

Checklist

  • No new dependencies are required or l2tdevtools has been updated.
  • Test data has a Plaso compatible license. (test_data/utmp_corrupted is synthetic and self-authored; no external source, so no ACKNOWLEDGEMENTS entry is needed.)
  • Reviewer assigned.
  • Automated checks (GitHub Actions, AppVeyor) pass.

The utmp parser stopped at the first record it could not parse, abandoning
every record after it. Because libc6 utmp/wtmp/btmp records are a fixed size,
a single corrupt record can be skipped so the remaining records are still
recovered: on a parse error, if a full record could still follow, emit a
warning and seek to the next record boundary instead of stopping. Trailing
data shorter than a record is still treated as the end of the file.

Add a synthetic, self-authored test sample utmp_corrupted: a valid record,
two consecutive unsupported-type records, a valid record, and a truncated
trailing record -- exercising skip-and-continue recovery (including
consecutive corruption) and the truncated-tail path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@joachimmetz joachimmetz self-assigned this Jun 16, 2026
@codecov

codecov Bot commented Jun 16, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 91.37931% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.07%. Comparing base (f4fbb45) to head (3bc42eb).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
plaso/parsers/utmp.py 91.37% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5115      +/-   ##
==========================================
- Coverage   85.07%   85.07%   -0.01%     
==========================================
  Files         455      455              
  Lines       40426    40464      +38     
==========================================
+ Hits        34392    34424      +32     
- Misses       6034     6040       +6     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread plaso/parsers/utmp.py Outdated
_INIT_PROCESS_TYPE = 5
_DEAD_PROCESS_TYPE = 8

# A libc6 utmp entry is a fixed-size record.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Joachim. I've added support for the 64-bit (400-byte) libc6 utmp layout (e.g. aarch64, where ut_session and the ut_tv timeval are widened).

The parser selects the layout by validating the first record with each candidate: the 32-bit and 64-bit layouts share the fields before the session field, so the correct one is identified by a supported type and a valid subsecond field — the microseconds field is defined to be < 1,000,000, and a 64-bit record read with the 32-bit layout has its wider seconds field spill into it and is rejected. The record size is then derived from the selected layout, so the skip-and-recover logic no longer hard-codes 384. Verified against a sample generated on real aarch64 (test_data/utmp_64bit: boot, login and logout records).

@joachimmetz joachimmetz added the pending reporter input Issue is pending input from the reporter label Jul 2, 2026
The parser assumed the 384-byte 32-bit record. On 64-bit builds without 32-bit time compatibility (e.g. aarch64) the libc6 utmp record is 400 bytes: the session and timeval fields are widened. Reading such a file with the 32-bit layout produced wrong timestamps and misaligned every subsequent record.

Add a linux_libc6_utmp_entry_64bit structure and select the layout by validating the first record. The two layouts share the fields before the session field, so the subsecond field distinguishes them: it is defined to be less than 1,000,000, and a 64-bit record read as 32-bit has its wider seconds field spill into it. The record size is derived from the selected layout instead of hard-coded, so the skip-and-recover logic advances by the correct amount for either layout.

Add utmp_64bit, a small self-authored sample generated on aarch64 (boot, login and logout records).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@joachimmetz

joachimmetz commented Jul 3, 2026

Copy link
Copy Markdown
Member

Thanks for the addition, my comment was mostly on the regarding the comment in the code.

I put something together to generate test data https://github.com/dfirlabs/utmp-specimens/

Recovering from a corrupt utmp is tricky given 384 or 400 0-byte could be a valid record. An approach to explore is to try a small number of records and abort if the data does not make sense

Note that the utmp parser should also not interfere with other parsers in the sense of it claiming a file it is not the appropriate parser for.

Also if I'm not mistaken the format is native-endian, which Plaso does not support at the moment.

@joachimmetz joachimmetz removed the pending reporter input Issue is pending input from the reporter label Jul 3, 2026
@joachimmetz

Copy link
Copy Markdown
Member

Maybe scan the the first 8 - 16 records and see:

  • 0 <= ut_type <= 8
  • 0 <= ut_tv.tv_usec <= 999999
  • ut_id and ut_line most likely to have a value if entry.ut_type != 0

kev365 and others added 4 commits July 3, 2026 11:34
This reverts commit 8199ca6.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
On 64-bit platforms without 32-bit time compatibility (e.g. aarch64) the
libc6 utmp record is 400 bytes rather than 384: the session and timeval
fields are 64-bit. Reading such a file with the 32-bit layout produced
wrong timestamps and misaligned every subsequent record.

Add a linux_libc6_utmp_entry_64bit structure and select the record layout
by reading the first records with each candidate and counting the valid
non-empty records; a record read with the wrong record size misaligns and
violates the record invariants (ut_type, microseconds range, and a
terminal for non-empty records). The layout with the most valid non-empty
records is used, and a file is only claimed when at least two are found,
so the parser no longer over-claims files that are not utmp files.

Add test_data/utmp_aarch64, generated on aarch64 with the
dfirlabs/utmp-specimens generator.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adopt the utmp type -> login_type rename from main's cleanup in the new
64-bit record support: the data type map, the record validation, and the
tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The record-layout detection accepted a file when it found enough valid
non-empty records among the first records, but an executable or other
non-utmp file can contain records that pass the invariants by chance,
causing the parser to claim it and emit records with out-of-range values.

Require the first record to be a valid record for the candidate layout as
well; a non-utmp file typically does not start with one. This keeps the
parser from claiming binaries (regression seen in the ext4_with_binaries
end-to-end test).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@joachimmetz

joachimmetz commented Jul 4, 2026

Copy link
Copy Markdown
Member

utmp from big-endian system

00000000  00 00 00 00 00 00 00 20  00 00 00 00 00 00 00 00  |....... ........|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000150  00 00 00 00 00 00 00 00  00 00 00 00 6a 48 93 69  |............jH.i|
00000160  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000190  00 08 00 00 00 00 00 20  74 74 79 32 00 00 00 00  |....... tty2....|
000001a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001b0  00 00 00 00 00 00 00 00  74 32 00 00 00 00 00 00  |........t2......|
000001c0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000002e0  00 00 00 00 00 00 00 00  00 00 00 00 6a 48 93 69  |............jH.i|
000002f0  00 00 00 00 00 00 00 00  01 02 03 04 00 00 00 00  |................|
00000300  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000320  00 02 00 00 00 00 00 20  73 79 73 74 65 6d 20 62  |....... system b|
00000330  6f 6f 74 00 00 00 00 00  00 00 00 00 00 00 00 00  |oot.............|
00000340  00 00 00 00 00 00 00 00  7e 00 00 00 72 65 62 6f  |........~...rebo|
00000350  6f 74 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |ot..............|
00000360  00 00 00 00 00 00 00 00  00 00 00 00 30 2e 30 2e  |............0.0.|
00000370  30 2e 30 00 00 00 00 00  00 00 00 00 00 00 00 00  |0.0.............|
00000380  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

Full file https://github.com/libyal/dtformats/blob/main/test_data/utmp-s390

The libc6 utmp format is native-endian, so on a big-endian system such as
s390x the record integers are big-endian and ut_type is a 16-bit value in
the high bytes of the field. Reading such a file with a little-endian
layout misreads every integer.

Add a linux_libc6_utmp_entry_64bit_bigendian structure and include it in
the record-layout detection, which selects it by validating records the
same way as the little-endian layouts.

Adopt the dtformats utmp specimens (Apache-2.0) as test data: replace the
self-generated utmp_aarch64 with the canonical one and add utmp_x86_64
(little-endian 384-byte) and utmp_s390 (big-endian).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@kev365

kev365 commented Jul 4, 2026

Copy link
Copy Markdown
Contributor Author

Thanks Joachim, for your assist here as well as the speciman set and test data. Reworked this along the lines you suggested -- it validates the first records to pick the layout and decide whether to claim the file. So it now reads 32-bit, 64-bit and big-endian (s390x) utmp, and won't claim non-utmp files. Test data is your dtformats specimens, including utmp-s390 for the big-endian case.

@joachimmetz

Copy link
Copy Markdown
Member

Much appreciated, will try to take review the changes shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

utmp parser aborts at the first corrupt record; 64-bit (400-byte) layout unsupported

2 participants