Skip to content
View tjuzek's full-sized avatar

Highlights

  • Pro

Block or report tjuzek

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
tjuzek/README.md

Welcome!

Computational linguist · Florida State University. Interests: AI language choices, mechanisms behind it, influence on human language.

🌐 tjuzek.com · 🔬 Google Scholar · 🆔 ORCID · 🧪 AI Word Explorer · ✉️ tjuzek@fsu.edu


Key projects

  • ai-34-languages: pipeline and interactive explorer for AI-overused words across 34 languages. Powers aiwordexplorer.com. Paper: AI-Associated Lexical Shifts Across 34 Languages (arXiv:2605.25358).
  • delve: code for Why Does ChatGPT "Delve" So Much? (COLING 2025, arXiv:2412.11385).
  • lhf: code for Word Overuse and Alignment in LLMs: The Influence of Learning from Human Feedback (BIAS 2025 @ ECML-PKDD, arXiv:2508.01930).
  • sad: the Syntactic Acceptability Dataset, with the accompanying paper and rating-website code.

Questions, or interested in working with me? See tjuzek.com or email me.

Popular repositories Loading

  1. sad sad Public

    The Syntactic Acceptability Dataset (SAD): syntactic acceptability judgements, with the accompanying paper and the rating-website code.

    PHP 3

  2. delve delve Public

    Code and data for 'Why Does ChatGPT "Delve" So Much? Exploring the Sources of Lexical Overrepresentation in Large Language Models' (Juzek & Ward, COLING 2025).

    Jupyter Notebook 1

  3. ai-outperforms ai-outperforms Public

    An early experiment (Dec 2022): can ChatGPT pass an intro-linguistics reading assignment? The note that seeded my work on AI and language change.

  4. resources resources Public

    Teaching materials, talk and seminar resources, and assorted notes on (computational) linguistics.

    Python

  5. tjuzek.github.io tjuzek.github.io Public

    Source for tjuzek.com, the academic website of Thomas Stephan Juzek (computational linguistics; AI and language change).

    HTML

  6. om-uid om-uid Public

    The C-SALT-mix corpus and analysis for the Open Mind paper 'Signal Smoothing and Syntactic Choices: A Critical Reflection on the UID Hypothesis' (MIT Press).

    Python