Update dependency transformers to v5#5
Open
dev-mend-for-github-com[bot] wants to merge 1 commit into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
==4.49.0→==5.0.0rc3By merging this PR, the below vulnerabilities will be automatically resolved:
Release Notes
huggingface/transformers (transformers)
v5.0.0rc3: Release candidate v5.0.0rc3Compare Source
Release candidate v5.0.0rc3
New models:
What's Changed
We are getting closer and closer to the official release!
This RC is focused on removing more of the deprecated stuff, fixing some minors issues, doc updates.
_get_num_multimodal_tokensby @Abhinavexists in #43137BartModelIntegrationTestby @Sai-Suraj-27 in #43160auto_doctringin Processors by @yonigozlan in #42101BitModelIntegrationTestby @Sai-Suraj-27 in #43164Fp8] Fix experts by @vasqu in #43154salesforce-ctrl,xlm&gpt-neomodel generation tests by @Sai-Suraj-27 in #43180Generate] Allow custom config values in generate config by @vasqu in #43181Pix2StructIntegrationTestby @Sai-Suraj-27 in #43229PhiIntegrationTestsby @Sai-Suraj-27 in #43214HF_TOKENdirectly and removerequire_read_tokenby @ydshieh in #43233Owlv2ModelIntegrationTest&OwlViTModelIntegrationTestby @Sai-Suraj-27 in #43182add_datesby @yonigozlan in #43199Vip-llavamodel integration test by @Sai-Suraj-27 in #43252position_idsin allapply_rotary_pos_embby @Cyrilvallez in #43255_get_test_infointesting_utils.pyby @ydshieh in #43259Hiera,SwiftFormer&LEDModel integration tests by @Sai-Suraj-27 in #43225_toctree.ymlby @Cyrilvallez in #43264PegasusX,Mvp&LEDmodel integration tests by @Sai-Suraj-27 in #43245New Contributors
Full Changelog: huggingface/transformers@v5.0.0rc2...v5.0.0rc3
v5.0.0rc2: Release candidate 5.0.0rc2Compare Source
What's Changed
This release candidate is focused on fixing
AutoTokenizer, expanding the dynamic weight loading support, and improving performances with MoEs!MoEs and performances:
Tokenization:
The main issue with the tokenization refactor is that
tokenizer_classare now "enforced" when in most cases they are wrong. This took a while to properly isolate and now we try to useTokenizersBackendwhenever we can. #42894 has a much more detailed description of the big changes!TokenizersBackendby @ArthurZucker in #42894Tokenizers] Change treatment of special tokens by @vasqu in #42903Core
Here we focused on boosting the performances of loading weights on device!
post_initand fix all of them by @Cyrilvallez in #42873_init_weightsfor ALL models by @Cyrilvallez in #42309New models
Ernie 4.5] Ernie VL models by @vasqu in #39585Quantization
Breaking changes
Mostly around processors!
convert_segmentation_map_to_binary_masksto EoMT by @simonreise in #43073Thanks again to everyone !
New Contributors
Full Changelog: huggingface/transformers@v5.0.0rc1...v5.0.0rc2
v5.0.0rc1: Release candidate 5.0.0rc1Compare Source
What's Changed
This release candidate was focused mostly on
quantizationsupport with the new dynamic weight loader, and a few notable 🚨 breaking changes🚨:from_pretrainedis nowauto!This is now as fast as before thanks to xet, and is just more convenient on the hub.
Dynamic weight loader updates:
Mostly QOL and fixed + support back CPU offloading.
New models:
Some notable quantization fixes:
Mostly added support for
fbgemme,quanto,Peft:
The dynamic weight loader broke small things, this adds glue for all models but MoEs.
Misc
Tokenization needed more refactoring, this time its a lot cleaner!
rope_parametersto emptydictif there is something to put in it by @hmellor in #42651We omitted a lot of other commits for clarity, but thanks to everyone and the new contributors!
New Contributors
Full Changelog: huggingface/transformers@v5.0.0rc0...v5.0.0rc1
v5.0.0rc0: Transformers v5.0.0rc0Compare Source
Transformers v5 release notes
Highlights
We are excited to announce the initial release of Transformers v5. This is the first major release in five years, and the release is significant: 800 commits have been pushed to
mainsince the latest minor release. This release removes a lot of long-due deprecations, introduces several refactors that significantly simplify our APIs and internals, and comes with a large number of bug fixes.We give an overview of our focus for this release in the following blogpost. In these release notes, we'll focus directly on the refactors and new APIs coming with v5.
This release is a release candidate (RC). It is not the final v5 release, and we will push on pypi as a pre-release. This means that the current release is purely opt-in, as installing
transformerswithout specifying this exact release will install the latest version instead (v4.57.3 as of writing).In order to install this release, please do so with the following:
For us to deliver the best package possible, it is imperative that we have feedback on how the toolkit is currently working for you. Please try it out, and open an issue in case you're facing something inconsistent/a bug.
Transformers version 5 is a community endeavor, and this is the last mile. Let's ship this together!
Significant API changes
Dynamic weight loading
We introduce a new weight loading API in
transformers, which significantly improves on the previous API. Thisweight loading API is designed to apply operations to the checkpoints loaded by transformers.
Instead of loading the checkpoint exactly as it is serialized within the model, these operations can reshape, merge,
and split the layers according to how they're defined in this new API. These operations are often a necessity when
working with quantization or parallelism algorithms.
This new API is centered around the new
WeightConverterclass:The weight converter is designed to apply a list of operations on the source keys, resulting in target keys. A common
operation done on the attention layers is to fuse the query, key, values layers. Doing so with this API would amount
to defining the following conversion:
In this situation, we apply the
Concatenateoperation, which accepts a list of layers as input and returns a singlelayer.
This allows us to define a mapping from architecture to a list of weight conversions. Applying those weight conversions
can apply arbitrary transformations to the layers themselves. This significantly simplified the
from_pretrainedmethodand helped us remove a lot of technical debt that we accumulated over the past few years.
This results in several improvements:
While this is being implemented, expect varying levels of support across different release candidates.
Linked PR: #41580
Tokenization
Just as we moved towards a single backend library for model definition, we want our tokenizers, and the
Tokenizerobject to be a lot more intuitive. With v5, tokenizer definition is much simpler; one can now initialize an emptyLlamaTokenizerand train it directly on your corpus.Defining a new tokenizer object should be as simple as this:
Once the tokenizer is defined as above, you can load it with the following:
Llama5Tokenizer(). Doing this returns you an empty, trainable tokenizer that follows the definition of the authors ofLlama5(it does not exist yet 😉).The above is the main motivation towards refactoring tokenization: we want tokenizers to behave similarly to models: trained or empty, and with exactly what is defined in their class definition.
Backend Architecture Changes: moving away from the slow/fast tokenizer separation
Up to now, transformers maintained two parallel implementations for many tokenizers:
tokenization_<model>.py) - Python-based implementations, often using SentencePiece as the backend.tokenization_<model>_fast.py) - Rust-based implementations using the 🤗 tokenizers library.In v5, we consolidate to a single tokenizer file per model:
tokenization_<model>.py. This file will use the most appropriate backend available:sentencepiecelibrary. It inherits fromPythonBackend.tokenizers. Basically allows adding tokens.MistralCommon's tokenization library. (Previously known as theMistralCommonTokenizer)The
AutoTokenizerautomatically selects the appropriate backend based on available files and dependencies. This is transparent, you continue to useAutoTokenizer.from_pretrained()as before. This allows transformers to be future-proof and modular to easily support future backends.Defining a tokenizers outside of the existing backends
We enable users and tokenizer builders to define their own tokenizers from top to bottom. Tokenizers are usually defined using a backend such as
tokenizers,sentencepieceormistral-common, but we offer the possibility to design the tokenizer at a higher-level, without relying on those backends.To do so, you can import the
PythonBackend(which was previously known asPreTrainedTokenizer). This class encapsulates all the logic related to added tokens, encoding, and decoding.If you want something even higher up the stack, then
PreTrainedTokenizerBaseis whatPythonBackendinherits from. It contains the very basic tokenizer API features:encodedecodevocab_sizeget_vocabconvert_tokens_to_idsconvert_ids_to_tokensfrom_pretrainedsave_pretrainedAPI Changes
1. Direct tokenizer initialization with vocab and merges
Starting with v5, we now enable initializing blank, untrained
tokenizers-backed tokenizers:This tokenizer will therefore follow the definition of the
LlamaTokenizeras defined in its class definition. It can then be trained on a corpus as can be seen in thetokenizersdocumentation.These tokenizers can also be initialized from vocab and merges (if necessary), like the previous "slow" tokenizers:
This tokenizer will behave as a Llama-like tokenizer, with an updated vocabulary. This allows comparing different tokenizer classes with the same vocab; therefore enabling the comparison of different pre-tokenizers, normalizers, etc.
vocab_file(as in, a path towards a file containing the vocabulary) cannot be used to initialize theLlamaTokenizeras loading from files is reserved to thefrom_pretrainedmethod.2. Simplified decoding API
The
batch_decodeanddecodemethods have been unified to reflect behavior of theencodemethod. Both single and batch decoding now use the samedecodemethod. See an example of the new behavior below:Gives:
We expect
encodeanddecodeto behave, as two sides of the same coin:encode,process,decode, should work.3. Unified encoding API
The
encode_plusmethod is deprecated in favor of the single__call__method.4.
apply_chat_templatereturnsBatchEncodingPreviously,
apply_chat_templatereturnedinput_idsfor backward compatibility. Starting with v5, it now consistently returns aBatchEncodingdict like other tokenizer methods.5. Removed legacy configuration file saving:
We simplify the serialization of tokenization attributes:
special_tokens_map.json- special tokens are now stored intokenizer_config.json.added_tokens.json- added tokens are now stored intokenizer.json.added_tokens_decoderis only stored when there is notokenizer.json.When loading older tokenizers, these files are still read for backward compatibility, but new saves use the consolidated format. We're gradually moving towards consolidating attributes to fewer files so that other libraries and implementations may depend on them more reliably.
6. Model-Specific Changes
Several models that had identical tokenizers now import from their base implementation:
These modules will eventually be removed altogether.
Removed T5-specific workarounds
The internal
_eventually_correct_t5_max_lengthmethod has been removed. T5 tokenizers now handle max length consistently with other models.Testing Changes
A few testing changes specific to tokenizers have been applied:
add_tokens,encode,decode) are now centralized and automatically applied across all tokenizers. This reduces test duplication and ensures consistent behaviorFor legacy implementations, the original BERT Python tokenizer code (including
WhitespaceTokenizer,BasicTokenizer, etc.) is preserved inbert_legacy.pyfor reference purposes.7. Deprecated / Modified Features
Special Tokens Structure:
SpecialTokensMixin: Merged intoPreTrainedTokenizerBaseto simplify the tokenizer architecture.special_tokens_map: Now only stores named special token attributes (e.g.,bos_token,eos_token). Useextra_special_tokensfor additional special tokens (formerlyadditional_special_tokens).all_special_tokensincludes both named and extra tokens.special_tokens_map_extendedandall_special_tokens_extended: Removed. AccessAddedTokenobjects directly from_special_tokens_mapor_extra_special_tokensif needed.additional_special_tokens: Still accepted for backward compatibility but is automatically converted toextra_special_tokens.Deprecated Methods:
sanitize_special_tokens(): Already deprecated in v4, removed in v5.prepare_seq2seq_batch(): Deprecated; use__call__()withtext_targetparameter instead.BatchEncoding.words(): Deprecated; useword_ids()instead.Removed Methods:
create_token_type_ids_from_sequences(): Removed from base class. Subclasses that need custom token type ID creation should implement this method directly.clean_up_tokenization(): Removed from base class. Now defined at model class level for models that need it (e.g., PLBart, CLVP, Wav2Vec2).prepare_for_model(),build_inputs_with_special_tokens(),truncate_sequences(): Moved fromtokenization_utils_base.pytotokenization_python.pyforPythonBackendtokenizers.TokenizersBackendprovides model-ready input viatokenize()andencode(), so these methods are no longer needed in the base class._switch_to_input_mode(),_switch_to_target_mode(),as_target_tokenizer(): Removed from base class. Use__call__()withtext_targetparameter instead.parse_response(): Removed from base class.Disclaimers for the RC0
PEFT + MoE:
Because we are switching from the naive MOE (
nn.ModuleListfor experts) we currently have an issue with MoEs that have adapters. For more details see #42491 (comment).We aim for this to be fixed and released in a following release candidate in the week that follows RC0.
Tensor parallel and Expert parallel + MoE
We are streamlining the MoE support with vLLM; while this is being implemented, tensor parallelism and expert parallelism aren't working as expected.
This is known and actively being worked on.
We aim for this to be fixed and released in a following release candidate in the week that follows RC0.
Custom pretrained models:
For anyone inheriting from a
transformersPreTrainedModel, the weights are automatically initialized with the common scheme: