Search code examples
Transformers: Cross Attention Tensor Shapes During Inference Mode...


pytorchtransformer-model

Read More
Query padding mask and key padding mask in Transformer encoder...


pythonmachine-learningpytorchtransformer-modelattention-model

Read More
cannot import name 'split_torch_state_dict_into_shards' from 'huggingface_hub'...


pythonnlphuggingface-transformerstransformer-modelllama

Read More
PyTorch Linear operations vary widely after reshaping...


pythondebuggingpytorchtransformer-modelattention-model

Read More
Why doesn't permuting positional encodings in GPT-2 affect the output as expected?...


pytorchhuggingface-transformerstransformer-modelgpt-2

Read More
Does Padding in a Batch of Sequences Affect Performance? How Effective is the Attention Mask?...


pytorchnlphuggingface-transformerstransformer-model

Read More
Why is the timm visual transformer position embedding initializing to zeros?...


pytorchtransformer-modelvision-transformer

Read More
Using positional encoding in pytorch...


deep-learningpytorchtransformer-model

Read More
Inference question through LoRA in Whisper model...


transformer-modelopenai-whisper

Read More
How to make huggingface transformer for translation return n translation inferences?...


pythonhuggingface-transformerstransformer-model

Read More
How to download a model from huggingface?...


huggingface-transformerstransformer-model

Read More
How to extract image hidden states in LLaVa's transformers (Huggingface) implementation?...


huggingface-transformerstransformer-modelmultimodal

Read More
How to reconstruct text entities with Hugging Face's transformers pipelines without IOB tags?...


nlptokenizetransformer-modelnamed-entity-recognitionhuggingface-transformers

Read More
Understanding the results of Transformers Learn In Context with Gradient Descent...


machine-learningnlplarge-language-modeltransformer-modelmeta-learning

Read More
How is transformers loss calculated for blank token predictions?...


machine-learningnlptransformer-modellanguage-model

Read More
Warning: Gradients do not exist for variables...


pythontensorflowkerastransformer-model

Read More
No Attention returned even when output_attentions= True...


nlphuggingface-transformersbert-language-modeltransformer-modelattention-model

Read More
TypeError: Exception encountered when calling layer 'embeddings' (type TFBertEmbeddings)...


tensorflowdeep-learningnlpbert-language-modeltransformer-model

Read More
Key matrix redundant in Transformer language models?...


nlptransformer-model

Read More
What are the inputs of the first decoder in the transformer architecture...


transformer-modelencoderdecoder

Read More
Positional encoding for VIsion transformer...


pytorchtransformer-modelvision-transformer

Read More
Informer: loss always Nan...


deep-learningnanloss-functiontransformer-model

Read More
Loading pre-trained weights properly in Pytorch...


pythonpytorchtransformer-modeltransfer-learning

Read More
How to solve: RuntimeError: CUDA error: device-side assert triggered?...


pythonpytorchtransformer-model

Read More
cocoeval change the number of keypoints and self.kpt_oks_sigmas into 14 but receive error...


pythonevaluationtransformer-model

Read More
vision transformers: RuntimeError: mat1 and mat2 shapes cannot be multiplied (32x1000 and 768x32)...


pythonmachine-learningdeep-learningpytorchtransformer-model

Read More
How to convert pretrained hugging face model to .pt and run it fully locally?...


machine-learningpytorchhuggingface-transformerstransformer-model

Read More
Understanding batching in pytorch models...


pythonmachine-learningdeep-learningpytorchtransformer-model

Read More
How to remove layers in Huggingface's transformers GPT2 pre-trained models?...


pythonmachine-learningdeep-learningnlptransformer-model

Read More
Annotated Transformer - Why x + DropOut(Sublayer(LayerNorm(x)))?...


pythonpytorchtransformer-modelencoder

Read More
BackNext