Search code examples
How to extract image hidden states in LLaVa's transformers (Huggingface) implementation?...


huggingface-transformerstransformer-modelmultimodal

Read More
ValueError: Exception encountered when calling layer 'tf_bert_model' (type TFBertModel)...


tensorflowtensorflow2.0huggingface-transformersbert-language-modeltransformer-model

Read More
How to correctly apply LayerNorm after MultiheadAttention with different input shapes (batch_first v...


audiodeep-learningpytorchtransformer-modelpattern-recognition

Read More
How to mask inputs with variable size in transformer model when the batches needs to be masked diffe...


pythonnumpytensorflowkerastransformer-model

Read More
Warning: Gradients do not exist for variables...


pythontensorflowkerastransformer-model

Read More
How to apply a pretrained transformer model from huggingface?...


huggingface-transformersnamed-entity-recognitiontransformer-model

Read More
Using positional encoding in pytorch...


deep-learningpytorchtransformer-model

Read More
How to reconstruct text entities with Hugging Face's transformers pipelines without IOB tags?...


nlptokenizetransformer-modelnamed-entity-recognitionhuggingface-transformers

Read More
Inference error after training an IP-Adapter plus model...


machine-learningdeep-learningpytorchtransformer-modelstable-diffusion

Read More
How to download a model from huggingface?...


huggingface-transformerstransformer-model

Read More
cannot import name 'split_torch_state_dict_into_shards' from 'huggingface_hub'...


pythonnlphuggingface-transformerstransformer-modelllama

Read More
Why do Transformers in Natural Language Processing need a stack of encoders?...


machine-learningdeep-learningnlptransformer-model

Read More
Is positional encoding necessary for transformer in language modeling?...


transformer-modellanguage-model

Read More
Transformers: Cross Attention Tensor Shapes During Inference Mode...


pytorchtransformer-model

Read More
Query padding mask and key padding mask in Transformer encoder...


pythonmachine-learningpytorchtransformer-modelattention-model

Read More
PyTorch Linear operations vary widely after reshaping...


pythondebuggingpytorchtransformer-modelattention-model

Read More
Why doesn't permuting positional encodings in GPT-2 affect the output as expected?...


pytorchhuggingface-transformerstransformer-modelgpt-2

Read More
Does Padding in a Batch of Sequences Affect Performance? How Effective is the Attention Mask?...


pytorchnlphuggingface-transformerstransformer-model

Read More
Why is the timm visual transformer position embedding initializing to zeros?...


pytorchtransformer-modelvision-transformer

Read More
Inference question through LoRA in Whisper model...


transformer-modelopenai-whisper

Read More
How to make huggingface transformer for translation return n translation inferences?...


pythonhuggingface-transformerstransformer-model

Read More
Understanding the results of Transformers Learn In Context with Gradient Descent...


machine-learningnlplarge-language-modeltransformer-modelmeta-learning

Read More
How is transformers loss calculated for blank token predictions?...


machine-learningnlptransformer-modellanguage-model

Read More
No Attention returned even when output_attentions= True...


nlphuggingface-transformersbert-language-modeltransformer-modelattention-model

Read More
TypeError: Exception encountered when calling layer 'embeddings' (type TFBertEmbeddings)...


tensorflowdeep-learningnlpbert-language-modeltransformer-model

Read More
Key matrix redundant in Transformer language models?...


nlptransformer-model

Read More
What are the inputs of the first decoder in the transformer architecture...


transformer-modelencoderdecoder

Read More
Positional encoding for VIsion transformer...


pytorchtransformer-modelvision-transformer

Read More
Informer: loss always Nan...


deep-learningnanloss-functiontransformer-model

Read More
Loading pre-trained weights properly in Pytorch...


pythonpytorchtransformer-modeltransfer-learning

Read More
BackNext