vllm.model_executor.layers.utils
Utility methods for model layers.
 
 apply_penalties(
    logits: Tensor,
    prompt_tokens_tensor: Tensor,
    output_tokens_tensor: Tensor,
    presence_penalties: Tensor,
    frequency_penalties: Tensor,
    repetition_penalties: Tensor,
) -> Tensor
Applies penalties in place to the logits tensor logits : The input logits tensor of shape [num_seqs, vocab_size] prompt_tokens_tensor: A tensor containing the prompt tokens. The prompts are padded to the maximum prompt length within the batch using vocab_size as the padding value. The value vocab_size is used for padding because it does not correspond to any valid token ID in the vocabulary. output_tokens_tensor: The output tokens tensor. presence_penalties: The presence penalties of shape (num_seqs, ) frequency_penalties: The frequency penalties of shape (num_seqs, ) repetition_penalties: The repetition penalties of shape (num_seqs, )
Source code in vllm/model_executor/layers/utils.py
  
  Source code in vllm/model_executor/layers/utils.py
  
    
    
 get_token_bin_counts_and_mask(
    tokens: Tensor, vocab_size: int, num_seqs: int
) -> tuple[Tensor, Tensor]