vllm.model_executor.models.qwen2_rm
Inference-only Qwen2-RM model compatible with HuggingFace weights.
 
  Bases: Qwen2RewardBaseModel
Source code in vllm/model_executor/models/qwen2_rm.py
  
 __init__(*, vllm_config: VllmConfig, prefix: str = '')
Source code in vllm/model_executor/models/qwen2_rm.py
  
  Bases: Qwen2RewardBaseModel
Source code in vllm/model_executor/models/qwen2_rm.py
  
 __init__(*, vllm_config: VllmConfig, prefix: str = '')
Source code in vllm/model_executor/models/qwen2_rm.py
  
  Bases: Module, SupportsLoRA, SupportsPP
Source code in vllm/model_executor/models/qwen2_rm.py
  instance-attribute  ¶
   instance-attribute  ¶
 model = Qwen2Model(
    vllm_config=vllm_config,
    prefix=maybe_prefix(prefix, "model"),
)
 class-attribute instance-attribute  ¶
 packed_modules_mapping = {
    "qkv_proj": ["q_proj", "k_proj", "v_proj"],
    "gate_up_proj": ["gate_proj", "up_proj"],
}
 instance-attribute  ¶
 score = Sequential(
    ColumnParallelLinear(
        hidden_size,
        hidden_size,
        quant_config=quant_config,
        return_bias=False,
    ),
    ReLU(),
    RowParallelLinear(
        hidden_size,
        num_labels,
        quant_config=quant_config,
        return_bias=False,
    ),
)
 
 __init__(*, vllm_config: VllmConfig, prefix: str = '')
Source code in vllm/model_executor/models/qwen2_rm.py
  
 forward(
    input_ids: Tensor,
    positions: Tensor,
    intermediate_tensors: Optional[
        IntermediateTensors
    ] = None,
    inputs_embeds: Optional[Tensor] = None,
) -> Union[Tensor, IntermediateTensors]