vllm.model_executor.models.clip
Minimal implementation of CLIPVisionModel intended to be only used within a vision language model.
 
  Bases: Module
Multi-headed attention from 'Attention Is All You Need' paper
Source code in vllm/model_executor/models/clip.py
  instance-attribute  ¶
 out_proj = RowParallelLinear(
    input_size=embed_dim,
    output_size=embed_dim,
    quant_config=quant_config,
    prefix=f"{prefix}.out_proj",
)
 instance-attribute  ¶
 qkv_proj = QKVParallelLinear(
    hidden_size=embed_dim,
    head_size=head_dim,
    total_num_heads=num_heads,
    quant_config=quant_config,
    prefix=f"{prefix}.qkv_proj",
)
 
 __init__(
    config: CLIPVisionConfig,
    quant_config: Optional[QuantizationConfig] = None,
    prefix: str = "",
)
Source code in vllm/model_executor/models/clip.py
  
 forward(hidden_states: Tensor)
Input shape: Batch x Time x Channel
Source code in vllm/model_executor/models/clip.py
  
  Bases: Module
Transformer encoder consisting of config.num_hidden_layers self attention layers. Each layer is a [CLIPEncoderLayer].
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| config | CLIPVisionConfig | CLIPConfig | required | 
Source code in vllm/model_executor/models/clip.py
  instance-attribute  ¶
 layers = ModuleList(
    [
        (
            CLIPEncoderLayer(
                config=config,
                quant_config=quant_config,
                prefix=f"{prefix}.layers.{layer_idx}",
            )
        )
        for layer_idx in (range(num_hidden_layers))
    ]
)
 
 __init__(
    config: CLIPVisionConfig,
    quant_config: Optional[QuantizationConfig] = None,
    num_hidden_layers_override: Optional[int] = None,
    prefix: str = "",
) -> None
Source code in vllm/model_executor/models/clip.py
  
  Source code in vllm/model_executor/models/clip.py
  
  Bases: VisionEncoderInfo[CLIPVisionConfig]
Source code in vllm/model_executor/models/clip.py
  
  Bases: Module
Source code in vllm/model_executor/models/clip.py
  instance-attribute  ¶
 mlp = CLIPMLP(
    config,
    quant_config=quant_config,
    prefix=f"{prefix}.mlp",
)
 instance-attribute  ¶
 self_attn = CLIPAttention(
    config,
    quant_config=quant_config,
    prefix=f"{prefix}.self_attn",
)
 
 __init__(
    config: CLIPVisionConfig,
    quant_config: Optional[QuantizationConfig] = None,
    prefix: str = "",
) -> None
Source code in vllm/model_executor/models/clip.py
  
  Source code in vllm/model_executor/models/clip.py
  
  Bases: Module
Source code in vllm/model_executor/models/clip.py
  instance-attribute  ¶
 fc1 = ColumnParallelLinear(
    hidden_size,
    intermediate_size,
    bias=True,
    quant_config=quant_config,
    prefix=f"{prefix}.fc1",
)
 instance-attribute  ¶
 fc2 = RowParallelLinear(
    intermediate_size,
    hidden_size,
    bias=True,
    quant_config=quant_config,
    prefix=f"{prefix}.fc2",
)
 
 __init__(
    config: CLIPVisionConfig,
    quant_config: Optional[QuantizationConfig] = None,
    prefix: str = "",
) -> None
Source code in vllm/model_executor/models/clip.py
  
    
  Bases: Module
Source code in vllm/model_executor/models/clip.py
  instance-attribute  ¶
 patch_embedding = Conv2d(
    in_channels=num_channels,
    out_channels=embed_dim,
    kernel_size=patch_size,
    stride=patch_size,
    bias=False,
)
 
  Source code in vllm/model_executor/models/clip.py
  
  Source code in vllm/model_executor/models/clip.py
  
  Bases: Module, SupportsQuant
Source code in vllm/model_executor/models/clip.py
  class-attribute instance-attribute  ¶
   instance-attribute  ¶
 vision_model = CLIPVisionTransformer(
    config=config,
    quant_config=quant_config,
    num_hidden_layers_override=num_hidden_layers_override,
    require_post_norm=require_post_norm,
    prefix=f"{prefix}.vision_model",
)
 
 __init__(
    config: CLIPVisionConfig,
    quant_config: Optional[QuantizationConfig] = None,
    *,
    num_hidden_layers_override: Optional[int] = None,
    require_post_norm: Optional[bool] = None,
    prefix: str = "",
) -> None
Source code in vllm/model_executor/models/clip.py
  
    
  Source code in vllm/model_executor/models/clip.py
  
  Bases: Module
Source code in vllm/model_executor/models/clip.py
  instance-attribute  ¶
 encoder = CLIPEncoder(
    config=config,
    quant_config=quant_config,
    num_hidden_layers_override=num_hidden_layers_override,
    prefix=f"{prefix}.encoder",
)
 
 __init__(
    config: CLIPVisionConfig,
    quant_config: Optional[QuantizationConfig] = None,
    *,
    num_hidden_layers_override: Optional[int] = None,
    require_post_norm: Optional[bool] = None,
    prefix: str = "",
) -> None