vllm.v1.spec_decode.ngram_proposer
 
 Source code in vllm/v1/spec_decode/ngram_proposer.py
  
 __init__(vllm_config: VllmConfig)
Source code in vllm/v1/spec_decode/ngram_proposer.py
  
    
  Proposes the next sequence of tokens based on n-gram pattern matching in the context. The function finds matches of the last n tokens in the previous context, and returns k tokens that followed that match.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| context_token_ids | ndarray | Numpy array of token IDs representing the context sequence. | required | 
Returns:
| Name | Type | Description | 
|---|---|---|
| Optional[ndarray] | np.ndarray: The sequence of tokens that followed the matched n-gram in the context. | |
| None | Optional[ndarray] | If no matching n-gram pattern is found. | 
Example
If context_token_ids = [1,2,3,4,2,3], min_n = 2, max_n = 3, and k = 4: - The last 3 (= max_n) tokens [4,2,3] cannot find a match. - The last 2 tokens [2,3] will be matched against the previous 4 tokens [1,2,3,4]. - Finding a match of [2,3] would return the tokens that followed that pattern. Here we will return [4,2,3] because we only have three tokens after the match.
Source code in vllm/v1/spec_decode/ngram_proposer.py
  
  Source code in vllm/v1/spec_decode/ngram_proposer.py
  
  Build the lps (longest proper prefix which is also suffix) array for the pattern.