RAGchain.utils.compressor package
Submodules
RAGchain.utils.compressor.base module
- class RAGchain.utils.compressor.base.BaseCompressor
Bases:
Runnable
[RetrievalResult
,RetrievalResult
],ABC
- property InputType: Type[Input]
The type of input this runnable accepts specified as a type annotation.
- property OutputType: Type[Output]
The type of output this runnable produces specified as a type annotation.
- invoke(input: Input, config: RunnableConfig | None = None) Output
Compress the passages in input and return the compressed passages. It gets compression algorithm’s parameters from config. Key name is ‘compressor_options’. Set parameters at configurable to dict. Example:
runnable.invoke(retrieval_result, config={“configurable”: {“compressor_options”: {“n_clusters”: 3}}})
Important! The scores of the passages will be removed. It is recommended to use this module after all retrievals and reranking passages, before you put the passages into LLM.
RAGchain.utils.compressor.cluster_time module
- class RAGchain.utils.compressor.cluster_time.ClusterTimeCompressor(semantic_cluster: SemanticClustering, split_by_sentences: bool = False)
Bases:
BaseCompressor
Compress passages by semantically clustering them and then selecting the most recent passage from each cluster.
RAGchain.utils.compressor.llm_lingua module
- class RAGchain.utils.compressor.llm_lingua.LLMLinguaCompressor(model_name: str = 'NousResearch/Llama-2-7b-hf', device_map: str = 'cuda', model_config: dict = {}, open_api_config: dict = {}, **kwargs: Any)
Bases:
Runnable
[Union
[PromptValue
,str
,Sequence
[BaseMessage
]],str
]Compress given prompt using LLMLingua. It uses small model like Llama-2-7b, and calculate perplexity of given prompt. With that information, it compresses the prompt for reducing token usage.
- property InputType: Type[Input]
Get the input type for this runnable.
- property OutputType: Type[Output]
The type of output this runnable produces specified as a type annotation.
- invoke(input: Input, config: RunnableConfig | None = None, **kwargs: Any) Output
Transform a single input into an output. Override to implement.
- Args:
input: The input to the runnable. config: A config to use when invoking the runnable.
The config supports standard keys like ‘tags’, ‘metadata’ for tracing purposes, ‘max_concurrency’ for controlling how much work to do in parallel, and other keys. Please refer to the RunnableConfig for more details.
- Returns:
The output of the runnable.