RAGchain.utils.compressor package

Submodules

RAGchain.utils.compressor.base module

class RAGchain.utils.compressor.base.BaseCompressor

Bases: Runnable[RetrievalResult, RetrievalResult], ABC

property InputType: Type[Input]: The type of input this runnable accepts specified as a type annotation.

property OutputType: Type[Output]: The type of output this runnable produces specified as a type annotation.

abstract compress(passages: List[Passage], **kwargs) → List[Passage]

invoke(input: Input, config: RunnableConfig | None = None) → Output

Compress the passages in input and return the compressed passages. It gets compression algorithm’s parameters from config. Key name is ‘compressor_options’. Set parameters at configurable to dict. Example:

runnable.invoke(retrieval_result, config={“configurable”: {“compressor_options”: {“n_clusters”: 3}}})

Important! The scores of the passages will be removed. It is recommended to use this module after all retrievals and reranking passages, before you put the passages into LLM.

RAGchain.utils.compressor.cluster_time module

class RAGchain.utils.compressor.cluster_time.ClusterTimeCompressor(semantic_cluster: SemanticClustering, split_by_sentences: bool = False)

Bases: BaseCompressor

Compress passages by semantically clustering them and then selecting the most recent passage from each cluster.

compress(passages: List[Passage], **kwargs) → List[Passage]

Parameters:

passages – list of passages to be compressed.
kwargs – kwargs for clustering algorithm.

RAGchain.utils.compressor.llm_lingua module

class RAGchain.utils.compressor.llm_lingua.LLMLinguaCompressor(model_name: str = 'NousResearch/Llama-2-7b-hf', device_map: str = 'cuda', model_config: dict = {}, open_api_config: dict = {}, **kwargs: Any)

Bases: Runnable[Union[PromptValue, str, Sequence[BaseMessage]], str]

Compress given prompt using LLMLingua. It uses small model like Llama-2-7b, and calculate perplexity of given prompt. With that information, it compresses the prompt for reducing token usage.

property InputType: Type[Input]: Get the input type for this runnable.

property OutputType: Type[Output]: The type of output this runnable produces specified as a type annotation.

invoke(input: Input, config: RunnableConfig | None = None, **kwargs: Any) → Output

Transform a single input into an output. Override to implement.

Args:: input: The input to the runnable. config: A config to use when invoking the runnable.

The config supports standard keys like ‘tags’, ‘metadata’ for tracing purposes, ‘max_concurrency’ for controlling how much work to do in parallel, and other keys. Please refer to the RunnableConfig for more details.
Returns:: The output of the runnable.

RAGchain.utils.compressor package

Submodules

RAGchain.utils.compressor.base module

RAGchain.utils.compressor.cluster_time module

RAGchain.utils.compressor.llm_lingua module

Module contents