RAGchain.reranker.pygaggle package

Submodules

RAGchain.reranker.pygaggle.base module

class RAGchain.reranker.pygaggle.base.Query(text: str, id: str | None = None)

Bases: object

Class representing a query. A query contains the query text itself and potentially other metadata.

Parameters

textstr: The query text.
idOptional[str]: The query id.

class RAGchain.reranker.pygaggle.base.Reranker

Bases: object

Class representing a reranker. A reranker takes a list texts and returns a list of texts non-destructively (i.e., does not alter the original input list of texts).

rerank(query: Query, texts: List[Text]) → List[Text]: Sorts a list of texts

abstract rescore(query: Query, texts: List[Text]) → List[Text]

Reranks a list of texts with respect to a query.

Parameters

queryQuery: The query.
textsList[Text]: The list of texts.

Returns

List[Text]: Reranked list of texts.

class RAGchain.reranker.pygaggle.base.Text(text: str, metadata: Mapping[str, Any] | None = None, score: float | None = 0, title: str | None = None)

Bases: object

Class representing a text to be reranked. A text is unspecified with respect to it length; in principle, it could be a full-length document, a paragraph-sized passage, or even a short phrase.

Parameters

textstr: The text to be reranked.
metadataMapping[str, Any]: Additional metadata and other annotations.
scoreOptional[float]: The score of the text. For example, the score might be the BM25 score from an initial retrieval stage.
titleOptional[str]: The text’s title.

RAGchain.reranker.pygaggle.monoT5 module

class RAGchain.reranker.pygaggle.monoT5.MonoT5Reranker(model_name: str = 'castorini/monot5-3b-msmarco-10k', use_amp: bool = False, token_false=None, token_true=None, *args, **kwargs)

Bases: BaseReranker

Rerank the passages using MonoT5 model. The model will be downloaded from HuggingFace model hub.

invoke(input: Input, config: RunnableConfig | None = None) → Output

Transform a single input into an output. Override to implement.

Args:: input: The input to the runnable. config: A config to use when invoking the runnable.

The config supports standard keys like ‘tags’, ‘metadata’ for tracing purposes, ‘max_concurrency’ for controlling how much work to do in parallel, and other keys. Please refer to the RunnableConfig for more details.
Returns:: The output of the runnable.

rerank(query: str, passages: List[Passage]) → List[Passage]

Reranks a list of passages based on a specific ranking algorithm.

Parameters:

passages (List[Passage]) – A list of Passage objects representing the passages to be reranked.
query – str: The query that was used for retrieving the passages.

Returns:

The reranked list of passages.

Return type:

List[Passage]

RAGchain.reranker.pygaggle.transformer module

This code is from pygaggle. https://github.com/castorini/pygaggle/blob/master/pygaggle/rerank/transformer.py

class RAGchain.reranker.pygaggle.transformer.MonoT5(pretrained_model_name_or_path: str = 'castorini/monot5-base-msmarco-10k', model: T5ForConditionalGeneration | None = None, tokenizer: QueryDocumentBatchTokenizer | None = None, use_amp=False, token_false=None, token_true=None)

Bases: Reranker

static get_model(pretrained_model_name_or_path: str, *args, device: str | None = None, **kwargs) → T5ForConditionalGeneration

static get_prediction_tokens(pretrained_model_name_or_path: str, tokenizer, token_false, token_true)

static get_tokenizer(pretrained_model_name_or_path: str, *args, batch_size: int = 8, **kwargs) → T5BatchTokenizer

rescore(query: Query, texts: List[Text]) → List[Text]

Reranks a list of texts with respect to a query.

Parameters

queryQuery: The query.
textsList[Text]: The list of texts.

Returns

List[Text]: Reranked list of texts.

Module contents

This is pygaggle implementation for using reranker. https://github.com/castorini/pygaggle

RAGchain.reranker.pygaggle package

Subpackages

Submodules

RAGchain.reranker.pygaggle.base module

Parameters

Parameters

Returns

Parameters

RAGchain.reranker.pygaggle.monoT5 module

RAGchain.reranker.pygaggle.transformer module

Parameters

Returns

Module contents