RAGchain.reranker.pygaggle package

Subpackages

Submodules

RAGchain.reranker.pygaggle.base module

class RAGchain.reranker.pygaggle.base.Query(text: str, id: str | None = None)

Bases: object

Class representing a query. A query contains the query text itself and potentially other metadata.

Parameters

textstr

The query text.

idOptional[str]

The query id.

class RAGchain.reranker.pygaggle.base.Reranker

Bases: object

Class representing a reranker. A reranker takes a list texts and returns a list of texts non-destructively (i.e., does not alter the original input list of texts).

rerank(query: Query, texts: List[Text]) List[Text]

Sorts a list of texts

abstract rescore(query: Query, texts: List[Text]) List[Text]

Reranks a list of texts with respect to a query.

Parameters

queryQuery

The query.

textsList[Text]

The list of texts.

Returns

List[Text]

Reranked list of texts.

class RAGchain.reranker.pygaggle.base.Text(text: str, metadata: Mapping[str, Any] | None = None, score: float | None = 0, title: str | None = None)

Bases: object

Class representing a text to be reranked. A text is unspecified with respect to it length; in principle, it could be a full-length document, a paragraph-sized passage, or even a short phrase.

Parameters

textstr

The text to be reranked.

metadataMapping[str, Any]

Additional metadata and other annotations.

scoreOptional[float]

The score of the text. For example, the score might be the BM25 score from an initial retrieval stage.

titleOptional[str]

The text’s title.

RAGchain.reranker.pygaggle.monoT5 module

class RAGchain.reranker.pygaggle.monoT5.MonoT5Reranker(model_name: str = 'castorini/monot5-3b-msmarco-10k', use_amp: bool = False, token_false=None, token_true=None, *args, **kwargs)

Bases: BaseReranker

Rerank the passages using MonoT5 model. The model will be downloaded from HuggingFace model hub.

invoke(input: Input, config: RunnableConfig | None = None) Output

Transform a single input into an output. Override to implement.

Args:

input: The input to the runnable. config: A config to use when invoking the runnable.

The config supports standard keys like ‘tags’, ‘metadata’ for tracing purposes, ‘max_concurrency’ for controlling how much work to do in parallel, and other keys. Please refer to the RunnableConfig for more details.

Returns:

The output of the runnable.

rerank(query: str, passages: List[Passage]) List[Passage]

Reranks a list of passages based on a specific ranking algorithm.

Parameters:
  • passages (List[Passage]) – A list of Passage objects representing the passages to be reranked.

  • query – str: The query that was used for retrieving the passages.

Returns:

The reranked list of passages.

Return type:

List[Passage]

RAGchain.reranker.pygaggle.transformer module

This code is from pygaggle. https://github.com/castorini/pygaggle/blob/master/pygaggle/rerank/transformer.py

class RAGchain.reranker.pygaggle.transformer.MonoT5(pretrained_model_name_or_path: str = 'castorini/monot5-base-msmarco-10k', model: T5ForConditionalGeneration | None = None, tokenizer: QueryDocumentBatchTokenizer | None = None, use_amp=False, token_false=None, token_true=None)

Bases: Reranker

static get_model(pretrained_model_name_or_path: str, *args, device: str | None = None, **kwargs) T5ForConditionalGeneration
static get_prediction_tokens(pretrained_model_name_or_path: str, tokenizer, token_false, token_true)
static get_tokenizer(pretrained_model_name_or_path: str, *args, batch_size: int = 8, **kwargs) T5BatchTokenizer
rescore(query: Query, texts: List[Text]) List[Text]

Reranks a list of texts with respect to a query.

Parameters

queryQuery

The query.

textsList[Text]

The list of texts.

Returns

List[Text]

Reranked list of texts.

Module contents

This is pygaggle implementation for using reranker. https://github.com/castorini/pygaggle