RAGchain.reranker.pygaggle package
Subpackages
Submodules
RAGchain.reranker.pygaggle.base module
- class RAGchain.reranker.pygaggle.base.Query(text: str, id: str | None = None)
Bases:
object
Class representing a query. A query contains the query text itself and potentially other metadata.
Parameters
- textstr
The query text.
- idOptional[str]
The query id.
- class RAGchain.reranker.pygaggle.base.Reranker
Bases:
object
Class representing a reranker. A reranker takes a list texts and returns a list of texts non-destructively (i.e., does not alter the original input list of texts).
- class RAGchain.reranker.pygaggle.base.Text(text: str, metadata: Mapping[str, Any] | None = None, score: float | None = 0, title: str | None = None)
Bases:
object
Class representing a text to be reranked. A text is unspecified with respect to it length; in principle, it could be a full-length document, a paragraph-sized passage, or even a short phrase.
Parameters
- textstr
The text to be reranked.
- metadataMapping[str, Any]
Additional metadata and other annotations.
- scoreOptional[float]
The score of the text. For example, the score might be the BM25 score from an initial retrieval stage.
- titleOptional[str]
The text’s title.
RAGchain.reranker.pygaggle.monoT5 module
- class RAGchain.reranker.pygaggle.monoT5.MonoT5Reranker(model_name: str = 'castorini/monot5-3b-msmarco-10k', use_amp: bool = False, token_false=None, token_true=None, *args, **kwargs)
Bases:
BaseReranker
Rerank the passages using MonoT5 model. The model will be downloaded from HuggingFace model hub.
- invoke(input: Input, config: RunnableConfig | None = None) Output
Transform a single input into an output. Override to implement.
- Args:
input: The input to the runnable. config: A config to use when invoking the runnable.
The config supports standard keys like ‘tags’, ‘metadata’ for tracing purposes, ‘max_concurrency’ for controlling how much work to do in parallel, and other keys. Please refer to the RunnableConfig for more details.
- Returns:
The output of the runnable.
RAGchain.reranker.pygaggle.transformer module
This code is from pygaggle. https://github.com/castorini/pygaggle/blob/master/pygaggle/rerank/transformer.py
- class RAGchain.reranker.pygaggle.transformer.MonoT5(pretrained_model_name_or_path: str = 'castorini/monot5-base-msmarco-10k', model: T5ForConditionalGeneration | None = None, tokenizer: QueryDocumentBatchTokenizer | None = None, use_amp=False, token_false=None, token_true=None)
Bases:
Reranker
- static get_model(pretrained_model_name_or_path: str, *args, device: str | None = None, **kwargs) T5ForConditionalGeneration
- static get_prediction_tokens(pretrained_model_name_or_path: str, tokenizer, token_false, token_true)
- static get_tokenizer(pretrained_model_name_or_path: str, *args, batch_size: int = 8, **kwargs) T5BatchTokenizer
Module contents
This is pygaggle implementation for using reranker. https://github.com/castorini/pygaggle