RAGchain.benchmark package

Subpackages

Submodules

RAGchain.benchmark.auto module

class RAGchain.benchmark.auto.AutoEvaluator(pipeline: BaseRunPipeline, questions: List[str], metrics=None)

Bases: BaseEvaluator

Evaluate metrics without ground truths. You only need to pass questions and your pipeline. You have to ingest properly to retrievals and DBs. Recommend to use IngestPipeline to ingest.

evaluate(**kwargs) EvaluateResult

Evaluate metrics and return the results :param validate_passages: If True, validate passages in retrieval_gt already ingested. If False, you can’t use context_recall and KF1 metrics. We recommend to set True for robust evaluation. :return: EvaluateResult

RAGchain.benchmark.base module

class RAGchain.benchmark.base.BaseEvaluator(run_all: bool = True, metrics: List[str] | None = None)

Bases: ABC

answer_gt_metrics = ['BLEU', 'METEOR', 'ROUGE', 'EM_answer']
answer_no_gt_ragas_metrics = ['answer_relevancy', 'faithfulness']
answer_passage_metrics = ['KF1']
abstract evaluate(validate_passages: bool = True) EvaluateResult

Evaluate metrics and return the results :param validate_passages: If True, validate passages in retrieval_gt already ingested. If False, you can’t use context_recall and KF1 metrics. We recommend to set True for robust evaluation. :return: EvaluateResult

retrieval_gt_metrics = ['Hole', 'TopK_Accuracy', 'EM_retrieval', 'F1_score', 'Recall', 'Precision']
retrieval_gt_metrics_rank_aware = ['AP', 'NDCG', 'CG', 'Ind_DCG', 'DCG', 'Ind_IDCG', 'IDCG', 'RR']
retrieval_gt_ragas_metrics = ['context_recall']
retrieval_no_gt_ragas_metrics = ['context_precision']
static uuid_to_str(id_list: List[str | UUID]) List[str]
class RAGchain.benchmark.base.DummyRetrieval

Bases: BaseRetrieval

delete(passages: List[Passage])

delete passages from vector representation of passages by ids.

ingest(passages: List[Passage])

ingest passages to vector representation of passages.

retrieve(query: str, top_k: int = 5, *args, **kwargs) List[Passage]

retrieve passages at ingested vector representation of passages.

retrieve_id(query: str, top_k: int = 5, *args, **kwargs) List[str | UUID]

retrieve passage ids at ingested vector representation of passages.

retrieve_id_with_scores(query: str, top_k: int = 5, *args, **kwargs) tuple[List[str | UUID], List[float]]

retrieve passage ids and similarity scores at ingested vector representation of passages.

Module contents