RAGchain.benchmark package

Subpackages

Submodules

RAGchain.benchmark.auto module

class RAGchain.benchmark.auto.AutoEvaluator(pipeline: BaseRunPipeline, questions: List[str], metrics=None)

Bases: BaseEvaluator

Evaluate metrics without ground truths. You only need to pass questions and your pipeline. You have to ingest properly to retrievals and DBs. Recommend to use IngestPipeline to ingest.

evaluate(**kwargs) → EvaluateResult: Evaluate metrics and return the results :param validate_passages: If True, validate passages in retrieval_gt already ingested. If False, you can’t use context_recall and KF1 metrics. We recommend to set True for robust evaluation. :return: EvaluateResult

RAGchain.benchmark.base module

class RAGchain.benchmark.base.BaseEvaluator(run_all: bool = True, metrics: List[str] | None = None)

Bases: ABC

answer_gt_metrics = ['BLEU', 'METEOR', 'ROUGE', 'EM_answer']

answer_no_gt_ragas_metrics = ['answer_relevancy', 'faithfulness']

answer_passage_metrics = ['KF1']

abstract evaluate(validate_passages: bool = True) → EvaluateResult: Evaluate metrics and return the results :param validate_passages: If True, validate passages in retrieval_gt already ingested. If False, you can’t use context_recall and KF1 metrics. We recommend to set True for robust evaluation. :return: EvaluateResult

retrieval_gt_metrics = ['Hole', 'TopK_Accuracy', 'EM_retrieval', 'F1_score', 'Recall', 'Precision']

retrieval_gt_metrics_rank_aware = ['AP', 'NDCG', 'CG', 'Ind_DCG', 'DCG', 'Ind_IDCG', 'IDCG', 'RR']

retrieval_gt_ragas_metrics = ['context_recall']

retrieval_no_gt_ragas_metrics = ['context_precision']

static uuid_to_str(id_list: List[str | UUID]) → List[str]

class RAGchain.benchmark.base.DummyRetrieval

Bases: BaseRetrieval

delete(passages: List[Passage]): delete passages from vector representation of passages by ids.

ingest(passages: List[Passage]): ingest passages to vector representation of passages.

retrieve(query: str, top_k: int = 5, *args, **kwargs) → List[Passage]: retrieve passages at ingested vector representation of passages.

retrieve_id(query: str, top_k: int = 5, *args, **kwargs) → List[str | UUID]: retrieve passage ids at ingested vector representation of passages.

retrieve_id_with_scores(query: str, top_k: int = 5, *args, **kwargs) → tuple[List[str | UUID], List[float]]: retrieve passage ids and similarity scores at ingested vector representation of passages.

RAGchain.benchmark package

Subpackages

Submodules

RAGchain.benchmark.auto module

RAGchain.benchmark.base module

Module contents