EU MDR Literature Search for CERs: How to Build a Reproducible Clinical Evaluation Process

Make your Life Easier with the FREE Literature Search protocol and report templates, ready for use in the Resource Center

WHAT IS THIS BLOG ABOUT?

This blog on conducting systematic literature searches for clinical evaluation and post-market clinical follow-up (PMCF) activities under the European Union (EU) Medical Devices Regulation (MDR) is written by experts, but it is not intended exclusively for experts. It is equally aimed at beginners as well as professionals who have been working in the field for some time.

Under the MDR, a literature search is no longer a simple document-gathering exercise, but a structured and reproducible process designed to support clinical evaluation. A systematic literature review is a structured approach used to identify, collect, and synthesise published evidence in an objective manner while minimising selection bias. A robust search strategy should capture evidence on the current state of the art (SOTA) for the relevant medical condition, comparable or similar devices to define acceptable limits of safety and performance, and the subject device itself to support clinical claims.

To ensure transparency and reproducibility, research questions should be clearly defined—typically using a PICO framework—and translated into database-specific search queries with predefined limits, such as article type or language. Search results should then be screened in a two-step process: first at the title and abstract level, and subsequently at the full-text level,[1] using predefined inclusion and exclusion criteria. Importantly, all decisions and justifications should be documented, and the overall screening outcome should be summarised using a PRISMA flow diagram that helps reinforce traceability.

Put simply, a systematic literature search builds trust with Notified Bodies by making the process transparent, reproducible, and traceable. This blog explains how to achieve this in practice, drawing on real examples and feedback received from Notified Bodies across multiple submissions.

[1] While modern AI-driven tools have led some to question the necessity of a tiered screening method, the traditional two-step approach remains a practical necessity in the regulatory landscape. High-level AI tools might theoretically allow for a more direct screening of content, but they do not eliminate the financial barriers associated with literature procurement. From a manufacturer’s perspective, it is economically impractical to purchase the full-text licenses for every initial search result before its relevance has been confirmed. By maintaining a distinct separation between the initial review and the full-text analysis, companies can manage their budgets effectively while still meeting the high evidence-based standards required for a Technical File.

WHAT HAS CHANGED WITH RESPECT TO REQUIREMENTS FROM THE MDD TO MDR?

One of the most impactful regulatory changes in the transition from the Medical Devices Directive (MDD 93/42/EEC) to the Medical Device Regulation (EU 2017/745) concerns the strengthened expectations around clinical evidence.

Under the MDD, many devices, especially those with a long market history, could effectively rely on limited data, expert opinion, or basic claims of equivalence. However, this shift toward stronger clinical evidence did not occur overnight. It began under the MDD with the publication of MEDDEV 2.7/1 rev.4, which significantly strengthened expectations around clinical evaluation and documentation.

The MDR has now brought this transition to its full conclusion, introducing clearer and more stringent requirements for clinical evidence to be continuously generated, actively maintained, and systematically evaluated throughout the device lifecycle. It also formalised expectations regarding the level and quality of evidence, as reflected in guidance such as MDCG 2020-6, which outlines a structured approach to clinical evidence appraisal.

In other words, the shift has been from reliance on “anecdotal evidence” under the MDD to the requirement for systematic clinical data under the MDR.

DEFINING "STATE OF THE ART" (SOTA)

Clinical evaluation is not limited to assessing the device under evaluation (DUE) alone. An essential early step is defining the SOTA for the device. Establishing the SOTA helps place the device within its broader medical context and sets the foundation for meaningful clinical evaluation.

Defining the SOTA involves outlining the current medical landscape, including the standard of care, comparable or similar devices, relevant clinical guidelines, and established clinical benchmarks. It also supports an understanding of how the device’s benefits and risks should be assessed within existing clinical practice.

Importantly, defining the SOTA at this stage does not mean drafting the full SOTA section. Rather, it involves creating a structured outline that clarifies where the device fits within the broader framework of available treatments and technologies. This initial outline of SOTA directly informs the development of a focused and robust literature search protocol, as discussed in the next section. The SOTA outline can then later be further refined and aligned to the kind of literature obtained for the defined PICO.

THE LITERATURE SEARCH PROTOCOL (LSP)

The literature search protocol (LSP) must be written before the searches are performed. It is the backbone of the searches and ensure the searches are done systematically. The goal is to capture both published and non-published data (see Section Future Perspectives).

To achieve this, the LSP should clearly define several key elements upfront. These include the period of evaluation covered by the search, the databases to be consulted, and the PICO framework used to structure the research questions, as explained below. The protocol should also describe the overall search strategy and document the specific literature search queries to be applied. Clearly outlining these components in advance helps ensure consistency, transparency, and reproducibility throughout the literature search process.

Challenge:

The key challenge is finding the right balance between completeness and specificity when defining the PICO, so that the literature search remains both efficient and clinically meaningful.

DEFINE THE PERIOD OF EVALUATION

Defining the period of evaluation depends on several factors and must be clearly justified as part of the clinical evaluation planning process. The period of evaluation is primarily guided by certain regulatory constraints such as device classification (class III devices) or other activities PMS or PMCF schedules. It is good practice to align the literature search update frequency with PMS requirements unless specific concerns are identified, for example, emerging safety signals, limited clinical evidence, breakthrough or novel technology, orphan indications, significant design changes, expanded indications, or an evolving state of the art.

PRACTICAL TIP: As a general rule:

Class III and Class IIb: at least annually
Class IIa: at least every two years
Class I: up to every three years may be acceptable

More frequent searches may be justified where uncertainty is higher (e.g., new devices, innovative mechanisms of action, small patient populations, or limited pre-market data).

Other relevant factors include whether the device is a new device under the MDR or a legacy device transitioning from the MDD, whether the evaluation is part of a routine update of the clinical evaluation, and whether a previous evaluation period was defined and, if so, the timeframe and evidence base it covered. These factors influence how the clinical evidence is identified, appraised, and updated to ensure continued compliance with MDR requirements.

In general, literature related to the state of the art (SOTA) and similar devices should be selected over a justified time window that is sufficiently broad to capture relevant clinical practice, safety outcomes, and performance data, while remaining representative of current medical knowledge. The selected evaluation period should reflect the:

· the maturity and evolution of the technology,

· the availability of published evidence, and

· the intended purpose of the device.

For newly marketed devices, the literature search should ensure inclusion of all relevant evidence available since market introduction. For legacy devices, the evaluation period may build upon the scope and quality of previous clinical evaluations, while ensuring that additional literature searches are conducted to support MDR conformity. Even where the Article 61(10), route, is applied, which allows for clinical evaluation based on non-clinical data when the demonstration of conformity based on clinical data is deemed inappropriate due to the device’s technical nature or low risk,[1] systematic literature searches remain necessary to support the clinical evaluation, confirm alignment with the current SOTA, and demonstrate the continued validity of the benefit–risk determination.

IMPORTANT:

It is important that the last date of the period of evaluation is not older than 90 days. It can be questioned during review.

Simply put, the manufacturer defines the period of evaluation and provides an appropriate scientific and regulatory justification for that choice.

[1] According to MDCG 2020-13 and Article 61(10), this route is generally reserved for:

· Simple/Low-Risk Devices: Items like tongue depressors, dental treatment units, or wheelchairs.

· No Direct Clinical Claim: Devices that don’t make a specific therapeutic claim but provide a technical benefit (e.g., medical device accessories).

· Ethical Constraints: Where it would be unethical to conduct a clinical trial because the outcome is already well-known through technical verification.

IDENTIFY THE SEARCH DATABASES

Identify the databases that will be used to search for published literature, clinical trials, and vigilance or recall information. Limiting searches to PubMed alone is not sufficient.

A comprehensive and defensible literature search strategy under the MDR should include multiple sources to ensure that relevant scientific evidence, ongoing or completed clinical investigations, and post-market safety information are adequately captured.

Example of Databases for clinical literature, clinical trials, and vigilance/recall data are given below:

Literature

Clinical Trials

Vigilance/Recall Data

– Embase

– PubMed

– Cochrane Library

– Google Scholar

– NICE

– Others, if any

– EU Clinical Trials Register

– Clinicaltrials.gov

– Others, if any

– FDA MAUDE

– FDA Medical Device Recalls

– FDA TPLC

– Bfarm Field Corrective Actions

– MHRA Alerts, recalls and safety information: drugs and medical devices

– SwissMedic – FSCA and recall

– Database of Adverse Event Notifications (DAEN) – medical devices

– Database for Recalls, Product Alerts, and Product Corrections (DRAC)

Note: Earlier known as System for Australian Recall Actions (SARA)

– EUDAMED

– Others, if any

It is important to note that draft of LSP can undergo many changes during search implementation based on the type of results obtained. This is explained further below.

DEFINE PICO

PICO stands for Population, Intervention, Comparator, Outcome. Just like DNA defines the blueprint of an organism, PICO defines the structure of your literature search. If your PICO is weak, your search results will be either too broad (thousands of irrelevant papers) or too narrow (missing key safety data).

Separate PICO frameworks should ideally be developed for SOTA, similar device, and device-under-evaluation (DUE) searches, as each search serves a distinct objective and requires tailored search terms and scope.

Element	Description	Examples
P – Patient / Population	The specific group of patients, disease, or medical condition your device targets.	Define indication(s), age, gender, comorbidities, and the specific stage of the disease, as applicable.
I – Intervention	The specific medical device or procedure you are evaluating.	Device Details: Use the generic name, trade name, manufacturer name State of the Art (SOTA): Use terms specific to the technology (e.g., “drug-eluting stent”).
C – Comparator	The similar devices, or alternative treatment, the current standard of care.	State of the Art (SOTA): This is where you identify similar devices (i.e., competitor/benchmark devices) or alternative therapies (e.g., surgery vs. medication).
O – Outcome	The clinical endpoints used to measure safety and performance.	Focus on measurable outcomes like “mortality rate,” “reduction in pain (VAS score),” or “complication rates.” This criterion may be particularly interesting for existing technologies where outcome parameters are clear and well-established and avoid background noises.

The search strategy must be informed not only by intended purpose and claims, but also by identified risks from the risk management file and findings from PMS/vigilance data.

How may a PICO be used in the literature search?

There are generally two approaches to using PICO in the literature search process.

1) Using PICO as Exclusion Criteria

In this approach, PICO elements are used directly as screening criteria. Articles that do not meet the predefined PICO criteria are excluded, with justification documented accordingly.

For this method to be defensible, each PICO element must be sufficiently detailed to allow objective inclusion or exclusion decisions. However, in LEXQARA’s opinion, this strategy alone may not be sufficiently robust. Important exclusion criteria, such as study design, minimum sample size, publication type, language restrictions, or duplicate records, are not inherently part of the PICO framework. Therefore, relying solely on PICO for screening may result in inconsistencies or incomplete justification.

2) Using PICO to Justify the Search Strategy (Recommended Approach)

In this approach, PICO is used to structure and justify the search queries rather than to function as the sole exclusion tool. PICO provides a recognised and clinically meaningful framework to define the device in its medical context. It supports the logical combination of keywords and ensures that the search is comprehensive and aligned with the objective.

The selected keywords derived from the PICO can then be applied consistently across all databases, strengthening transparency and reproducibility.

DEFINE THE STRATEGY: COMBINING PICO TERMS

The search strategy should be clearly aligned with the predefined PICO elements and structured using appropriate combinations of terms and Boolean operators (AND, OR, NOT). Defining the right combinations is critical to ensuring that the search is both comprehensive and focused. Each combination should also be accompanied by a brief rationale explaining why it was selected.

For example:

‘P’ Population AND ‘I’ Intervention (e.g. Indication AND “automatic wheelchair”)
‘P’ Population AND ‘C’ Comparator (e.g. Indication AND “manual wheelchair”)
‘P’ Population AND ‘I’ Intervention AND Outcome, ‘O’ where relevant
‘I’ Intervention AND ‘O’ Outcome, depending on the research objective

The selected combinations depend on the objective of the search. SOTA searches typically combine Population terms with broader intervention and guideline-related terms to define the current medical landscape and accepted benchmarks. Similar-device searches focus more specifically on comparable technologies and performance ranges, while subject-device searches prioritise combinations that directly capture safety and performance data for the device under evaluation.

Clearly documenting these combinations and the reasoning behind them strengthens transparency and supports reproducibility during audits or Notified Body reviews.

COMMON MISTAKE:

A frequent error is relying on a single broad search string without systematically testing alternative combinations. This can either generate an overwhelming number of irrelevant results or inadvertently exclude critical safety or performance data.

This is why the draft of the LSP may undergo several refinements during the implementation of the searches. Adjustments are often necessary based on the type and volume of results generated by each search query or string.

Note: Do not let “O” terms limit your safety capture. But defined endpoints may be used when evaluating performance claims.

Example – Drug-Eluting Stent as Device Under Evaluation

Defined PICO

Population: Patients with mobility impairment (e.g., spinal cord injury, neuromuscular disorders, or severe mobility limitation)
Intervention: Automatic wheelchair manufactured by the manufacturer
Comparator: Manual wheelchair or other powered wheelchairs
Outcome: Mobility improvement, user independence, safety incidents (e.g., falls or collisions), usability, and quality of life

Justified Keyword Combination examples:

1. Population AND Intervention
(“mobility impairment” OR “spinal cord injury” OR “mobility limitation”) AND (“automatic wheelchair” OR “powered wheelchair” OR “electric wheelchair” OR (wheelchair AND (automatic OR powered OR electric)))

Justification: Ensures capture of studies evaluating the DUE within its intended patient population and clinical context.

2. Population AND Comparator
(“mobility impairment” OR “spinal cord injury”) AND (“manual wheelchair” OR “powered wheelchair comparator ABC”)

Justification: Captures SOTA and comparative literature on mobility assistive technologies, helping to establish acceptable safety and performance benchmarks.

3. Intervention AND Outcomes
(“automatic wheelchair” OR “powered wheelchair”) AND (“mobility improvement” OR “user independence” OR “safety” OR “quality of life”)

Justification: Ensures retrieval of studies reporting clinically relevant safety, usability, and performance outcomes associated with powered mobility devices.

DEFINE INCLUSION/EXCLUSION CRITERIA

Defining the Inclusion and Exclusion (I/E) criteria is the process of setting the “rules of entry” for your literature search. Inclusion and exclusion criteria must be predefined, objective, and directly aligned with the search objective, intended purpose, identified risks, and defined safety and performance endpoints. Criteria should be sufficiently specific to ensure relevance, yet not so restrictive that clinically important safety signals are inadvertently excluded. All screening decisions must be documented, and exclusions must be supported by a clear rationale consistent with the protocol.

The Inclusion/Exclusion criteria should reflect the PICO elements.

· Inclusion Criteria: Define what a study must have to be considered. Because the search queries are constructed based on the PICO framework, they should already capture the essential characteristics required for inclusion. However, it may be useful to further stratify included articles for analysis. For example, studies may be categorised based on whether they relate to the subject device, an equivalent device, a similar device, therapeutic alternatives, or off-label uses. Such stratification helps clarify the reason for inclusion and supports more structured data extraction and interpretation during the clinical evaluation.

· Exclusion Criteria: Define the rules that disqualify a study, even if it initially seems relevant. When an exclusion criterion is met, the article should be excluded and the justification documented. It is therefore important to define these criteria carefully so that all non-relevant articles can be consistently filtered out. While some exclusion criteria may also be derived from the PICO elements, they often extend beyond them. For instance, additional criteria may relate to study design, publication type, population size, or methodological quality. In cases where the search yields an excessive number of results, exclusion criteria can also help narrow the evidence base by prioritising higher-quality studies (e.g., meta-analyses, systematic reviews, or studies with a minimum number of patients) while still maintaining transparency and methodological consistency.

For example, see the table below:

Criterion	Description	Exclusion (Exclude if…)
Population	Individuals with permanent mobility impairments (e.g., spinal cord injury, MS, ALS).	Temporary injuries (e.g., broken leg) or pediatric patients (if device is for adults only).
Intervention	Automatic wheelchairs.	Manual wheelchairs, mobility scooters, etc.
Comparison	Standard power wheelchairs, manual mobility, or previous versions of automated tech.	Studies comparing the device to non-mobility interventions (e.g., physical therapy alone).
Outcomes	–	Studies only measuring the “aesthetic appeal” or “market price” of the chair.
Study Type	Randomized controlled trials (RCTs), observational studies, or systematic reviews.	Bench testing (mechanical stress tests) or purely anecdotal “customer testimonials.”
Language	–	All other languages without a certified translation.
Publication Date	–	Studies published before 2016 (considered “outdated” for automated electronic systems).
Duplicate	–	Studies already captured in the searches.

Tip: The “Why” Column

It is recommended to add a “Justification” or “Why” column to the internal protocol. For example, if an auditor asks, “Why did you exclude studies older than 10 years?” the protocol should already have the answer: “Because the technology in this field changed significantly in 2015, making older data obsolete.”

Important:

You cannot exclude a study simply because it shows poor results for your device. If a study meets your Inclusion criteria but has negative results, you must include it and then explain it in your “Appraisal and Analysis” section. Excluding negative data is the #1 way to get a “Major Non-Conformance.”

METHOD TO REMOVE DUPLICATION OF ARTICLES

The method for identifying and removing duplicate records should also be clearly defined in the LSP. This includes specifying the software or tool used for deduplication and describing the criteria applied to determine whether records are considered duplicates. Documenting this step strengthens transparency and ensures that the screening process remains systematic and reproducible.

DRAFT LITERATURE SEARCH QUERIES

The literature search queries should align with the PICO term combinations defined and justified in the previous section.

While search queries will naturally vary across databases due to differences in structure and functionality, the core strategy must remain consistent. Search limitations, such as article type, language, species, or publication period, may differ between databases or search type (literature or vigilance/recall) or search objective (SOTA, similar device, or DUE), but the underlying logic should continue to reflect the predefined PICO framework and overall search methodology to ensure coherence, transparency, and reproducibility across all sources searched.

Search limitations can be useful to narrow the results when necessary, depending on the maturity of the technology and the volume of available data. For well-established technologies with a large body of literature, additional limitations such as study type, publication date, or minimum sample size may be applied to focus on the most relevant and higher-quality evidence. Conversely, when limited data are available, for example for newer or niche technologies, fewer restrictions should be applied to ensure that all potentially relevant evidence is captured.

Queries should also be documented in as much detail as possible. In practice, insufficiently detailed queries are a common issue and can compromise reproducibility. For example, when the search needs to be repeated during a Notified Body review one year later. In such cases, the inability to reproduce the search strategy may result in a direct non-compliance.

GOOD PRACTICE:

It is good practice to define search limitations clearly and to document the rationale behind each decision made when drafting the search strategy. For instance, the choice behind choosing only English language for your search should be justified.

The table below provides practical examples of how search queries can differ in quality and level of detail across commonly used databases (note: only a few databases were chosen as examples).

By comparing incomplete queries with fully structured, reproducible ones, it shows how proper documentation of Boolean operators, filters, date ranges, and search fields improves transparency and reproducibility. These examples are intended to highlight good practice rather than serve as exhaustive templates, demonstrating how consistency with the predefined search strategy can be maintained across different platforms.

Database	Tips to maintain consistency with the predefined strategy and to ensure the reproducibility of the literature search process	Examples
Database		*Incomplete queries*	*Complete queries*
Published literature
PubMed	Documenting complete and detailed search queries, including all applied search limitations, directly within the query itself supports rapid reproducibility during reviews and future updates, such as annual clinical evaluation updates.	(wheelchair AND automatic)	((“wheelchair s”[All Fields] OR “wheelchairs”[MeSH Terms] OR “wheelchairs”[All Fields] OR “wheelchair”[All Fields]) AND (“automatable”[All Fields] OR “automatic”[All Fields] OR “automatical”[All Fields] OR “automatically”[All Fields] OR “automaticities”[All Fields] OR “automaticity”[All Fields] OR “automatics”[All Fields] OR “automatism”[MeSH Terms] OR “automatism”[All Fields] OR “automatisms”[All Fields] OR “automatization”[All Fields] OR “automatize”[All Fields] OR “automatized”[All Fields] OR “automatizes”[All Fields] OR “automatizing”[All Fields]) AND ((clinicaltrial[Filter] OR guideline[Filter] OR meta-analysis[Filter] OR practiceguideline[Filter] OR systematicreview[Filter]) AND (humans[Filter]) AND (2021/1/1:2025/12/31[pdat]) AND (english[Filter]))
Google Scholar	When using Google Scholar, the Advanced search feature should be applied, with clear documentation of which search fields were used, the exact terms entered, and which fields were intentionally left blank. Also indicate the language, article type, and if citations were included. Note: As Google Scholar does not allow precise date range selection, particular attention should be paid during the screening phase to exclude publications that fall outside the defined search period.	“wheelchair” AND “automatic” Return articles dated between: 2021 – 2025 Any type: activated	Advanced search (Note: search fields that were kept empty are not included here): With all of the words: “wheelchair” AND “automatic” Where my words occur: anywhere in the article Return articles dated between: 2021 – 2025 Any type: activated Include citations: activated
Clinical trials
Clinicaltrials.gov	Date Range Options on ClinicalTrials.gov – Study start – Primary completion – First posted – Results first posted – Last update posted – Study completion In the context of clinical evaluations under EU MDR or PMCF planning, choosing “Last update posted” (instead of “Study End Date” or “Study First Posted”) for period of evaluation ensures that your evaluation reflects the current Clinical Trial status and data availability, which is more relevant than just when the study started or ended. This field reflects when the sponsor last made any modifications to the study record, which is useful for ensuring that the information is current and reflective of the latest available data, even if the study was initiated or completed much earlier.	“wheelchair” AND “automated” Study type: · Interventional · Observational Date Range: Last update posted: From MM/DD/YYYY to MM/DD/YYYY	“wheelchair” AND “automated” Study status: All studies Study type: · Interventional · Observational Study results: with results Date Range: Last update posted: From MM/DD/YYYY to MM/DD/YYYY (Note: search fields that were kept empty are not included here)
Vigilance/recall
FDA MAUDE	Certain search fields may be left blank depending on the scope of the search or may be populated to narrow the results. However, the LSP should clearly document which fields were used and how they were completed in order to obtain the reported results.	Wheelchair A	Date Report Received by FDA: YY/XX/ZZZZ to YY/XX/ZZZZ Brand Name: Wheelchair Manufacturer: Automatic Note: other fields are left empty

THE LITERATURE SEARCH REPORT (LSR)

EXECUTING AND DOCUMENTING THE SEARCHES

Once The Literature Search Protocol (LSP) has been finalized and approved, the systematic search can be executed in accordance with the predefined methodology. At this stage, strict adherence to the documented search strategy is essential to preserve reproducibility and methodological integrity.

Before full implementation, it is also good practice to test the LSP. Pilot searches allow verification that the strategy produces consistent and comparable results across databases while maintaining an appropriate volume of articles to meet the objective of the search. This step helps confirm that the queries are neither too broad nor too restrictive before the final searches are executed and documented in the Literature Search Report (LSR).

All aspects of the search must be fully documented in LSR as well, including

· the databases consulted,

· the exact search strings used,

· the date each database was searched,

· the person who conducted the searches

· applied filters or limits, and

· the number of records retrieved per database.

This level of documentation ensures that the process is transparent, auditable, and capable of independent replication. Even when no results are found, this outcome must still be documented. For example, for low-risk devices where Article 61(10) is applied, it is common for literature searches to return no relevant publications. However, if the search methodology is robust and comprehensive, this absence of results provides reassurance that no new published evidence exists that would challenge or affect the justification for using the Article 61(10) route.

Ideally the LSR should be drafted separately for SOTA, similar devices, and DUE. Separating the reports helps maintain methodological clarity, ensures that the evidence supporting each objective can be traced independently, and facilitates review by Notified Bodies.

THE SELECTION/SCREENING PROCESS

The results generated from the search are then screened against the predefined inclusion and exclusion criteria specified in the protocol. Screening should be conducted in a structured manner, typically, in two stages:

· Level 1: title/abstract review followed by

· Level 2: full-text assessment

Each record must have a documented disposition (e.g., included, excluded at abstract stage, excluded at full-text stage), and exclusions must be accompanied by a clear and objective rationale aligned with the protocol criteria. It is commonly accepted practice to document inclusion and exclusion criteria using a coding system. For example, exclusion criteria can be coded as E1, E2, E3, etc., with each code clearly defined. Similarly, when results need to be stratified, inclusion categories can be coded as I1, I2, I3, and so on.

With the emergence of AI-based screening tools, this step can be simplified through algorithms that assign a probability score for inclusion or exclusion and provide reasons for the decision. However, it remains essential to systematically review, modify, or approve these results manually. To date, LEXQARA is not aware of any manufacturer who has formally validated a fully automated screening process without human verification.

Maintaining detailed records of screening decisions is essential to ensure transparency. A PRISMA flow diagram is typically used to document this process, clearly illustrating how the initial set of retrieved records is systematically narrowed down through predefined inclusion and exclusion criteria. This visual “funnel” demonstrates, for example, how 500 initial results were reduced to 25 relevant studies included in the appraisal. Such documentation provides auditors and Notified Bodies with a clear audit trail of the selection process.

DEFINE APPRAISAL PROCESS

Once relevant literature/clinical trials have been identified through systematic screening, the next step is to define how the selected studies will be critically appraised. Under EU MDR Annex XIV, the clinical data must be evaluated for scientific validity, methodological quality, and relevance to the device under evaluation.

According to MEDDEV 2.7/1 Rev. 4, appraisal must cover three distinct areas:

Suitability: Does the study address your device or an equivalent one or a similar one? Does it cover your intended population?
Data Quality: Was the study design robust? Did it have a sufficient sample size? Was there a clear clinical endpoint?
Scientific Contribution: How much “weight” does this study add to your overall conclusion of safety and performance?

Types of Appraisal Criteria

*Criteria derived from the appraisal criteria defined in IMDRF MDCE WG/N56FINAL:2019
†This could be for instance Oxford Level of Evidence (Oxford Centre for Evidence-Based Medicine: Levels of Evidence (March 2009) for Therapy / Prevention / Aetiology / Harm)

After the appraisal criteria is defined, the application of appraisal criteria needs to be performed by type of clinical data. For instance, primary clinical investigations and published literature on the subject device typically undergo full appraisal, including suitability, contribution, quality, and level of evidence assessment, as they directly support safety and performance claims. In contrast, SOTA literature, PMCF surveys, or manufacturer-held PMS data may primarily require evaluation of regulatory relevance and contribution rather than formal methodological grading. This risk-based and purpose-driven approach ensures that each data source is assessed appropriately without applying unnecessary or artificial appraisal layers.

NOTE:

Appraisal is not about whether the data are favourable. it is about whether or how they are credible and applicable.

Manufacturers sometimes use appraisal scores to exclude articles based on a predefined threshold. This approach requires defining and justifying a score threshold above which articles are accepted and below which they are removed. However, LEXQARA does not generally support this practice, as some articles can be difficult to justify excluding purely on the basis of an appraisal score.

For example, articles related to the DUE should not be excluded simply because the study population does not perfectly match the intended population. Off-label uses should still be captured and assessed to determine whether systematic misuse or emerging safety concerns need to be considered. Similarly, articles on similar devices should not be used to demonstrate the performance of the DUE, but they may still be relevant when defining or justifying acceptable safety and performance criteria. Finally, safety-related information, including complications, should not be excluded simply because the publication is a case report or a case series, as such evidence may still provide valuable insights into potential risks.

The appraisal process transforms a collection of studies into defensible clinical evidence. A well-defined, objective, and documented appraisal methodology demonstrates regulatory maturity. It also ensures that the clinical evaluation withstands scrutiny under MDR Annex XIV. This is often where Notified Bodies focus their review, as the credibility of the entire clinical evaluation depends on the integrity of the appraisal process.

COMMON MISTAKES AND AUDITOR "RED FLAGS"

FUTURE PERSPECTIVES

The 2025 proposal to amend the MDR and its Annexes introduces specific changes and requirements regarding literature searches (for clinical evaluation).

The proposal explicitly broadens the type of literature that can be accepted as clinical data to reduce the burden on manufacturers.

– Acceptance of Non-Peer-Reviewed Literature: The definition of “clinical data” in Article 2 is amended. Manufacturers will be allowed to use data generated through studies published in scientific literature that are not necessarily peer-reviewed to demonstrate safety and performance. This contrasts with the current requirement which generally prioritizes peer-reviewed data to establish equivalence or safety.

– Systematic Review Requirement: Despite the broader acceptance of data types, the proposal maintains the requirement in Annex XIV (Clinical Evaluation) that manufacturers must identify available clinical data through a “systematic scientific literature review”. This review must identify gaps in clinical evidence.

– Notified Body Assessment: Notified bodies are required to examine and validate the manufacturer’s planning and conduct of the “scientific pre-clinical literature search” and the methodology used for the literature search in the clinical evaluation.

NOTE: These perspectives should be considered with caution. The legislative process may take 18–24 months, and the proposed changes may still be modified or rejected. Until any amendments are formally adopted, it is recommended to continue working under the current regulatory framework.

LEXQARA RESOURCES

To support implementation in practice, LEXQARA provides downloadable templates and forms for the Literature Search Protocol (LSP), Literature Search Report (LSR), and appraisal criteria. These resources are available free of charge on our website and are regularly updated to reflect the latest MDR requirements, relevant MDCG guidance, and practical feedback received from Notified Bodies. They are designed to help manufacturers build structured, reproducible, and audit-ready literature search processes.

Dr. Nandini Dhiman (LEXQARA)