| the document |
The
Document
About the Johns Hopkins-American Healthways Outcomes Summit
In November 2004, Johns Hopkins and American Healthways convened
a conference of 250 practicing physicians and medical managers
from across the United States to meet in Rancho Mirage, CA. This
groundbreaking
conference was the first time that physicians and medical managers
provided a consensus statement on how outcomes-based compensation
arrangements should be developed in order to align health care
toward evidence-based medicine, affordability and public accountability
for how resources are used.
The three objectives of the consensus conference were:
- To review and revise the consensus document
draft developed by the steering committee,
- To articulate physician preferences for the
specific wording of the design principles, and
- To elicit physician perspectives
on the ideal PFP program.
Conference participants were asked to consider
how PFP should be crafted in order to have an impact on outcomes
for the largest
number of patients possible. At its conclusion, the conference
generated a blueprint for the design of pay-for-performance
programs. We believe that these design principles, reflecting
the thoughtful
deliberations of a large group of physicians, can have an
important impact on future physician payment policy. The principles
that
were produced are intended as a guideline for any organization
interested in developing a pay-for-performance program, or
as a framework for studying programs currently in place.
> back to index
Section
1. Why Physician Payment Must Change
The social obligation for best
practice is part of the commodity the physician sells.
KJ Arrow, 1963
Quality problems are everywhere,
affecting many patients. Between the health
care we have and the care we could have lies not just a gap, but a chasm.
Institute of Medicine, 2001
There are many mechanisms
for paying physicians; some are good and some
are bad. The three worst are fee-for-service,
capitation and salary.
JC Robinson, 2001
Americans think their health care is the best in the world.
A large biomedical research enterprise produces a steady output
of new pharmaceutical, medical
and surgical products and procedures. Technological diffusion is rapid, giving
the United States an incredibly resource-rich health care system. However,
availability of technology does not equate with excellence in quality of
care.
A growing body of empirical evidence has documented gaps between how health
care should be delivered to achieve the best possible outcomes and how it
is actually delivered (Schuster, McGlynn et al. 1998; Institute
of Medicine 2001;
Fisher E. S., Wennberg et al. 2003; McGlynn, Asch et al. 2003). These gaps
are so large that a 2001 panel of experts convened by the Institute of
Medicine called it a quality chasm (Institute of Medicine 2001).
Our inability to consistently produce health care of the highest
quality results from failures throughout the health care system.
It is not merely a problem
of getting health professionals to do the right thing. Change
needs to occur at multiple levels: (1) among health professionals to ensure
they have current and accurate knowledge, skills and expertise; (2) at the
group or team level to facilitate integration of services across practitioners
and time; (3) among health delivery organizations so that the necessary infrastructure,
such as clinical information systems, is available; and (4) at the larger
environmental level to address regulatory, coverage and payment policies
(Shortell SM 2004).
Transforming health care cannot be done by targeting just one of these levels;
we need to align system change at all four levels.
One of the most powerful levers for modifying the organization
of health care is through alterations in provider payment policy.
Medicares prospective
payment system is an excellent case in point. In the 1980s, hospital payment
for Medicare switched from a retrospective, cost-based payment methodology
to a prospective, per-case method. This new payment system, also called Diagnosis
Related Groups (DRGs), removed incentives for keeping patients in the hospital.
The per-case DRG method gave hospitals a lump sum for every patient hospitalized
for a specific condition, regardless of how long they stayed. Hospitals responses
were dramatic and quick (see Figure page 5). Mean length of stay in the nation
decreased, leading to fewer hospital days, while ambulatory services and
post-acute care increased in importance. Because
DRGs dealt
with inpatient services only, its impact on cost containment for all
health care sectors was minimal. Much of health care delivery shifted
from inpatient to outpatient care. The DRG example suggests (1) that
health care organizations can be dramatically altered in response to
payment and (2) both intended (i.e., lower inpatient costs) and unintended
effects (i.e., increased outpatient costs as a result of the DRGs) may
result from health care financing reforms that occur in isolation.
The Institute of Medicine has recognized the need to transform
physician payment. In its now famous report, Crossing the Quality
Chasm (IOM
2001), the following recommendations regarding physician payment were made:
- fair payment should be given for good clinical management,
- providers should have the opportunity to share in the benefits
of quality improvement,
- purchasers should have the opportunity to recognize quality
differences in health care and direct decisions accordingly,
- financial incentives should align with implementation of care
processes based on best practices and the achievement of better
patient outcomes and
- payment should promote better coordination of care.
Current physician payment systems are not designed to promote
quality or better outcomes. Both theory and history support this
claim. Fee-for-service is essentially pay-for-production and offers
rewards for seeing more patients, generating more services (whether
appropriate or inappropriate care) and upcoding procedures and
diagnoses. When used within an environment in which consumers have
nearly all their costs covered by insurance, fee-for-service can
lead to large increases in health care expenditures. Conceptually,
capitated payments should enhance efficiency of health care production
and reduce provider and patient demand for services. However, some
have argued that capitation causes stinting on care, under-use,
quality problems and risk selection. Salaried payment reduces incentives
for productivity and is basically a pay-for-time method. None of
these forms of provider payment aligns compensation with outcomes.
New methods for paying physicians are needed so that doctors are
appropriately rewarded for providing high-quality care and promoting
better outcomes for their patients.
> back to index
Section 2. Pay-for-Performance:
A Definition
Health care organizations have already begun
to pay physicians for meeting quality standards (Casalino L.,
Gillies et al. 2003; Strunk BC 2004). These
forms of compensation are usually called pay-for-performance PFP), because
physicians or their practices are given financial incentives for achieving
certain quality targets. Recent reports suggest that just 1% to 2% of physician
compensation in PFP programs is from incentive pay for quality (Kralewski,
Rich et al. 2000; Casalino L., Gillies et al. 2003). Even so, the word from
the market is that the number of physicians and amount of money
that will be involved in some form of PFP in the near future will be substantial
(Epstein, Lee et al. 2004).
Although research projects recently funded by Robert Wood Johnson
Foundation and Centers for Medicare and Medicaid Services will
provide results on the
effects of alternative PFP models, none has been published yet. In general,
there is very little research on the effects of PFP methods. We have no empirical
information on a wide range of topics related to PFP: how large should payments
be, should payments be made to individual physicians or groups, what metrics
should be used for the payments, what effects on quality and outcomes do these
payments have, and what other changes in the health care system should be made
to enhance and reinforce the effectiveness of the PFP model. Thus, PFP is an
unproven method of physician payment and could be considered experimental. This
lack of documented evidence of benefit suggests the need for evaluations of
PFP interventions.
PFP is not an all-encompassing solution for improving quality. It is one method
among a wide array of approaches targeted at different levels of the health
care system (the milieu, organizations, groups and individual practitioners)
and can be combined with non-financial methods as well.
For the purposes of this document, we define pay-for-performance in the following
way:
DEFINITION: PAY-FOR-PERFORMANCE
The use of incentives to encourage and reinforce the delivery of evidence-based
practices and health care system transformation that promote better
outcomes as efficiently as possible.
We use the term incentives to denote reinforcers. A reinforcer
is anything that alters the chances that a behavior occurs (Town
R 2004), or at the organizational level, that a structural change
occurs. In our definition, we therefore include positive reinforcers
(e.g., bonus payments), negative reinforcers (e.g., withhold distributions),
punishments (e.g., physicians pay a penalty for not meeting target
levels for quality or outcome measures) and non-financial mechanisms
(e.g., making results of quality assessments publicly available).
The definition indicates that PFP should reinforce high levels of performance
while encouraging lower performers to improve. It also makes clear that it
is not enough to link incentives with the production of care. PFP is designed
to promote better outcomesthat is, better health, functional status
and well-being.
> back to index
Section 3. Pay-for-Performance
Design Principles: Assumptions
Four assumptions guided the developed of the PFP design
principles. These included: (1) unit of analysis and payment, (2)
requirement for investment of new resources or redistribution of
existing resources, (3) patient incentives will not be addressed,
and (4) the percentage of income at risk will not be addressed.
Assumption #1. Our focus is on incentives targeted
at individual physicians and physician organizations.
PFP can be implemented at many levels of the health care
system: integrated delivery systems, health plans, hospitals, physician
organizations and individual physician practices. For the purposes
of this document, our emphasis is on community-based physicians,
which is where most individuals get most of their care, and physician
organizations. Both single-specialty and multi-specialty
group practice organizational models were considered to be within
the scope of physician organizations.
Assumption #2. PFP will require investment of
health care resources.
PFP is a tool to improve quality and outcomes. The early stages of PFP programs
are almost certain to entail the allocation of new, or a redistribution of existing
health care resources. Focusing patient care on the production of outcomes rather
than merely producing services will require physicians and their organizations
to develop new care management capacity. This may include, for example, disease
registries, electronic information systems capable of producing quality and outcome
metrics and the addition of nurses and other practice staff to implement these
new processes. We expect that any return on investment generated from PFP programs
will take several years and will result from better outcomes and reduced complications,
which result in lower downstream demand for high-cost specialty and hospital
services.
Assumption #3. Patient incentives will not be
addressed, although we recognize that patient participation in
the health care process is critically important to enhancing outcomes.
Better outcomes are not achieved solely by applying high-quality medical care
during office visits. Patients themselves play a central role in improving their
health. They choose whether and how to participate in the care delivery process,
as well as whether and how to implement treatment programs. Physicians and plans
have a responsibility for engaging all individuals in health care, not just those
who access services on their own accord.
The role of patient self-management as an integral element of health care quality
and optimal outcomes has received a good deal of attention in the research literature
and the lay press. However, a thoughtful framework for using incentives either
through benefit designs or other mechanisms to encourage healthful patient self-management
(not just cost-shifting) has not been fully developed. Past experience has demonstrated
that financial factors such as differential premium contributions, higher co-payment
or restricting access to services results in lower utilization, but these measures
also may worsen health outcomes (Soumerai 2004). The growing popularity of consumer-directed
plans also places a premium on patient health care purchasing behavior without
clear evidence regarding its long-term impact on quality and health outcomes.
This document does not address strategies for providing incentives to patients
to align their actions with better outcomes. We concluded that this is a topic
so large and complex that both it and physician incentives could not be dealt
with adequately at the same consensus conference.
Assumption #4. The amount of physician or practice
income affected by PFP is not addressed by our design principles.
The design principles do not address the specific percentage of income that should
be performance-based. In one study, bonus payments had little effect on physician
organizations use of care management programs, largely because they were
perceived as too small to influence decision-making (Casalino L., Gillies et
al. 2003). The specific level of physician or practice income that effects substantive
change without unintended negative consequences, such as shunning high-risk patients
from physician practices, is unknown.
Supporting the governments proposal for a bold experiment in quality,
family practitioners in the United Kingdom voted affirmatively for a new pay-for-performance
bonus payment system. Up to 20% of physician income is based on meeting targets
for clinical indicators, practice organization indicators and patient experiences
obtained from surveys (Roland 2004). The impact of this level of bonus payment
on promoting quality and on other perhaps unforeseen changes in patient care
is unknown and currently being evaluated. Moreover, we acknowledge that any PFP
program must operate within the framework of the statutes and regulations, such
as anti-trust and anti-kick-back laws, governing physician payments.
> back to index
Section 4. Design
Principles
This section describes the design principles, which are grouped into six categories:
Payment Structure, (2) Transparency, (3) Metrics, (4) Evaluation, (5) Community
and Patient Participation, and (6) Fairness.
The conference included an initial session in which each principle and respective
design options were discussed in terms of meaning, clarity and comprehensiveness
of the description. A second session was devoted to formulating specific wording
for each principle. The conference concluded with a consensus process involving
all attendees during which final versions of the principles were produced.
The 15 design principles produced at the consensus conference are reproduced
below. After each one is stated, we provide a discussion of important and in
some cases critical considerations offered by conference participants as they
formulated the final wording of the principle.
Design Principle Category #1: Payment Structure
There are several considerations regarding how to structure the incentives
used in pay-for-performance. This category address four discrete principles
related to the structure of payments.
Design Principle 1.1: Accountability Level
The accountability level for physician pay-for-performance may be either the
individual physician or practice. Incentives targeted at physicians are more
likely than those targeted at practices to change physician behavior; however,
practice incentives may have a bigger impact on altering the infrastructure
needed to provide high-quality care. Some have argued that within a practice,
multiple physicians manage any given patient, so there should not be a single
accountable physician designated. Instead, all treating physicians within a
practice should benefit from the outcomes and resultant compensation. Health
professionals often share both patients and resources, such as clinical information
systems and ancillary personnel within a practice. Alternatively, making multiple
physicians accountable for a single patient may dilute the impact of the incentive
and diffuse a sense of accountability.
Regarding accountability level, the consensus conference participants made
the following declaration:
Design Principle 1.1: Accountability Level
Physician organizations that directly
interface with payers should be the
accountable entity in PFP programs.
Discussion: Although payments may be
distributed to physician organizations, practices are strongly
encouraged to disburse funds to individual physicians based on
their performance within the group. Maximal effectiveness of PFP
programs will be achieved when both physician organizations and
individual physicians within those groups are accountable for measurable
patient outcomes. PFP programs targeted at physician organizations
may be more likely to alter structural aspects of the practice
milieu than those targeted at individual physicians.
Design Principle 1.2: Distribution of Financial
Incentives
PFP may take the form of a variable component of payment that is added to base
compensation, which is unaffected by the PFP formula. How to distribute the
incentives, both positive and negative, is a decision that designers must make.
One approach is to give all entities some form of positive incentive with increasing
amounts linked to better performance. In this scenario, even the low-performing
entities would receive some amount of variable payments, albeit not as large
as their higher performing counterparts. Providing incentives to most or all
physician organizations in the PFP program may be necessary to encourage continued
participation among the lower performers.
A second approach is to give positive incentives to entities
meeting certain target thresholds for quality or outcomes, and
the specific amount may or may
not be graded by performance. For some, this approach may be appealing because
only excellent quality is rewarded. The threshold approach to
distributing incentives assumes that physicians will be motivated to make substantial
changes in their clinical actions and/or practice infrastructure as a result
of the possibility of a reward. It is unclear whether this holds true for average
and low performers, who may not perceive the incentives to be within their
grasp.
Regarding distribution of incentives, the consensus conference participants
made the following declaration:
Design Principle 1.2: Distribution of Incentives
PFP programs should provide variable
incentives to physician organizations
that meet certain target thresholds or demonstrate a clear improvement
over baseline performance levels.
Discussion: Thus, the threshold approach
should be applied to high-performing physician organizations to
reward excellence. It should also be applied to organizations showing
some minimum level of improvement over baseline to reward substantive
improvements in quality.
Design Principle 1.3: Financial Incentive Type
According to recent surveys, several types of rewards have been used in the
early PFP programs. Most common are bonuses to groups and individual physicians,
tiered co-payments with higher performing providers having lower patient co-payments,
payment rates tied to performance and quality infrastructure grants (Bailit
Health Purchasing 2002; Strunk BC 2004). Bonus payment may be used as a positive
(extra funds) or a negative (expected funds not distributed) incentive. Another
approach is to modify a conversion factor used in a fee-for-service system
or the capitation rate, higher for better performers and lower for those in
the poorer spectrum. The co-payment structure can be used to encourage (by
lowering or eliminating them) or to discourage patients from using a physician
or practice based on its quality.
There is suggestive evidence that negative incentives may be
more powerful stimuli to induce behavior change among physicians
than positive incentives.
A quality withhold is a form of negative incentive. A portion
of physician income is set aside pending the achievement of certain quality
targets.
Regarding the type of incentive that PFP programs should use, the consensus
conference participants made the following declaration:
Design Principle 1.3: Incentive Type
PFP programs should be based on positive
financial incentives.
Discussion: Negative financial incentives
should be avoided. In the early phases of PFP programs, developing
adequate levels of provider buy-in to the process will be critical
to program success. Negative incentives would discourage providers
from participating. In addition, loss of income associated with
negative incentives can create serious gaps in the flow of practice
revenue.
Design Principle 1.4: Frequency of Assessments
and Incentive Distribution
There is a necessary lag between the end of the assessment period and availability
of data because of the time it takes to collect data and produce the quality
metrics. The impact of incentives is likely to be stronger when they are applied
close in time to the clinical activity. Distributions with very long lags are
less likely to allow physicians to make timely adjustments in their behavior
and practice. Timeliness of assessments and distributions is counterbalanced
by the larger payments that may result from longer intervals. Furthermore,
frequency of assessment must be weighed against the administrative burden on
payers, physicians and practices associated with generating the metrics. In
many cases, the assessment interval is determined by the definition of the
metric.
Regarding the frequency of assessments and incentive distribution, the consensus
conference participants made the following declaration:
Design Principle 1.4: Frequency of Assessments
and Incentive Distribution
Metric assessments and payments should
be made as frequently as possible in order to better align rewards
to actual performance. Results of assessments should be reported
and payments provided to the physicians involved as soon as possible
after the close of the measurement period.
Discussion: Frequent assessments in
the absence of electronic medical records could place a large burden
on practices and payers. Until EMRs or other means of securing
performance data are commonplace, the timing of payments under
PFP should depend on the technical capabilities of the providers
and health plans involved.
> back to index
Design Principle
Category #2: Transparency
Physicians act as patients agents, helping them, as well as their designated
advocates and caregivers, to make medical decisions in the face of uncertain
effects and outcomes (Arrow KJ 1963). Physicians decisions are influenced
by their concern for their patients welfare and health and by their
professional norms. Public disclosure of the results of quality and outcome
assessments are non-financial incentives that can alter a physician or groups
esteem among their peers or patients. Public recognition for quality of care
may be a stronger incentive than bonus payments, which in the past have tended
to be too small to garner much attention from physicians and practices (Casalino
L., Gillies et al. 2003). Either used alone or combined with financial rewards,
public disclosure could have powerful effects on modifying the structure and
performance of physicians and their practices.
Like many aspects of PFP programs, the effects that public disclosure will
have on patients, physicians, the patient-physician relationship and the health
care system are not fully known. The argument for disclosure, therefore, is
not predicated on a strong research base. It is motivated in part by a general
consumerist trend to make more information available to the public.
On an ethical level, patient autonomy argues for transparency.
Autonomy refers to the tenet that patients should have all the
information they need to make
informed decisions. Withholding details regarding the methods used to pay physicians,
the providers participating in a PFP program and the results of quality assessments
are threats to autonomy. Patients right to know this information is
counterbalanced by physicians desire to keep information about their
professional practice private and confidential. Thus, regarding public disclosure,
there is a tension between autonomy of patients and autonomy of physicians.
Some research evidence suggests that disclosure of physician incentives to
patients does not alter patients trust in their doctors or insurance
companies and may actually have a positive effect on trust (Hall, Dugan et
al. 2002).
Public accountability also supports transparency. Society entrusts physicians
with powerful prerogatives in the care of patients, a privileged status that
must be preserved and strengthened. Transparency of incentives supports continued
trust in the profession.
Two design principles can be derived from the concept of transparency. The
first relates to making the method used to
pay physicians or groups transparent to the public. Disclosing the method may
also include revealing the identity of the participants in the pay-for-performance
program. The second principle relates to disclosure of results of
the quality and outcome assessments.
Regarding public disclosure of method, the consensus conference participants
made the following declaration:
Design Principle 2.1: Public Disclosure
of Method
A list of physician organizations
participating in the PFP program,
as well as the quality and outcome metrics used in the program,
should be disclosed to the public.
Discussion: Disclosure
of physician participation in a PFP program sends the public the
positive message that the provider is focusing on quality and improving
care and that they are willing to be accountable for their performance.
Design Principle 2.2: Disclosure of Results
This design principle addresses transparency of the results produced from the
quality and outcome assessments.
Regarding public disclosure of results, the consensus conference participants
made the following declaration:
Design Principle 2.2: Disclosure of Results
PFP programs should publicly disclose
a list of physician organizations
who meet quality and outcome target thresholds and those who are
demonstrating improvement over time.
Discussion: Ranking
of all physician organizations should not be done because of unreliability
inherent in conventional statistical methods and the resultant
risk of falsely identifying outliers.
There should be a baseline period before public disclosure to provide physicians
with opportunities to review, validate and interpret their results. In effect,
this preliminary phase would involve disclosure to the physician organization
only. A process for validating results and expressing disagreement with the
findings should be established in all programs. Once the validity of the quality
and outcome assessments has been substantiated, then disclosure of results
can proceed.
Some participants strongly felt that disclosure has several potential negative
effects. For example, if PFP uses only a limited number of measures, consumers
choosing a practice may have incomplete information about the global quality
of care delivered by that practice. In such cases, doctors practices
may suffer or be rewarded inappropriately. There were also concerns about potential
misuse of such data for the purposes of contracting or in legal cases.
> back to index
Design
Principle Category #3: Metrics
In PFP, the measures used to assess quality and outcomes provide a basis for
determining the amount of reward to distribute to providers. Health services
researchers have spent years developing the technology necessary to accurately
measure quality and outcomes. The field has advanced to a point that there
are a sufficient number of metrics with established measurement properties
to build credible quality-improvement programs. The design principles in this
category build on this knowledge base and add a few considerations that may
be unique to the PFP context.
Design Principle 3.1: Measurement Level
For the purposes of quality assessment, three measurement levels can be assessed:
structure, process and outcomes. Structural measures refer
to aspects of the health care system that are present before patients and professionals
meet. Examples of structural measures include disease registries, electronic
information sources, electronic medical records and availability of health
educators, social workers and case managers. These measures are easier to obtain
than process or outcome measures and place the least data collection burden
on the practice. On the other hand, they have weaker links with outcomes as
compared with process measures.
Processes of care refer to what happens
when patients and professionals interact. Process quality measures
assess the degree to which those interactions conform to evidence-based
guidelines of care. Some of these measures have strong empirical
evidence for a linkage with outcomes. However, most are disease-specific
and cover a very narrow range of clinical activity. Examples
of process measures are whether certain lab tests (e.g., A1C
and LDL) are checked during a specified interval among patients
with particular diseases, patients assessments of their
interactions (also called satisfaction with care), and appropriate
administration of certain drugs to patients. Process measures
should also include assessments of waste and inefficient practices.
Outcome measures are the intermediate
and long-term results of health care, and include changes in
health status (both self-assessed health and clinical markers
such as organ function), functioning (ability to participate
in desired activities, cognition, mobility and self-management)
and well-being. Although improving patient outcomes is the most
important goal of health care, providers have voiced concerns
about being held accountable for those changes in health, functioning
and well-being on which their interventions have little direct
effect. For many, outcomes factors outside the control of health
professionals are critically important determinants. Obtaining
outcome information presents the greatest methodological challenges,
because many measures require patient report and survey sampling,
others rely on laboratory results, and clinical measures such
as organ functioning may not be apparent for a number of years.
Outcomes may be intermediate biochemical or physiologic changes (such as LDL
level or results from pulmonary function tests) or long-term end-organ effects
(such as rate of myocardial infarction). Whereas the latter are arguably more
important, it may be more feasible to measure the former.
Ultimately, provider behavior change and system transformation, which are the
targets of PFP, should have positive effects on outcomes. However, as these
change processes unfold and evolve, it will be essential to have quality assessments
done at all three levels of measurement. Structural measures can be used to
assess practice infrastructure, process measures tap into practitioner behavior,
and physician-patient interactions and outcome assessments provide information
on the end results of care processes.
Regarding measurement level, the consensus conference participants made the
following declaration:
Design Principle 3.1: Measurement Level
The metric set used in PFP programs
should include a mix of outcome,
process and structural measurements.
Discussion: Outcome
measures are intentionally listed first, because they are viewed
as the most important type of measure for PFP programs. The exact
mix of all these measures cannot be specified for every program
and will depend on priorities for change and existing technical
capabilities.
Design Principle 3.2: Metric Attributes
When selecting specific metrics, PFP designers must choose a number sufficient
to cover several clinical processes, but not so many as to engender confusion
in the participants regarding how and where they should focus their efforts.
Because a large volume of patients may be needed to obtain stable estimates
of performance, particularly for disease-specific quality, the balance between
comprehensive assessment and respondent burden should be carefully considered.
There are several metric attributes that can be considered during a selection
process. The criteria proposed in this document have been adapted from an Institute
of Clinical Systems Improvement (ICSI) internal document proposing performance
measurement for ICSI member organizations.
First, a candidate measure is more worthy of being used to the extent that
the health care structural element, process of care or outcome is common or
frequently experienced. We term this criterion volume.
Improvements in high-volume measures can have a larger impact on the health
and health care of patients than those that are lower volume.
Second, the potential impact on health associated with changes in performance
is an important consideration. This is called the gravity of
the measure. For example, measures associated with cancer quality/outcomes
have higher gravity than those linked to acne because of cancers threat
to survival. Delays in treatment for life-threatening illnesses have higher
gravity than long waiting times for routine ambulatory care. Obstetrical concerns
have high gravity because of the many years of health or disability at stake
for the newborn.
Third, a measure is more worthy to the extent that there is empirical evidence linking
changes in the metric with clinically important changes in health, functioning
or well-being. Alternatively, for outcome measures, there should be evidence
that the application of health care is an important determinant of the outcomes.
The metric should be actionablei.e., specific health care actions associated
with the metric lead to better patient outcomes. Beta-blocker use post-MI is
a good example of an actionable measure for which there is strong evidence
of its linkage to future health outcomes.
Fourth, a proposed measure should assess an aspect of performance for which
there is a gap between current practice and what
can be achieved under optimal circumstances. Metrics for which there is variation,
such as rates of diabetic foot exam, are most useful. Those which are nearly
uniformly done (e.g., blood pressure checks during routine visits) are less
useful.
Fifth, a measure is more suitable for use to the extent that the prospects for
improvement of the measured performance are good. Not only should there be
variation in the metric (what we call a gap above), but there
should also be no external factor that would preclude improvement in the quality
or outcomes among health care units.
Sixth, the measure itself should have an acceptable degree of reliability,
validity and feasibility. In other words, a measure is a better choice
to the extent that experience in its use has shown that it produces consistent
results over time and across observers (reliable), it is consistently
associated with outcomes and related health care metrics (validity)
and methods exist for the efficient and minimally burdensome acquisition of
data (feasibility).
Regarding metric attributes, the consensus conference participants made the
following declaration:
Design Principle 3.2: Metric Attributes
The attributes of measures used in PFP
programs should include:
- High volume: common structural
attribute or frequently experienced process/outcome of care,
- High gravity: large potential
impact on health associated with metric,
- Strong evidence-basis: research
evidence of linkage between change in measure and outcomes,
- Gap between current and ideal practice,
- Good prospects for quality
improvement:
no external factor that would preclude health care entities
from
closing gaps between current and ideal practice,
- Measurement reliability: the
metric produces consistent results across time and observers,
- Measurement validity: the metric
actually measures what it is intended to measure and is clearly
defined and
- Measurement feasibility: methods
or technologies exist for the efficient acquisition of the necessary
data.
Discussion: The
field of quality and outcome assessment changes rapidly enough
that PFP programs should have sufficient flexibility to add new
metrics and delete existing ones in a dynamic way. The volume consideration
can be applied to a general population or within sub-groups defined
by age, disease class or some other attribute. With better electronic
information systems, the number of measures that can be included
in PFP programs will be greatly increased, because the feasibility
criterion will be more commonly satisfied.
Design Principle 3.3: Metric Domain
Once the relative mix of structural, process and outcome measures is determined,
PFP designers must select metrics from specific quality domains.
Regarding metric domains, the consensus conference participants made the following
declaration:
Design Principle 3.3: Metric Domain
Metrics for PFP programs should be selected
from the following quality and outcomes domains:
- Patient-centeredness: captures
patients assessments of their experiences in the care
process and interactions with providers,
- Effectiveness: measures that
are linked to health outcomes in real-world settings,
- Safety: measures associated
with reduced chances of patient harm and
- Efficiency: risk-adjusted
assessments of service use and expenditures.
Discussion: This
approach uses a modified IOM framework for quality assessment (Institute
of Medicine 2001). Measures within these four domains can be selected
by type of service (i.e., preventive care, acute care, chronic
care, long-term care and palliative care) and/or by type of outcome
(e.g., biochemical and physiologic outcomes, end-organ outcomes,
functional status and well-being).
Patient-centered measures capture
patients assessments of how their physician or the entire
health care team respects their personal values and preferences,
are responsive to their needs, provides emotional support or
physical comfort and involves family and friends. Examples of
these measures include: structurereports on ease or difficulty
accessing providers, ratings of office waits, reports about appointment
waits; processevaluations of interactions with health
professionals in terms of the respect given to the patient, trust
in the provider and emotional support offered; and outcomepatient-reported
assessments of their health, functioning and well-being.
Effectiveness metrics include
structure or process measures that are linked to outcomes in
real-world settings. Structural metric examples are presence
of clinical information systems and disease registries, both
of which are associated with improved chronic care outcomes.
Process metrics could include conformance with clinical practice
known to be linked to improved outcomes. Checking A1C on a semi-annual
basis is an example of a process-level effectiveness metric,
whereas the actual A1C level is an example of an outcome-level
effectiveness metric.
Safety measures include metrics
associated with reduced chances of patient harm. Drug-drug, drug-age
and drug-disease interactions represent three classes of medication-related
safety measures. An example of a drug interaction structural
measure is presence of electronic prescribing technology, a process
measure would be the actual prescribing of a drug inappropriately,
and an outcome measure would identify a change in a specific
poor health or functional state associated with unsafe medical
practice, such as fall injuries associated with sedative use
among elders.
Efficiency is a cross-cutting
theme in the IOM quality domains. Efficiency measures refer primarily
to assessments of utilization and health care expenditures. We
believe that they are necessary for a balanced PFP metric set,
but they are certainly not sufficient. As our PFP definition
indicates, we suggest designers ought to provide incentives that
improve quality and promote outcomes as efficiently as possible.
To accomplish this goal, some measurement of resource use is
necessary. Efficiency measures must be risk-adjusted (for example,
by using a method such as the Johns Hopkins Adjusted Clinical
Groups [ACG] Case-Mix System) to control for differences in the
morbidity burden of patient populations.
Design Principle 3.4: Range of Metrics
Some existing PFP programs focus on a few (e.g., three or four chronic care
indicators) or even one metric (e.g., immunization rates), whereas others have
opted to include a much larger number. In California, a consortium of six health
plans (Aetna, Blue Cross of California, Blue Shield of California, CIGNA Healthcare
of California, HealthNet and PacifiCare) have developed a PFP model that includes
about a dozen preventive care, chronic care and patient experience measures.
Additional measures can be (and have been) added to this list (for more information
see http://www.iha.org). This relatively small
set of metrics contrasts with a much more comprehensive, yet complex, set of
indicators in the new British PFP experiment (Roland 2004). British family
practitioners quality bonuses are benchmarked against performance across
well over 100 indicators that assess care for 10 conditions, each with multiple
indicators, as well as practice organization and patient experience metrics.
PFP designers must decide on the number of metrics to use to calculate the
amount of the incentives. Smaller numbers are easier to explain and understand,
but they may also lead to sub-optimal care for conditions not covered in the
metric set. A more comprehensive set of indicators runs the risk of too much
complexity, leading to physician and practice confusion during implementation.
Regarding range of metrics, the consensus conference participants made the
following declaration:
Design Principle 3.4: Scope of Metrics in
the PFP Program
PFP designers should include a sufficient
number of metrics across
a spectrum of health promotion activities and disease states so that
they provide a balanced view of performance.
Discussion: At
the onset, this number should be limited within domains, so they
could be focused on transforming specific elements of the system
and be accurately measured. Over time, this list needs to be re-evaluated
and redefined and expanded to be more comprehensive. It is important
for multiple groups to be involved in choosing these metrics. These
groups include community physicians, insurers, purchasers and patients.
Ultimately, the metrics should be broad in scope and physician-specialty
specific.
> back to index
Design Principle
Category #4: Evaluation
Because research on pay-for-performance is scant, the actual effects of this
new payment method are quite uncertain. One could argue that this degree of clinical
uncertainty renders PFP experimental, which suggests the need for evaluations
to assess impact in early adopting health care organizations. On the other hand,
program evaluations require methodological expertise and additional funds, which
are resources that may be difficult to secure for some organizations. It would
also be reasonable to suppose that evaluations are best left for the research
community, rather than organizations that implement the programs.
Regarding the need for evaluation, the consensus conference participants made
the following declaration:
Design Principle 4.1: Need for Evaluation
Every PFP program should have some
level of evaluation.
The evaluations should include periodic assessments of intended
and unintended impacts on access, costs, quality, health outcomes,
physician satisfaction and patient satisfaction.
Discussion: A
national database of PFP evaluation results should be established
so that organizations implementing PFP programs can share their
experiences. Funders of health services research are encouraged
to support scientifically rigorous studies of innovative programs.
> back to index
Design Category #5:
Community Participation
Within a single medical market, if each payer develops and implements a unique
set of metrics, the impact of PFP is likely to be limited. Moreover, a consortium
of plans in a community that cooperate on the design and implementation of
PFP is more likely to affect a large enough share of physician income to produce
real change. Metric sets that do not overlap across payers (or purchasers)
using PFP will increase the level of confusion among providers regarding on
which aspects of clinical care and practice organization they should focus
change efforts. Payers include both public (Medicare and Medicaid) and private
organizations. By developing a common set of metrics and implementation procedures
across payers within a community, specific community priorities can be the
focus of providers attention. Community-wide participation has the potential
to transform health care within a geographic region and will give high levels
of visibility to the effort. On the other hand, forcing community-wide participation
could limit health plan innovation and may slow down the implementation process.
Regarding the need for evaluation, the consensus conference participants made
the following declaration:
Design Principle 5.1: Community-wide Participation
in Program Development
Employers, public purchasers, payers
and providers serving
the same medical market should develop a common set of
metrics and measurement procedures.
Discussion: Community-wide
participation facilitates statistically valid evaluation of smaller
physician practices by capturing a large share of their patient
populations in quality assessments. Moreover, fewer resources among
physician organizations are required for measurement if a common
approach is utilized. Communities should consider developing common
data sets that aggregate data across payers, purchasers and providers
in order to have a uniform methodology for assessing and reporting
performance. A common set of national metrics and implementation
procedures would greatly enhance the capacity of communities to
coordinate efforts across organizations. Without community-wide
participation, PFP faces a high risk of failure due to the large
burden that will be placed on physicians and their practices.
Design Principle 5.2: Patient Participation
Quality and outcomes of care cannot improve without the active engagement of
patients in health care processes. Their importance to improving quality is
often overlooked.
Regarding involving patients in PFP design and implementation processes, the
consensus conference participants made the following declaration:
Design Principle 5.2: Patient Participation
Patients should be involved in PFP
program development and assessment.
Discussion: Patients
have a critical role in quality improvement as central actors in
care processes. Thus, it is logical that patient preferences should
be incorporated into the design of PFP systems.
A dissenting view expressed is that involving patients in the design process
is logistically problematic and unnecessary for the success of PFP.
> back to index
Design Category #6:
Fairness
This category relates to how fair PFP is to physicians and patients affected
by it. Superior PFP systems minimize the likelihood that any provider or patient
group will be unjustly impacted by PFP.
Patients do not randomly distribute themselves to their providers. Some professionals
care for sicker or more socially complex patient populations than others. Achieving
quality and outcome targets for these providers will be more difficult than
for those whose patients are healthier. For example, in a study on health plan
quality, organizations with higher percentages of minority, rural and low socio-economic
patients had lower quality ratings. Once differences in patient mix were accounted
for, the quality rankings of some organizations changed substantially (Zaslavsky,
Hochheimer et al. 2000), making some bad apples look good.
Regarding methods to maximize fairness, conference participants made the following
declaration:
Design Principle 6.1: Methods to Maximize Fairness
PFP programs should include methods
to maximize fairness by
addressing differences in patient health status, social complexity
and patient adherence.
Discussion: To
promote fairness, this design principle states that assessments,
and thus payments based on those assessments, should be adjusted
for differences in patient mix across providers (i.e., risk adjusted).
Risk adjustment will minimize the effects of risk selection that
may be unfair to both patients and their providers. Another concern
is that if PFP assessments are done using patient populations that
are too small to provide valid results, results that may be publicly
disclosed would unjustly penalize or reward providers.
Regarding sample size, conference participants made the following declaration:
Design Principle 6.2: Sample Size
PFP assessments should be done using
patient samples that are
large enough to produce statistically meaningful results.
Discussion: If
an adequate sample size cannot be achieved using data from the
physician organization only, results could be pooled across reporting
units in order to gain sufficient sample size. For small practices,
statistical reporting may be unreasonable and alternative methods
for assessment may be needed. No physician organization should
be excluded from PFP programs because of the size of its patient
population.
Section 5. Conclusions
There is growing interest in changing physician compensation
to promote better quality and patient outcomes. Some organizations
have already begun to offer physician organizations bonus payments,
better contracts and other financial rewards to better align
payment with quality. Additionally, public recognition of both
good and poor quality is being used and offers a powerful supplement
to financial incentives. All these programs have been developed
without substantive input from physicians. This document fills
this information gap.
Both positive and potentially negative outcomes may result from pay-for-performance
programs. For example, there may be a change in the holistic, patient-oriented
approach to patient care, if health care is delivered by managing
the metric rather than managing the patient. Specifically, some of the
potentially negative impacts include:
- Disincentives for physicians to practice in areas with patient
populations that have high levels of health care needs or social
complexity;
- Less attention to patients psychosocial
needs due to an increased biomedical orientation of health
professionals;
- Fragmentation of care that could result from management of
metrics rather than management of patients;
- Physician concern that PFP is being done to decrease their
income;
- If co-payment tiering is used, access could worsen for patients
whose physicians are underperforming and as a result have higher
co-payments;
- Disclosure of PFP participation or results could have deleterious
effects on the doctor-patient relationship among physicians who
are poor performers;
- Loss of the art of medicine because of a preoccupation with
charting, flow charts and other forms of documentation;
- Poorer quality of care for conditions not included in the
incentive system;
- Higher practice administrative costs entailed in generation
of the PFP metrics;
- Incentives for physicians and organizations
to cherry-pick the
easiest patients to manage and
- Poorly run programs may discourage physicians from participating.
- It is incumbent on PFP designers to build mechanisms that
monitor the effects of the program on patient access, practice
burden, quality and outcomes for conditions not targeted by the
PFP formula. Unintended negative effects, if detected, should
prompt reassessment, and potentially a redesign, of the PFP program.
If incentives are sufficiently powerful to modify clinician behavior
and practice structure, some possible positive effects include
(adapted in part from Roland 2004):
- Better access to and delivery of preventive services;
- Potential for reducing waste and inefficiency;
- Increased use of electronic information systems, including
medical records and disease registries;
- Stronger connections with community resources that patients
may call on to enhance chronic care self-management;
- Improved primary care management of chronic disease with more
practices specializing in chronic care;
- Better quality; and
- Improved outcomes.
The relative balance between positive
and negative effects needs to be carefully monitored by designers
and evaluators of PFP programs. The ongoing input and feedback
of physicians will be critical to determining the future success
or failure of PFP.
> back to index
Table. Summary of Pay-for-Performance
Design Principles

> back to index
|