Handling performance degradation
AI systems are probabilistic and will sometimes give incorrect or unexpected output. This makes it critical to plan for errors and failures from early in development.
What users consider an error depends on their expectations. A recommendation system that's useful 60% of the time might be seen as a success or a failure depending on users' context. These perceptions establish or correct mental models and calibrate trust. Consider different error types. A medical diagnosis AI might fail by missing a condition (false negative) or flagging healthy patients (false positive). A translation app might produce grammatically correct but culturally inappropriate phrases. A route planner might suggest technically shorter paths through unsafe areas.
Design your system knowing some people will intentionally abuse it. Make failure safe and boring. Avoid making dangerous failures interesting or over-explaining vulnerabilities, which can incentivize reproduction.
When AI fails, often the easiest path forward is letting users take over. Users need awareness of the situation, understanding of what to do next, and the ability to take action. Error messages should be human, not machine-like. Address mistakes with humanity and explain limits while inviting people forward.[1]