Values that guide action, not just rules.
Internal Ethics & Morals is a research framework for building AI systems that reason about ethical dilemmas using value frameworks rather than rigid rule sets. Instead of relying on brittle, context-blind policies, this approach enables systems to weigh competing values, explain trade-offs, and adapt to diverse cultural and situational contexts while maintaining universal ethical principles.
The core principle: flexible reasoning grounded in values, not inflexible compliance with rules.
Rule-based ethical systems are fundamentally limited. They cannot anticipate every scenario, they invite gaming and loopholes, and they fail in novel contexts. Human moral reasoning draws on values, principles, and contextual judgment—not simple if-then logic.
This framework enables AI to:
- Reason about ethical trade-offs in ambiguous situations
- Adapt to diverse cultural and community norms
- Explain decisions in terms of underlying values
- Maintain accountability through transparent reasoning
Rules provide clarity but lack nuance:
- They cannot cover every possible situation
- They encourage adversarial gaming and edge-case exploitation
- They provide no guidance for novel scenarios
- They offer no mechanism for weighing competing goods
A teacher discovering academic dishonesty must balance:
- Honesty: Addressing the violation
- Fairness: Treating all students equitably
- Compassion: Understanding circumstances and supporting growth
AI should reason in the same multifaceted way, surfacing trade-offs rather than applying blanket policies.
- Privacy vs. Safety: When does protecting one compromise the other?
- Autonomy vs. Welfare: When should systems intervene for user well-being?
- Short-term Comfort vs. Long-term Growth: When is challenge appropriate?
- Individual Rights vs. Collective Good: How to balance competing interests?
Systems must make these tensions explicit and reason about them contextually.
Communities hold different norms and values:
- A tutoring system for conservative religious families should respect their values
- A system in secular educational settings operates within different boundaries
- Collectivist cultures prioritize group harmony differently than individualist ones
The framework allows configurable cultural and contextual adaptation while maintaining non-negotiable universal principles:
Universal principles (non-configurable):
- Human dignity and worth
- Prevention of harm
- Honesty and non-deception
- Respect for autonomy
Contextual values (configurable):
- Formality and communication style
- Privacy boundaries and data sharing norms
- Authority structures and decision-making
- Definitions of appropriate content and topics
When encountering novel situations, systems use analogical reasoning:
- Identify similar past cases
- Extract relevant value conflicts
- Apply principled reasoning to new context
- Acknowledge uncertainty and defer when appropriate
Critical constraint: Analogical reasoning supports human judgment, never replaces it.
Mature ethical reasoning integrates multiple moral traditions:
Deontological (Duty-Based)
- Are there inviolable rules or duties at stake?
- What principles apply regardless of consequences?
Consequentialist (Outcome-Based)
- What are the likely outcomes of different actions?
- How do we weigh harms and benefits across stakeholders?
Virtue Ethics (Character-Based)
- What would a person of good character do?
- What habits and dispositions should guide this choice?
Context demands which framework receives priority, with human oversight ensuring appropriate balance.
When a system declines a request, it must:
- Identify the values at stake (e.g., "This could compromise user privacy")
- Explain the harms considered (e.g., "Sharing this data could enable identity theft")
- Acknowledge trade-offs (e.g., "I understand this limits convenience")
- Offer alternatives when possible (e.g., "Here's a safer approach")
Never: Just say "I can't do that" without explanation.
Transparent reasoning enables:
- User understanding of system limitations and priorities
- Correction mechanisms when the system's judgment is wrong
- Accountability for designers and operators
- Ongoing improvement through feedback on value conflicts
AI ethical reasoning is supportive, not authoritative:
- Systems surface values and trade-offs for human consideration
- Final decisions in high-stakes scenarios rest with people
- Systems acknowledge uncertainty and limits of their reasoning
- Escalation pathways exist for complex or novel situations
Human oversight ensures:
- Value alignment reflects community standards appropriately
- Error correction when reasoning fails or misfires
- Boundary enforcement on non-negotiable ethical principles
- Adaptation as social norms and contexts evolve
- Core Value Ontology: Structured representation of fundamental values
- Contextual Weight Assignment: How values are prioritized in different situations
- Conflict Detection: Identifying when values compete or contradict
- Resolution Strategies: Frameworks for navigating value tensions
- Situation Assessment: Understanding stakeholders, context, and potential outcomes
- Value Mapping: Identifying which values are activated by the situation
- Trade-off Analysis: Weighing competing considerations
- Justification Generation: Explaining the reasoning chain in human terms
- User corrections inform value weight adjustments
- Community input shapes contextual calibration
- Edge cases expand the reasoning case library
- Failures trigger systematic review and refinement
- Educational Systems: Balancing academic integrity, student welfare, and growth
- Healthcare AI: Navigating autonomy, beneficence, and resource constraints
- Content Moderation: Weighing free expression, safety, and community standards
- Autonomous Vehicles: Life-and-death decisions requiring transparent ethical reasoning
- Personal Assistants: Privacy, autonomy, and well-being trade-offs in daily life
Between recklessness and paralysis.
Without internal ethics, AI systems oscillate between two failures:
- Reckless: Ignoring ethical implications entirely, causing harm
- Over-restrictive: Refusing benign requests out of excessive caution
Value-based reasoning enables responsible action that people can understand and rely on—flexible enough to handle complexity, principled enough to prevent harm, and transparent enough to invite correction when wrong.
- No deception: Systems never mislead about their reasoning or capabilities
- No exploitation: Ethical reasoning serves user welfare, not platform objectives
- No abdication: Systems take responsibility for their recommendations and actions
- No opacity: Reasoning must be explainable in terms humans can understand
- Value pluralism: Respect for diverse ethical frameworks within universal bounds
- Epistemic humility: Acknowledging limits and uncertainty in moral reasoning
- Human primacy: People retain final authority in ethical decision-making
- Continuous refinement: Ongoing learning from failures and edge cases
- How do we validate that value frameworks align with stated principles?
- When should systems refuse vs. warn vs. comply with problematic requests?
- How can we ensure value calibration doesn't drift toward lowest-common-denominator ethics?
- What oversight mechanisms prevent value frameworks from being gamed or corrupted?
These questions drive ongoing research and require interdisciplinary collaboration.