These are not aspirations. They are the standing requirements that any system we ship — for ourselves or a customer — must meet. Where a specific engagement’s constraints make a principle impossible to honour fully, we say so in writing, explain the trade-off, and seek explicit customer authorisation rather than compromising silently.
1. Human oversight by design
Every AI system has a designed point of human oversight proportionate to its risk tier. Models that affect a person’s rights, credit, employment, health, or safety carry stronger oversight than internal productivity copilots. Oversight is a product feature — a queue, an interface, a confidence surface — not a meeting.
2. Outcomes grounded in sources
For systems that produce factual claims, we engineer source grounding by default. Every claim should be traceable to a retrieved chunk, a referenced document, or a structured fact. Confident hallucination is a failure of design, not the user’s problem.
3. Transparency about AI involvement
End users interacting with AI are told so, in plain language. Where the system makes a material decision about the user, they have access to a human-reviewable explanation, the data inputs used, and an appeals path consistent with applicable regulation.
4. Fairness assessment, not assumption
For models that affect protected outcomes, we evaluate performance and error distribution across subgroups before deployment and on an ongoing cadence. Where we find unjustifiable disparity, we close it before promotion — or we do not promote.
5. Safety and red-teaming as standard
We red-team systems against the OWASP LLM Top 10 and MITRE ATLAS catalogue before launch and on a continuous regression basis. Quarterly campaigns for any system in long-term operation. New attack patterns are baked into the regression suite the same week they appear publicly.
6. Refusal as a first-class behaviour
We engineer models to refuse rather than guess on out-of-domain or high-risk queries. Refusal rate is instrumented and reviewed. A well-calibrated system declines what it does not know — confidently and with a clear escalation path for the user.
7. Data and model provenance
We track and document the data used to train, fine-tune, evaluate, and ground every system. We use foundation models from providers whose training-data practices we understand. Where licensing or provenance is unclear, we treat it as a risk and act accordingly.
8. Cost and energy awareness
We pick model and architecture choices with the cost and energy profile in mind. A 7B model that meets the requirement is not just cheaper — it is the right professional choice. The smallest model that does the job is the right model.
9. Operate for the long run
Every production AI system we ship comes with drift detection, scheduled retraining triggers, an incident-response plan, and a documented ownership matrix. We do not ship and walk away. Most AI value is destroyed in the first year of unmanaged operation, not in the build.
10. Stop when stopping is right
If a system fails its production KPI, drifts beyond acceptable bounds, or produces a harm we did not anticipate, we pause and review before continuing. Kill criteria are written into the engagement plan, not improvised in a crisis.
11. Foundation-model selection
We choose foundation models based on demonstrated capability, security posture, evaluation maturity, and the provider’s public stance on safety. We do not commit to a single provider for the lifetime of any system; portability is a design constraint, not an afterthought.
12. Honest assessment
If AI is not the right answer to a use case, we say so. If the data is not ready, we say that. The strongest credibility move a consultancy can make is to walk away from work that should not happen. We do.
These principles map directly to the EU AI Act, ISO/IEC 42001, and the NIST AI Risk Management Framework. For specific control mappings, see the Compliance page.