Question 1

What is the difference between a custom SLM and fine-tuning a public model?

Accepted Answer

Fine-tuning adapts weights (or adapters) on top of an existing checkpoint. A custom SLM engagement usually includes that adaptation plus explicit work on data curation, evaluation gates, and compression so the resulting model meets latency, cost, and residency targets - not only a new LoRA on top of a generic API.

Question 2

How do you handle sensitive or regulated data?

Accepted Answer

We align on where data may live, who can access it, retention limits, and audit expectations before training starts. Work typically stays in environments your security team approves; exact controls are documented in the statement of work. Nothing on this page replaces your legal or DPA process.

Question 3

Which compression techniques do you use?

Accepted Answer

Common options include knowledge distillation from a larger teacher, post-training quantization, structured or unstructured pruning where metrics allow, and smaller architectures when re-training from scratch is justified. The mix depends on your accuracy floor and serving hardware - we do not apply a fixed recipe to every client.

Question 4

How long does a first delivery usually take?

Accepted Answer

Timelines depend on data readiness, evaluation complexity, and infra access. Indicative ranges are summarized on the About page; every schedule is confirmed after discovery. This site does not quote fixed durations.

Question 5

What do we need to provide from our side?

Accepted Answer

A product or use-case owner, access to representative data (or agreement on how to collect it), someone who can approve governance decisions, and inference owners who will run or integrate the model. Optional: existing MLOps hooks for CI and promotion.

Question 6

Can you integrate with our existing MLOps stack?

Accepted Answer

Yes, when it reduces friction for your teams. We document how artifacts map to your registries, containers, and deployment pipelines rather than forcing a greenfield toolchain.

Question 7

What happens after a proof of concept?

Accepted Answer

If metrics meet the agreed gate, we plan production hardening: monitoring, versioning, rollback, and optional scale-out. If not, we document gaps and options - smaller scope, different data, or a different architectural path such as private LLM first.

Custom SLM development

Problems this service addresses

API cost and unpredictability at volume

Latency and residency constraints

Generic models miss your vocabulary and policies

How delivery maps to the SLM pipeline

Pipeline stages

What we deliver

Data engineering and curation

Model compression

PEFT and LoRA

Deliverables checklist

Who it is for

Strong fit

Usually not a fit (yet)

From our insights

Frequently asked questions