aidy-models is a model registry for LLM providers and models.
It provides a models.json file and keeps
the data shaped for runtime consumers:
- provider defaults
- model-level overrides
- capability metadata
- pricing metadata
- runtime compatibility metadata
Install the Model Pricing skill to help agents answer pricing questions from the latest registry data:
npx skills add ImSingee/aidy-modelsThe current generator merges:
bun run generateThis rewrites models.json.
models.json has the following top-level structure:
{
"_meta": {},
"providers": {},
"models": {}
}Generation metadata:
generatedAt: ISO timestamp of the last generationsources.modelsDev: provider and model counts frommodels.devsources.lobehub: provider and model counts from Lobehub, pluscommitHashmerged: counts after merge and normalizationoverridesApplied: number of local overrides applied fromsrc/generate/overrides.ts
providers is a map keyed by canonical provider id.
Each provider entry contains default runtime metadata for that provider.
models is a map keyed by canonical provider id.
Each value is an array of models for that provider.
Model entries may override provider defaults for cases where a single provider serves multiple protocols or base URLs.
Consumers should treat provider-level fields as defaults and model-level fields as overrides.
Recommended precedence:
- Start from
provider - Apply
model - For nested
compat, mergeprovider.compatfirst and thenmodel.compat
This matters for fields such as:
apibaseUrlheaderscompat
Example:
github-copilothas provider-level static headers- some
github-copilotmodels overrideapifromopenai-completionstoopenai-responsesoranthropic-messages opencodeprovider defaults toopenai-completions, while some models override toopenai-responses,anthropic-messages, orgoogle-generative-ai
Provider describes provider defaults.
id: canonical provider idname: display nameapi: default API protocol for the providerbaseUrl: default base URL for the providerheaders: static default headers to send for this providercompat: provider-level runtime compatibility defaults
official: whether the provider is treated as an official first-party sourcefeatured: whether the provider should be highlighted in consumer UIsdescription: provider descriptionurl: provider homepagedoc: docs URLenabled: upstream enablement flag if availablecheckModel: model id suitable for health checks or API verificationapiKeyUrl: API key management page
Model describes a concrete model entry.
id: model id used by the provider APIname: display nameapi: optional model-level API overridebaseUrl: optional model-level base URL overrideheaders: optional model-level static headerscompat: optional model-level compatibility overrides_: optional unstable passthrough metadata
description: model descriptiontype: upstream type such aschatfamily: model familyreleasedAt: release date stringknowledge: knowledge cutoff string if availableopenWeights: whether the model has open weightsdeprecated: whether the model is deprecated upstream
abilities.toolCall: supports tool / function callingabilities.reasoning: supports reasoning modeabilities.vision: accepts image inputabilities.structuredOutput: supports structured outputabilities.search: supports native searchabilities.imageOutput: supports image generationabilities.video: supports video input or output, depending on sourceabilities.attachment: supports file or attachment inputabilities.temperature: supports configurable temperatureabilities.interleaved: supports interleaved thinking / tool behavior
contextWindow: max input contextmaxOutput: max output tokensmodalities.input: accepted input modalitiesmodalities.output: supported output modalities
_ is reserved for unstable local metadata that is useful to runtime consumers
but not normalized as first-class schema yet.
OpenAI reasoning models may include:
_.reasoningEffort.enum: supported reasoning effort values_.reasoningEffort.default: default reasoning effort value
reasoningEffort constrains effort on reasoning for reasoning models.
Currently supported values are none, minimal, low, medium, high, and
xhigh.
Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
pricing.currency: pricing currency, such asUSDorCNYpricing.unit: billing unit used by this pricing definitionpricing.basePricing: base rates by pricing targetpricing.adjustments: conditional pricing adjustments
pricing contains:
unit: billing denominator such asmillionTokensorimagebasePricing: default rates keyed by pricing target such astextInputadjustments: conditional changes applied on top ofbasePricing
The current schema assumes a single unit per model pricing definition.
Each adjustment contains:
mode:multiplierorabsolutewhen: condition mapunless: optional exception condition map, or a list of exception condition mapsvalues: target -> factor or final rate
Condition values may be:
string,number, orbooleanfor a matching condition[number, number | "infinity"]for ranges such astextTotalInput: [0.2, "infinity"]
Adjustment application order:
- Start from
basePricing - Iterate over
adjustmentsin array order - For each adjustment, first match
when - Skip the adjustment if any
unlessclause also matches - For each matching target:
- if
modeisabsolute, replace the current rate - if
modeismultiplier, multiply the current rate
- if
If unless is a single object, all keys in that object must match to suppress the
adjustment. If unless is an array, matching any one entry suppresses the
adjustment.
This means adjustment order is significant:
absolutefollowed bymultiplier: both apply, because the later multiplier multiplies the overridden ratemultiplierfollowed byabsolute: only the later absolute result remains, because it replaces the already-multiplied rate
| key | Meaning |
|---|---|
textInput |
prompt or other input text billing |
textOutput |
generated text billing |
textInput_cacheRead |
prompt cache read / cache hit billing |
textInput_cacheWrite |
prompt cache write billing |
audioInput |
audio input billing |
audioOutput |
audio output billing |
audioInput_cacheRead |
cached audio input read billing |
imageInput |
image input billing |
imageInput_cacheRead |
cached image input read billing |
imageOutput |
image output billing |
imageGeneration |
image generation billing |
videoGeneration |
video generation billing |
unit |
Meaning |
|---|---|
millionTokens |
price per 1,000,000 tokens |
millionCharacters |
price per 1,000,000 characters |
image |
price per image |
megapixel |
price per megapixel |
second |
price per second |
Use the billing denominator that matches the provider's published pricing. For example:
- text and most chat models use
millionTokens - some TTS models bill
textInputbymillionCharacters - image generation may bill by
imageormegapixel - audio or video duration-based pricing may bill by
second
The when object uses condition keys to describe when an adjustment applies.
| key | Value type | Meaning |
|---|---|---|
cacheTtl |
string |
prompt cache TTL such as 5m, 1h, or 24h |
fastMode |
boolean |
whether provider fast mode is enabled |
serviceTier |
"flex" | "priority" |
OpenAI service tier selector |
textTotalInput |
[number, number | "infinity"] |
total input-token bucket, including textInput + textInput_cacheRead + textInput_cacheWrite, using pricing.unit as the denominator |
textOutput |
[number, number | "infinity"] |
output-token bucket, using pricing.unit as the denominator |
quality |
string |
image quality variant such as standard or hd |
size |
string |
image size such as 1024x1024 |
generateAudio |
boolean |
whether audio generation is enabled |
thinkingMode |
boolean |
whether a model is in thinking / reasoning mode |
Other condition keys may appear if upstream pricing introduces more dimensions.
When that happens, the value type still follows PricingConditionValue.
serviceTier is currently intended for OpenAI-style responses runtimes. The
known values are:
flex: lower-cost background or latency-tolerant service tierpriority: premium higher-priority service tier
The keys in basePricing and adjustments.values use the same enum:
| key | Meaning |
|---|---|
textInput |
prompt or other input text billing |
textOutput |
generated text billing |
textInput_cacheRead |
prompt cache read / cache hit billing |
textInput_cacheWrite |
prompt cache write billing |
audioInput |
audio input billing |
audioOutput |
audio output billing |
audioInput_cacheRead |
cached audio input read billing |
imageInput |
image input billing |
imageInput_cacheRead |
cached image input read billing |
imageOutput |
image output billing |
imageGeneration |
image generation billing |
videoGeneration |
video generation billing |
values means different things depending on mode:
- if
modeismultiplier,values[target]is a factor such as1.5or2 - if
modeisabsolute,values[target]is the final rate inpricing.unit
{
"currency": "USD",
"unit": "millionTokens",
"basePricing": {
"textInput": 3,
"textOutput": 15,
"textInput_cacheRead": 0.3,
"textInput_cacheWrite": 3.75
},
"adjustments": [
{
"mode": "multiplier",
"when": {
"textTotalInput": [0.272, "infinity"]
},
"values": {
"textInput": 2,
"textOutput": 1.5,
"textInput_cacheRead": 2
}
},
{
"mode": "multiplier",
"when": {
"serviceTier": "priority"
},
"values": {
"textInput": 2,
"textOutput": 2,
"textInput_cacheRead": 2
}
},
{
"mode": "multiplier",
"when": {
"cacheTtl": "1h"
},
"values": {
"textInput_cacheWrite": 1.6
}
}
]
}{
"currency": "USD",
"unit": "image",
"basePricing": {
"imageGeneration": 0.04
},
"adjustments": [
{
"mode": "absolute",
"when": {
"quality": "hd",
"size": "1024x1024"
},
"values": {
"imageGeneration": 0.08
}
}
]
}compat stores runtime behavior that is not captured by basic fields like
api, baseUrl, or abilities.
The goal is to move provider- and model-specific runtime quirks out of client code and into registry data.
compat is split by runtime family:
openaiCompletionsopenaiResponsesanthropicbedrockgooglegoogleGeminiCli
For OpenAI-compatible chat-completions style APIs.
supportsStoresupportsDeveloperRolesupportsReasoningEffortsupportsUsageInStreamingmaxTokensField:max_completion_tokensormax_tokensrequiresToolResultNamerequiresAssistantAfterToolResultrequiresThinkingAsTextrequiresMistralToolIdsthinkingFormat:openai,zai, orqwenopenRouterRoutingvercelGatewayRoutingsupportsStrictModeassistantContentFormat:stringorpartstoolCallIdStrategy:preserve,openai-40,pipe-call-40, ormistral-9
For OpenAI Responses-style APIs.
toolCallIdStrategy:preserveorresponses-fc64longPromptCacheTtl: currently24hsupportsServiceTiersupportsAdditionalServiceTiers: array offlexorpriority
longPromptCacheTtl: currently1hsupportsFastModesupportsAdaptiveThinkingxHighReasoningEffort:highormax
supportsAdaptiveThinkingsupportsPromptCachingsupportsThinkingSignaturexHighReasoningEffort
For Google Gemini / Vertex style reasoning and tool behavior.
requiresToolCallIdsupportsMultimodalFunctionResponsereasoningMode:levelorbudgetreasoningLevelMapdefaultThinkingBudgets
For Cloud Code Assist / Gemini CLI style APIs.
toolSchemaFormat:parametersorinput_schemareasoningModereasoningLevelMapdefaultThinkingBudgets
Current api values used in this registry include:
openai-completionsopenai-responsesazure-openai-responsesanthropic-messagesbedrock-converse-streamgoogle-generative-aigoogle-gemini-cligoogle-vertex
- Not every field is present on every provider or model.
- Some upstream fields and local override metadata that are useful but not yet
normalized may live under
_. These fields are unstable and may change at any time.