Sample data: Phase-1 sample snapshot. Official crawling and weekly benchmark jobs are not connected yet. All price, latency and score values validate the product structure only and must be replaced by traceable production data before launch.

Model comparison

The `models=a,b,c` URL parameter already drives the comparison page; selectors and saved comparisons come next.

ModelInputOutputTTFTContextValueUpdated
DSDeepSeek V3DeepSeek · closed$0.14/1M$0.28/1M124ms128K962026-06-09Compare
G4GPT-4oOpenAI · closed$2.50/1M$10.00/1M89ms128K732026-06-09Compare
C3Claude 3.5 SonnetAnthropic · closed$3.00/1M$15.00/1M95ms200K702026-06-09Compare
DS

DeepSeek V3

DeepSeek · Strong value baseline for coding and Chinese tasks in the sample set.

Quality
86
Chinese
93
G4

GPT-4o

OpenAI · High quality and low TTFT in this sample, with higher output token cost.

Quality
93
Chinese
84
C3

Claude 3.5 Sonnet

Anthropic · Excellent coding and reasoning sample scores; expensive output tier.

Quality
94
Chinese
82