Live inference demo
Pick a model and send a prompt. Watch tokens stream in real time with live throughput in the header.
CueInference can make mistakes. Live tok/s shown in header.