I was just playing with Cerebras a few days ago because it's the fastest inference provider by far. Unfortunately, the only model anywhere near economical to run that fast is gpt-120b-oss which sucks at Pi's tool calling. So I've been hoping for something faster ever since, especially since my local hardware has a paltry 128GB of unified memory.
Hopefully this pans out and fast models (that are also not ridiculously dumb) become the norm. It's amazing what you can unlock with even a single order of magnitude's speed improvement.
Hopefully this pans out and fast models (that are also not ridiculously dumb) become the norm. It's amazing what you can unlock with even a single order of magnitude's speed improvement.