Implement and test Anthropic Messages API

Qwen 3 Coder on Docker Hub would be a good model to test this with

https://github.com/ggml-org/llama.cpp/pull/17570

The max context size an 36GB VRAM macbook pro can handle is:

llama-server -hf unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF:Q4_K_M -c 65536