Model size trade-offs, embedding dimension costs, ANN recall vs latency, RAG failure modes, RRF scoring, and prompt engineering with sp_invoke_external_rest_endpoint!