Estimate token count for LLM prompts. Uses a character-level approximation (~4 chars/token for English). Results are approximate — actual tokenization varies by model.