ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV Caching | Synapse