Beyond Metrics: A Critical Analysis of the Variability in Large Language Model Evaluation Frameworks | Synapse