How improper dataset split hinders model generalizability: a systematic comparison in Human activity recognition and exercise evaluation tasks | Synapse