An attention mechanism is integrated with embeddings jointly learned from an implicit deep learning model.
Gary Nan Tie (Thu,) studied this question.