Understanding users' environments is crucial for determining their states, needs, and interactions with technology. This work focuses on route context, including environmental factors such as road conditions, traffic, and weather that influence users while traveling. Integrating route context with LLMs enables reasoning over environmental factors, thus allowing users to ask questions like 'When is the best moment for a phone call along my route?' or 'Is this a good route for a drive in a convertible?'. We introduce the first LLM that natively understands route context. We create ContextualRoutes1, a dataset of 320k routes, each comprising road, weather, and traffic data. We annotate these routes using a template and a teacher model to create LabeledRoutes1, a multimodal multi-task question-answering dataset with over 1k tasks and 40k conversations containing routes and text. Based on the first dataset, we train the first route context tokenizer that groups the routes into semantically meaningful clusters. On its basis, we propose the first route-context-aware LLM and find it capable of zero-shot reasoning on routes. Still, we urge that further research on learning cross-modal route-to-text understanding is necessary and discuss challenges in the future development of artifacts for this novel branch of research.
Hallgarten et al. (Sat,) studied this question.