Los puntos clave no están disponibles para este artículo en este momento.
Motivated by its wide availability and richness, there have been a plethora of recent work in querying, analyzing, and visualizing microblogs (see 3 for a brief survey). Examples of microblogs include tweets, online reviews, and comments on news websites. Unfortunately, existing work in microblog lacks data management tools that provide the necessary infrastructure to support efficient storage, indexing, and retrieval of microblogs. Hence, researchers, developers, and practitioners who need to process microblogs for their own purposes would need to either build their own ad-hoc techniques 5 or use any of existing general purpose big data engines, e.g., Spark, as their backbone 4. Relying on ad-hoc techniques does not scale for large data sizes. Meanwhile, existing general purpose big data engines are built in a generic way to support various query workloads. Thus, they are not equipped to support the characteristics of microblogs 2, and so they are missing necessary infrastructure like supporting the real-time indexing and promoting temporal, spatial, and ranking queries. This results in sub par performance when supporting microblogs. Video: http://kite.cs.umn.edu/video.html
Magdy et al. (Sat,) studied this question.