Google Analytics uses Dremel to scan trillions of cells every second.
The data is distributed across leaf nodes, who can be queried by parent nodes sharded by keys. The data access is forced into a tree structure, with caching for popular queries.
Data format on disk is columnar, optimised for aggregation queries. Data is backed by Google's File System (Colossus).
Dremel also uses approximations to speed up queries and improve fault tolerance.
Paper: https://research.google/pubs/dremel-interactive-analysis-of-web-scale-datasets-2/
Google Dremel Lessons: https://interviewready.io/learn/system-design-course/google-dremel-deep-dive/google-dremel?tab=chapters
#GoogleAnalytics #Algorithms #SystemDesign
The data is distributed across leaf nodes, who can be queried by parent nodes sharded by keys. The data access is forced into a tree structure, with caching for popular queries.
Data format on disk is columnar, optimised for aggregation queries. Data is backed by Google's File System (Colossus).
Dremel also uses approximations to speed up queries and improve fault tolerance.
Paper: https://research.google/pubs/dremel-interactive-analysis-of-web-scale-datasets-2/
Google Dremel Lessons: https://interviewready.io/learn/system-design-course/google-dremel-deep-dive/google-dremel?tab=chapters
#GoogleAnalytics #Algorithms #SystemDesign
- Category
- Systeme.io Boost your sales
- Tags
- system design, interview preparation, interviews







Comments