- Understand the fundamentals of weblog data and its importance for eCommerce and online platforms.
- Explore the 41 attributes of a weblog dataset and learn how they map to real-world website activity.
- Install and configure Apache Spark, Spark SQL, and Apache Zeppelin on both Ubuntu and Windows (Docker-based) environments.
- Work with Spark DataFrames and Spark SQL to clean, transform, and analyze weblog data.
- Build end-to-end weblog reports, including: Session Reports, Page Views Reports, New Visitor Reports
- Referring Domains & Referring URL Reports, Target Domains Reports, Top IP Address Reports, Search Query Reports, Device, Browser, and Network Analysis Reports
- Master data visualization in Apache Zeppelin, using charts like bar, pie, and line graphs to bring your reports to life.
- Optimize Spark queries and learn basic job performance tracking and tuning.
- Publish your Databricks or Zeppelin notebooks as shareable reports for business stakeholders.
- Gain hands-on project experience with real-world weblog data, preparing you for data engineering and analytics roles.
Are you ready to master Apache Spark by working on a real-world weblog reporting project?
If you’ve ever wanted to analyze website user activity, generate meaningful insights from weblogs, and build interactive reports with Spark SQL and Apache Zeppelin, this course is designed for you.
