This project is one of my most advanced and exciting works in Data Engineering. It automates the collection of daily news from Al Jazeera’s Telegram channel and processes it into categorized, summarized, and analytics-ready datasets.
Using Apache Airflow, the pipeline runs twice per day to fetch the latest Telegram news. Each news item is then processed through LLM-powered APIs, which return the category, confidence score, and associated country. All processed data is stored in Google BigQuery for large-scale analytics and reporting.
The pipeline continues with data modeling and transformation using dbt, preparing clean and structured datasets. A second LLM call generates concise daily summaries of the latest news trends.
Finally, a real-time dashboard built with Next.js displays the most important insights in a clear, visual format. The entire project was rapidly scaffolded using Vibe Coding, enabling fast development of both the FastAPI backend and the Next.js frontend.