Overview
Snowflake Analytics Hub is a comprehensive enterprise data analytics system built for a trading group with operations across Vietnam and Australia. Previously, the analytics team spent days manually collecting data from dozens of sources — SharePoint, Google Drive, internal ERP systems — and then consolidating everything in Excel before each reporting cycle.
The new system fully automates the entire data flow: from collection, cleansing, and normalization through to visualization on interactive dashboards, enabling leadership to make data-driven decisions based on real-time figures rather than days-old reports.
The architecture follows the data lakehouse model, combining the advantages of a data lake (flexible storage, low cost) and a data warehouse (fast structured queries) on the Snowflake platform.
The Challenge
Data was scattered across incompatible formats and systems: Excel files in SharePoint, PDF reports in Google Drive, transactional data from on-premise ERP, and logs from third-party SaaS applications. There was no unified schema, no data validation process, and different departments used different definitions for the same business metrics.
Additionally, data volume was growing 40% per quarter, requiring a system capable of scaling without re-architecture.
Our Solution
Ventra Rocket designed a three-layer ETL pipeline: an Ingestion Layer using Apache Kafka to stream real-time data from all sources, a Processing Layer using Laravel jobs to transform and validate data against business rules, and a Serving Layer with Snowflake as the central analytical warehouse.
The Vue.js frontend provides interactive dashboards with 30+ chart types, multi-dimensional filters, and drill-down from summary to transaction-level detail. Users with no SQL knowledge can build custom reports independently.
Key Features
- Real-time ETL Pipeline: Data from SharePoint and Google Drive syncs automatically every 15 minutes; critical sources like financial transactions stream continuously via Kafka.
- Unified Data Catalog: A metadata catalog helps users discover and understand the meaning of each data field, preventing metric misinterpretation across departments.
- Self-service Analytics: Drag-and-drop report builder for non-technical users with one-click export to Excel or PDF.
- Anomaly Detection: Automatic detection of data anomalies (sudden drops, unusual spikes) with alerts via email and Slack.
- Role-based Access Control: Granular permissions by department and role — each person only sees data they are authorized to access.
Impact & Results
Monthly report preparation time dropped from 3 days to 4 hours — a 70% reduction. The analytics team reclaimed 60 hours of labor per month to focus on deep analysis rather than manual data collection.
Data quality improved markedly: error rate fell from 8% to under 0.3% through the automated validation layer. Leadership gained real-time KPI visibility for the first time instead of waiting for end-of-month reports.
Tech Stack Details
Snowflake was chosen for its auto-scaling query compute without infrastructure management — ideal for a team without a dedicated DBA. Apache Kafka guarantees zero data loss even when source systems experience temporary downtime. Laravel with its robust queue system handles asynchronous ETL tasks reliably. Vue.js with Pinia enables complex UI components with clear, debuggable state management.