Back to projects
DataDeployed

Market Analytics

Market data analytics — automated ETL pipeline, Snowflake warehouse, interactive dashboards.

1M+records under 2s
PythonSnowflakeReactD3.js

Overview

Market Analytics is a real estate market data analytics platform built for a leading market research firm in Vietnam. The system collects and analyzes property transaction data from multiple sources, providing market reports and price trend intelligence for investors, banks, and property developers.

The standout capability is processing and visualizing over 1 million transaction records with sub-2-second query times — giving analysts the freedom to explore data interactively without waiting.

The Challenge

Real estate data in Vietnam is fragmented across many sources: online listing portals, provincial land price bulletins, auction records, and brokerage firm reports. Each source uses different formats, data quality is inconsistent, and address representations vary widely — the same district might appear as "Q.1, HCM", "Quận 1, TP.HCM", or "District 1, Ho Chi Minh City" across different sources.

Normalizing addresses was the core analytical challenge: all of these variations needed to be recognized as the same geographic entity for accurate cross-source comparison.

Our Solution

Ventra Rocket built a Python ETL pipeline with distinct stages: data scraping, cleansing with fuzzy matching for address normalization, geocoding to standardize geographic coordinates, and loading into Snowflake via a star schema optimized for analytical queries.

The React + D3.js frontend provides interactive dashboards with a real estate price heatmap by ward and district, price trend charts over time, and tools to compare market indicators across regions.

Key Features

Impact & Results

The system processes analytical queries across 1M+ records in under 2 seconds, enabled by Snowflake's optimized clustering keys and materialized views. Weekly market report preparation time fell from 2 days to 3 hours.

Data quality improved significantly: the successfully normalized address rate reached 96% vs. 60% previously. Clients gained the ability to analyze price trends by individual street — a level of granularity previously unavailable in Vietnam's real estate research market.

Tech Stack Details

Python with Pandas and GeoPandas handles the geospatial ETL pipeline — GeoPandas is particularly suited for spatial analysis and coordinate system transformations. Snowflake Time Travel allows querying data at any point in the past 90 days — invaluable for auditing and reproducing historical analyses. React with D3.js builds custom visualizations that off-the-shelf charting libraries cannot provide, particularly choropleth maps and multi-dimensional scatter plots. Mapbox GL renders interactive maps at high performance even with tens of thousands of data points rendered simultaneously.

Related Projects

Data

Snowflake Analytics Hub

Enterprise data analytics dashboard — auto-streams data into Snowflake via real-time ETL pipelines.

70%faster reporting
Data

Kello

Market intelligence platform for pre-owned luxury watches — aggregates millions of data points from forums, marketplaces, and auction sites to deliver transparent pricing for 20,000+ watch models.

20K+watch models tracked
Market Analytics — Million-Record Market Data Platform | Ventra Rocket