Bianchi Dy - Personal Website

High-quality dialogue between the government and the public is a necessary component of the urban planning process. Data from public surveys, online feedback portals, and social media platforms are standard sources of public opinion on current issues and developments. However, deriving area or issue-specific insights from unstructured text requires planners to read individual messages – a tedious and siloed process that prevents planners from learning across related issues and cases.

In this project, we interviewed planners to understand the structure of planning departments at URA, their workflows and how they currently use citizen feedback in their day-to-day decision making. Aside from coordinating expectations, meetings and deliverables with the DPL team, I developed a workflow to process unstructured text using natural language processing and machine learning into clusters or ‘topics’, with an emphasis on identifying and assessing the quality of persistent ('evergreen') and 'emergent' topics across time. I explored different metrics for assessing cluster quality and the similarity of clusters to one another temporally, based on the shared occurence of key words. These are reflected in the data visualizations on the next two pages, which were primarily designed by Nazim Ibrahim and involved my input. Both the workflow and data visualizations are currently in the process of being integrated into the ePlanner, an allin- one system of planning knowledge where datasets can be layered onto each other or consulted in detail by planners.

This project served as a precursor to another project on machine-assisted reply generation for urban planning queries. It will be part of the to-be-launched SUTD-URA Centre of Excellence.

2019-2021

✦Research

✦Data Visualization

Bianchi Dy, Nazim Ibrahim, Sam Joyce

B Dy, I Nazim, S Koh, A Chua, “Topics Through Time: Clustering and Visualizing Unstructured Public Feedback for City Planning” under review in Computers, Environment and Urban Society, 2022

Eight maps of Singapore, colored by intensity of messages
pertaining to certain cluster

Small multiples of each planning area, broken down into
subzones. The bar chart in the top row shows topic volumes, while the bottom row shows
where messages are concentrated within the planning area.

I initially explored geospatial representations of the data such as choropleths and dot plots. Messages were clustered using k-means and TF-IDF after stop word removal. This project was the first time the data had been visualised on a map, and this feature was later brought over into URA's planning support system.

Left chart shows four blurred out graphs overlaid by the words
Volume, Quality, Similarity and Text. The RHS shows a series of scatterplots containing t-SNE results.

I experimented with data visualizations such as bar charts, area charts (for cluster metrics) and t-SNE to represent clusters in the dataset. From interviews with planners, we learned that they were most interested in volume of the cluster, keywords inside the cluster, cluster quality and similarity to other clusters within or from other time slices. While t-SNE (RHS) showed both cluster volume, quality and similarity to other clusters, planners found it difficult to interpret, especially when making temporal comparisons.

Charts of temporal cluster tree. Left side shows state-based coloring
and right side shows parent-based coloring.

To show all four aspects planners were interested in, Nazim and I developed the temporal cluster tree. It uses a shared documents score and cosine similarity to map relationships between clusters from different time slices. The two coloring systems enable users to identify lineages of topics, i.e. persistent or "evergreen" concerns and emergent topics.

Home page Back to top ↑ Women's access to leisure through mobility ✨Figure drawing What's the state of Southeast Asian Sci-fi? ✨The new world order of airports: aviation demand by 2050 ✨Singapore Art Museum: Lonely Vectors Data Deep Dive✨Re-imagining the role of offices in a post-pandemic world through mobility data Transforming urban planning feedback for temporal knowledge discovery Improving design decisions through data visualisations Computer vision for promoting healthy living What’s the big deal about chilli in Asia? ✨A shape grammar interpreter for Grasshopper ORTO Playspace Project

Topics across time: temporal knowledge discovery in urban planning feedback data through machine learning

Design and Planning Lab, Urban Redevelopment Authority of Singapore