Charlottesville data scientists analyze pedestrian wifi usage in the downtown mall

Open Data Challenge encourages Charlottesville community to engage in data science

hs-DowntownMall-XiaoqiLi

The Open Data Challenge calls for the Charlottesville data scientist community to examine pedestrian wifi usage in the Downtown Mall.

Xiaoqi Li | Cavalier Daily

An open data initiative is seeking to better understand how visitors to the Charlottesville Downtown Mall are using wireless internet. 

The open data challenge, created by Daniel Bailey, co-founder and chief executive officer of Astraea — a local startup established in 2016 focused on machine learning and developing data science tools — is engaging the tech community in Charlottesville to collaborate and design predictive models that will analyze wifi usage on mall. Since March 12, the challenge allows registered teams of data scientists to create algorithms and statistically interpret pedestrian wifi use, ultimately with the goal of further developing the Downtown Mall’s infrastructure by understanding its flow of people. 

With the presence of ubiquitous wifi, people constantly engage in the virtual world. This regular usage has created a plethora of easily accessible data stored in every economic sector known as “big data.” Due to the large storage size in terabytes or petabytes, variety in structures, complexity of format, as well as speed of creation and storage, this “big data” cannot be analyzed by the traditional data analysis methods, according to the United Nations Industrial Development Organization.

Additionally, the rate at which this data is created and stored surpasses the rate at which this data is analyzed. The question remains, especially for many businesses, on how to analyze their data to “understand the market and customer behavior”, according to research conducted by the United Nations Industrial Development Organization.

“I think as a business owner to understand how your customers are using [the wifi] but then also looking at … [the] weather and events [could] impact sales,” said Jason Ness, Business Development Manager for the City of Charlottesville.

Ness said analyzing the wifi usage helps public service industries to plan their maintenance and development of the mall during non-optimal business hours. Additionally, this challenge will further help the city prioritize the developments of the mall based on pedestrian flow.  

By partnering with the Open Data Challenge Advisory Group and artificial intelligence computing company NVIDIA, Astraea provided participants with a year’s worth of data from clients who connected to the Charlottesville Downtown Mall wifi. Ting, which provides internet service on the Downtown Mall, has anonymized and aggregated the data usage of pedestrians from nine access points throughout the mall. 

According to Bailey, more than 40,000 customers connected to the wifi last year, accounting for a total of nearly 330,000 sessions. 

In the Open Data Challenge, each team will design predictive models based on this data. So far, 20 teams have registered for the Open Data Challenge, totaling approximately 100 participants. 

This challenge not only raises awareness of the open data initiative undertaken by the Charlottesville city, but also creates a pathway through which the city and its various counties can engage with the growing tech community for the purpose of social good, Bailey said in an email statement. 

Ness said that this challenge will leverage the private sectors in Charlottesville to engage in this contest. He said he hopes the results from this contest will further develop the mall by “understanding what types of businesses go where and how that impacts the pedestrians and retail traffic.”

There are two categories of entry for this contest: best predictive model and best storytelling model. The judging criteria for the best predictive models is based on how well these models generate a “one-week forecast for the following time series: clients per day, number of sessions over time, and usage over time,” according to Astraea

On the other hand, the judging criteria for the best storytelling models is based on soundness, explainability, appeal, accessibility and engagement of the narrative and visualization. According to Ness, the visualization aspect especially helps people without any data science background to understand these predictive models. 

The winning team for each category will be awarded $500 and a Titan XP Graphics Processing Unit, which currently has a retail value of nearly $1,200. The winning teams will also present their models at the Tom Tom Founders Festival’s Applied Machine Learning Conference held at the Violet Crown Theater on April 12. 

Ness said the results of this challenge will hopefully be implemented in future development of the Charlottesville Downtown Mall. He looks forward to seeing what teams come up with and how the algorithms and statistical analyses created will be implemented to improve the city’s services. 

related stories