How George Mason professors are challenging students to find new ways to predict peak bloom dates

Several George Mason University professors have turned what started as a way to make statistics exciting for students into a competition to determine who can develop a model to accurately predict when cherry blossoms around the world will reach peak bloom.

Jonathan Auerbach, an assistant professor in George Mason’s department of statistics, said this is the third year for the contest. It’s open to undergraduate and graduate students, researchers and professionals, and encourages participants to think about determining peak bloom dates in a new way.

Usually, Auerbach said, temperature is one of the most significant factors. But, he said, “There are a lot of other factors that can be important, too. And so students try all sorts of traditional and nontraditional methods.”

The National Park Service, Auerbach said, looks closely at the D.C. trees themselves. The agency recently announced that the blossoms along D.C.’s Tidal Basin are expected to reach peak bloom between March 23 and March 26. The contest, though, requires contestants to find models that can predict bloom dates for blossoms in D.C., Kyoto, Japan, Vancouver, Canada, Liestal-Weideli, Switzerland, and New York City.

“We take for granted that we’ve been observing the cherry trees in New York for 100-plus years,” Auerbach said. “Some of these other locations that the contestants have to predict, they only have a few years, or maybe no observations; it’s the first time that someone’s going to call the bloom date. The contestants have to be clever with their resources and make predictions that are going to extrapolate well.”

There are many reasons the competition is hard, Auerbach said. For one, even simple models that use temperature have to predict what the temperature is going to be over the next few weeks. There are also factors specific to each location, such as humidity and altitude, that may play a role.

Now that the entries have been submitted, judges will review submissions to make sure they align with the competition’s rules. The analysis has to be reproducible, and participants have to provide their code. Some judges who are statisticians will be “looking for a coherent narrative that predictions make sense.” Biologists, meanwhile, “are looking for a biological narrative to make sure that the predictions and the context and narrative are biologically meaningful.”

One or more winners will be selected and are eligible for a cash prize, Auerbach said.

Guesses that use temperature trends usually produce predictions that are accurate within a week, he said. Some participants then use “machine learning or data science methods in order to pick up a few extra days,” according to Auerbach.

Based on predictions that have been submitted, the average peak bloom date for D.C. is March 26. Generally, Auerbach said, contestants agree with the Park Service prediction. Historically, participants have guessed later dates, he said.

“It’s a really hard problem,” Auerbach said. “There’s just a lot of unknowns.”

More information about the competition is available online.

Get breaking news and daily headlines delivered to your email inbox by signing up here.

© 2024 WTOP. All Rights Reserved. This website is not intended for users located within the European Economic Area.

Scott Gelman

Scott Gelman is a digital editor and writer for WTOP. A South Florida native, Scott graduated from the University of Maryland in 2019. During his time in College Park, he worked for The Diamondback, the school’s student newspaper.

Federal News Network Logo
Log in to your WTOP account for notifications and alerts customized for you.

Sign up