December 7, 2016

New dataset developed at Duke will benefit solar energy growth

Nicholas Institute for Environmental Policy Solutions


Duke University researchers and students are helping bridge a critical information gap faced by those who seek to evaluate, improve and encourage the use of rooftop solar panels.

Duke has released a new set of open-source data including the size and locations of more than 19,000 solar panels in four California cities. In an open-access article published in Scientific Data this week, Duke researchers and students describe the dataset, how they developed it and its potential applications.

Why this dataset matters

The dataset indicates the locations of solar panels (also known as photovoltaic solar arrays) and how large they are (a good indicator of much power capacity they generate).

The set can be employed immediately by utility officials and third parties in the four California cities to predict where to install or upgrade electric grid infrastructure to meet changing demands on the system. The set can also be useful to researchers working on socioeconomic analysis of renewable energy resource development or electric power grid and microgrid analysis for distributed generation.

But the set could soon have global applications.

That's because it's ground-truth data, annotated manually over eight months by a team of students and researchers who pored over detailed satellite images. Duke researchers are now successfully using the dataset to train algorithms to automate the identification of solar panels through object detection techniques.

"As these algorithms are further refined, they will be useful for identifying solar capacity in any community around the world," notes Kyle Bradbury, managing director of the Energy Data Analytics Lab at the Duke University Energy Initiative. "Solar panel data could then be efficiently collected by machines—rather than by humans staring at satellite images for months at a time or combing through public records from dozens of regional utility commissions."

And the dataset has applications beyond energy, notes Bradbury. It could be invaluable to researchers who are identifying objects of any kind (not just solar panels) via remote sensing data, particularly aerial imagery. It could also be employed in developing machine learning techniques that involve large training datasets (deep neural network development, for example).

Tapping Duke's strengths as an interdisciplinary energy school

The research team involved Duke faculty and students in electrical and computer engineering, environmental policy, economics,  and computer science. The project was initiated by the Duke University Energy Initiative, which advances energy education, research and engagement across Duke's schools. The team first included student and faculty participants in Data+, an immersive summer program run by the Information Initiative at Duke. Then a Bass Connections team picked up the baton.

"One of the greatest strengths of the team was its disciplinary diversity," points out Bradbury. "By bringing together students and faculty from different departments and schools, we had a built-in network with diverse ideas that considered both the big-picture objectives and the technical details of the implementation."

Authors of the Scientific Data article include Duke faculty Kyle Bradbury (Duke University Energy Initiative), Timothy Johnson (Nicholas School of the Environment), Jordan Malof (Pratt School of Engineering), Leslie Collins (Pratt School of Engineering) and Richard Newell (Nicholas School of the Environment).  Duke student coauthors were Arjun Devarajan (Trinity College of Arts and Sciences), Raghav Saboo (Trinity College of Arts & Sciences) and Wuming Zhang (Trinity College of Arts and Sciences).

Additional contributors to the data set include students Sharrin Manor, Jeffrey Perkins, Natalia Odnoletkova, Joseph Stalin, Cassidee Kido, Aaron Newman, Nicholas von Turkovich, Brody Kellish, Chia-Rui Chang, Ting Lu and Yixuan Zhang.

This work was supported in part by the Alfred P. Sloan Foundation and by Wells Fargo.

Have questions about the dataset or Duke's ongoing work in this area?

Contact Kyle Bradbury ( at the Duke University Energy Initiative.

Want to keep in touch with energy education, research and engagement at Duke?

Join the Energy Initiative email list and follow us on Twitter (@dukeuenergy).