DeepSolar: A Machine Learning Framework to Efficiently Construct Solar Deployment Database in the United States

Posted on 19/09/2019, in Paper.
  • Overview: This paper build a solar panel identifying system (DeepSolar) and based on it, build SolarForest that predict solar deployment density according to local environmental and socioeconomic features.
  • Data: The main dataset to cross-validate is OpenPV (~million data points with size information). At the satellite image side, the authors scan 50 cities/towns in U.S. (0.043% of the total areas), resulting ~500k patches of 299 by 299 after resizing. During the prediction stage, the sampling is guided by nightlights. 10% of training set are positive while <2% of validation and test set are positive.
  • Model: The Inception-v3 model pertained on ImageNet is adopted. During the fine tuning stage, the authors first re-train the final affine layer and then fine-tune all other layers. To estimate the size of the solar penal, segmentation prediction is needed but fully supervised training is infeasible in this case due to the limit of manpower and computational power. This paper is using CAP as the proxy of segmentation prediction.
  • Greedy layer-wise training: The original try of using CAP seems to be too detailed (including nuance of edges and grids shape). The Author propose greedy layer-wise training by stacking layers one by one (This paper did not report the details such as how much layers are trained using this greedy methods):
    • Step 1: Train a model with Backbone_k[Fixed] - Conv - GAP (Global Average Pooling) - linear - softmax
    • Step 2: Concatenate Backbone_k and Conv to get Backbone_{k+1}

  • The paper is published in Joule and most of the model details are in the supplemental materials section. It is worth thinking how different audience would take from the paper and structure it differently.