Convolutional neural networks (CNNs) have recently achieved unprecedented performance for the automatic recognition of objects (e.g., buildings, roads, or vehicles) in color aerial imagery. Although these results are promising, questions remain about their practical applicability. This is because there is a wide variability in the visual characteristics of remote sensing imagery across different geographic locations, and CNNs are often trained and tested on imagery from nearby (or the same) geographic locations. It is therefore unclear whether trained CNNs will perform well on new, previously unseen, geographic locations, which is an important practical consideration. In this work we investigate this problem when applying CNNs for solar array detection on a large aerial imagery dataset comprised of two nearby US cities. We compare the performance of CNNs under two conditions: Training and testing on the same city vs training on one city and testing on another city. We discuss several subtle difficulties with these experiments and make recommendations. We show that there can be substantial performance loss in second case, when compared to the first. We also investigate how much training data is required from the unseen city in order to fine-tune the CNN so that it performs well. We investigate several different fine-tuning strategies, yielding a clear winner.