Recently, Convolutional Neural Networks (CNNs) have demonstrated impressive performance on several visual recognition benchmark datasets utilizing overhead imagery. However, most of these analyses performed on benchmark datasets involve testing pre-trained CNNs on imagery that was collected over roughly the same locations as the training imagery. In this work we propose a benchmark problem - termed cross-geographical domain (xGD) adaptation - designed to evaluate the performance of CNNs in which they are tested on imagery collected over previously unseen geo-locations - a more challenging and practical scenario that we term cross-domain testing. We focus this work on building segmentation due to the availability of appropriate datasets. The results indicate that CNNs generalize poorly to data processed from geographic locations that were not present in training. Surprisingly, we found that larger models (pre-trained on ImageNet) generalize as well as small models in cross-domain testing, and sometimes better. This work provides the first comprehensive results for cross-domain recognition, raising awareness of this important problem. We hope that xGD can serve as a benchmark for future work; xGD uses publicly-available data, and we release our design details with this publication.