(Failure of OOD detection under invariant classifier) Consider an out-of-distribution input which contains the environmental feature: ? out ( x ) = M inv z out + M e z e , where z out ? ? inv . Given the invariant classifier (cf. Lemma 2), the posterior probability for the OOD input is p ( y = 1 ? ? out ) = ? ( 2 p ? z e ? + log ? / ( 1 ? ? ) ) , where ? is the logistic function. Thus for arbitrary confidence 0 < c : = P ( y = 1 ? ? out ) < 1 , there exists ? out ( x ) with z e such that p ? z e = 1 2 ? log c ( 1 ? ? ) ? ( 1 ? c ) .

Proof. Consider an aside-of-delivery type in x out that have M inv = [ We s ? s 0 1 ? s ] , and you may M age = [ 0 s ? elizabeth p ? ] , then the feature logo try ? e ( x ) = [ z out p ? z elizabeth ] , where p ‘s the tool-standard vector outlined inside Lemma 2 .

Then we have P ( y = 1 ? ? out ) = P ( y = 1 ? z out , p ? z e ) = ? ( 2 p ? z e arablounge? + log ? / ( 1 ? ? ) ) , where ? is the logistic function. Thus for arbitrary confidence 0 < c : = P ( y = 1 ? ? out ) < 1 , there exists ? out ( x ) with z e such that p ? z e = 1 2 ? log c ( 1 ? ? ) ? ( 1 ? c ) . ?

Remark: During the a far more general case, z out is going to be modeled because the an arbitrary vector which is in addition to the in the-shipping labels y = step one and you will y = ? 1 and you can environmental have: z away ? ? y and you may z aside ? ? z age . Therefore into the Eq. 5 i have P ( z out ? y = step 1 ) = P ( z away ? y = ? step 1 ) = P ( z away ) . Then P ( y = 1 ? ? away ) = ? ( dos p ? z e ? + log ? / ( step one ? ? ) ) , identical to in Eq. seven . For this reason the chief theorem however retains below so much more general situation.

## Appendix B Expansion: Color Spurious Relationship

To further verify our very own results past background and you can intercourse spurious (environmental) provides, we offer extra experimental abilities with the ColorMNIST dataset, as shown in the Contour 5 .

## Assessment Activity 3: ColorMNIST.

[ lecun1998gradient ] , which composes colored backgrounds on digit images. In this dataset, E = < red>denotes the background color and we use Y = < 0>as in-distribution classes. The correlation between the background color e and the digit y is explicitly controlled, with r ? < 0.25>. That is, r denotes the probability of P ( e = red ? y = 0 ) = P ( e = purple ? y = 0 ) = P ( e = green ? y = 1 ) = P ( e = pink ? y = 1 ) , while 0.5 ? r = P ( e = green ? y = 0 ) = P ( e = pink ? y = 0 ) = P ( e = red ? y = 1 ) = P ( e = purple ? y = 1 ) . Note that the maximum correlation r (reported in Table 4 ) is 0.45 . As ColorMNIST is relatively simpler compared to Waterbirds and CelebA, further increasing the correlation results in less interesting environments where the learner can easily pick up the contextual information. For spurious OOD, we use digits < 5>with background color red and green , which contain overlapping environmental features as the training data. For non-spurious OOD, following common practice [ MSP ] , we use the Textures [ cimpoi2014describing ] , LSUN [ lsun ] and iSUN [ xu2015turkergaze ] datasets. We train on ResNet-18 [ he2016deep ] , which achieves 99.9 % accuracy on the in-distribution test set. The OOD detection performance is shown in Table 4 .