I have a variable for ethnicity where 9000 people didn't respond to what their ethnicity was- but as this isn't my main explanatory variable I don't want those missing values to be excluded from my regression. Should I create another level in my categorical variable for ethnicity (ie 0=white, 1=BAME, 2=missing) or do I create another variable for missing values (ie 0=other, 1=missing). By also including the latter variable in my regression it gets excluded for collinearity so not sure how to get round this.

Thank you!