Hey guys, begginers doubt:
I am preparing a dataframe for a machine learning model. The purpose of the model is to predict whether people infected with COVID will die or not.
To do this, I am looking for some conditions and symptoms, such as sore throat, cough, comorbidities, gender, and others, and binarizing them into “yes” or “no” or “male” and “female”.
I have a problem. One of the variables is “pregnant”, but only individuals of the female sex can be pregnant. How can I deal with this variable?
Can I keep it in the dataframe and assign the value “not pregnant” to all male individuals? Or could this harm the model?
And of course you get downvoted… the virus devours peoples minds.
Didn’t expect anything else…