The normal distribution Function - Advise

The team has recommended to use np.random.normal to create a normal distribution of each variable for each family for at least 100 samples. However, the normal distribution function also yields negative values which are not possible for a geochemical variable. Now the question that arises is what to do with those negative values. Do we replace it with zero, mean, median, or mode?

piece of a hint: Visualizing a box plot can be a great tool for making such a decision.

Hello mkr6255. I have just not dealt with them at all and I got a pretty good result after all. As long as we create enough samples, it won’t be a problem. In reality, the distribution would not be strictly normal, but it seems a good approximation for this exercise. I have also applied to the geochemical dataset the same normalization I did on the lab samples, which probably helps as well.