Interesting stuff. I won't get into the clearly missed portion of why you chose that data, how you validated it, how you gathered it and whatnot that is really the core of why you chose the model you chose and what questions are truly driving finding out the probability of the occurrence. But, why did you take the approach of only using code to explain the outliers versus explaining why they are happening to begin with?
For example, you can see the outliers as soon as you plot the data, but what methods are common to identify them other than clearly accepting them like you did? You really didn't explain why they are there. Maybe they exist in human error? Maybe they exist in data extract error? Etc. Just accepting them, shading them red is not really teaching us how to really handle them.
I say this because I have data scientist come to me with outliers that could have been my fault with how they asked for the data. There is a methodology / best practice at play here around the science of explaining the facts of what they are observing. Not accepting everything and going to the board room with it.
Good example in retail outlier is a drop shipper. Normal users may be buying items between $1 and $20 dollars. Then you have a outlier buying in bulk at $1000 dollars. Understanding why that happens can lead to finding out that drop shippers are not correctly being identified and removed from the analysis entirely because they did not qualify as an active user. You can also look at this as discovery. Through your analysis, you discovered a new species of fish that you can now start classifying and tracking, which is a win for the reason on investigating your results.