
"Well," you might say to yourself, "what a nice precise problem description! At least I do not have to clean the data first … what is that?"

After spending a lot of time looking, correcting, and talking to smug people who congratulate themselves on not being the ones to look after that hot mess, you finally have a dataset that seems ready for a proper statistical analysis.
But what is a statistical analysis? Are you supposed to classify, regress, estimate, test, or cluster something? Or something completely else?
The goal of a statistical analysis

Scratching your head you turn to your trusted Statistics book and find the following claim¹:
"The goal of a statistical analysis is to find the distribution behind your data."
"What do you mean when you say the distribution behind my data?"
The distribution of your data describes the ranges and frequencies of the features of your data with respect to a population of interest.
The distribution of your data is dependent on the population of interest.
In the simplest case, your population of interest is only your available data. This means your data already describes the distribution perfectly.
Managerial Interlude
"Maybe management is only interested in results about the observations in the dataset …"
In that case, congratulation, your statistical analysis is complete. You can answer any follow-up question directly from your dataset. There is no need for any statistical estimation or statistical testing.
Delighted you go back to management and tell them that you completed your statistical analysis. Impressed management directly invites you to present your findings.

You tell them that the data is ready for any question they have about the observations, as they are only interested in the observations in the data.
Stony faced, management tells you that they are not only interested in the observations in the data. They want to know things about many more observations, all customers present and future, not just a few survey customers in your data …

Discouraged you go back to your desk. "How unfair! How was I supposed to know what the population of interest is … Ok, how do I proceed with my statistical analysis now?"
An exemplary statistical analysis
As your population of interest increased and your dataset only comprises a subset of it, you now have the problem that your data only gives you a glance at the true distribution underlying the data.
Your data has too many features to analyze at the same time. You decide to limit your attention to a single feature for the moment. As a first step, you start with a histogram.
Data-driven distribution decision?

"Doesn’t this look like a normal distribution? Perhaps that is the underlying distribution for this feature?"
You are faced with a decision now. Do you want to proceed with your assumption that the feature indeed follows a normal distribution?

"Sure I do. If it looks like a normal distribution and quacks like a normal distribution it is a normal distribution."

While duck typing is surely your thing, you are, sadly, in violation of data-driven decision making.
"Huh? How? Why?"
You looked at the histogram and came up with the hypothesis that the feature follows a normal distribution. Then you looked at the histogram again and concluded that the histogram confirmed your hypothesis. Doesn’t that strike you as a bit circular?
"Hm … if you put it that way … it really sounds wrong …"
If you want to read more about deciding correctly with data, Cassie Kozyrkov is the right person for you.
"So let’s just call the normal distribution an assumption for the moment. What now?"
Statistical Modeling
As we assumed that the underlying distribution is normal, we just decided upon a statistical model for the feature.

"We did? I mean, of course, we did. But just out of curiosity how would you define a statistical model?"
Sure, a statistical model of a random feature (random because we don’t know what the feature of your next observation will be) is a set of parameterized probability distributions of which one describes the feature correctly.
A statistical model of X: {P(t) : t in T} with X ~ P(s) for an s in T
"Ah, I see. So P(t) is the normal distribution, am I right?"
Yes, and P(s) is the correct normal distribution that describes our feature. T is called the parameter space. In our case, it is defined as
T := {(μ, σ²) with μ real and σ² > 0}
The true parameters of a statistical model
"And we now need to find the correct pair s = (μ, σ²) in T?"
That is correct. After fixing the statistical model, we now need to find the true parameters behind our feature’s distribution.
After fixing a statistical model, we try to find the true parameters behind the distribution of our data.
"Ok, let’s do that. I can now estimate the mean and variance from my data."
Make sure to use the sample variance to get an unbiased estimator.
"Done. So this is the true normal distribution P(s) of the feature?"
Sadly no.

It is your estimate of the true parameters behind the distribution. You might be close to the true values or far away. You will only know once you know the true values, so most likely never. However, what you can do now, is find intervals around your estimates that have a high probability of covering the true parameters.
"This sounds a lot like confidence intervals."
Exactly. You will see that for any fixed probability a range around your estimators leads to smaller intervals than a range not containing them. Even though your estimates are probably wrong, this makes them good candidates for the true values.
Your estimates of the parameters behind a distribution are probably wrong, but they are still useful.
From individual distributions to a combined distribution
"Ok, so now I just need to repeat the process for the other features and I am done with the statistical analysis?"
Do you remember that we simplified our life by looking at the features individually? Unfortunately, any interdependence between your features can significantly undermine your results from the isolated view when generalizing your findings to multiple features. But it is a good start on your path to determine the true distribution behind your data.
Just remember Cassie Kozyrkov. After you are done with your analysis, you need to confirm the hypotheses and assumptions that helped you arrive at your result. Most of the time, this will require you to get new data. Or you could have split your dataset before you examined it.
Conclusion
"I think I understand what a statistical analysis is now. I have a final question. Why is a statistical analysis always concerned with the true probability distribution governing my data?"
Any statistic, estimator, classifier, or regression model is a function dependent on your data. That means they are all dependent on the true distribution governing your data.
"Interesting. I will keep that in mind next time. Do I need to do anything else before I present my analysis to management again?"
There is one final thing you have to do. This is the most important thing and is absolutely required before you can present your findings.

Post any further questions below.