Good research should change our thinking. People often think that sample size is an indictor of good research. There are many other more important factors that determine whether research is any good. Regardless, I’m often asked, what’s the right sample size? My first, smart ass, answer is whatever number is needed to change your thinking and adjust your belief system. I don’t answer this question until I figure out what I need to claim. I’ll also need to understand the nature of the observations (data/evidence) to make a good argument. By nature, I mean determining whether the data will be numbers, words, self reported, directly observed, opinions, facts, etc. My goals is to pick a sample of people that is necessary to make a convincing argument, supported by acceptable evidence, while including necessary assumptions and qualifications between claims and evidence.
I don’t have a simple answer to ‘how many is right’ in this line of work. Others do, but I don’t. It’s like asking what’s the meaning of life. There isn’t one answer. Statisticians will teach concepts such as the law of large numbers, central limit theorem, representativeness and repeated measures as important concepts to know when doing research, stats, and deciding the best sample size. These are important concepts I’ve learned when making observations, interpreting data, forming conclusions, and predicting the probably of future events. But, these concepts often don’t come to mind when I’m determining how many people to study. And these concepts may not be necessary in adjusting your belief system. There are three concepts I do often consider when deciding how many people to study. 1) time and budget, 2) data variability, and 3) effect size.
Time and budget has a big impact on how many people can be studied. I don’t know of anything more complex than understanding human behavior. It may take a lifetime for you to understand yourself. And many more lifetimes for me to understand you. If you think you already have yourself figured out, then good for you. But I don’t know the first thing about you. And that’s my job. The amount of time and money clients want to spend on understanding people is going to be fairly limited. I like to understand these limits.
Data variability describes how similar/different or close/far observations are from each another. If I shine a bright light in your eyes, then your pupils will get smaller. If I shine a bright light in your friend’s eyes, then her pupils will get smaller. If people behave similarly then there is less variability. If there is less variability, then fewer people need to be studied to understand the phenomenon. The behavior is fairly predictable and may be understood relatively quickly. If I asked you about BP a year ago, then asked you now, your opinion may be very different. If I watched you pump gas a year ago, and watch you pump gas now, your behavior may be very similar. Opinions about products can be much more variable then how people use products. If I’m after opinions, I tend to study larger sample sizes than if I’m interested in directly observing the use of products.
I often want to know whether new ideas will improve people lives. Is the effect or impact of these ideas real? Is the effect small or large? Effect size measures the strength of the relationship between two variables. For example, product X keeps people entertained longer. If longer is 5 seconds, then that’s a small effect in this case. If longer is 1 hour, then that’s a big effect. Small effect sizes may require a larger sample size, especially if there a lot of variability in the data and the effect is difficult to see. Large effect sizes tend to be easier to see and can lead to needing fewer people.
My goal is to change our thinking, and perhaps adjust our belief system in a fundamental way. If I’m going to convince you, it’s not going to be based solely on a large volume of evidence I’ve gathered. It’s going to be based on whether I can make an argument, supported by acceptable evidence, while also describing assumptions and qualifications between claims and evidence.